The 4CAT Capture and Analysis Toolkit
A Modular Tool for Transparent and Traceable Social Media Research
Keywords:
research tools, computational humanities, web scraping, 4cat, digital methodsAbstract
This paper introduces the 4CAT Capture and Analysis Toolkit (4CAT), an open-source Web-based research tool. 4CAT can capture data from a variety of online sources (including Twitter, Telegram, Reddit, 4chan, 8kun, BitChute, Douban and Parler) and analyze them through analytical processors. 4CAT seeks to make robust data capture and analysis available to researchers not familiar with computer programming, without ‘black-boxing’ the implemented research methods. Before outlining the practical use of 4CAT, we discuss three ‘affordances’ that inform its design: modularity, transparency, and traceability. 4CAT is modular because new data sources and analytical processors can be easily added and changed; transparent because it aims to render legible its inner workings; and traceable because of automatic and shareable documentation of intermediate analysis steps. We then show how 4CAT operationalizes these features through a description of its general setup and a short walkthrough. Finally, we discuss how 4CAT strives for an ‘ethics by design’ development philosophy that enables ethically sound data-driven research. 4CAT is then positioned as both an answer to and a further call for ‘tool criticism’ in computational social research.