The added value of topic modelling for the humanities

Humanities scholars and the challenge of analysing very, very large collections

As digitally available textual repositories are becoming larger and larger, the relevance of distant reading for the humanities has grown exponentially.


Traditional methods are no longer suitable

Traditional close reading methods are no longer suitable for the analysis of such unprecedented mass of digital data

Humanities scholars are more and more confronted with the challenge of having to apply quantitative approaches in their research, for traditional close reading methods are no longer suitable for the analysis of such unprecedented mass of digital data. One such quantitative approach is Topic Modelling (TM), a computational, statistical method to discover patterns and topics in large collections of unstructured text.


A workflow for the humanities

While there are many TM programs and tutorials available, what appears to be still missing is a description of a generalizable TM workflow for the humanities. With this tutorial/workshop, I offer a step-by-step guidance for a versatile method that could be applied transversely across different datasets. Specifically, I provide a way to enrich the distant reading technique of TM with the qualitative information necessary for contextualising the TM results and opening up avenues for interpretation. This workflow is partially based on my article Viola and Verhuel (2019).

This is a versatile method that could be applied transversely across different datasets

A three hour session to speed up your data analysis

In this three-hour session, learners first get acquainted with the computational concept of 'topic'. After a practical demonstration, they can start experimenting with their own data and obtaining already results for their research!

Please do get in touch if you are interested in organising this session at your institution.



12 views