Event

Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the United States, 1898-1920

  • Conférencier  Lorella Viola

  • Lieu

    Webex platform

    LU

  • Thème(s)
    Sciences humaines

Online seminar with Loralla Viola, postdoctoral researcher at the University of Luxembourg/C²DH

Topic modelling (TM) is a computational, statistical method to discover patterns and topics in large collections of (unstructured) text (e.g., emails, book chapters, newspapers’ articles, letters, reports). Its main value lies in the possibility to find semantic patterns that would be difficult to identify otherwise. Such patterns may for instance be helpful to categorise documents in large archives or to discover the underlying topical structure of big corpora without the need to read each individual document. Although potentially a very efficient distant reading technique, the output may sometimes be difficult to interpret.

In this seminar, Lorella Viola present a method that uses the close reading technique Discourse-Historical Approach (DHA – Reisigl & Wodak 2001) to refine TM. This combined methodology, “discourse-driven topic modelling” (DDTM), aims to enable researchers to triangulate linguistic, social, and historical data in order to make the topics more interpretable and unlock the full potential of TM. To test the methodology, she investigates public discourses produced by Italian migrants in the United States using a corpus of digitised Italian ethnic newspapers published in the United States between 1898 and 1920 (ChroniclItaly – Viola 2018). The results proved DDTM to be effective in obtaining a relatively quick categorisation of the topics discussed in the immigrant press. Moreover, the changing distribution of topics over time revealed how the Italian immigrant community negotiated their sense of connectedness with both the host country and the homeland and how the changing experience of migration, identity construction, and assimilation was reflected over time in the accounts of the minorities themselves. At the same time, without jeopardising the analytical depth of the findings, the method proved its value of minimising the risk of biases when identifying the topics which stemmed from the data rather than from preconceived assumptions.

The seminar is organised by the Doctoral Training Unit “Digital History & Hermeneutics”.

If you want to participate, please send an email to vanessa.napolitano@ext.uni.lu in order to get the Webex-link.