In this article, Jessica di Cocco discusses a topic around which different schools of thought, interpretations, and creeds exist: how to measure populism. In particular, she explores the new frontiers of the possible measurement of populism in speeches: automated content analysis, machine learning and text-as-data.
A sparkling, refreshing article, to cross old borders and chart new directions.
Enjoy the read.
When someone asks me what I do in my research, and I answer that one of my main focuses is the study of populism using textual approaches, I often notice a particular interest on the part of my interlocutor. This interest generally tends to grow when I specify that I work with computational tools, drawing on the boundless world of machine learning and natural language processing. The interlocutor’s interest is driven by the most diverse motivations: curiosity for interdisciplinarity or aversion to it; interest in controversial and debated phenomena; personal reasons; and individual civic interests, among others. I have noticed, in short, that there is a motivated curiosity that drives my fellow researchers, my students, and people outside academia to want to create a debate, nurture an exchange of views, and advance the discussion on these issues. When talking about populism and machine learning, we could think that we are putting together two of the hottest words of the moment. This article highlights a few reasons why the interest we feel in this combination might not be unfounded.
Let’s start with populism. Whether we work in academia or not, we have all heard of ‘populism’, even if not in a single way. In the last decade or so, there has been a substantial increase in research work on this topic and a growth in its use in the mass media and everyday language. This increased attention has helped consolidate populism (and adjacent phenomena) among the most studied and sexy topics in political science, borrowed from other disciplines with a certain eagerness. One of the reasons for the impressive attention to this topic is the multiple impacts that the populist phenomenon has on political, social and economic levels. It can be observed from so many perspectives that one is spoilt for choice. It has been analysed as a political trend, a social movement, a consequence of the economic crisis, and a wake-up call for societies increasingly dissatisfied with democratic and economic systems. Scholars have highlighted and explored its strong links with radical ideologies, the loss of status, reduced social recognition, emotions, and inequality (for a general review of research on populism, see Hunger and Paxton). This brief overview of how the subject has been studied and explored gives us a rough idea of how far this topic has come in the academic literature. The phenomenon has been reviewed and dissipated so much that it is almost an infodemic of populism.

When in doubt…it is always useful to have a look at Ben Stanley’s schmopulism filter
One of the crucial questions remains its definition. It is not only a matter of exploring the theoretical dimension but also of identifying the boundaries of this varied phenomenon for its inclusion in empirical analyses. Those familiar with quantitative research know that you cannot build a model without clearly understanding what the different variables measure and represent. Hence, a conceptualisation of populism, however sketchy and partial, is necessary for these purposes. To date, there is beginning to be general agreement on what populism is or how it can be defined.
Nonetheless, the very characteristics of the populist phenomenon, including its context-specificity and difficulty of definition, have made it challenging to quantify, so much so that studies have used a variety of methodologies, sometimes with unclear or conflicting results. The debate on the measurement of populism deserves some attention. For years, researchers have used purely dichotomous classifications indicating whether a party is populist or not. More recently, many scholars have stressed the added value of measuring levels of populism rather than sticking to a black-or-white classification (see Deegan‐Krause and Haughton; Pauwels; Aslanidis; Hawkins and Rovira Kaltwasser; Di Cocco and Monechi; Meijers and Zaslove). For example, measuring levels of populism might enable intercepting variations and nuances that cannot be captured with binary classification. This aspect leads us to mention another advantage of graduated measures of populism, connected with the assumption that it can vary over time between and within political actors. There is also another issue, the possibility of identifying populism’s ‘traces’ in parties and leaders that are not conventionally considered as such. As we explore the phenomenon and its facets, we look for methods that allow us to grasp (and measure) its complexity. This is where machine learning and text-as-data approaches, among others, come in.
Talking about machine learning and text-as-data in the era of rapid technological progress and big data is extremely attractive. Never before have we had access to such vast amounts of information and data. Never before have we had the tools and skills to analyse such vast quantities of information and data. Inferring positions directly from the texts of political actors, analysing variations in discourse, studying the prominence or absence of specific topics, and investigating emotions and feelings, are just some of the possibilities that text-as-data offer us. In the case of populism, these possibilities are complemented by the opportunity to deconstruct the phenomenon. Deconstruction potentially allows for analysing populism’s sub-components, such as anti-elitism, people-centrism and the Manichean view of society.
The world of text-as-data is as boundless as that of populism. When I talk about text-as-data, I move into the field of automated textual analysis (not necessarily, but this is often the case). Scholars working in this field usually apply techniques typical of natural language processing, a subfield of linguistics, computer science, and artificial intelligence concerned with computerised processing and analysis of large amounts of natural language data. Text data consist of documents including words, sentences or paragraphs of text. They need to be pre-processed for machine learning methods to work them (e.g., through tokenisation, lemmatisation, sentence splitting, stemming and so forth). Given the different applications of text-as-data, it is not surprising that they are also used to study populism and adjacent phenomena. I constantly work with automated textual analysis to measure the levels of populism in parties and analyse their trends. I would not have been able to conduct a large part of my research if I had not used automated analysis because I would have needed too many economic and manual resources.
Whether automatic or not, text-based methods also have their limitations. For example, a ‘textual specificity’ component could prevent the results from generalising. Or, it is not always easy to collect a large enough number of texts to carry out a robust quantitative analysis. There are also issues related to the concepts we want to study through textual methods. Let’s take populism as an example, which remains this article’s core. The difficulty of identifying the boundaries of this phenomenon, its relationship with host ideologies and the presence of multiple sub-components are real challenges for those who intend to try their hand at textual analysis to extract information about it. Before giving some examples, it might be helpful for me to introduce the question of machine learning a little better. Even for the uninitiated, the concept of ‘machine learning’ goes hand in hand with that of ‘algorithms’. However, not everyone knows that algorithms are divided into supervised and unsupervised. Supervised algorithms are used to predict specific data labels (e.g. ‘is there a cat in this picture?’ or ‘is this sentence populist or non-populist?’). Unsupervised algorithms extract statistical patterns from the data without any specific target. An example of an unsupervised algorithm frequently used in the textual analysis is topic modelling, a type of statistical model for discovering abstract ‘topics’ occurring in a collection of documents. In my case, I find myself mainly working with supervised algorithms for several reasons. Partly because they respond better to my research needs, partly because the populist topic is so complex and overlaps with so many other issues that it would be challenging to think that an algorithm could independently capture it. When working with supervised algorithms, one of the indispensable steps is to understand how to carry out the training, not only in tuning hyperparameters but also from a substantive point of view. Indeed, algorithms learn to replicate human actions, so we must teach them effectively. Therefore, if I want to teach a supervised algorithm how to recognise populism and (or) its sub-components in a text, I have to be very careful in defining the concepts so that I am transparent about what I am teaching the algorithm. If the concepts are evident in my mind, I will also be able to identify them in the text and will manage to attribute the labels to a corpus more precisely.
Consequently, when the supervised algorithm uses that labelled corpus for learning how to recognise populism (or its sub-components), it will obtain much more accurate results by replicating my adequately precise actions. Of course, the explanation I have just given is a bit simplistic. For example, I have not touched on equally fundamental technical aspects, such as the importance of testing for prediction accuracy and reliability. Let us say that, in this article, we will mainly discuss the ‘substantive’ dimension of studying populism from automated textual analysis. We have repeatedly reiterated that populism is a complex phenomenon, and there is no unique and univocal view of it. We have also mentioned the distinction between supervised and unsupervised algorithms; we will now focus on the former, i.e. the supervised ones. Here is the question that, in my opinion, is worth pondering. If supervised algorithms learn to replicate human actions, then the way we label texts (whatever their length) affects the prediction results of that algorithm.
Consequently, our reasoning or training activities cannot ignore a broader element not linked explicitly to automated analysis, that of intersubjectivity. Intersubjectivity implies that even if each of us has a subjective view of the phenomenon’s characteristics, this view should at least be shared with others in our community. This ensures that our result and, more precisely, our way of labelling the data is not entirely subjective and, therefore, potentially subject to personal biases that make it hardly generalisable. As you can imagine, it is difficult to reach a level of spontaneous intersubjectivity on a complex and varied phenomenon such as populism. For this reason, when we code a corpus that will be used as a labelled dataset for training the algorithm, we prefer to rely on the activity of several coders. We measure how well these coders agree on attributing labels through a measure of inter-coder reliability (e.g., Cohen’s Kappa). Although there are limitations and advantages, standards of inter-coder reliability give us essential information about how much intersubjectivity exists between coders and to what extent their agreement is not the result of causal choices. Clearly, there must be some conditions for agreement between coders to be facilitated. For example, it may be helpful to provide guidance on coding through a specific codebook. Pitfalls are around the corner when coding text in search of populist elements. We may have prepared the most detailed codebook to train the coders, but some things may be beyond our control. For example, depending on how the sentences are ‘fragmented’, some may lose contextual references. It is not surprising that even on the units of analysis, there is no univocal agreement among the researchers. Some prefer to work on phrases, some on paragraphs, and some on whole texts. Each approach can have advantages and disadvantages that I will not elaborate on, but it is not surprising that the debate on the matter is hot.

April 2022, Storace insists on his crusade on the powerful elites controlling our economy, in this case Mario Draghi
Therefore, going beyond the question of the unit of analysis, let us take the following sentence as an example to figure out possible pitfalls:
Perhaps I would speak of banking seigniorage, national sovereignty, the right to the money, in other words, issues that may attract citizens’ attention
This text is taken from a 2013 speech by Francesco Storace, an Italian right-wing leader. If we do not know who this leader is, in what context the sentence was uttered, nor even the type of prosody, it may be difficult to label it as populist or non-populist. I posed this question to my Twitter community, asking whether this sentence could be considered populist, non-populist or ambiguous. Forty-six people, probable researchers in the political science field, participated in the survey game. 71.7% of respondents labelled the sentence non-populist, 15.2% labelled it as populist, and 13% said they did not know how to classify it. From the comments, one of the main difficulties for coding is the absence of references to the context.
For instance, knowing the subsequent sentence would help better understand whether this one was used with populist or anti-populist tones. Another crucial element that emerged from comments is the question of the sub-components of populism. In short, which of them must be present in a sentence for it to be considered populist? Some scholars agree with the persistence of five sub-components (see Meijers and Zaslove): people-centrism, anti-elitism, the Manichean view of the society, people’s general will and indivisibility. However, not all the researchers define these five sub-components in the same way or give them the same relevance. There are also researchers focusing on other aspects, such as the presence of the charismatic leader or the moral dimension. Hence, the question ‘what sub-components should a sentence include to be coded as populist’ is not trivial. Firstly, because it implies agreement on the sub-components; second, because it is also related to the quantity, i.e. ‘how many sub-components must be present’. Of course, if we codified elements about healthy eating, we would all be more or less in agreement in labelling the texts; this is certainly not the case with populism.
I could continue to delve into automated textual analysis applied to populism studies by highlighting this and that aspect. It is an exciting but challenging world with room for numerous theoretical and practical reflections. However, I suppose that I gave you enough information for a first overview of the subject, giving you a taste of the debate that accompanies it. I don’t know if you are more aware of why the combination of words like ‘textual analysis’ and ‘populism’ is not unjustifiably appealing. In my case, I am fascinated by it more and more every day. Obviously, as I have briefly shown in this article, there are limits and advantages. Still, it is advisable to know them, identify them and be aware of them, just like in many other branches of the social sciences. Research in this area is advancing with such speed that we will see some good ones, and the texts to be analysed, fortunately, will never end. Stay tuned.
Jessica Di Cocco is Max Weber Fellow at the European Univeresity Institute in Fiesole. She works on text-based approaches to the study of populism and adjacent topics. She is interested in measuring the populist phenomenon using textual sources, the use of sentiments and emotions in electoral campaigns, and comparative analyses of historical trends through textual analysis. She also deals with more theoretical aspects of using texts to investigate party behaviour and is interested in studying expert and non-expert positions on sensitive topics. She is fascinated by exploring the nexus between inequality and voting behaviour using less conventional datasets, such as anonymised geolocalised data.