Call for Papers! Big Data, Digital Data, Textual Data. Milan 15-17 Sept 2016


Big Data, Digital Data, Textual Data: Restructuring Political Science? 

Chairs: Andrea Ceron & Luigi Curini (Università degli Studi di Milano)

Where and When: Milan, 15-17 September 2016.

Deadline to submit abstract: 5 June 2016

Link to: panel description. Submit here!

Call for Papers:

This panel is open to scholars from very different fields, ranging from political science to communication or computer science and information technology. The aim is to gather papers that adopt updated statistical methods (including text analyses techniques) to analyze large-N collections of digital data, either in textual or non-textual form. Any application of Big Data analysis (i.e. open data, social media data, or any large digital textual data) to the study of political institutions or to the study of public opinion dynamics is particularly welcome, but the panel also accepts papers related to other different topics linked with politics and society. Secondary analyses of Big Data performed through traditional statistical techniques are suitable too, particularly if these studies deal with the integration of different sources of data (e.g., survey data and sentiment analysis) or combine datasets from multiple sources (e.g. roll call votes, manifesto data, data on conflicts, pieces of news, etc.). We accept both case studies or longitudinal analyses related to Italy or to any other country, as well as cross-sectional comparative analyses that focus on more countries (related to the present or to the past). Thanks to such contributions, the panel aims to show how the “Big Data revolution” can allow us to solve puzzles involving traditional political science topics (e.g. legislative politics, coalition governments, electoral campaigns, accountability and responsiveness, peace and conflicts, democratization, collective action, agenda setting, etc.).


Political science is undergoing a complex threefold process of revolution, which can be summarized under the label of “Big Data revolution”. Political science is radically changing, from using sparse datasets produced by isolated scholars that work alone, to building up collaborative, interdisciplinary, lab-style research teams that analyze increasing quantities of diverse, highly informative data. Such transformation, from studying problems to solving them, can explain why – at least in some countries – “the influence of quantitative social science (including the related technologies, methodologies, and data) on the real world has been growing fast” (King 2014).

Big Data (i.e. large-N digital or textual data) certainly play a crucial role in such transformation and can contribute to restructuring political science. This process, in fact, benefits from different sources of data that are more and more available to scholars: 1) open data, provided by public or private organizations; 2) a wide array of textual data, produced by political institutions, which are increasingly available in a digital format; 3) digital data, in textual and non-textual form, generated by a growing crowd composed of Internet users and social media users (encompassing citizen-to-citizen and citizen-to-elite interactions, online news, and top-down elite communication).

Such “Big Data revolution” is not only related to data sources. The evolution of our societies toward a “digital world” is a necessary premise. However, the methodological contribution of information technology, which allows us to gather and store huge quantities of data, processing them at an incredibly fast rate, and the new developments in statistics and political methodology, particularly in the field of text analysis (Grimmer and Stewart 2013), are also important in performing such transformation.

Indeed, the recent improvements in terms of automated and supervised text analysis techniques dramatically reduce the costs of analyzing large collections of textual data and allow scholars to study politics and political conflicts through the analysis of written and spoken words. In this regard, a wide range of techniques is now increasingly used by political scientists. These methods range from scaling techniques – like Wordscore (Laver, Benoit and Garry 2003) and Wordfish (Slapin and Proksch 2008) – that measure similarities and differences between political actors, to topic models (Grimmer 2010; Quinn 2010) – that allows scholars to identify the topics discussed in a text.

These techniques can greatly enhance our knowledge on the functioning of political institutions, particularly when applied to large digital data gathered by collective research groups such as the Comparative Agenda Project (Baumgartner, Green-Pedersen and Jones 2006) or the Comparative Manifesto (Lehmann et al. 2015).

The broadening of Internet penetration and the increasing number (30% of world population in 2015) of worldwide citizens active on social networking sites, like Facebook and Twitter, pushed such revolution further. In this new “digital world” citizens share information and opinions online, thereby generating a large amount of data about their tastes and attitudes. The evolution of sentiment analysis (Hopkins and King 2010; Ceron, Curini and Iacus 2016) allows to extract information from these rich sources.

This information can then be successfully exploited to study more in depth the formation and evolution of public opinion (Schober et al. 2016) – particularly by integrating sentiment analysis with traditional survey data (Couper 2013) – in order to study political mobilization (Bennett and Segerberg 2011) or to nowcast and forecast elections (Ceron et al. 2014; Gayo-Avello 2013).


Baumgartner, Frank R., Christoffer Green-Pedersen, and Bryan D. Jones, eds. 2006. Comparative Studies of Policy Agendas. Special issue of the Journal of European Public Policy 13 (7).

Bennett, W.L. and Segerberg, A. (2011). Digital media and the personalization of collective action: Social technology and the organization of protests against the global economic crisis. In Information Communication and Society, 14(6): 770–799.

Ceron, Andrea, Luigi Curini and Stefano M. Iacus. 2016. Social Media and Politics: Nowcasting and Forecasting Elections with Big Data, London: Ashgate, forthcoming, 2016

Ceron, Andrea, Luigi Curini, Stefano M. Iacus, and Giuseppe Porro. 2014. “Every Tweet Counts? How Sentiment Analysis of Social Media Can Improve Our Knowledge of Citizens’ Political Preferences with an Application to Italy and France.” New Media & Society 16:340–58.

Couper, Mick P. 2013. “Is the Sky Falling? New Technology, Changing Media, and the Future of Surveys.” Survey Research Methods 7(3):145–56.

Gayo-Avello, D. (2013). A meta-analysis of state-of-the-art electoral prediction from Twitter data. In Social Science Computer Review, 31(6): 649–679.

Grimmer, J. and Stewart, B.M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. In Political Analysis, 21(3): 267–297.

Hopkins, Daniel, and Gary King. 2010. Extracting systematic social science meaning from text. American Journal of Political Science 54(1):229–47.

King, G. (2014). Restructuring the social sciences: Reflections from Harvard’s Institute for Quantitative Social Science. In Politics and Political Science, 47(1): 165–172.

Laver, Michael, Kenneth Benoit, and John Garry. 2003. Extracting policy positions from political texts using words as data. American Political Science Review 97(02):311–31.

Lehmann P, Matthieß T, Merz N, Regel S, Werner, A (2015) Manifesto Corpus. Version: 2015a. Berlin: WZB Berlin Social Science Center.

Quinn, Kevin. 2010. How to analyze political attention with minimal assumptions and costs. American Journal of Political Science 54(1):209–28.

Schober, Michael F., Pasek, Josh, Guggenheim, Lauren, Lampe, Cliff, and Conrad, Frederick G. (2016). Social media analyses for social measurement. Public Opinion Quarterly 80(1) 180–211

Slapin, Jonathan, and Sven-Oliver Proksch. 2008. A scaling model for estimating time-series party positions from texts. American Journal of Political Science 52(3):705–22.