Name File Type Size Last Modified

Project Citation: 

Yang, Yinping. COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2020-07-18. https://doi.org/10.3886/E120321V1

Project Description

Summary:  View help for Summary We collected and processed a dataset and make it available for the research community to study the COVD-19 pandemic in multiple possibilities.

Scope of Project

Subject Terms:  View help for Subject Terms COVID-19; pandemic; twitter; social media; COVID-19; pandemic; twitter; social media; sentiment analysis; emotion recognition
Geographic Coverage:  View help for Geographic Coverage Global
Time Period(s):  View help for Time Period(s) 1/28/2020 – 7/1/2020
Universe:  View help for Universe Twitter posts
Data Type(s):  View help for Data Type(s) other; program source code; text
Collection Notes:  View help for Collection Notes This resource describes a large dataset covering over 63 million coronavirus-related Twitter posts from more than 13 million unique users since 28 January to 1 July 2020. As strong concerns and emotions are expressed in the tweets, we analyzed the tweets content using natural language processing techniques and machine-learning based algorithms, and inferred seventeen latent semantic attributes associated with each tweet, including 1) ten attributes indicating the tweet’s relevance to ten detected topics, 2) five quantitative attributes indicating the degree of intensity in the valence (i.e., unpleasantness/pleasantness) and emotional intensities across four primary emotions of fear, anger, sadness and joy, and 3) two qualitative attributes indicating the sentiment category and the most dominant emotion category, respectively. To illustrate how the dataset can be used, we present descriptive statistics around the topics, sentiments and emotions attributes and their temporal distributions, and discuss possible applications in communication, psychology, public health, economics and epidemiology.


Related Publications

This study is un-published. See below for other available versions.

Export Metadata

Report a Problem

Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.

This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.