Search openICPSR


Find and share social, behavioral, and health sciences research data.

  • Search terms can be anywhere in the study: title, description, variables, etc.
  • Because our holdings are large, we recommend using at least two query terms:
    rural economy
    home ownership
    higher education
    ghana adolescents
  • Keywords help delimit the breadth of results. Therefore, use as many as required to achieve your desired results:
    elementary education federal funding
  • Our search will find studies with derivative expressions of your query terms: A search for "nation" will find results containing "national"
  • Use quotes to search for an exact expression:
    "social mobility"
  • You can combine exact expressions with loose terms:
    "united states" inmates
  • Exclude results by using a MINUS sign:
    elections -sweden -germany
    elections -sweden -germany
  • On the results page, you will be able to sort and filter to further refine results.
  • Please note that your search queries only openICPSR data holdings.
CLOSE
Name File Type Size Last Modified
  Twitter COVID dataset - Aug 2020 09/04/2020 11:19:AM

Project Citation: 

Gupta, Raj, Vishwanath, Ajay, and Yang, Yinping. COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2020-09-04. https://doi.org/10.3886/E120321V5

Project Description

Summary:  View help for Summary This project aims to present a large dataset for researchers to discover public conversation on Twitter surrounding the COVID-19 pandemic. As strong concerns and emotions are expressed in the publicly available tweets, we annotated seventeen latent semantic attributes for each public tweet using natural language processing techniques and machine-learning based algorithms. The latent semantic attributes include: 1) ten attributes indicating the tweet’s relevance to ten detected topics, 2) five quantitative attributes indicating the degree of intensity in the valence (i.e., unpleasantness/pleasantness) and emotional intensities across four primary emotions of fear, anger, sadness and joy, and 3) two qualitative attributes indicating the sentiment category and the most dominant emotion category, respectively. 

Scope of Project

Subject Terms:  View help for Subject Terms COVID-19; pandemic; twitter; social media; COVID-19; pandemic; twitter; social media; sentiment analysis; emotion recognition
Geographic Coverage:  View help for Geographic Coverage Global
Time Period(s):  View help for Time Period(s) 1/28/2020 – 7/1/2020
Universe:  View help for Universe Twitter posts
Data Type(s):  View help for Data Type(s) other; program source code; text


Related Publications

Published Versions

Export Metadata

Report a Problem

Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.

This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.