Name File Type Size Last Modified
jstor_data_submission.csv text/csv 26.8 MB 03/07/2023 10:47:PM

Project Citation: 

Boros, Krisztián, and Kmetty, Zoltán. Identifying Missing Data Handling Methods with Text Mining. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2023-03-08. https://doi.org/10.3886/E185961V1

Project Description

Summary:  View help for Summary Missing data is an inevitable aspect of every empirical research. Researchers developed several techniques to handle missing data to avoid information loss and biases. Over the past 50 years, these methods have become more and more efficient and also more complex. Building on previous review studies, this paper aims to analyze what kind of missing data handling methods are used among various scientific disciplines. For the analysis, we used nearly 50.000 scientific articles that were published between 1999 and 2016. JSTOR provided the data in text format. Furthermore, we utilized a text-mining approach to extract the necessary information from our corpus. Our results show that the usage of advanced missing data handling methods such as Multiple Imputation or Full Information Maximum Likelihood estimation is steadily growing in the examination period. Additionally, simpler methods, like listwise and pairwise deletion, are still in widespread use.

Scope of Project

Subject Terms:  View help for Subject Terms missing data; text mining
Time Period(s):  View help for Time Period(s) 1/1/1999 – 12/31/2016
Data Type(s):  View help for Data Type(s) text

Methodology

Data Source:  View help for Data Source JSTOR Data for Research
Note: This service is discontinued and was replaced by Constellate (https://constellate.org/)
Collection Mode(s):  View help for Collection Mode(s) other
Unit(s) of Observation:  View help for Unit(s) of Observation articles

Related Publications

Published Versions

Export Metadata

Report a Problem

Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.

This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.