Search openICPSR


Find and share social, behavioral, and health sciences research data.

  • Search terms can be anywhere in the study: title, description, variables, etc.
  • Because our holdings are large, we recommend using at least two query terms:
    rural economy
    home ownership
    higher education
    ghana adolescents
  • Keywords help delimit the breadth of results. Therefore, use as many as required to achieve your desired results:
    elementary education federal funding
  • Our search will find studies with derivative expressions of your query terms: A search for "nation" will find results containing "national"
  • Use quotes to search for an exact expression:
    "social mobility"
  • You can combine exact expressions with loose terms:
    "united states" inmates
  • Exclude results by using a MINUS sign:
    elections -sweden -germany
    elections -sweden -germany
  • On the results page, you will be able to sort and filter to further refine results.
  • Please note that your search queries only openICPSR data holdings.
CLOSE
Name File Type Size Last Modified
Iowa_State_Census_1915_data.dta application/x-stata 21.8 MB 08/01/2019 01:54:PM
LIDO_score_1950.dta application/x-stata 83.9 MB 07/30/2019 07:22:AM
LIDO_score_1950_Iowa.dta application/x-stata 297 KB 01/24/2018 12:16:PM
ReadMe.docx application/vnd.openxmlformats-officedocument.wordprocessingml.document 13.4 KB 08/05/2019 08:26:AM
census1950_2000.dta application/x-stata 1.3 GB 08/05/2019 06:54:AM
construct_1950_based_LIDO.do text/x-stata-syntax 1 KB 07/30/2019 06:51:AM
construct_2000_based_LIDO.do text/x-stata-syntax 1.4 KB 08/05/2019 05:46:AM
figure_1_replication.do text/x-stata-syntax 2.5 KB 08/05/2019 08:03:AM
figure_2_replication.do text/x-stata-syntax 3 KB 08/01/2019 01:53:PM
lasso_2000.dta application/x-stata 32.6 MB 11/29/2016 05:59:PM

Project Citation: 

Saavedra, Martin, and Twinam, Tate. A Machine Learning Approach to Improving Occupational Income Scores. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2019-08-05. https://doi.org/10.3886/E111103V2

Project Description

Summary:  View help for Summary These files are the replication files for "A Machine Learning Approach to Improving Occupational Income Scores" by Martin Saavedra and Tate Twinam.

Abstract: Historical studies of labor markets frequently lack data on individual income. The occupational income score (OCCSCORE) is often used as an alternative measure of labor market outcomes. We consider the consequences of using OCCSCORE when researchers are interested in earnings regressions. We estimate race and gender earnings gaps in modern decennial Censuses as well as the 1915 Iowa State Census. Using OCCSCORE biases results towards zero and can result in gaps of the wrong sign. We use a machine learning approach to construct a new adjusted score based on industry, occupation, and demographics. The new income score provides estimates closer to earnings regressions. Lastly, we consider the consequences for estimates of intergenerational mobility elasticities.

Scope of Project

Subject Terms:  View help for Subject Terms Occupational Income Scores; OCCSCORE ; Intergenerational Mobility
Geographic Coverage:  View help for Geographic Coverage United States


Related Publications

Published Versions

Export Metadata

Report a Problem

Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.

This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.