Data and Code for: Robot Hubs: The Skewed Distribution of Robots in U.S. Manufacturing
Principal Investigator(s): View help for Principal Investigator(s) ERIK BRYNJOLFSSON, Stanford Digital Economy Lab; CATHERINE BUFFINGTON, U.S. Census Bureau; NATHAN GOLDSCHLAG, U.S. Census Bureau; J. FRANK LI, Stanford Digital Economy Lab; JAVIER MIRANDA, IWH and Friedrich-Schiller University Jena; ROBERT SEAMANS , NYU Stern School of Business
Version: View help for Version V1
Name | File Type | Size | Last Modified |
---|---|---|---|
aea_pnp_replication | 05/01/2023 01:33:PM | ||
|
text/plain | 3.4 KB | 05/01/2023 11:09:AM |
Project Citation:
Project Description
Scope of Project
O30 Innovation; Research and Development; Technological Change; Intellectual Property Rights: General
records (e.g., employment) or are entirely imputed (e.g., using capital expenditures). The mail-eligible sample, roughly one-third of all manufacturing establishments (approximately 102,000), assigns all plants a stratified random probability of receiving a form; large plants are sampled with certainty, and the remainder are assigned probabilities proportionate to size and are sampled
within industries and product classes. Of the mail-eligible sample, roughly half will be surveyed. Sample weights are applied to surveyed plants to recover the full mail-eligible sample. Our analyses focus on the subset of the mail-eligible sample with reported values, weighted with sample weights throughout. Our analysis shows that robot users are relatively large and disproportionately likely to fall into the eligible sample that we focus on. That said, in future years, it will be increasingly important to monitor the behavior of non-mail units, particularly if robots become more accessible to smaller establishments.
Methodology
- Establishments that are eligible to be sent a report form:
This is defined as the mail stratum. It is comprised of larger single-location manufacturing companies and all manufacturing establishments of multi-location companies. The mail stratum is comprised of 102,468 establishments. On an annual basis, the mail stratum is supplemented with large, newly active single-location companies identified from a list provided by the IRS and new manufacturing locations of multi-location companies identified from the Census Bureau’s COS.
The 2019 ASM sample design is similar to the 2014-2018 sample design. The only significant change is the products universe file was created from the North American Product Classification System (NAPCS) codes.
Establishments in the 2017 Economic Census - Manufacturing that satisfied any of the following criteria are included in the sample with certainty: (1) the total 2017 employment for the establishment is greater than or equal to 1,000; (2) the establishment is identified as one of the ten largest establishments within the industry (based on employment); (3) the establishment is classified within an industry with less than 20 establishments; (4) the establishment is classified in the computer or flat-glass or sugar industry; (5) the establishment is located within a state where there are less than 20 additional establishments in the same North American Industry Classification System (NAICS) group (NAICS group is defined as the set of NAICS industries that have the same first four digits); or (6) the establishment is one of the largest establishments in terms of cost of fuels used, cost of electricity used, end-of-year inventories, end-of-year assets, or LIFO inventories. Collectively, there are 16,621 establishments that are selected with certainty. These establishments accounted for approximately 70 percent of the total value of shipments in the 2017 Economic Census - Manufacturing.
Establishments in the remaining portion of the mail stratum are sampled with probabilities ranging from .05 to 1.00. Each of the 360 industries and 2,184 product classes are considered to be a separate population. Using variable reliability constraints, each establishment within a given population is assigned an initial probability of selection that reflects its relative importance within the population. Establishments producing products in multiple product classes receive multiple initial probabilities. The final probability of selection for a specific establishment is defined as the largest of their initial probabilities.
This method of assigning probabilities is motivated by the Census Bureau's primary desire to produce reliable estimates of both product class and industry shipments. The high correlation between shipments and employment, value-added, and other general statistics assures that these variables will also be well represented. For sample selection purposes, each establishment is assigned to an industry stratum. Within each of the 360 industry strata, an independent sample is selected using the final probability of selection associated with the establishments classified within the stratum. A fixed-sample size methodology is used to assure that the desired sample size is realized. The total sample size for 2019 is 49,414. 2. Establishments not eligible to be sent a report form: This is defined as the nonmail stratum. The nonmail stratum consists of small- and medium-sized, single-establishment companies from the Economic Census - Manufacturing. The initial nonmail stratum of the 2019 sample contained 186,670 single-establishment companies from the 2017 Economic Census - Manufacturing.
The nonmail stratum is supplemented annually using the list of newly active single-location companies provided by the IRS. Data for establishments included in the nonmail stratum are estimated using information obtained from the administrative records of the IRS and Social Security Administration (SSA); and are included in the published ASM estimates. This administrative information, which includes payroll, total employment, industry classification, and physical location, is obtained under conditions which safeguard the confidentiality of both tax and census records.
Most of the ASM estimates derived for the mail stratum are computed using a difference estimator. The difference estimator takes advantage of the fact that, for manufacturing establishments, there is a strong correlation between the current-year data values and the previous Census values. Because of this correlation, difference estimates are generally more reliable than comparable estimates developed from the current sample data alone. The ASM difference estimates are computed at the establishment level by adding the weighted difference (between the current data and the Census data) to the Census data. That is,
Difference Estimate = Census value + weight(Current value - Census value)
Or equivalently
Difference Estimate = weight(Current value) + (1-weight)Census value
Estimates for the capital expenditures variables are not generated using the difference estimator because the year-to-year correlations are considerably weaker. The standard linear estimator is used for these variables.
For the nonmail stratum, estimates for payroll are directly tabulated from the administrative-record data provided by the IRS and the SSA. Estimates of the other data variables are developed from industry averages. Although the establishments in the nonmail stratum are far more numerous than those in the mail stratum, they account for less than 6 percent of the value of shipments estimate at the total manufacturing level.
Corresponding estimates for the mail and nonmail components are combined to produce the estimates included in this publication.
Related Publications
Published Versions
Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.
This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.