Promises and pitfalls of using computer vision to make inferences about landscape preferences: Evidence from an urban-proximate park system (data and code)
Principal Investigator(s): View help for Principal Investigator(s) Emily J. Wilkins, Utah State University; Jordan W. Smith, Utah State University
Version: View help for Version V1
Name | File Type | Size | Last Modified |
---|---|---|---|
GoogleVision | 10/31/2021 10:36:PM | ||
code | 10/31/2021 10:32:PM | ||
shapefiles | 05/04/2021 12:53:PM | ||
survey | 10/31/2021 11:43:PM |
Project Citation:
Wilkins, Emily J., and Smith, Jordan W. Promises and pitfalls of using computer vision to make inferences about landscape preferences: Evidence from an urban-proximate park system (data and code). Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2021-10-31. https://doi.org/10.3886/E139681V1
Project Description
Summary:
View help for Summary
We compare preferences for landscape features derived through a computer vision algorithm (Google Cloud Vision) used to analyze social media photographs with preferences derived through a traditional on-site intercept survey. We surveyed visitors in Boulder Open Space and Mountain Parks lands in Colorado (USA) in May and June, 2018. We downloaded all Flickr photographs within Boulder Open Space and Mountain Parks lands from 2004 - 2018, and ran the photographs through Google Cloud Vision to get up to 10 labels for each image. We compare the content in Flickr photographs to the features that visitors say positively impacted their experience on surveys.
This paper is currently under review.
Contents of this repository:
GoogleVision:
Contains raw data exported from Google Vision, as well as a codebook for how we coded each label to match the landscape categories we asked about in the survey. Also contains a full database connecting the Google Vision labels and presence/absence of each feature to the Flickr data.
Code:
Contains one R script that makes the maps and runs spatial cluster analysis, and one R script that does all the data cleaning and analysis for the Flickr and survey data. You will need to download the contents in the GoogleVision, shapefiles, and survey folders to run this code. This folder also contains a Python script that we used to download Flickr data within Boulder through the Flickr API, and another R script used only to generate table E.1 in the supplementary material.
Shapefiles:
Contains all the spatial data needed to reproduce maps and run R code. This includes OSMP trails, trailheads, lands, landscape character areas, survey locations, and coordinates of Flickr points.
Survey:
Contains the data from a visitor survey in Boulder OSMP lands from May and June 2018 (in a CSV), as well as a codebook to interpret the data, and the survey instrument.
This paper is currently under review.
Contents of this repository:
GoogleVision:
Contains raw data exported from Google Vision, as well as a codebook for how we coded each label to match the landscape categories we asked about in the survey. Also contains a full database connecting the Google Vision labels and presence/absence of each feature to the Flickr data.
Code:
Contains one R script that makes the maps and runs spatial cluster analysis, and one R script that does all the data cleaning and analysis for the Flickr and survey data. You will need to download the contents in the GoogleVision, shapefiles, and survey folders to run this code. This folder also contains a Python script that we used to download Flickr data within Boulder through the Flickr API, and another R script used only to generate table E.1 in the supplementary material.
Shapefiles:
Contains all the spatial data needed to reproduce maps and run R code. This includes OSMP trails, trailheads, lands, landscape character areas, survey locations, and coordinates of Flickr points.
Survey:
Contains the data from a visitor survey in Boulder OSMP lands from May and June 2018 (in a CSV), as well as a codebook to interpret the data, and the survey instrument.
Funding Sources:
View help for Funding Sources
Boulder Open Space and Mountain Parks
Scope of Project
Subject Terms:
View help for Subject Terms
public land;
urban parks;
machine learning;
surveys;
social media;
image content analysis ;
landscape preferences
Geographic Coverage:
View help for Geographic Coverage
Boulder, CO, USA
Time Period(s):
View help for Time Period(s)
2004 – 2018 (Flickr data from 2004-2018; Survey data from 2018)
Collection Date(s):
View help for Collection Date(s)
2018 – 2018
Universe:
View help for Universe
Visitors to Boulder's Open Space and Mountain Parks lands.
Data Type(s):
View help for Data Type(s)
geographic information system (GIS) data;
program source code;
survey data
Methodology
Response Rate:
View help for Response Rate
84.3% response rate to the visitor survey; 81.6% response rate after non-usable surveys were removed.
Sampling:
View help for Sampling
We distributed surveys at 18 Boulder Open Space and Mountain Parks trailheads in May and June 2018. We selected locations using a stratified sampling approach based on the six OSMP landscape character areas, combined with a spatial cluster analysis of geotagged Flickr data to determine the most popular locations in the parks.
Data Source:
View help for Data Source
- Geotagged social media data are from the Flickr API (https://www.flickr.com/services/api/).
- Some of the shapefiles are from the city of Boulder, Colorado (https://open-data.bouldercolorado.gov/search?categories=recreation).
- Google Cloud Vision (https://cloud.google.com/vision).
- Some of the shapefiles are from the city of Boulder, Colorado (https://open-data.bouldercolorado.gov/search?categories=recreation).
- Google Cloud Vision (https://cloud.google.com/vision).
Collection Mode(s):
View help for Collection Mode(s)
on-site questionnaire;
web scraping
Scales:
View help for Scales
Several Likert-type scales were used in the visitor survey.
Unit(s) of Observation:
View help for Unit(s) of Observation
Individuals
Related Publications
Published Versions
Report a Problem
Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.
This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.