Promises and pitfalls of using computer vision to make inferences about landscape preferences: Evidence from an urban-proximate park system (data and code)

Principal Investigator(s): View help for Principal Investigator(s) Emily J. Wilkins, Utah State University; Jordan W. Smith, Utah State University

Version: View help for Version V1

Name	File Type	Size	Last Modified
GoogleVision			10/31/2021 10:36:PM
code			10/31/2021 10:32:PM
shapefiles			05/04/2021 12:53:PM
survey			10/31/2021 11:43:PM

Project Citation:

Wilkins, Emily J., and Smith, Jordan W. Promises and pitfalls of using computer vision to make inferences about landscape preferences: Evidence from an urban-proximate park system (data and code). Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2021-10-31. https://doi.org/10.3886/E139681V1

Project Description

Summary: We compare preferences for landscape features derived through a computer vision algorithm (Google Cloud Vision) used to analyze social media photographs with preferences derived through a traditional on-site intercept survey. We surveyed visitors in Boulder Open Space and Mountain Parks lands in Colorado (USA) in May and June, 2018. We downloaded all Flickr photographs within Boulder Open Space and Mountain Parks lands from 2004 - 2018, and ran the photographs through Google Cloud Vision to get up to 10 labels for each image. We compare the content in Flickr photographs to the features that visitors say positively impacted their experience on surveys.

This paper is currently under review.

Contents of this repository:

GoogleVision:
Contains raw data exported from Google Vision, as well as a codebook for how we coded each label to match the landscape categories we asked about in the survey. Also contains a full database connecting the Google Vision labels and presence/absence of each feature to the Flickr data.

Code:
Contains one R script that makes the maps and runs spatial cluster analysis, and one R script that does all the data cleaning and analysis for the Flickr and survey data. You will need to download the contents in the GoogleVision, shapefiles, and survey folders to run this code. This folder also contains a Python script that we used to download Flickr data within Boulder through the Flickr API, and another R script used only to generate table E.1 in the supplementary material.

Shapefiles:
Contains all the spatial data needed to reproduce maps and run R code. This includes OSMP trails, trailheads, lands, landscape character areas, survey locations, and coordinates of Flickr points.

Survey:
Contains the data from a visitor survey in Boulder OSMP lands from May and June 2018 (in a CSV), as well as a codebook to interpret the data, and the survey instrument.

Funding Sources: Boulder Open Space and Mountain Parks

Scope of Project

Subject Terms: public land; urban parks; machine learning; surveys; social media; image content analysis ; landscape preferences

Geographic Coverage: Boulder, CO, USA

Time Period(s): 2004 – 2018 (Flickr data from 2004-2018; Survey data from 2018)

Collection Date(s): 2018 – 2018

Universe: Visitors to Boulder's Open Space and Mountain Parks lands.

Data Type(s): geographic information system (GIS) data; program source code; survey data

Methodology

Response Rate: 84.3% response rate to the visitor survey; 81.6% response rate after non-usable surveys were removed.

Sampling: We distributed surveys at 18 Boulder Open Space and Mountain Parks trailheads in May and June 2018. We selected locations using a stratified sampling approach based on the six OSMP landscape character areas, combined with a spatial cluster analysis of geotagged Flickr data to determine the most popular locations in the parks.

Data Source: - Geotagged social media data are from the Flickr API (https://www.flickr.com/services/api/).
- Some of the shapefiles are from the city of Boulder, Colorado (https://open-data.bouldercolorado.gov/search?categories=recreation).
- Google Cloud Vision (https://cloud.google.com/vision).

Collection Mode(s): on-site questionnaire; web scraping

Scales: Several Likert-type scales were used in the visitor survey.

Unit(s) of Observation: Individuals

Related Publications

Download this project

Published Versions

V1 [2021-10-31]

Export Metadata

Dublin Core

DDI 2.5

Report a Problem

Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.

This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.