Takes a folder of stay-at-home Safegraph data structured how it comes from AWS (i.e. folders 2020/04/03 for April 3 2020) and reads them in.

read_distancing(
  start,
  end,
  dir = ".",
  gen_fips = TRUE,
  by = c("state_fips", "county_fips"),
  filter = NULL,
  select = c("origin_census_block_group", "device_count",
    "completely_home_device_count", "part_time_work_behavior_devices",
    "full_time_work_behavior_devices"),
  ...
)

Arguments

start

Date object with the starting date to read in stay-at-home data.

end

Ending date to read stay-at-home data to.

dir

The folder in which the "2020" (etc.) folder resides.

gen_fips

Set to TRUE to use the origin_census_block_group variable to generate state_fips and county_fips as numeric variables. This will also result in origin_census_block_group being converted to character.

by

After reading, collapse to this level by summing all the data. Usually c('state_fips','county_fips') with gen_fips = TRUE. Set to NULL to aggregate across all initial rows, or set to FALSE to not aggregate at all.

filter

A character string describing a logical statement for filtering the data, for example filter = 'state_fips == 6' would give you only data from California. Will be used as an i argument in a data.table, see help(data.table). Filtering here instead of afterwards can cut down on time and memory demands.

select

Character vector of variables to get from the file. Set to NULL to get all variables.

...

Other arguments to be passed to data.table::fread when reading in the file. For example, nrows to only read in a certain number of rows.

Details

The stay-at-home data is no longer being updated as of April 19, 2021. This function should still work for the old data though.

Note that after reading in data, if gen_fips = TRUE, state and county names can be merged in using data(fips_to_names).

Examples


if (FALSE) {

# The directory distdat is the folder we have downloaded the distancing data to from AWS.
# Read and compile all distancing data from May 1 to May 7
distancing <- read_distancing(
    start = lubridate::ymd('2020-05-01'),
    end = lubridate::ymd('2020-05-07'),
    dir = distdat
)

}