This will open up a ZIP file from the SafeGraph shop and will read all of the data in, performing processing of the patterns files using read_patterns.

read_shop(
  filename,
  dir = ".",
  keeplist = c("patterns", "normalization_stats.csv", "home_panel_summary.csv",
    "visit_panel_summary.csv", "brand_info.csv"),
  exdir = dir,
  cleanup = TRUE,
  by = NULL,
  fun = sum,
  na.rm = TRUE,
  filter = NULL,
  expand_int = NULL,
  expand_cat = NULL,
  expand_name = NULL,
  multi = NULL,
  naics_link = NULL,
  select = NULL,
  gen_fips = FALSE,
  silent = FALSE,
  start_date = NULL,
  ...
)

Arguments

filename

The filename of the .zip file from the shop.

dir

The directory the file is in.

keeplist

Character vector of the files in the ZIP to read in. Use 'patterns' to refer to the patterns files.

exdir

Name of the directory to unzip to.

cleanup

Set to TRUE to delete all the unzipped files after being read in.

by, fun, na.rm, filter, expand_int, expand_cat, expand_name, multi, naics_link, select, gen_fips, silent, ...

Other arguments to be passed to read_patterns, specified as in help(read_patterns). NOte that gen_fips is FALSE here by default, rather than TRUE as elsewhere, as files from the shop often do not contain the poi_cbg variable necessary to use it. Check which state indicator variables you have access to, perhaps region.

start_date

An argument to be passed to read_patterns giving the first date present in the file, as a date object. When using read_shop this should usually be included, since the patterns file names in the shop files are not in a format read_patterns can pick up on automatically.

Details

The result will be a named list with each of the components of the data.

Examples


if (FALSE) {
# In the working directory I have the file 'shop_file.zip' to read in

mydata <- read_shop('shop_file.zip',
                    # I only want some of the files
                    keeplist = c('patterns','home_panel_summary.csv'),
                    # For patterns, only keep these variables
                    select = c('raw_visit_counts', 'region', 'bucketed_dwell_times', 'location_name'),
                    # I want two aggregations of patterns - one of total visits by state ('region')
                    # and another by location_name that has the dwell times for each brand
                    multi = list(
                      list(name = 'all',
                           by = 'region'),
                      list(name = 'location_dwells',
                           by = 'location_name',
                           expand_cat = 'bucketed_dwell_times',
                           expand_name = 'bucketed_times')
                      ),
                    # Be sure to specify start_date for read_shop
                    start_date = lubridate::ymd('2020-03-01'))

# The result is a list with two items- patterns and home_panel_summary.csv
# patterns itself is a list with two data.tables inside - 'all' and 'location_name',
# aggregated as given.
}