This accepts a directory. It will use read_shop to load every zip in that folder, assuming they are all files downloaded from the SafeGraph shop. It will then row-bind together each of the subfiles, so you'll get a list where one entry all the normalization data row-bound together, another is all the patterns files, and so on. . Note that after reading in data, if gen_fips = TRUE, state and county names can be merged in using data(fips_to_names).

read_many_shop(
  dir = ".",
  recursive = FALSE,
  filelist = NULL,
  start_date = NULL,
  keeplist = c("patterns", "normalization_stats.csv", "home_panel_summary.csv",
    "visit_panel_summary.csv", "brand_info.csv"),
  exdir = dir,
  cleanup = TRUE,
  by = NULL,
  fun = sum,
  na.rm = TRUE,
  filter = NULL,
  expand_int = NULL,
  expand_cat = NULL,
  expand_name = NULL,
  multi = NULL,
  naics_link = NULL,
  select = NULL,
  gen_fips = FALSE,
  silent = FALSE,
  ...
)

Arguments

dir

Name of the directory the files are in.

recursive

Look for files in all subdirectories as well.

filelist

Optionally specify only a subset of the filename to read in.

start_date

A vector of dates giving the first date present in each zip file, to be passed to read_patterns giving the first date present in the file, as a date object. When using read_many_shop this **really** should be included, since the patterns file names in the shop files are not in a format read_patterns can pick up on automatically. If left unspecified, will produce an error. To truly go ahead unspecified, set this to FALSE.

keeplist, exdir, cleanup

Arguments to be passed to read_shop, specified as in help(read_shop).

by, fun, na.rm, filter, expand_int, expand_cat, expand_name, multi, naics_link, select, gen_fips, silent, ...

Other arguments to be passed to read_patterns, specified as in help(read_patterns).

Examples


if (FALSE) {
# In the working directory we have two shop ZIP files, one for March and one for April.
mydata <- read_shop(# I only want some of the sub-files
                    keeplist = c('patterns','home_panel_summary.csv'),
                    # For patterns, only keep these variables
                    select = c('raw_visit_counts', 'region', 'bucketed_dwell_times', 'location_name'),
                    # I want two aggregations of patterns - one of total visits by state ('region')
                    # and another by location_name that has the dwell times for each brand
                    multi = list(
                      list(name = 'all',
                           by = 'region'),
                      list(name = 'location_dwells',
                           by = 'location_name',
                           expand_cat = 'bucketed_dwell_times',
                           expand_name = 'bucketed_times')
                      ),
                    # Be sure to specify start_date for read_shop
                    start_date = c(lubridate::ymd('2020-03-01'),lubridate::ymd('2020-04-01')))

# The result is a list with two items- patterns and home_panel_summary.csv
# patterns itself is a list with two data.tables inside - 'all' and 'location_name',
# aggregated as given.

}