read_shop.Rd
This will open up a ZIP file from the SafeGraph shop and will read all of the data in, performing processing of the patterns files using read_patterns
.
read_shop(
filename,
dir = ".",
keeplist = c("patterns", "normalization_stats.csv", "home_panel_summary.csv",
"visit_panel_summary.csv", "brand_info.csv"),
exdir = dir,
cleanup = TRUE,
by = NULL,
fun = sum,
na.rm = TRUE,
filter = NULL,
expand_int = NULL,
expand_cat = NULL,
expand_name = NULL,
multi = NULL,
naics_link = NULL,
select = NULL,
gen_fips = FALSE,
silent = FALSE,
start_date = NULL,
...
)
The filename of the .zip
file from the shop.
The directory the file is in.
Character vector of the files in the ZIP to read in. Use 'patterns'
to refer to the patterns files.
Name of the directory to unzip to.
Set to TRUE
to delete all the unzipped files after being read in.
Other arguments to be passed to read_patterns
, specified as in help(read_patterns)
. NOte that gen_fips
is FALSE
here by default, rather than TRUE
as elsewhere, as files from the shop often do not contain the poi_cbg
variable necessary to use it. Check which state indicator variables you have access to, perhaps region
.
An argument to be passed to read_patterns
giving the first date present in the file, as a date object. When using read_shop
this should usually be included, since the patterns file names in the shop files are not in a format read_patterns
can pick up on automatically.
The result will be a named list with each of the components of the data.
if (FALSE) {
# In the working directory I have the file 'shop_file.zip' to read in
mydata <- read_shop('shop_file.zip',
# I only want some of the files
keeplist = c('patterns','home_panel_summary.csv'),
# For patterns, only keep these variables
select = c('raw_visit_counts', 'region', 'bucketed_dwell_times', 'location_name'),
# I want two aggregations of patterns - one of total visits by state ('region')
# and another by location_name that has the dwell times for each brand
multi = list(
list(name = 'all',
by = 'region'),
list(name = 'location_dwells',
by = 'location_name',
expand_cat = 'bucketed_dwell_times',
expand_name = 'bucketed_times')
),
# Be sure to specify start_date for read_shop
start_date = lubridate::ymd('2020-03-01'))
# The result is a list with two items- patterns and home_panel_summary.csv
# patterns itself is a list with two data.tables inside - 'all' and 'location_name',
# aggregated as given.
}