expand_cat_json.Rd
This function accepts a data.table
along with a set of grouping variables and a character-format category-style JSON column (i.e. starts with curly brackets, not square).
expand_cat_json(
dt,
expand,
index = "index",
by = NULL,
fast = FALSE,
fun = sum,
na.rm = TRUE,
set_key = TRUE
)
data.table object (or something that can be coerced to data.table)
String indicating the JSON column to be expanded.
String indicating the name of the new index column
Character vector indicating the variables to group by after expanding. Set to NULL
to aggregate across all initial rows, or set to FALSE
to not aggregate at all (this will also add an initial_rowno
column showing the original row number).
Assumes that all the JSON vectors are of the exact same categories, and adds the values together rather than using whatever is in fun
.
Function that takes a vector and returns a single value to use when collapsing to the by
level. Requires fast = FALSE
.
Ignore missing values of expand
Set the key of dt
to by
. Set to FALSE
if you have already set the key or want it returned without key.
It expands that JSON column into long format, with one row per observation per value of the JSON column, and then collapses everything according to the set of grouping variables.
# Raw example data for expanding/collapsing
patterns <- data.table::data.table(state_fips = c(1,1,2,2),
cat_origin = c('{"a": "2", "b": "3"}',
'{"a": "3", "b": "4"}',
'{"a": "4", "b": "5"}',
'{"a": "5", "b": "6"}'))
expand_cat_json(patterns, 'cat_origin', by = 'state_fips')[]
#> index state_fips cat_origin
#> 1: a 1 5
#> 2: b 1 7
#> 3: a 2 9
#> 4: b 2 11