[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fromJSON is very slow in simplifyDataFrame function #203

Open
andrewkho opened this issue Sep 25, 2017 · 1 comment
Open

fromJSON is very slow in simplifyDataFrame function #203

andrewkho opened this issue Sep 25, 2017 · 1 comment

Comments

@andrewkho
Copy link

Hi, I'm using a package which translates ElasticSearch queries (JSON format) to data.tables. The issue is that it's very slow, and majority of the time seems to spent in jsonlite's 'simplifyDataFrame' function.

If I call:
jsonList <- jsonlite::fromJSON(m, flatten=T)

where m is a 17.9 Mb json string, it takes over 35 seconds, resulting in a 120.8 Mb object (jsonList). I ran it through Rprof and it looks like most of the bottleneck is in simplifyDataFrame. Here's the first few rows of the profiling summary, I'll also attach the raw profile data:

$by.total
                           total.time total.pct self.time self.pct
"chomp_aggs"                     44.72     99.91      0.00     0.00
"lapply"                         42.98     96.02      2.68     5.99
"FUN"                            42.92     95.89      4.76    10.63
"fromJSON_string"                35.86     80.12      0.00     0.00
"jsonlite::fromJSON"             35.86     80.12      0.00     0.00
"simplifyDataFrame"              35.02     78.24      2.00     4.47
"simplify"                       35.02     78.24      0.00     0.00
"vapply"                          9.56     21.36      3.72     8.31
"unpack_nested_data"              8.52     19.03      0.00     0.00
"unlist"                          7.44     16.62      1.34     2.99
"data.frame"                      4.94     11.04      0.84     1.88
"unique"                          4.58     10.23      1.78     3.98
"as.data.table.data.frame"        4.00      8.94      0.12     0.27

Is there anyway this method could be sped up?

@exaexa
Copy link
exaexa commented Aug 1, 2019

+1, same issue (triggered by pushing large amounts of data to browsers from Shiny)

Is there any update since the bug was opened?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants