-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use mutate(.keep = "none")
in transmute()
#483
base: main
Are you sure you want to change the base?
Conversation
The code change looks good. There's a tradeoff to the change, because if you're doing an immediate library(bench)
library(data.table)
library(magrittr)
n <- 1e6
dt <- replicate(10, sample(n), simplify = FALSE) %>%
setNames(., tail(letters, length(.))) %>%
as.data.table()
mark(
add_then_remove = copy(dt)[, `:=`(double_x = x * 2)][, `:=`(names(dt), NULL)],
list_in_j = dt[, .(double_x = x * 2)],
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 add_then_remove 6.8ms 8.96ms 91.7 47.6MB 504.
#> 2 list_in_j 2.19ms 2.86ms 315. 15.3MB 137. Created on 2024-12-25 with reprex v2.0.2 One thought I had is that transmute could be a |
Ah yep I'll make the reorder change.
We could also do something like this to preserve GForce but I think it makes the translation look odd. library(dtplyr)
library(dplyr)
df <- tibble(x = 1:2, y = 3:4) %>%
lazy_dt()
df %>%
reframe(y, mean_x = mean(x), .row_preserve = row_number()) %>%
select(-.row_preserve)
#> Source: local data table [2 x 2]
#> Call: `_DT1`[, .(y = y, mean_x = mean(x), .row_preserve = seq_len(.N))][,
#> `:=`(".row_preserve", NULL)]
#>
#> y mean_x
#> <int> <dbl>
#> 1 3 1.5
#> 2 4 1.5
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results |
Actually sorry I completely did the column order thing wrong 😅 Let me take another pass at it. |
Ok good to review now |
cols_group <- group_vars(.data) | ||
cols_group <- setdiff(cols_group, cols_expr) | ||
cols_retain <- c(cols_group, cols_expr) | ||
select(out, all_of(cols_retain)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some columns could have been removed in mutate
/transmute
with colnm = NULL
, e.g.
tibble(x = 1, y = 2) %>%
lazy_dt() %>%
transmute(x, z = 3, z = NULL)
So, maybe intersect()
with the colnames of the mutate
before select
?
cols_retain <- intersect(c(cols_group, cols_expr), out$vars)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or, could use any_of()
Closes #470
Note: I removed most of the dedicated
transmute()
tests as they're now covered bymutate()
tests