Skip to content

melt needs a variable.value parameter. #2065

@juanpide

Description

@juanpide

In order to reshape data.tables such as this one:

mydata <- data.table(ID=1:5, ZA_2001=1:5, ZA_2002=5:1, BB_2001=rep(3,5), BB_2002=rep(6,5), CC_2007=6:2)
ID ZA_2001 ZA_2002 BB_2001 BB_2002 CC_2007
1       1       5       3       6       6
2       2       4       3       6       5
3       3       3       3       6       4
4       4       2       3       6       3
5       5       1       3       6       2

to this one

ID  measure ZA BB CC
1    2001  1  3 NA
1    2002  5  6 NA
1    2007 NA NA  6
2    2001  2  3 NA
2    2002  4  6 NA
2    2007 NA NA  5
3    2001  3  3 NA
3    2002  3  6 NA
3    2007 NA NA  4
4    2001  4  3 NA
4    2002  2  6 NA
4    2007 NA NA  3
5    2001  5  3 NA
5    2002  1  6 NA
5    2007 NA NA  2

and keep the original variables' order I need to do:

idvars =  grep("_20[0-9][0-9]$",names(mydata) , invert = TRUE)
temp <- melt(mydata, id.vars = idvars)  
temp[, `:=`(var = sub("_20[0-9][0-9]$", '', variable), 
measure = sub('.*_', '', variable), variable = NULL)]  
temp[,var:=factor(var, levels=unique(var))]
dcast( temp,   ... ~ var, value.var='value' )

But it needs too many lines of code, a lot of memory, and it forces type conversions.

It would be great if we could just do it using melt:

melt(mydata, measure.vars=patterns("ZA","BB","CC"), value.name =c("ZA","BB","CC"), id.vars="ID", variable.name="year")   

(or using regex instead).

But it doesn't work, it only produces 10 rows, and it would need an option to specify where to get the year from.

something like variable.values=sub(".*_","",names(mydata) )[-1]

or even better a syntax such as the splitstackshape package:
Reshape(mydata,id.vars="ID",var.stubs =... ,sep="_")
it would be great.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions