tidyr 0.2.0 is now available on CRAN. tidyr makes it easy to “tidy” your data, storing it in a consistent form so that it’s easy to manipulate, visualise and model. Tidy data has variables in columns and observations in rows, and is described in more detail in the tidy data vignette. Install tidyr with:

install.packages("tidyr")

There are three important additions to tidyr 0.2.0:

sales <- dplyr::data_frame(
  year = rep(c(2012, 2013), c(4, 2)),
  quarter = c(1, 2, 3, 4, 2, 3),
  sales = sample(6) * 100
)

# Missing sales data for 2013 Q1 & Q4
sales
#> Source: local data frame [6 x 3]
#>
#>   year quarter sales
#> 1 2012       1   400
#> 2 2012       2   200
#> 3 2012       3   500
#> 4 2012       4   600
#> 5 2013       2   300
#> 6 2013       3   100

# Missing values are now explicit
sales %>%
  expand(year, quarter) %>%
  dplyr::left_join(sales)
#> Joining by: c("year", "quarter")
#> Source: local data frame [8 x 3]
#>
#>   year quarter sales
#> 1 2012       1   400
#> 2 2012       2   200
#> 3 2012       3   500
#> 4 2012       4   600
#> 5 2013       1    NA
#> 6 2013       2   300
#> 7 2013       3   100
#> 8 2013       4    NA
raw <- dplyr::data_frame(
  x = 1:3,
  y = c("a", "d,e,f", "g,h")
)
# y is character vector containing comma separated strings
raw
#> Source: local data frame [3 x 2]
#>
#>   x     y
#> 1 1     a
#> 2 2 d,e,f
#> 3 3   g,h

# y is a list of character vectors
as_list <- raw %>% mutate(y = strsplit(y, ","))
as_list
#> Source: local data frame [3 x 2]
#>
#>   x        y
#> 1 1 <chr[1]>
#> 2 2 <chr[3]>
#> 3 3 <chr[2]>

# y is a character vector; rows are duplicated as needed
as_list %>% unnest(y)
#> Source: local data frame [6 x 2]
#>
#>   x y
#> 1 1 a
#> 2 2 d
#> 3 2 e
#> 4 2 f
#> 5 3 g
#> 6 3 h
raw %>% separate(y, c("trt", "B"), ",")
#> Error: Values not split into 2 pieces at 1, 2
raw %>% separate(y, c("trt", "B"), ",", extra = "drop")
#> Source: local data frame [3 x 3]
#>
#>   x trt  B
#> 1 1   a NA
#> 2 2   d  e
#> 3 3   g  h
raw %>% separate(y, c("trt", "B"), ",", extra = "merge")
#> Source: local data frame [3 x 3]
#>
#>   x trt   B
#> 1 1   a  NA
#> 2 2   d e,f
#> 3 3   g   h

To read about the other minor changes and bug fixes, please consult the release notes.

reshape2 1.4.1

There’s also a new version of reshape2, 1.4.1. It includes three bug fixes for melt.data.frame() contributed by Kevin Ushey. Read all about them on the release notes and install it with:

install.packages("reshape2")