Releases · tidyverse/tidyr

24 Jan 21:40

hadley

v1.0.2

434c453

tidyr 1.0.2

Minor fixes for dev versions of rlang, tidyselect, and tibble.

(Was supposed to be 1.0.1; accidentally released as 1.0.2)

Assets 2

13 Sep 14:48

hadley

v1.0.0

5295af0

tidyr 1.0.0

Breaking changes

See vignette("in-packages") for a detailed transition guide.

nest() and unnest() have new syntax. The majority of existing usage
should be automatically translated to the new syntax with a warning.
If that doesn't work, put this in your script to use the old versions
until you can take a closer look and update your code:
```
library(tidyr) 
nest <- nest_legacy 
unnest <- unnest_legacy 
```
nest() now preserves grouping, which has implications for downstream calls
to group-aware functions, such as dplyr::mutate() and filter().
The first argument of nest() has changed from data to .data.
unnest() uses the emerging tidyverse standard
to disambiguate unique names. Use names_repair = tidyr_legacy to
request the previous approach.
unnest_()/nest_() and the lazyeval methods for unnest()/nest() are
now defunct. They have been deprecated for some time, and, since the interface
has changed, package authors will need to update to avoid deprecation
warnings. I think one clean break should be less work for everyone.

All other lazyeval functions have been formally deprecated, and will be
made defunct in the next major release. (See lifecycle vignette for
details on deprecation stages).
crossing() and nesting() now return 0-row outputs if any input is a
length-0 vector. If you want to preserve the previous behaviour which
silently dropped these inputs, you should convert empty vectors to NULL.
(More discussion on this general pattern at
tidyverse/design#24)

Pivoting

New pivot_longer() and pivot_wider() provide modern alternatives to spread() and gather(). They have been carefully redesigned to be easier to learn and remember, and include many new features. Learn more in vignette("pivot").

These functions resolve multiple existing issues with spread()/gather(). Both functions now handle mulitple value columns (#149/#150), support more vector types (#333), use tidyverse conventions for duplicated column names (#496, #478), and are symmetric (#453). pivot_longer() gracefully handles duplicated column names (#472), and can directly split column names into multiple variables. pivot_wider() can now aggregate (#474), select keys (#572), and has control over generated column names (#208).

To demonstrate how these functions work in practice, tidyr has gained several new datasets: relig_income, construction, billboard, us_rent_income, fish_encounters and world_bank_pop.

Finally, tidyr demos have been removed. They are dated, and have been superseded by vignette("pivot").

Rectangling

tidyr contains four new functions to support rectangling, turning a deeply nested list into a tidy tibble: unnest_longer(), unnest_wider(), unnest_auto(), and hoist(). They are documented in a new vignette: vignette("rectangle").

unnest_longer() and unnest_wider() make it easier to unnest list-columns of vectors into either rows or columns (#418). unnest_auto() automatically picks between _longer() and _wider() using heuristics based on the presence of common names.

New hoist() provides a convenient way of plucking components of a list-column out into their own top-level columns (#341). This is particularly useful when you are working with deeply nested JSON, because it provides a convenient shortcut for the mutate() + map() pattern:

df %>% hoist(metadata, name = "name")

Assets 2

02 Mar 14:40

hadley

v0.8.3

c267c48

tidyr 0.8.3

crossing() preserves factor levels (#410), now works with list-columns
(#446, @SamanthaToet). (These also help expand() which is built on top
of crossing())
nest() is compatible with dplyr 0.8.0.
spread() works when the id variable has names (#525).
unnest() preserves column being unnested when input is zero-length (#483),
using list_of() attribute to correctly restore columns, where possible.
unnest() will run with named and unnamed list-columns of same length
(@hlendway, #460).

Assets 2

29 Oct 14:18

hadley

v0.8.2

8a42d07

tidyr 0.8.2

separate() now accepts NA as a column name in the into argument to
denote columns which are omitted from the result. (@markdly, #397).
Minor updates to ensure compatibility with dependencies.

Assets 2

18 May 14:04

hadley

v0.8.1

1b3bcfc

tidyr 0.8.1

unnest() weakens test of "atomicity" to restore previous behaviour when
unnesting factors and dates (#407).

Assets 2

30 Jan 15:44

hadley

v0.8.0

bb69f87

tidyr 0.8.0

Breaking changes

There are no deliberate breaking changes in this release. However, a number
of packages are failing with errors related to numbers of elements in columns,
and row names. It is possible that these are accidental API changes or new
bugs. If you see such an error in your package, I would sincerely appreciate
a minimal reprex.
separate() now correctly uses -1 to refer to the far right position,
instead of -2. If you depended on this behaviour, you'll need to switch
on packageVersion("tidyr") > "0.7.2"

New features

Increased test coverage from 84% to 99%.
uncount() performs the inverse operation of dplyr::count() (#279)

Bug fixes and minor improvements

complete(data) now returns data rather than throwing an error (#390).
complete() with zero-length completions returns original input (#331).
crossing() preserves NAs (#364).
expand() with empty input gives empty data frame instead of NULL (#331).
expand(), crossing(), and complete() now complete empty factors instead
of dropping them (#270, #285)
extract() has a better error message if regex does not contain the
expected number of groups (#313).
drop_na() no longer drops columns (@jennybryan, #245), and works with
list-cols (#280). Equivalent of NA in a list column is any empty
(length 0) data structure.
nest() is now faster, especially when a long data frame is collapsed into
a nested data frame with few rows.
nest() on a zero-row data frame works as expected (#320).
replace_na() no longer complains if you try and replace missing values in
variables not present in the data (#356).
replace_na() now also works with vectors (#342, @flying-sheep), and
can replace NULL in list-columns. It throws a better error message if
you attempt to replace with something other than length 1.
separate() now longer checks that ... is empty, allowing methods to make
use of it. This check was added in tidyr 0.4.0 (2016-02-02) to deprecate
previous behaviour where ... was passed to strsplit().
separate() and extract() now insert columns in correct position when
drop = TRUE (#394).
separate() now works correctly counts from RHS when using negative
integer sep values (@markdly, #315).
separate() gets improved warning message when pieces aren't as expected
(#375).
separate_rows() supports list columns (#321), and works with empty tibbles.
spread() now consistently returns 0 row outputs for 0 row inputs (#269).
spread() now works when key column includes NA and drop is FALSE
(#254).
spread() no longer returns tibbles with row names (#322).
spread(), separate(), extract() (#255), and gather() (#347) now
replace existing variables rather than creating an invalid data frame with
duplicated variable names (matching the semantics of mutate).
unite() now works (as documented) if you don't supply any variables (#355).
unnest() gains preserve argument which allows you to preserve list
columns without unnesting them (#328).
unnest() can unnested list-columns contains lists of lists (#278).
unnest(df) now works if df contains no list-cols (#344)

Assets 2

17 Oct 14:09

hadley

v0.7.2

0ed08c2

tidyr 0.7.2

The SE variants gather_(), spread_() and nest_() now
treat non-syntactic names in the same way as pre tidy eval versions
of tidyr (#361).
Fix tidyr bug revealed by R-devel.

Assets 2

12 Sep 06:24

lionel-

v0.7.1

bd0c6b0

tidyr 0.7.1

This is a hotfix release to account for some tidyselect changes in the
unit tests.

Note that the upcoming version of tidyselect backtracks on some of the
changes announced for 0.7.0. The special evaluation semantics for
selection have been changed back to the old behaviour because the new
rules were causing too much trouble and confusion. From now on data
expressions (symbols and calls to : and c()) can refer to both
registered variables and to objects from the context.

However the semantics for context expressions (any calls other than to
: and c()) remain the same. Those expressions are evaluated in the
context only and cannot refer to registered variables. If you're
writing functions and refer to contextual objects, it is still a good
idea to avoid data expressions by following the advice of the 0.7.0
release notes.

Assets 2

16 Aug 14:10

lionel-

v0.7.0

8926723

tidyr 0.7.0

This release includes important changes to tidyr internals. Tidyr now
supports the new tidy evaluation framework for quoting (NSE)
functions. It also uses the new tidyselect package as selecting
backend.

Breaking changes

If you see error messages about objects or functions not found, it
is likely because the selecting functions are now stricter in their
arguments An example of selecting function is gather() and its
... argument. This change makes the code more robust by
disallowing ambiguous scoping. Consider the following code:
```
x <- 3
df <- tibble(w = 1, x = 2, y = 3)
gather(df, "variable", "value", 1:x)
```
Does it select the first three columns (using the x defined in the
global environment), or does it select the first two columns (using
the column named x)?

To solve this ambiguity, we now make a strict distinction between
data and context expressions. A data expression is either a bare
name or an expression like x:y or c(x, y). In a data expression,
you can only refer to columns from the data frame. Everything else
is a context expression in which you can only refer to objects that
you have defined with <-.

In practice this means that you can no longer refer to contextual
objects like this:
```
mtcars %>% gather(var, value, 1:ncol(mtcars))

x <- 3
mtcars %>% gather(var, value, 1:x)
mtcars %>% gather(var, value, -(1:x))
```
You now have to be explicit about where to find objects. To do so,
you can use the quasiquotation operator !! which will evaluate its
argument early and inline the result:
```
mtcars %>% gather(var, value, !! 1:ncol(mtcars))
mtcars %>% gather(var, value, !! 1:x)
mtcars %>% gather(var, value, !! -(1:x))
```
An alternative is to turn your data expression into a context
expression by using seq() or seq_len() instead of :. See the
section on tidyselect for more information about these semantics.
Following the switch to tidy evaluation, you might see warnings
about the "variable context not set". This is most likely caused by
supplyng helpers like everything() to underscored versions of
tidyr verbs. Helpers should be always be evaluated lazily. To fix
this, just quote the helper with a formula: drop_na(df, ~everything()).
The selecting functions are now stricter when you supply integer
positions. If you see an error along the lines of
```
`-0.949999999999999`, `-0.940000000000001`, ... must resolve to
integer column positions, not a double vector
```
please round the positions before supplying them to tidyr. Double
vectors are fine as long as they are rounded.

Switch to tidy evaluation

tidyr is now a tidy evaluation grammar. See the
programming vignette
in dplyr for practical information about tidy evaluation.

The tidyr port is a bit special. While the philosophy of tidy
evaluation is that R code should refer to real objects (from the data
frame or from the context), we had to make some exceptions to this
rule for tidyr. The reason is that several functions accept bare
symbols to specify the names of new columns to create (gather()
being a prime example). This is not tidy because the symbol do not
represent any actual object. Our workaround is to capture these
arguments using rlang::quo_name() (so they still support
quasiquotation and you can unquote symbols or strings). This type of
NSE is now discouraged in the tidyverse: symbols in R code should
represent real objects.

Following the switch to tidy eval the underscored variants are softly
deprecated. However they will remain around for some time and without
warning for backward compatibility.

Switch to the tidyselect backend

The selecting backend of dplyr has been extracted in a standalone
package tidyselect which tidyr now uses for selecting variables. It is
used for selecting multiple variables (in drop_na()) as well as
single variables (the col argument of extract() and separate(),
and the key and value arguments of spread()). This implies the
following changes:

The arguments for selecting a single variable now support all
features from dplyr::pull(). You can supply a name or a position,
including negative positions.
Multiple variables are now selected a bit differently. We now make a
strict distinction between data and context expressions. A data
expression is either a bare name of an expression like x:y or
c(x, y). In a data expression, you can only refer to columns from
the data frame. Everything else is a context expression in which you
can only refer to objects that you have defined with <-.

You can still refer to contextual objects in a data expression by
being explicit. One way of being explicit is to unquote a variable
from the environment with the tidy eval operator !!:
```
x <- 2
drop_na(df, 2)     # Works fine
drop_na(df, x)     # Object 'x' not found
drop_na(df, !! x)  # Works as if you had supplied 2
```
On the other hand, select helpers like start_with() are context
expressions. It is therefore easy to refer to objects and they will
never be ambiguous with data columns:
```
x <- "d"
drop_na(df, starts_with(x))
```
While these special rules is in contrast to most dplyr and tidyr
verbs (where both the data and the context are in scope) they make
sense for selecting functions and should provide more robust and
helpful semantics.

Assets 2

19 May 14:26

hadley

0.6.3

eb70bfe

tidyr 0.6.3

Patch tests to be compatible with dev tibble

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Breaking changes

Pivoting

Rectangling

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Breaking changes

New features

Bug fixes and minor improvements

Uh oh!

Uh oh!

Uh oh!

Breaking changes

Switch to tidy evaluation

Switch to the tidyselect backend

Uh oh!

Uh oh!

Releases: tidyverse/tidyr

tidyr 1.0.2

Uh oh!

tidyr 1.0.0

Breaking changes

Pivoting

Rectangling

Uh oh!

tidyr 0.8.3

Uh oh!

tidyr 0.8.2

Uh oh!

tidyr 0.8.1

Uh oh!

tidyr 0.8.0

Breaking changes

New features

Bug fixes and minor improvements

Uh oh!

tidyr 0.7.2

Uh oh!

tidyr 0.7.1

Uh oh!

tidyr 0.7.0

Breaking changes

Switch to tidy evaluation

Switch to the tidyselect backend

Uh oh!

tidyr 0.6.3

Uh oh!