Skip to content

unnest column of data.frames #1112

@geotheory

Description

@geotheory

unnest() does not work with data.frame columns.


unnest() works with list column of nested data.frames, but fails with a data.frame column for the same data. Given that this is a common data format (e.g. typical fromJSON() output) it feels like an unnecessary trap for the unaware.

The documentation does specify "list-column" in the title of the documentation, but otherwise this distinction of column-type isn't discussed. It feels to me that a column of nested data.frames is quite likely to be confused with a list column of nested data.frames.

require(jsonlite)
require(tidyr)
require(dplyr)

j = '{
  "observations": [{
    "id": "a",
    "data": {
      "count": 49,
      "max": 100
    }
  }, {
    "id": "b",
    "data": {
      "count": 93,
      "max": 120
    }
  }, {
    "id": "c",
    "data": {
      "count": 27,
      "max": 88
    }
  }]
}'

d = fromJSON(j)$observations %>% as_tibble()

glimpse(d)
#> Rows: 3
#> Columns: 2
#> $ id   <chr> "a", "b", "c"
#> $ data <df[,2]> <data.frame[3 x 2]>

d %>% unnest(data)
#> Error: Assigned data `map(data[[col]], as_df, col = col)` must be compatible with existing data.
#> x Existing data has 3 rows.
#> x Assigned data has 2 rows.
#> ℹ Only vectors of size 1 are recycled.

d %>% rowwise() %>% mutate(data = list(data)) %>% unnest(data)
#> # A tibble: 3 x 3
#>   id    count   max
#>   <chr> <int> <int>
#> 1 a        49   100
#> 2 b        93   120
#> 3 c        27    88

Metadata

Metadata

Assignees

No one assigned

    Labels

    df-col 👜featurea feature request or enhancementrectangling 🗄️converting deeply nested lists into tidy data frames

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions