DesignMatrix should have to_dataframe() method

This would be useful, for example, when I really want to be able to use a design matrix as both a raw numpy array and a pandas dataframe.

I suppose I could specify `return_type="dataframe"` and then get the numpy array from df.values, and it's also not hard to build the dataframe from scratch, but this would be particularly handy for interactive use, where it would provide a useful shortcut (e.g., `X.to_dataframe().plot()` or `X.to_dataframe().head()`).

To do this right, the new method would be factored out of build_design_matrices. Roughly speaking, it would look like this:

``` python
def to_dataframe(self):
    if not have_pandas:
        raise PatsyError("pandas.DataFrame was requested, but "
                         "pandas is not installed")
    di = self.design_info
    df = pandas.DataFrame(self, columns=di.column_names,
                          index=di.pandas_index)
    df.design_info = di
    return df
```

The main design change would be that DesignInfo (or DesignMatrix) would need to gain a `pandas_index` attribute, which would keep track of any index from the original data.

If this seems reasonable, I could put together a pull request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DesignMatrix should have to_dataframe() method #30

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DesignMatrix should have to_dataframe() method #30

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions