Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patsy prints uninformative error message when user places "Intercept" in a formula #14

Open
setjmp opened this issue Feb 4, 2013 · 1 comment

Comments

@setjmp
Copy link

setjmp commented Feb 4, 2013

This is observed on patsy 0.1.0.

I saw that the design_info object of a design matrix uses "Intercept" as the encoding for the intercept term so I wondered what would happen if a programmer chose this as the name for a feature.

The ideal scenerio is that patsy either:
a. Does some name mangling
b. throws an error telling me exactly what I did wrong if this is not going to be supported

What happens in reality is that an uniformative assertion message is produced:

Traceback (most recent call last): File "failure.py", line 5, in <module> y,X = patsy.dmatrices("sl ~ Intercept",dataFrame) File "build/bdist.macosx-10.8-intel/egg/patsy/highlevel.py", line 283, in dmatrices File "build/bdist.macosx-10.8-intel/egg/patsy/highlevel.py", line 150, in _do_highlevel_design File "build/bdist.macosx-10.8-intel/egg/patsy/build.py", line 860, in build_design_matrices File "build/bdist.macosx-10.8-intel/egg/patsy/build.py", line 776, in _build File "build/bdist.macosx-10.8-intel/egg/patsy/build.py", line 757, in design_info File "build/bdist.macosx-10.8-intel/egg/patsy/design_info.py", line 78, in __init__ AssertionError

Here is the code that produces the error:

import pandas
import patsy

dataFrame = pandas.io.parsers.read_csv("salary2.txt") 
y,X = patsy.dmatrices("sl ~ Intercept",dataFrame) 
@njsmith
Copy link
Member

njsmith commented Feb 4, 2013

Oo, sneaky.

Yeah, we should probably do some name mangling, since the same thing could happen when people create custom factor objects. Maybe I can re-use the name mangling for the automatic name creation (i.e. just say that unnamed columns are called "x" and then let the name mangler turn that into "x1", "x2", ...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants