You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This ticket is the result of testing for ways around the behavior of ds.create_categorical(..., multiple=True) described in #196. To summarize briefly, that behavior is that the resulting variable always has the full base of the entire dataset, regardless of the way it is constructed, the way the case statements are written, or the way in which the source variable itself is populated.
The categories given to a new multiple_response using this method are:
However, this only gives the ability to get a base equal to the rows for which any of these subvariables are selected, meaning that it's not possible to have both some "Not selected" and others "Missing" at the same time.
It's necessary to integrate the variable first because we can't yet edit missing on derived variables (see #148).
There are a few things that might need to be done to address this.
1.
Firstly, the default categories given to a new multiple_response using this method should probably be:
jamesrkg
changed the title
Creating derived multiple_response with explicit subvariables/categories
Creating derived multiple_response with control over base
Jan 17, 2018
@xbito@jjdelc can we talk about this one because it prevents us from using derived variables in a lot of situations that most warrant it. However, I'm not sure if this is wholly an issue that scrunch on its own can solve or not.
This ticket is the result of testing for ways around the behavior of
ds.create_categorical(..., multiple=True)
described in #196. To summarize briefly, that behavior is that the resulting variable always has the full base of the entire dataset, regardless of the way it is constructed, the way the case statements are written, or the way in which the source variable itself is populated.The categories given to a new
multiple_response
using this method are:The result always having a base-all is due to both of these categories being defined with
missing=False
.It's possible to get a non-all base on the new
multiple_response
created this way by following up with:However, this only gives the ability to get a base equal to the rows for which any of these subvariables are selected, meaning that it's not possible to have both some "Not selected" and others "Missing" at the same time.
It's necessary to
integrate
the variable first because we can't yet editmissing
on derived variables (see #148).There are a few things that might need to be done to address this.
1.
Firstly, the default categories given to a new
multiple_response
using this method should probably be:This will protect the ability to make the important distinction between "Not selected" and "Missing" when necessary.
2.
ds.create_categorical(..., multiple=True)
should become the 'simple' use case version of this request where the:1
2
-1
3.
A new method for fully explicit control over the new variable is provided. API for this to be discussed/defined.
The text was updated successfully, but these errors were encountered: