Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No slugify of multiselect answers #181

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 5 additions & 9 deletions survey/forms.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
from django.conf import settings
from django.forms import models
from django.urls import reverse
from django.utils.text import slugify

from survey.models import Answer, Category, Question, Response, Survey
from survey.signals import survey_completed
Expand Down Expand Up @@ -164,18 +163,15 @@ def get_question_initial(self, question, data):
if answer:
# Initialize the field with values from the database if any
if question.type == Question.SELECT_MULTIPLE:
initial = []
if answer.body == "[]":
pass
elif "[" in answer.body and "]" in answer.body:
Copy link
Author

@johanneswilm johanneswilm Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pierre-Sassoulas I don't fully understand what the idea was here. It looks to me like a "]" and a "[" anywhere in the answer.body would cause this to be initiated for example ("Subtitled [English]"). Maybe that's why you decided to slugify? To make sure there wouldn't be any "[" nor "]" in there?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, slugifying means that unique choices can be come none unique. If choices are something like "<40" and ">40" then both are translated into "40"

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, I remember this was a bit hacky, we're serializing / deserializing a list of answers here.

initial = []
unformated_choices = answer.body[1:-1].strip()
for unformated_choice in unformated_choices.split(settings.CHOICES_SEPARATOR):
choice = unformated_choice.split("'")[1]
initial.append(slugify(choice))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I was wondering why the contents of answer.body are not just JSON or some other standardized method. That way one doesn't run risks of this mechanism breaking as one can fall back to existing json deserialization methods.

It's not quite json as of now because it's using single quotation marks rather than double quotation marks. But it's already quite close.

My code below doesn't change that but I think it's slightly more robust in some cases.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a small sql database, so I suppose I wanted to avoid dumping a json.

Copy link
Owner

@Pierre-Sassoulas Pierre-Sassoulas Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A standardized method (possibily json) would indeed be better than this custom parsing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pierre-Sassoulas Ok, but it wouldn't take up any more space in the database. Instead of

['one', 'two', 'three']

in JSON that would simple be:

["one", "two", "three"]

An advantage would be that it would automatically be able to read things like this as well:

["'one'", "t,w,o", "[three]"]

I think my code below can do most of that as well though. So maybe it's not needed.

I spent a day last week cleaning up a ~90k row dump turning everything back from the slugified version to the original options because loaddata wouldn't accept the data before realizing that I needed to override this code as well to actually make the original data work. That's the reason I raise this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like it's going to be hard to make a migration from slugified to the new value (as there is less information in slugified answer than the actual answers) so putting that in a breaking change for version 2.0.0 and removing all the migrations files seems in order, what do you think ?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe create a "best effort" migration file ?

Copy link
Author

@johanneswilm johanneswilm Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to make a data migration file in which one takes all the options, converts them using slugify and that way make a pretty good guess as to which of the original options is the right one. Two cases where I don't think that's possible:

  • If the options have changed after some respondents answered. The slugified version will either correspond to none of the current choices or may even correspond to an incorrect one.
  • Cases where two or more options turn into the same slugified value.

So yeah, probably major version change and some way to gracefully handle the two above situations telling users that "sorry, some data was lost and we can no longer figure out what your respondents actually answered".

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good !

elif answer.body[0] == "[" and answer.body[-1] == "]":
initial = [
choice.strip(" '") for choice in answer.body.strip("[]").split(settings.CHOICES_SEPARATOR)
]
else:
# Only one element
initial.append(slugify(answer.body))
initial = [answer.body]
else:
initial = answer.body
if data:
Expand Down
2 changes: 1 addition & 1 deletion survey/models/question.py
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ def get_choices(self):
"""
choices_list = []
for choice in self.get_clean_choices():
choices_list.append((slugify(choice, allow_unicode=True), choice))
choices_list.append((choice, choice))
choices_tuple = tuple(choices_list)
return choices_tuple

Expand Down