Skip to content

Fix byte/str handling for python3#2

Open
ndevenish wants to merge 6 commits intomainfrom
py3_str
Open

Fix byte/str handling for python3#2
ndevenish wants to merge 6 commits intomainfrom
py3_str

Conversation

@ndevenish
Copy link
Member

@ndevenish ndevenish commented May 14, 2021

These are the patches from dials/cbflib#19.

Problem: All char * fields, input[1] and output are mapped to str. This works on Python 2, but on Python 3 causes problems that your binary data is now encoded in a string and needs to be converted via encode('utf-8', errors='surrogateescape') (see SWIG docs). This can be worked around by setting SWIG_PYTHON_STRICT_BYTE_CHAR in the build - but now all char * fields are bytes, meaning that you need to encode any strings that you are passing into pycbf (see e.g. cctbx/dxtbx@44b6d72).

This commit fixes this. When using Python 3, functions that return data will return it as a python bytes object, and those that accept data will accept a python bytes object. Everything else will accept str.

[1] Input field mapping is mostly solved by 647ffcb, which means that anything that accepts an explicitly string argument will accept both string and bytes. However, we don't want data fields to accept strings - so input handling is still important.

  • Need to check this strictly doesn't conflict with the byte/string dual handling from 647ffcb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant