-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C extension (speeds up simpleion dump/load) #149
C extension (speeds up simpleion dump/load) #149
Conversation
amazon/ion/ioncmodule.c
Outdated
IONCHECK(c_string_from_py(decimal_str, &decimal_c_str, &decimal_c_str_len)); | ||
|
||
ION_DECIMAL decimal_value; | ||
IONCHECK(ion_decimal_from_string(&decimal_value, decimal_c_str, &dec_context)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced previous decQuadFromString
to ion_decimal_from_string
.
Refer to here for previous code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here as with int. Can we use math to translate between the types instead of roundtripping via string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, we only support 34 decimal digits. There is a test with more than 34 digits throws a NUMBER_FLOW exception.
_IT.DECIMAL: (
(_D('1.1999999999999999555910790149937383830547332763671875e0'), # <- value
b'1.1999999999999999555910790149937383830547332763671875')) # <- expected value
dump/write decimal function here.
load/read decimal function here.
The Travis failed due to wheel build. Will look into this later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's coming along well. I'm excited for the performance improvements.
Once this review is finished, we should read/write some large sample files and try to detect memory leaks using, e.g. Valgrind or whatever the best tool is for memory analysis with Python C extensions. The manual memory management in the C extension is tricky and it's possible we missed something.
amazon/ion/ioncmodule.c
Outdated
IONCHECK(c_string_from_py(decimal_str, &decimal_c_str, &decimal_c_str_len)); | ||
|
||
ION_DECIMAL decimal_value; | ||
IONCHECK(ion_decimal_from_string(&decimal_value, decimal_c_str, &dec_context)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here as with int. Can we use math to translate between the types instead of roundtripping via string?
1. What I did:
2. After finishing automation part:
Let me know if there is anything need to be prioritized! |
amazon/ion/ioncmodule.c
Outdated
if (PyObject_RichCompareBool(temp, PyLong_FromLong(0), Py_LT) == 1) { | ||
ion_int_value._signum = -1; | ||
temp = PyNumber_Negative(temp); | ||
} else if (PyObject_RichCompareBool(temp, PyLong_FromLong(0), Py_GT) == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should PyLong_FromLong(0)
be stored in a variable to avoid recomputing it? Or is 0 a special case that requires no computation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
amazon/ion/ioncmodule.c
Outdated
// Python equivalence: digit = temp / 2^31, temp = temp % 2^31 | ||
res = PyNumber_Divmod(temp, pow_value); | ||
py_digit = PyNumber_Long(PyTuple_GetItem(res, 0)); | ||
py_reminder = PyTuple_GetItem(res, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
py_reminder
-> py_remainder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
Summary:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Do you know why the Travis builds are failing?
highlights:
|
The travis failed due to known unit test failures, and manually checked that they are all equivalent. As discussed offline, I'll push it to |
Description:
PR of C extension. (both coding and configuration parts).
Distribution part may be changed but I'll create a separate PRs for them as needed.
I added comment for myself from the feedback last time. (e.g. not use force-python-impl, check c skip list etc.)
Suggested Review Order
C extension consists of three parts:
simpleion dump/load
,c extension (e.g. for mac it's ionc.so)
andion-c binaries
. We will go through each of them below.C extension ---
setup.py
:Highlights:
We add c extension information here, setup.py will build c extension during the build process. we can also manually build extension by
python setup.py build_ext
.Ion-c binaries ---
install.py
:Then we look into
install.py
where we prepare ion-c binaries and headers for us. it starts from here.Highlights:
ionc
anddecNumber
intoion-c-build
. 4. copyion-c-build
toamazon/ion
for distribution.Python ---
simpleion.py
:Now We look into simpleion.py, how we connect ion-python and c extension.
Highlights:
force_python_impl
, I rename the previousdump/load
todump_original/load_original
. Newdump
method wrapsdump_original
anddump_extension
(c extension dump API). So doesload
.Distribute all required files ---
MANIFEST.in
:After preparing all three parts, the next step is letting PYPI know what files we are going to distribute.
We added some required files into MANIFEST.in for distribution: ion-c binaries in
ion-c-build
,_ioncmodule.h
header file for c module and installing scriptinstall.py
. Now when we distribute ion-python, both c extension and ion-c will be included.Next let's look into ion-c module and their test:
ion-c module ---
ioncmodule.c and _ioncmodule.h
For
dump (write)
, it starts from here, this method initialize required variable and parse the python argument to c types. It will eventually call ionc_write_value to write value.ionc_write_value
will check what type the value is and use the corresponding ion-c api to write them.Likely for
load
, it starts from here and eventually will call ionc_read_value to read files.After knowing how the c module looks like, we move on to unit tests:
tests/test_cookbook.py
tests/test_simpleion.py
tests/tests/test_vectors.py
Refer to comments for more details.
Last file
.github/workflows/python-publish.yml
is minor,Don't need to review anymore, going to find another way to automate distribution.
I'll temporary keep this script until we find a way to distribute.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.