-
Notifications
You must be signed in to change notification settings - Fork 39
Error from astype() on StringArray and inconsistencies with zeros_like() #199
Comments
I suppose I can use |
Converting strings that say >>> numpy.array(["True", "False", "True", "True", "False"]).astype(bool)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'True'
>>> numpy.array(["true", "false", "true", "true", "false"]).astype(bool)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'true' In Awkward, a |
Right, the numpy thing is a bug, which is mentioned in the issue that I linked. I think the proper thing to do is to handle conversion the same way that Python does natively. The specific |
I don't think there's even a (Actually, I only implemented As for returning >>> a = awkward.fromiter(["True", "", "False", "", "", "True"])
>>> a
<StringArray ['True' '' 'False' '' '' 'True'] at 0x7f8cc3ce98d0>
>>> a.counts
array([4, 0, 5, 0, 0, 4])
>>> a.counts.astype(bool)
array([ True, False, True, False, False, True]) |
Well, what I originally wanted was an array with the same shape as the string array but every entry filled with ( |
Right, and your example (with |
My use case: I need to be able to make a mask for a
JaggedArray
containing strings, starting with something like this:but this fails on a couple different levels. The first is that
StringArray
seems to have a problem withastype()
:Independently,
zeros_like()
has some problematic behavior onStringArray
as well:My issue with this is that a string of null bytes actually evaluates to
True
and can't even be directly converted to a number:For comparison, numpy's
zeros_like()
converts strings to empty strings:Empty strings do convert to False (i.e.,
bool('')
isFalse
).As an aside,
astype(bool)
oddly doesn't actually work on thisndarray
:But the following does work (and unfortunately doesn't have an equivalent in awkward as far as I'm aware):
Edit: Turns out this known problem in numpy has been sitting around for a couple years: numpy/numpy#9875
The text was updated successfully, but these errors were encountered: