Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea/Discussion - Applying a similar technique to dictionaries and their keys? #87

Open
aggieNick02 opened this issue Jun 14, 2023 · 2 comments

Comments

@aggieNick02
Copy link

One common lesson-learned for a python dev using json.dumps followed by json.loads is that json requires string keys, so numeric keys become strings after a serialize/deserialize cycle.

It's interesting to think about applying the same technique pyjson_tricks uses to support so many data types to dictionaries and their keys. A python dict could become a special json dict with __dict__ magic key, along with an entry for an array of arrays for the key/value pairs. Then the dictionary keys would be able to survive a serialize/deserialize cycle with their type intact.

For example, right now we have this behavior:

>>> a={1:2,3:4}
>>> json_tricks.dumps(a)
'{"1": 2, "3": 4}'
>>> json_tricks.loads(json_tricks.dumps(a))
OrderedDict([('1', 2), ('3', 4)])

Instead we could have this, with keys preserved:

>>> a={1:2,3:4}
>>> json_tricks.dumps(a)
'{"__dict__":None, keys_and_values:[[1, 2],[3, 4]]'
>>> json_tricks.loads(json_tricks.dumps(a))
OrderedDict([(1, 2),(3,4)])

pyjson_tricks lets you do a non-destructive dumps/loads cycle on a much larger set of types than regular json while avoiding the security and readability drawbacks of pickle. Having it be able to do that for dictionaries with non-string keys would increase that set of types significantly.

@mverleg
Copy link
Owner

mverleg commented Jun 14, 2023

It's a nice idea but I don't think it's going to work, because I don't think custom encoders get called when the object is already a dict.

def dict_non_str_key_encoder(obj, **kwargs):
	if isinstance(obj, dict):
		# this is where the conversion code would go, but it is never reached
		raise Exception("raise if dict detected")
	return obj


def test_dict_hook():
	data = dict()
	data[1] = 2
	json = dumps(data, extra_obj_encoders=(dict_non_str_key_encoder,))
	bck = loads(json)
	assert data == bck

# {1: 2} != OrderedDict([('1', 2)])

If there's a way around that then it might be a nice addition. Same if json-tricks ever switch to not wrapping the standard json encoders

@aggieNick02
Copy link
Author

Ah, thanks for explaining. I didn't realize the extension point (default) for JSONEncoder was only for things that otherwise can't be serialized. It seems like it could have been designed to be called for supported types as well, but oh well.

I think one could monkey-patch JSONEncoder._iterencode_dict to create a new dict to encode whenever the keys are not all strings, but obviously that's brittle across python versions, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants