Skip to content

ENH: Enabling parsing ulonglong from json #44770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

deponovo
Copy link
Contributor

@deponovo deponovo commented Dec 5, 2021

@pep8speaks
Copy link

pep8speaks commented Dec 5, 2021

Hello @deponovo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-12-10 12:01:29 UTC

@jbrockmendel
Copy link
Member

cc @WillAyd

@WillAyd
Copy link
Member

WillAyd commented Dec 6, 2021

I think ujson added this upstream. See

ultrajson/ultrajson@6430079

@deponovo I think better to just copy what they've already done down rather than have a new implementation

@deponovo
Copy link
Contributor Author

deponovo commented Dec 6, 2021

I think ujson added this upstream. See

ultrajson/ultrajson@6430079

@deponovo I think better to just copy what they've already done down rather than have a new implementation

No idea it was based on an external lib. That commit is even from 2014. I'll replace my implementation by the commit you mentioned.

@deponovo
Copy link
Contributor Author

deponovo commented Dec 6, 2021

@WillAyd Pushed mods according to the requests. Now that I notice that pandas' ujson is based on an external lib, one might even consider updating the whole local code with the latest stable version of ujson.

@deponovo
Copy link
Contributor Author

deponovo commented Dec 7, 2021

@WillAyd I just tested replacing the whole ujson base code with the latest version. The tests have the same result as per this PR. Should we instead perform that upgrade? I would only push after a positive answer.

@WillAyd
Copy link
Member

WillAyd commented Dec 7, 2021

I think we have some extensions in our code base that would be lost with a blanket update. If you wanted to do that its probably worth a dedicated issue and providing an analysis of the diff to see what might be changing.

For the scope of this PR, I think you are getting build failures from the definitions of DECLARE_NAN and DECLARE_INF that aren't being used

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking reasonable. I think the failing test needs to be updated

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions IO JSON read_json, to_json, json_normalize labels Dec 8, 2021
@WillAyd
Copy link
Member

WillAyd commented Dec 8, 2021

@deponovo this PR looks good to me. Can you add a whatsnew note so users are aware of the enhancement?

@jreback over to you

@jreback jreback added this to the 1.4 milestone Dec 9, 2021
@deponovo
Copy link
Contributor Author

deponovo commented Dec 9, 2021

@deponovo this PR looks good to me. Can you add a whatsnew note so users are aware of the enhancement?

I am confused to how I should add an entry here. Should I had my text on v1.4.0? Not sure I fully understand the (reference)[https://pandas.pydata.org/pandas-docs/stable/development/contributing_codebase.html#documenting-your-code] help.

@WillAyd
Copy link
Member

WillAyd commented Dec 9, 2021

@deponovo this PR looks good to me. Can you add a whatsnew note so users are aware of the enhancement?

I am confused to how I should add an entry here. Should I had my text on v1.4.0? Not sure I fully understand the (reference)[https://pandas.pydata.org/pandas-docs/stable/development/contributing_codebase.html#documenting-your-code] help.

Yea 1.4 looks right. Should be good in the “Other Enhancements” section

https://github.com/pandas-dev/pandas/blob/master/doc/source/whatsnew/v1.4.0.rst#other-enhancements

@jreback jreback merged commit 6b9e93a into pandas-dev:master Dec 10, 2021
@jreback
Copy link
Contributor

jreback commented Dec 10, 2021

thanks @deponovo very nice!

would certainly take an update to the ujson vendored code.

@jbrockmendel
Copy link
Member

@deponovo any interest in porting more more improvements from upstream? in particular in ultrajsondec.c is looks like some overflow-handling improvements have been made.

@deponovo
Copy link
Contributor Author

@jbrockmendel @jreback I already started with it. But there are some pandas specific code injections, so I can just swap code blindfolded. For now time is lacking. But I am working on it.

@deponovo deponovo deleted the enabling_parsing_ulonglong_from_json branch December 29, 2021 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_json: ValueError: Value is too big
5 participants