read_json: ValueError: Value is too big #26068

jmh045000 · 2019-04-12T19:05:33Z

Reopening issue #14530. The close description is incorrect. The JSON specification explicitly states that limits are not in the specification.

From https://tools.ietf.org/html/rfc7159#section-6

This specification allows implementations to set limits on the range and precision of numbers accepted

The standard json library in python supports large numbers, meaning the language supports JSON with these values.

Python 3.6.8 (default, Apr  7 2019, 21:09:51)
[GCC 5.3.1 20160406 (Red Hat 5.3.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> j = """{"id": 9253674967913938907}"""
>>> json.loads(j)
{'id': 9253674967913938907}

Loading a json file with large integers (> 2^32), results in "Value is too big". I have tried changing the orient to "records" and also passing in dtype={'id': numpy.dtype('uint64')}. The error is the same.

import pandas
data = pandas.read_json('''{"id": 10254939386542155531}''')
print(data.describe())

Expected Output

                          id
count                      1
unique                     1
top     10254939386542155531
freq                       1

Actual Output (even with dtype passed in)

 File "./parse_dispatch_table.py", line 34, in <module>
    print(pandas.read_json('''{"id": 10254939386542155531}''', dtype=dtype_conversions).describe())
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 234, in read_json
    date_unit).parse()
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 302, in parse
    self._parse_no_numpy()
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 519, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Value is too big

No problem using read_csv:

import pandas
import io
print(pandas.read_csv(io.StringIO('''id\n10254939386542155531''')).describe())

Output using read_csv

                          id
count                      1
unique                     1
top     10254939386542155531
freq                       1

Output of `pd.show_versions()`

The text was updated successfully, but these errors were encountered:

artdgn · 2019-11-07T06:41:05Z

Workaround:

A workaround that worked for me is to patch the loads function that's used in read_json.
You can patch it using the default python json module or using simplejson instead of the standard python json module.
E.g. if you monkeypatch pandas like this:

import pandas.io.json

# monkeypatch using standard python json module
import json
pd.io.json._json.loads = lambda s, *a, **kw: json.loads(s)

# monkeypatch using faster simplejson module
import simplejson
pd.io.json._json.loads = lambda s, *a, **kw: simplejson.loads(s)

# normalising (unnesting) at the same time (for nested jsons)
pd.io.json._json.loads = lambda s, *a, **kw: pandas.io.json.json_normalize(simplejson.loads(s))

After this patch read_json() should work on the example above (pandas.read_json('''{"id": 10254939386542155531}''', orient='index') and on the JSON that was giving me trouble.

Suggestion for future solution:

A possible solution for pandas is to provide a parameter via which one can override (inject) the loads function when calling read_json(). E.g read_json(data, *args, loads_fn=None, **kwargs). If the maintainers think that's a viable solution, I'd be happy to submit a PR.

mroeschke · 2020-05-08T21:00:08Z

Going to close this issue as I believe it's the same issue as #20599

deponovo · 2021-11-16T14:23:52Z

I don't think this issue is solved.

Test data (test.csv:):

data1,data2
10254939386542155531,12764512476851234

Test code:

import pandas as pd
import json

df = pd.read_csv(r'test.csv')  # all fine
df
Out[45]: 
                  data1              data2
0  10254939386542155531  12764512476851234

df.dtypes
Out[46]: 
data1    uint64
data2     int64
dtype: object

pd.read_json(json.dumps(df.to_dict()))  # prob

Results in error:

  File "<env_path>\lib\site-packages\pandas\io\json\_json.py", line 1140, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None
ValueError: Value is too big

then:

json.loads(json.dumps(df.to_dict()))
Out[48]: {'data1': {'0': 10254939386542155531}, 'data2': {'0': 12764512476851234}}

This behavior is therefore inconsistent.
Also, 10254939386542155531 is allowed as uint64.

(Pandas 1.3.4, json 2.0.9, numpy 1.21.3)

jreback · 2021-11-16T23:44:06Z

@deponovo this was redirected to another open issue - look there and see if the example is sufficient

issue get fixed when the members of the community puts up pull requests

the core team can provide code review

deponovo · 2021-11-19T16:31:03Z

Is there a wiki how I could do this? Would be my first time.

deponovo · 2021-12-01T19:36:40Z

take

…v#26068)

pandas-dev#26068)

krpatter-intc · 2022-03-03T19:58:15Z

I'm using pandas 1.4.1 and can still reproduce this issue:

pandas.read_json(pandas.DataFrame([{"foo":441350462101887021235463396661461057}]).to_json(orient="table"), orient="table")

Sorry one thing to note: it seems the output (to_json) knows this is a problem and puts the type as string, but still writes to the json as an python long (instead of a string):

'{"schema":{"fields":[{"name":"index","type":"integer"},{"name":"foo","type":"string"}],"primaryKey":["index"],"pandas_version":"1.4.0"},"data":[{"index":0,"foo":441350462101887021235463396661461057}]}'

krpatter-intc · 2022-03-09T22:23:45Z

@deponovo Would it be helpful to open a new issue for this?

Is it ok to do the json overrides I see in other related issues, is it a performance trade off and that is why pandas uses a custom json parser?

dmitriyshashkin · 2025-04-24T12:52:52Z

Workaround posted above doesn't seem to work anymore. Does anyone have a new workaround that works?

WillAyd added the IO JSON read_json, to_json, json_normalize label Apr 15, 2019

mroeschke closed this as completed May 8, 2020

deponovo mentioned this issue Nov 16, 2021

OverflowError: Python int too large to convert to C long #20599

Open

github-actions bot assigned deponovo Dec 1, 2021

deponovo added a commit to deponovo/pandas that referenced this issue Dec 5, 2021

ENH : adding support to parse unsigned long long from json (pandas-de…

c094a6b

…v#26068)

deponovo added a commit to deponovo/pandas that referenced this issue Dec 5, 2021

TST: refactored tests where necessary and added new testing conditions (

bf81d32

pandas-dev#26068)

deponovo mentioned this issue Dec 5, 2021

ENH: Enabling parsing ulonglong from json #44770

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read_json: ValueError: Value is too big #26068

read_json: ValueError: Value is too big #26068

jmh045000 commented Apr 12, 2019 •

edited

Loading

Expected Output

Actual Output (even with dtype passed in)

Output using read_csv

Output of `pd.show_versions()`

artdgn commented Nov 7, 2019

mroeschke commented May 8, 2020

deponovo commented Nov 16, 2021 •

edited

Loading

jreback commented Nov 16, 2021

deponovo commented Nov 19, 2021

deponovo commented Dec 1, 2021

krpatter-intc commented Mar 3, 2022 •

edited

Loading

krpatter-intc commented Mar 9, 2022

dmitriyshashkin commented Apr 24, 2025

read_json: ValueError: Value is too big #26068

read_json: ValueError: Value is too big #26068

Comments

jmh045000 commented Apr 12, 2019 • edited Loading

Expected Output

Actual Output (even with dtype passed in)

Output using read_csv

Output of pd.show_versions()

artdgn commented Nov 7, 2019

Workaround:

Suggestion for future solution:

mroeschke commented May 8, 2020

deponovo commented Nov 16, 2021 • edited Loading

jreback commented Nov 16, 2021

deponovo commented Nov 19, 2021

deponovo commented Dec 1, 2021

krpatter-intc commented Mar 3, 2022 • edited Loading

krpatter-intc commented Mar 9, 2022

dmitriyshashkin commented Apr 24, 2025

jmh045000 commented Apr 12, 2019 •

edited

Loading

Output of `pd.show_versions()`

deponovo commented Nov 16, 2021 •

edited

Loading

krpatter-intc commented Mar 3, 2022 •

edited

Loading