Skip to content

"ValueError: Must have equal len keys and value when setting with an iterable" when updating an object type cell using .loc with a nd.array #57962

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
isabelladegen opened this issue Mar 22, 2024 · 10 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@isabelladegen
Copy link

isabelladegen commented Mar 22, 2024

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import numpy as np
import pandas as pd

df = pd.DataFrame({1: [1, 2, 3], 2: [np.zeros(3), np.ones(3), np.zeros(0)]})

print(df.dtypes)

df.loc[1, 2] = np.zeros(3)

Issue Description

The loc assign throws a "ValueError: Must have equal len keys and value when setting with an iterable" despite dtypes being object for column 2. This started to happen when I updated Pandas from 1.4.4 -> 2.2.1. In 1.4.4 this syntax worked. This issue also happens when assigning multiple columns at the same time as soon as one of the new values is an iterable.

Expected Behavior

The assign should set the second row to np.array of zeros.

Installed Versions

INSTALLED VERSIONS

commit : bdc79c1
python : 3.9.19.final.0
python-bits : 64
OS : Darwin
OS-release : 23.4.0
Version : Darwin Kernel Version 23.4.0: Wed Feb 21 21:44:06 PST 2024; root:xnu-10063.101.15~2/RELEASE_ARM64_T8103
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_GB.UTF-8

pandas : 2.2.1
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0
setuptools : 69.2.0
pip : 24.0
Cython : 3.0.9
pytest : 8.1.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 5.1.0
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.18.1
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2024.2.0
gcsfs : None
matplotlib : 3.8.1
numba : 0.59.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 15.0.2
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.11.3
sqlalchemy : 2.0.28
tables : None
tabulate : 0.9.0
xarray : None
xlrd : 2.0.1
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None
None

@isabelladegen isabelladegen added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 22, 2024
@dimitsev
Copy link

dimitsev commented Jul 2, 2024

@isabelladegen The exact same problem appears on the main branch of pandas 2.2.2 if you do df.loc[1, 2] = list(np.zeros(3)). Please update your title that the problem happens with any iterable, not just a numpy array. Please also tick the checkbox that "this bug exists on the main branch of pandas".

@onejgordon
Copy link

Just ran into this today with 2.2.2. Is anyone aware of a workaround?

@isabelladegen
Copy link
Author

@onejgordon This is how I worked around it. Warning it is ugly. For the above toy example, instead of indexing the row and column(s) to update the cell like this:
df.loc[1, 2] = np.zeros(3)

I update the whole row like this:
df.loc[1] = {1:2, 2:np.zeros(3)}

@onejgordon
Copy link

Thanks for your response @isabelladegen.

It appears one of the factors influencing my case was a non-unique index. Dropping duplicates from the index column resolved the error and I was able to set both arrays and individual cells with:

df.loc[3, 'arr'] = np.array([1, 2])

and

df.loc[[2, 4, 6], 'arr'] = np.array([1, 2])

@saeub
Copy link

saeub commented Dec 4, 2024

Using df.at instead of df.loc seems to do the job if you only want to set a single value:

df.at[1, 2] = np.zeros(3)

@AlastairKelly
Copy link

AlastairKelly commented Dec 5, 2024

I'm having the same problem with df.at, unfortunately.
df.at[row.Index,"pmids_old_before"] = results.get('idlist')
is generating the same error as
df.loc[row.Index,"pmids_old_before"] = results.get('idlist')

Oddly, this was working fine for me until today, and I haven't changed anything in my coding environment, so I'm very puzzled about the break. Also, it still works on two dataframes before suddenly generating the error on the third that uses this function. I can't figure out any salient differences. The function creates and initializes this column with None values before trying to make this assignment, so it should be identical conditions for each dataframe.

@saeub
Copy link

saeub commented Dec 16, 2024

@AlastairKelly does your column pmids_old_before have dtype=object?

@JakeHightower
Copy link

Using df.at instead of df.loc seems to do the job if you only want to set a single value:

df.at[1, 2] = np.zeros(3)

@saeub method worked for me, while having identical issue listed here. Ty.

@tcourat
Copy link

tcourat commented Feb 24, 2025

Same issue in panda 2.2.2 and nothing works (.at or wrapping with a list)

@Pesec1
Copy link

Pesec1 commented Apr 25, 2025

Same issue in panda 2.2.2 and nothing works (.at or wrapping with a list)

@tcourat
Try to convert column dtype to object. It helped me.

(Pdb) dataframe[field].dtype
string[python]
(Pdb) dataframe[field] = dataframe[field].astype(str)
(Pdb) dataframe.at[index, field] = [1,1]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

8 participants