[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST/BUG (string dtype): Fix and adjust indexes string tests #59544

Merged
merged 9 commits into from
Sep 9, 2024

Conversation

phofl
Copy link
Member
@phofl phofl commented Aug 17, 2024

xref #54792

@jorisvandenbossche jorisvandenbossche added the Strings String extension data type and string data label Aug 18, 2024
Comment on lines 85 to 88
if using_infer_string and HAS_PYARROW:
tm.assert_extension_array_equal(
new_index.values, pd.array(arr, dtype="str")
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I don't understand. Why is index.values a numpy array in case of "python" storage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought I rolled that one back, the change was just something I tried.

Updated now

Comment on lines 359 to 361
elif index.dtype == "str" and not index.dtype.storage == "python":
with pytest.raises(NotImplementedError, match="i8"):
index.view("i8")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just testing a different error, right? (just to understand why this has a separate branch)

(ideally we should also make it a TypeError, I assume, maybe can add a TODO note)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a TODO

Comment on lines 506 to +508
# non-EA dtype indexes have special casting logic, so we punt here
pass
if isinstance(data, (set, frozenset)):
data = list(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to what you mentioned on slack, I assume?

idx = pd.Index({1, 2}, dtype="Int64")
idx = pd.Index({1, 2}, dtype="int64")

Whether we consider a set as sequence as valid input for constructors or not (IMO it would also be fine to not do that more generally)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct. It#s explicitly allowed for Index, so we shouldn't break this now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, for Series(..) we disallow it explicitly, but for Index(..) it currently works for non-EA dtypes. Sounds good to allow that for now.

@jorisvandenbossche
Copy link
Member

Ah, still some mypy errors ..

@jorisvandenbossche jorisvandenbossche changed the title TST (string-dtype): Adjust indexes string tests TST?BUG (string-dtype): Fix and adjust indexes string tests Sep 9, 2024
@jorisvandenbossche jorisvandenbossche changed the title TST?BUG (string-dtype): Fix and adjust indexes string tests TST/BUG (string dtype): Fix and adjust indexes string tests Sep 9, 2024
@jorisvandenbossche jorisvandenbossche merged commit 13f45e7 into pandas-dev:main Sep 9, 2024
47 checks passed
@jorisvandenbossche jorisvandenbossche deleted the string-indexes branch September 9, 2024 10:53
@jorisvandenbossche
Copy link
Member

Thanks @phofl

ammar-qazi pushed a commit to ammar-qazi/pandas that referenced this pull request Sep 9, 2024
…ev#59544)

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
@jorisvandenbossche jorisvandenbossche added this to the 2.3 milestone Oct 10, 2024
jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this pull request Oct 10, 2024
…ev#59544)

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
jorisvandenbossche added a commit that referenced this pull request Oct 10, 2024
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants