Skip to content

Pandas 3.0#729

Open
genedan wants to merge 11 commits intomainfrom
pandas3
Open

Pandas 3.0#729
genedan wants to merge 11 commits intomainfrom
pandas3

Conversation

@genedan
Copy link
Copy Markdown
Collaborator

@genedan genedan commented Apr 29, 2026

Updates chainladder to be compatible with pandas 3.0. Closes #664.

Applies the following updates:


Note

Medium Risk
Upgrades core datetime handling and IO paths to match pandas 3.0 behavior, which can subtly change valuation timestamps and equality comparisons across the library. Dropping Python 3.10 and loosening the pandas pin may also surface version-specific edge cases in downstream environments.

Overview
Updates the package to target pandas 3.0 by bumping pandas to >=3 and requiring Python >=3.11 (CI matrices also drop 3.10).

Aligns datetime behavior with pandas 3 defaults by switching valuation/ultimate sentinel handling from nanoseconds to microseconds (e.g., options.ULT_VAL, TriangleBase.valuation, and ddims produced by aggregation), and loosens dtype checks to is_datetime64_dtype / is_string_dtype where pandas now prefers string.

Fixes pandas 3 IO strictness by wrapping raw HTML/JSON strings in StringIO (pd.read_html, pd.read_json), and updates tests for pandas 3 groupby(axis=1) behavior by grouping via transpose instead.

Reviewed by Cursor Bugbot for commit c65e5d8. Bugbot is set up for automated code reviews on this repo. Configure here.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 85.26%. Comparing base (5f23ed5) to head (c65e5d8).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
chainladder/__init__.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #729   +/-   ##
=======================================
  Coverage   85.25%   85.26%           
=======================================
  Files          85       85           
  Lines        4952     4953    +1     
  Branches      645      645           
=======================================
+ Hits         4222     4223    +1     
  Misses        521      521           
  Partials      209      209           
Flag Coverage Δ
unittests 85.26% <85.71%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread pyproject.toml Outdated
keywords = ["actuarial", "reserving", "insurance", "chainladder", "IBNR"]
dependencies = [
"pandas >=2.0, <3.0",
"pandas >=3.0",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense in the Pandas3 branch. But do we need to force everyone to pandas 3?

Comment thread chainladder/core/base.py
origin = np.minimum(self.odims, np.datetime64(self.valuation_date))
val_array = origin.astype("datetime64[M]") + np.timedelta64(ddims[0], "M")
val_array = val_array.astype("datetime64[ns]") - np.timedelta64(1, "ns")
val_array = val_array.astype("datetime64[us]") - np.timedelta64(1, "us")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, I never knew us stood for microsecond. Good stuff

@kennethshsu
Copy link
Copy Markdown
Collaborator

I'm good with this PR. @henrydingliu, do you think we should hold off?

I generally prefer not to maintain too many versions, especially when it means supporting older dependencies, and I encourage users to adopt the latest versions of dependent packages.

@henrydingliu
Copy link
Copy Markdown
Collaborator

jumping from requiring 2.0 straight to requiring 3.0 seems a little steep

agreed on not maintaining too many versions. but if we are not maintaining too many versions, isn't the best practice to allow as old as possible on dependencies?

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 2ef0131. Configure here.

@genedan
Copy link
Copy Markdown
Collaborator Author

genedan commented Apr 29, 2026

One thing users can do if they want to keep using Pandas 2 is to use an old version of Chainladder. At our next release (or some other future one), we can announce in the release notes that it'll be "the last tested version on pandas 2.3.3. Future versions of Chainladder will support Pandas 3 only". This will give users a searchable way to find the latest version of Chainladder that still works with their pandas 2 setup while giving them a heads up that they need to take steps to upgrade their environment.

@henrydingliu
Copy link
Copy Markdown
Collaborator

One thing users can do if they want to keep using Pandas 2 is to use an old version of Chainladder. At our next release (or some other future one), we can announce in the release notes that it'll be "the last tested version on pandas 2.3.3. Future versions of Chainladder will support Pandas 3 only". This will give users a searchable way to find the latest version of Chainladder that still works with their pandas 2 setup while giving them a heads up that they need to take steps to upgrade their environment.

The mechanism of providing a warning and still letting users use an older version is sound. But I just don't see how it applies to us. Do we actually need anything that's new in pandas 3.0? Aren't we dissatisfied with the axis situation and think Pandas will eventually revert in a future version?

@genedan
Copy link
Copy Markdown
Collaborator Author

genedan commented Apr 30, 2026

Pandas is facing some pretty steep competition from other libraries, like Polars, so my hope is that as time passes, most of the changes on the whole will be better for the user. That's a situation I think we need to keep an eye on, potentially making a switch if deemed necessary (like we did from pip -> uv).

Performance enhancements and bug fixes will continue to be made in Pandas 3, whereas they won't be in Pandas 2.3.3. The axis change was unfortunate, but I think more users will be moving on to the latest version or Polars, so my preference overall is to take steps to keep up with the broader community.

So, here's what I propose as our strategy. When we get word that a new major release is out (let's say from version 2 -> 3), we fix up whatever we can, test it on the last previous version, and announce in the release notes "This is the last stable Chainladder version that will support package v2. For future releases, users are expected to upgrade to version v3." We're a small team, so there's not much capacity for maintaining backwards compatibility. I think this approach gives a good balance between giving users a heads-up, while freeing us to move forward.

Chainladder is open source, while we release new versions, the old versions are still available, can be forked, etc.

As far as Pandas 2 is concerned, I think we should do one more release. Let's bump the pyproject.toml to have pandas >2.3.3, <3, make sure that it passes the tests, and then inform the users that this is the last version they can use for both Pandas 2 and Python 3.10. Then our next release will require Pandas 3 and Python >= 3.11.

@henrydingliu
Copy link
Copy Markdown
Collaborator

When we get word that a new major release is out (let's say from version 2 -> 3), we fix up whatever we can, test it on the last previous version, and announce in the release notes "This is the last stable Chainladder version that will support package v2.

I don't know if we have the manpower to actually do this. The package currently has 7 dependencies, 3 of which doesn't have any version requirements.

@casact casact deleted a comment from cursor Bot May 1, 2026
@casact casact deleted a comment from cursor Bot May 1, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 1cca8f8. Configure here.

Comment thread chainladder/core/triangle.py Outdated
@kennethshsu
Copy link
Copy Markdown
Collaborator

@henrydingliu do you want to review this one?

@henrydingliu
Copy link
Copy Markdown
Collaborator

@henrydingliu do you want to review this one?

This PR doesn't reflect any of the discussion that we've been having on this topic. It's deprecating 3.10 ahead of schedule. And it's pushing Pandas 3.0 without another release. As presently committed, this PR is not aligned with the goals of the collaborator group.

The original intent of #664 was to work towards Pandas 3.0 compatibility, which resolving 4 of the 5 sub-issues should achieve. Perhaps we should split up this PR for the time being.

@genedan
Copy link
Copy Markdown
Collaborator Author

genedan commented May 2, 2026

After making good faith attempts at dual Pandas 2 & 3 compatibility in the commits of this PR, I don't think it's a good idea, and neither would be trying to hang on to soon-to-be-deprecated Python 3.10. Dual compatiblity can be technically achieved by:

  1. Adding a nano- or microsecond time adjustment in several locations depending on which installation of Pandas the user has
  2. Doubling the number of GitHub actions workflows by testing each combo of pandas 2 and 3 with with Python 3.10 thru 3.14. To ensure compatibility, we would need to run this matrix of workflows for as long as we plan on supporting both versions of Pandas due to the breaking changes. How long would we plan to do this for? If it's just until the deprecation of Python 3.10 in 6 months, it might not be worth it.
  3. Any new features we add to Chainladder will have to be checked for both Pandas 2 and Pandas 3 compatibility until we drop Pandas 2 support.

Maintaining backwards compatibility increases the complexity of both the codebase and testing, so I conclude now that if a user wants to stay on Python 3.10 or Pandas 2, they'll need to use an old version of Chainladder.

@henrydingliu
Copy link
Copy Markdown
Collaborator

henrydingliu commented May 3, 2026

We keep talking past each other. Let's align on some first principles.

The initial issue was raised for Pandas 3 compatibility. Supporting the newest version of a key dependency is a no-brainer, assuming we are not giving up something. If there are now technical issues with dual compatibility, that doesn't mean the issue automatically changes itself into full migration to Pandas 3. Very few noticed this issue when Pandas 3 came out, meaning most users of the package are still on an older version of Pandas. So requiring them to go to 3 will have a sizable impact. I'm not comfortable with increasing the Pandas requirement on the main branch with little to no notice. At a minimum, we should hold off on this PR till the next meeting where others can voice their opinions and preferences.

Pushing for newer versions of dependent packages is a good thing in general. It's also something that this package historically has not cared about. As I have previously noted, of the 7 dependencies in this package, we currently don't have any version requirements on 3. Pushing for Pandas 3 when we aren't planning on leveraging any of the Pandas 3 enhancements in this package is a poor Product Management decision. We would be adding to the user's burden without offering a benefit. And we can't even offer a weak "oh we always make an attempt to go to the latest version" because we don't.

Asking users who don't want to keep up to use an older version of chainladder is also a viable mechanism in general. However, I can't see how that's a good idea at the moment as we are introducing a slew of doc and QoL improvements. By pushing for Pandas 3 in the middle of these improvements, we are implicitly requiring users to upgrade to Pandas 3 or not able to enjoy those improvements. Yes, we'd be tying some benefit to the Pandas 3 migration, but entirely unnecessarily. It would cost us literally nothing to wait till a stable version where most of the proposed doc improvements are completed. Why are we in a rush to push for Pandas 3 now? What is the benefit to the package, the collaborators, or the users?

Now onto the more technical topics

Adding a nano- or microsecond time adjustment in several locations depending on which installation of Pandas the user has

We can refactor the code to take the default from the Pandas version in use. It might be tedious, but probably a good practice anyway.

Doubling the number of GitHub actions workflows by testing each combo of pandas 2 and 3 with with Python 3.10 thru 3.14. To ensure compatibility, we would need to run this matrix of workflows for as long as we plan on supporting both versions of Pandas due to the breaking changes. How long would we plan to do this for? If it's just until the deprecation of Python 3.10 in 6 months, it might not be worth it.

I don't think this is necessary. Most of our users are on Pandas 2. So that's what we test with, till majority of the users are on Pandas 3. As early adopters of Pandas 3 run into issues, they can fix those issues in the package in a compatible way.

Any new features we add to Chainladder will have to be checked for both Pandas 2 and Pandas 3 compatibility until we drop Pandas 2 support.

We shouldn't be dropping Pandas 2 for as long as the user base is still using Pandas 2. New features just need to be checked in Pandas 2. Then if early adopters of Pandas 3 have issues with the new features, they can fix.

@EKtheSage
Copy link
Copy Markdown
Contributor

Maybe I’m missing something about the expected user workflow here, but I thought tools like uv generally encourage installing into an isolated project environment from pyproject.toml / uv.lock, rather than forcing users to mutate their global Python environment.

Would that reduce the practical burden of supporting Pandas 3? In other words, users who want the latest chainladder could install it into a fresh isolated environment without necessarily disrupting their existing projects.

Maybe Henry's concern is this:

A user may have an existing project like:

dependencies = [
  "chainladder",
  "pandas<3",
  "some-other-package-that-requires-pandas<3"
]

If new chainladder says:

dependencies = ["pandas>=3"]

then uv cannot satisfy both:

pandas < 3
pandas >= 3

So the user has to choose:

  • upgrade Pandas / fix compatibility issues
    or
  • stay on older chainladder

And maybe we should seriously think about using polars as an alternative backend as Gene alludes to earlier where new python users are slowly moving towards.

@henrydingliu
Copy link
Copy Markdown
Collaborator

In other words, users who want the latest chainladder could install it into a fresh isolated environment without necessarily disrupting their existing projects.

of course they can. But why are we asking them to do that? What's in it for them? What's in it for us? What are some tangible benefits of requiring Pandas 3?

@EKtheSage
Copy link
Copy Markdown
Contributor

I agree that we should be thoughtful about increasing dependency requirements, especially if there is no immediate benefit to users.

That said, I think part of the disconnect here is the assumed workflow. Modern Python workflows are increasingly based on isolated, project-specific environments. In that setup:

  • environments are explicit and reproducible
  • dependencies are pinned per project
  • upgrading a dependency does not disrupt other work

So the practical burden of moving to something like Pandas 3 is lower than it would be in a shared or global environment.

That does not mean we should force the upgrade without reason. I agree that if we are not leveraging Pandas 3 features yet, the benefit is unclear. In fact, I think we should move to polars instead 😅

My main point is that with the expectation of reproducible, isolated environments becoming the norm, user burden is probably lower than we're assuming.

@henrydingliu
Copy link
Copy Markdown
Collaborator

That said, I think part of the disconnect here is the assumed workflow. Modern Python workflows are increasingly based on isolated, project-specific environments. In that setup:

When we did the uv demo a few meetings back, did we get a good sense that everyone in the room was already fully leveraging it or something like it? Not saying that a workflow for pandas 3-based chainladder would be cumbersome to set up as an additional environment for someone who knows how to set up additional environments. But do we know for a fact that most of our users, even most of our collaborators, actually have workflow management ready to go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants