Conversation
…ity with pandas 3.0.
…, for pandas 3.0 compatibility.
…y_axis1`, for compatibility with pandas 3.0.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #729 +/- ##
=======================================
Coverage 85.25% 85.26%
=======================================
Files 85 85
Lines 4952 4953 +1
Branches 645 645
=======================================
+ Hits 4222 4223 +1
Misses 521 521
Partials 209 209
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| keywords = ["actuarial", "reserving", "insurance", "chainladder", "IBNR"] | ||
| dependencies = [ | ||
| "pandas >=2.0, <3.0", | ||
| "pandas >=3.0", |
There was a problem hiding this comment.
This makes sense in the Pandas3 branch. But do we need to force everyone to pandas 3?
| origin = np.minimum(self.odims, np.datetime64(self.valuation_date)) | ||
| val_array = origin.astype("datetime64[M]") + np.timedelta64(ddims[0], "M") | ||
| val_array = val_array.astype("datetime64[ns]") - np.timedelta64(1, "ns") | ||
| val_array = val_array.astype("datetime64[us]") - np.timedelta64(1, "us") |
There was a problem hiding this comment.
lol, I never knew us stood for microsecond. Good stuff
|
I'm good with this PR. @henrydingliu, do you think we should hold off? I generally prefer not to maintain too many versions, especially when it means supporting older dependencies, and I encourage users to adopt the latest versions of dependent packages. |
|
jumping from requiring 2.0 straight to requiring 3.0 seems a little steep agreed on not maintaining too many versions. but if we are not maintaining too many versions, isn't the best practice to allow as old as possible on dependencies? |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 2ef0131. Configure here.
|
One thing users can do if they want to keep using Pandas 2 is to use an old version of Chainladder. At our next release (or some other future one), we can announce in the release notes that it'll be "the last tested version on pandas 2.3.3. Future versions of Chainladder will support Pandas 3 only". This will give users a searchable way to find the latest version of Chainladder that still works with their pandas 2 setup while giving them a heads up that they need to take steps to upgrade their environment. |
The mechanism of providing a warning and still letting users use an older version is sound. But I just don't see how it applies to us. Do we actually need anything that's new in pandas 3.0? Aren't we dissatisfied with the axis situation and think Pandas will eventually revert in a future version? |
|
Pandas is facing some pretty steep competition from other libraries, like Polars, so my hope is that as time passes, most of the changes on the whole will be better for the user. That's a situation I think we need to keep an eye on, potentially making a switch if deemed necessary (like we did from pip -> uv). Performance enhancements and bug fixes will continue to be made in Pandas 3, whereas they won't be in Pandas 2.3.3. The So, here's what I propose as our strategy. When we get word that a new major release is out (let's say from version 2 -> 3), we fix up whatever we can, test it on the last previous version, and announce in the release notes "This is the last stable Chainladder version that will support package v2. For future releases, users are expected to upgrade to version v3." We're a small team, so there's not much capacity for maintaining backwards compatibility. I think this approach gives a good balance between giving users a heads-up, while freeing us to move forward. Chainladder is open source, while we release new versions, the old versions are still available, can be forked, etc. As far as Pandas 2 is concerned, I think we should do one more release. Let's bump the |
I don't know if we have the manpower to actually do this. The package currently has 7 dependencies, 3 of which doesn't have any version requirements. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 1cca8f8. Configure here.
|
@henrydingliu do you want to review this one? |
This PR doesn't reflect any of the discussion that we've been having on this topic. It's deprecating 3.10 ahead of schedule. And it's pushing Pandas 3.0 without another release. As presently committed, this PR is not aligned with the goals of the collaborator group. The original intent of #664 was to work towards Pandas 3.0 compatibility, which resolving 4 of the 5 sub-issues should achieve. Perhaps we should split up this PR for the time being. |
|
After making good faith attempts at dual Pandas 2 & 3 compatibility in the commits of this PR, I don't think it's a good idea, and neither would be trying to hang on to soon-to-be-deprecated Python 3.10. Dual compatiblity can be technically achieved by:
Maintaining backwards compatibility increases the complexity of both the codebase and testing, so I conclude now that if a user wants to stay on Python 3.10 or Pandas 2, they'll need to use an old version of Chainladder. |
|
We keep talking past each other. Let's align on some first principles. The initial issue was raised for Pandas 3 compatibility. Supporting the newest version of a key dependency is a no-brainer, assuming we are not giving up something. If there are now technical issues with dual compatibility, that doesn't mean the issue automatically changes itself into full migration to Pandas 3. Very few noticed this issue when Pandas 3 came out, meaning most users of the package are still on an older version of Pandas. So requiring them to go to 3 will have a sizable impact. I'm not comfortable with increasing the Pandas requirement on the main branch with little to no notice. At a minimum, we should hold off on this PR till the next meeting where others can voice their opinions and preferences. Pushing for newer versions of dependent packages is a good thing in general. It's also something that this package historically has not cared about. As I have previously noted, of the 7 dependencies in this package, we currently don't have any version requirements on 3. Pushing for Pandas 3 when we aren't planning on leveraging any of the Pandas 3 enhancements in this package is a poor Product Management decision. We would be adding to the user's burden without offering a benefit. And we can't even offer a weak "oh we always make an attempt to go to the latest version" because we don't. Asking users who don't want to keep up to use an older version of chainladder is also a viable mechanism in general. However, I can't see how that's a good idea at the moment as we are introducing a slew of doc and QoL improvements. By pushing for Pandas 3 in the middle of these improvements, we are implicitly requiring users to upgrade to Pandas 3 or not able to enjoy those improvements. Yes, we'd be tying some benefit to the Pandas 3 migration, but entirely unnecessarily. It would cost us literally nothing to wait till a stable version where most of the proposed doc improvements are completed. Why are we in a rush to push for Pandas 3 now? What is the benefit to the package, the collaborators, or the users? Now onto the more technical topics
We can refactor the code to take the default from the Pandas version in use. It might be tedious, but probably a good practice anyway.
I don't think this is necessary. Most of our users are on Pandas 2. So that's what we test with, till majority of the users are on Pandas 3. As early adopters of Pandas 3 run into issues, they can fix those issues in the package in a compatible way.
We shouldn't be dropping Pandas 2 for as long as the user base is still using Pandas 2. New features just need to be checked in Pandas 2. Then if early adopters of Pandas 3 have issues with the new features, they can fix. |
|
Maybe I’m missing something about the expected user workflow here, but I thought tools like uv generally encourage installing into an isolated project environment from Would that reduce the practical burden of supporting Pandas 3? In other words, users who want the latest chainladder could install it into a fresh isolated environment without necessarily disrupting their existing projects. Maybe Henry's concern is this: A user may have an existing project like: If new chainladder says: then uv cannot satisfy both: So the user has to choose:
And maybe we should seriously think about using |
of course they can. But why are we asking them to do that? What's in it for them? What's in it for us? What are some tangible benefits of requiring Pandas 3? |
|
I agree that we should be thoughtful about increasing dependency requirements, especially if there is no immediate benefit to users. That said, I think part of the disconnect here is the assumed workflow. Modern Python workflows are increasingly based on isolated, project-specific environments. In that setup:
So the practical burden of moving to something like Pandas 3 is lower than it would be in a shared or global environment. That does not mean we should force the upgrade without reason. I agree that if we are not leveraging Pandas 3 features yet, the benefit is unclear. In fact, I think we should move to My main point is that with the expectation of reproducible, isolated environments becoming the norm, user burden is probably lower than we're assuming. |
When we did the uv demo a few meetings back, did we get a good sense that everyone in the room was already fully leveraging it or something like it? Not saying that a workflow for pandas 3-based chainladder would be cumbersome to set up as an additional environment for someone who knows how to set up additional environments. But do we know for a fact that most of our users, even most of our collaborators, actually have workflow management ready to go? |

Updates chainladder to be compatible with pandas 3.0. Closes #664.
Applies the following updates:
TrianglePandas.groupby(), instead, changes the test expectation to match pandas 3.0 behavior. Closes Deprecate axis argument from TrianglePandas.groupby() #727.Note
Medium Risk
Upgrades core datetime handling and IO paths to match pandas 3.0 behavior, which can subtly change valuation timestamps and equality comparisons across the library. Dropping Python 3.10 and loosening the pandas pin may also surface version-specific edge cases in downstream environments.
Overview
Updates the package to target pandas 3.0 by bumping
pandasto>=3and requiring Python>=3.11(CI matrices also drop 3.10).Aligns datetime behavior with pandas 3 defaults by switching valuation/ultimate sentinel handling from nanoseconds to microseconds (e.g.,
options.ULT_VAL,TriangleBase.valuation, and ddims produced by aggregation), and loosens dtype checks tois_datetime64_dtype/is_string_dtypewhere pandas now prefersstring.Fixes pandas 3 IO strictness by wrapping raw HTML/JSON strings in
StringIO(pd.read_html,pd.read_json), and updates tests for pandas 3groupby(axis=1)behavior by grouping via transpose instead.Reviewed by Cursor Bugbot for commit c65e5d8. Bugbot is set up for automated code reviews on this repo. Configure here.