Skip to content

fix: Parse 1mon ago#1329

Open
serhii73 wants to merge 5 commits into
masterfrom
fix/1123
Open

fix: Parse 1mon ago#1329
serhii73 wants to merge 5 commits into
masterfrom
fix/1123

Conversation

@serhii73
Copy link
Copy Markdown
Collaborator

@serhii73 serhii73 commented May 6, 2026

Close #1123

@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.10%. Comparing base (373ede9) to head (cc8c4c1).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1329   +/-   ##
=======================================
  Coverage   97.10%   97.10%           
=======================================
  Files         235      235           
  Lines        2904     2904           
=======================================
  Hits         2820     2820           
  Misses         84       84           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

serhii73 added 4 commits May 6, 2026 12:54
…mpat

Python 3.14 changed the json C encoder to access dict subclasses via
direct C-level storage rather than calling Python-level __iter__/items().
ruamel.yaml's CommentedMap relies on its Python-level iteration for
correct key ordering, so the C shortcut produced a non-deterministic key
order on 3.14, causing test_dateparser_data_integrity to fail.

Add _to_plain_types() to recursively convert CommentedMap → OrderedDict
and CommentedSeq → list (preserving insertion order via Python-level
iteration) before passing data to json.dumps in write_complete_data().
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix issue #1123 by making the freshness/relative-time parser understand compact English month abbreviations like 1mon ago by updating English translation/simplification data and adding regression tests. It also tweaks the data-generation script to produce deterministic JSON output across Python versions.

Changes:

  • Add English simplification for \d+mon(s)\d+ month and extend month tokens in EN translation data.
  • Update freshness parser tests to assert that 1mon ago/2mon ago/3mons ago are parsed as relative months.
  • Make write_complete_data.py convert ruamel YAML types to plain mapping/sequence types before json.dumps for stable output.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
tests/test_freshness_date_parser.py Adds regression coverage asserting *mon(s) ago parses as relative months.
dateparser/data/date_translation_data/en.py Updates EN month tokens and adds a simplification regex for mons?.
dateparser_scripts/write_complete_data.py Normalizes YAML-loaded structures to stable plain types before serialization.
dateparser_data/supplementary_language_data/date_translation_data/en.yaml Mirrors EN month-token + simplification updates in supplementary YAML source.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"month",
"months"
"months",
"mon",
Copy link
Copy Markdown
Contributor

@AdrianAtZyte AdrianAtZyte May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also my main concern with this change.

I wonder if you can use AI to expand the tests with inputs that would become problematic after this change and inputs that were problematic before it, and decide which ones is best to leave broken, keeping them as expected failures.

- years
month:
- months
- mon
],
"simplifications": [
{
"(\\d+)\\s*mons?\\b": "\\1 month"
- (\d+[.,]?\d*) decades? ago

simplifications:
- (\d+)\s*mons?\b: \1 month
Comment on lines 2365 to +2379
@parameterized.expand(
[
param("1mon ago"), # 1116
param("1mon ago", ago={"months": 1}, period="month"), # 1123
param("2mon ago", ago={"months": 2}, period="month"), # 1123
param("3mons ago", ago={"months": 3}, period="month"), # 1123
]
)
def test_known_issues(self, date_string):
def test_known_issues(self, date_string, ago, period):
self.given_parser()
self.given_date_string(date_string)
self.when_date_is_parsed()
self.then_error_was_not_raised()
self.assertEqual(None, self.result["date_obj"])
self.then_date_was_parsed_by_freshness_parser()
self.then_date_obj_is_exactly_this_time_ago(ago)
self.then_period_is(period)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parse 1mon ago

3 participants