Skip to content

Simplify defunct stubs and fix O(n^2) pagination concats#225

Open
thodson-usgs wants to merge 1 commit intoDOI-USGS:mainfrom
thodson-usgs:simplify
Open

Simplify defunct stubs and fix O(n^2) pagination concats#225
thodson-usgs wants to merge 1 commit intoDOI-USGS:mainfrom
thodson-usgs:simplify

Conversation

@thodson-usgs
Copy link
Copy Markdown
Collaborator

Summary

A few targeted simplifications and bug fixes turned up by a code review of recent commits.

Cleanup

  • nwis.py: Reduce 5 defunct function stubs (get_qwdata, get_discharge_measurements, get_gwlevels, get_pmcodes, get_water_use) to minimal **kwargs signatures — they only raise NameError, so the full typed parameter lists were dead weight.
  • samples.py: Replace the 230-line get_usgs_samples pass-through (which duplicated 22 parameters and their docstrings verbatim from waterdata.get_samples) with a small **kwargs wrapper.

Performance fixes

  • nwis.py::_read_json: Fix O(n²) pd.concat in a per-site loop — now collects DataFrames in a list and concats once. Also drops an unnecessary str(record_json).replace("'", '"')StringIOpd.read_json round-trip, since record_json is already parsed Python data and pd.DataFrame(record_json) works directly.
  • waterdata/utils.py::get_stats_data: Same collect-then-concat fix for the pagination loop — previously the concat was inside the while next_token: loop.

Bug fixes

  • nwis.py::get_record: Restore the missing elif service == "peaks": dispatch branch. The docstring has long advertised "peaks" as a supported service but the branch was dropped in an old lint/format pass (#164), so get_record(service="peaks") has been raising TypeError: peaks service not yet implemented until now.
  • Demo notebooks (Peaks, Ratings, Statistics, WaterUse under demos/hydroshare/): Fix a SyntaxError from an import line that got joined without a newline (from dataretrieval import nwisfrom dataretrieval import waterdata), apparently introduced by ruff-format commit #219.
  • WaterUse_Examples.ipynb: Comment out two cells that still referenced a pennsylvania variable whose assignment had been commented out as # [Defunct].
  • NWIS_demo_1.ipynb, WaterData_demo.ipynb: nb-clean pass (stripped stale version metadata only).

Test plan

  • pytest tests/ — 116 passed (excluding the untracked ngwmn_test.py)
  • Execute all 16 canonical demo notebooks via jupyter nbconvert --execute against live USGS APIs — 16/16 pass after the fixes (11/16 before)
  • Sanity-check that the _read_json and get_stats_data paths are still exercised by existing tests (tests/nwis_test.py, tests/waterservices_test.py, tests/waterdata_test.py)

🤖 Generated with Claude Code

- nwis.py: reduce 5 defunct function stubs (get_qwdata,
  get_discharge_measurements, get_gwlevels, get_pmcodes, get_water_use)
  to minimal **kwargs signatures since they only raise NameError
- nwis.py: fix O(n^2) pd.concat in _read_json by collecting per-site
  DataFrames in a list and concatenating once; also drop an unnecessary
  str() + pd.read_json round-trip since the source is already parsed
- nwis.py: restore missing `elif service == "peaks"` dispatch branch in
  get_record (lost in an earlier lint/format pass)
- waterdata/utils.py: fix O(n^2) pd.concat in get_stats_data pagination
  loop using the same collect-then-concat pattern
- samples.py: replace the 230-line get_usgs_samples pass-through that
  duplicated 22 parameters verbatim with a **kwargs wrapper
- demos/hydroshare/*Peaks,Ratings,Statistics,WaterUse*.ipynb: fix
  SyntaxError from a joined `from dataretrieval import nwisfrom ...`
  import line; comment out two WaterUse cells that referenced a variable
  from a now-defunct get_water_use call
- nb-clean metadata on NWIS_demo_1 and WaterData_demo notebooks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant