Skip to content

[rid] Add lock on subscriptions#1523

Draft
the-glu wants to merge 1 commit into
interuss:masterfrom
Orbitalize:fix_1509
Draft

[rid] Add lock on subscriptions#1523
the-glu wants to merge 1 commit into
interuss:masterfrom
Orbitalize:fix_1509

Conversation

@the-glu

@the-glu the-glu commented Jun 12, 2026

Copy link
Copy Markdown
Member

This PR add a lock on RID's subscriptions when working on ISAs.

It fixes #1509 (or at least improve the situation)

This follow the same logic that the one in SCD, preventing (improving) cascade of transactions restart observed with tracing with parameters / code in #1509 (comment)

image_2026-06-12_09-31-14

⚠️ This has the same issues as locks in SCD:

  • Lot of subscriptions will make it slow
  • Global lock / cells lock are not implemented but could be good tradeoffs

However ISA don't create a automatic subscription, so the case with lot of subscriptions may not be relevant.

This has been verify locally, by running tests as described previously (parameters + specific non-merged version of monitoring repo). Timeout present before are gone.

image

Before:

Running configuration(s): configurations.dev.netrid_concurrency
warning: `--no-sync` has no effect when used outside of a project
2026-06-12 07:20:13.900 | INFO     | __main__:main:304 - ========== Running uss_qualifier for configuration configurations.dev.netrid_concurrency ==========
2026-06-12 07:20:13.921 | INFO     | __main__:run_config:194 - Validating configuration...
2026-06-12 07:20:14.425 | DEBUG    | __main__:sign:102 - Computing signatures of inputs
2026-06-12 07:20:14.425 | INFO     | __main__:run_config:238 - Test definition description:
Codebase version: Orbitalize/monitoring/v0.28.0-d8e6f8a2-dirty
Commit hash: d8e6f8a2545311b46a4a895721d768828dbc9977
Baseline signature: TB-34ee356 34ee356e3719d0a1485e3e03dbd7cf865fd0348fba1d9fa4b686cc45f0ee836a
Environment signature: TE-aedb222 aedb222c12ad1e2e2d3a57cca66c2c6c6a31f4343631d4946db48d37a6a71757
2026-06-12 07:20:14.425 | INFO     | __main__:run_config:260 - Executing test run
2026-06-12 07:20:14.425 | INFO     | __main__:execute_test_run:127 - Instantiating resources
2026-06-12 07:20:14.429 | INFO     | __main__:execute_test_run:140 - Instantiating top-level test suite action
2026-06-12 07:20:16.039 | INFO     | __main__:execute_test_run:149 - Running top-level test suite action
2026-06-12 07:20:16.039 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 07:20:44.427 | WARNING  | monitoring.monitorlib.fetch.fetch_async:query_and_describe:136 - query_and_describe attempt 1 from PID 10 to PUT /dss/identification_service_areas/00000176-e36d-40be-8d38-beca6ca30007 failed with exception TimeoutError: 
At File "/app/monitoring/uss_qualifier/scenarios/astm/netrid/common/dss/heavy_traffic_concurrent.py", line 271, in _create_isa
2026-06-12 07:20:56.427 | WARNING  | monitoring.monitorlib.fetch.fetch_async:query_and_describe:136 - query_and_describe attempt 1 from PID 10 to PUT /dss/identification_service_areas/00000176-e36d-40be-8d38-beca6ca30021 failed with exception TimeoutError: 
At File "/app/monitoring/uss_qualifier/scenarios/astm/netrid/common/dss/heavy_traffic_concurrent.py", line 271, in _create_isa
2026-06-12 07:21:09.427 | WARNING  | monitoring.monitorlib.fetch.fetch_async:query_and_describe:136 - query_and_describe attempt 1 from PID 10 to PUT /dss/identification_service_areas/00000176-e36d-40be-8d38-beca6ca30038 failed with exception TimeoutError: 
At File "/app/monitoring/uss_qualifier/scenarios/astm/netrid/common/dss/heavy_traffic_concurrent.py", line 271, in _create_isa
2026-06-12 07:21:20.427 | WARNING  | monitoring.monitorlib.fetch.fetch_async:query_and_describe:136 - query_and_describe attempt 2 from PID 10 to PUT /dss/identification_service_areas/00000176-e36d-40be-8d38-beca6ca30038 failed with exception TimeoutError: 
At File "/app/monitoring/uss_qualifier/scenarios/astm/netrid/common/dss/heavy_traffic_concurrent.py", line 271, in _create_isa
2026-06-12 07:21:26.371 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 1 to http://dss1.uss1.localutm/rid/v2 with default scopes <none> last used 138295.944389596 (now 138352.944969718)

After:

(Notice there is an unrelated error on subscriber notification)

./monitoring/uss_qualifier/run_locally.sh configurations.dev.netrid_concurrency
make: « image » est à jour.
Running configuration(s): configurations.dev.netrid_concurrency
warning: `--no-sync` has no effect when used outside of a project
2026-06-12 07:53:42.031 | INFO     | __main__:main:304 - ========== Running uss_qualifier for configuration configurations.dev.netrid_concurrency ==========
2026-06-12 07:53:42.054 | INFO     | __main__:run_config:194 - Validating configuration...
2026-06-12 07:53:42.656 | DEBUG    | __main__:sign:102 - Computing signatures of inputs
2026-06-12 07:53:42.657 | INFO     | __main__:run_config:238 - Test definition description:
Codebase version: Orbitalize/monitoring/v0.28.0-d8e6f8a2-dirty
Commit hash: d8e6f8a2545311b46a4a895721d768828dbc9977
Baseline signature: TB-34ee356 34ee356e3719d0a1485e3e03dbd7cf865fd0348fba1d9fa4b686cc45f0ee836a
Environment signature: TE-aedb222 aedb222c12ad1e2e2d3a57cca66c2c6c6a31f4343631d4946db48d37a6a71757
2026-06-12 07:53:42.657 | INFO     | __main__:run_config:260 - Executing test run
2026-06-12 07:53:42.657 | INFO     | __main__:execute_test_run:127 - Instantiating resources
2026-06-12 07:53:42.660 | INFO     | __main__:execute_test_run:140 - Instantiating top-level test suite action
2026-06-12 07:53:44.333 | INFO     | __main__:execute_test_run:149 - Running top-level test suite action
2026-06-12 07:53:44.333 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 07:53:47.278 | WARNING  | monitoring.uss_qualifier.suites.suite:_print_failed_check:75 - New failed check:
  details: Attempting to notify subscriber for ISA 00000176-e36d-40be-8d38-beca6ca30046
    at https://testdummy.interuss.org/interuss/monitoring/uss_qualifier/scenarios/astm/netrid/common/dss/heavy_traffic_concurrent/preexisting_sub
    resulted in 999
  documentation_url: https://github.com/interuss/monitoring/blob/d8e6f8a2545311b46a4a895721d768828dbc9977/monitoring/uss_qualifier/scenarios/astm/netrid/v22a/dss/test_steps/clean_workspace.md#notified-subscriber-check
  name: Notified subscriber
  participants: []
  query_report_timestamps:
  - '2026-06-12T07:53:47.272384Z'
  requirements:
  - astm.f3411.v22a.NET0730
  severity: High
  summary: Could not notify ISA subscriber
  timestamp: '2026-06-12T07:53:47.277997Z'
  
2026-06-12 07:54:04.835 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 07:54:04.835 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 07:55:00.528 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 07:55:00.528 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 07:55:01.833 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 1 to http://dss1.uss1.localutm/rid/v2 with default scopes <none> last used 140311.407245411 (now 140368.407564035)
2026-06-12 07:55:57.532 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 2 to http://dss2.uss1.localutm/rid/v2 with default scopes <none> last used 140367.101519989 (now 140424.106835682)
2026-06-12 07:55:59.602 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 07:55:59.602 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 07:56:56.602 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 3 to http://dss3.uss1.localutm/rid/v2 with default scopes <none> last used 140426.176028091 (now 140483.176294125)
2026-06-12 07:57:09.192 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 4 to http://dss1.uss2.localutm/rid/v2 with default scopes <none> last used 140438.766193421 (now 140495.766413843)
2026-06-12 07:58:34.770 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 4 to http://dss1.uss2.localutm/rid/v2 with default scopes <none> last used 140524.344532964 (now 140581.344883151)
2026-06-12 07:59:23.202 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 07:59:23.202 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 08:00:20.200 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 4 to http://dss1.uss2.localutm/rid/v2 with default scopes <none> last used 140629.774583328 (now 140686.774837416)
2026-06-12 08:00:31.724 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 5 to http://dss2.uss2.localutm/rid/v2 with default scopes <none> last used 140641.298082505 (now 140698.298351829)
2026-06-12 08:01:58.163 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 5 to http://dss2.uss2.localutm/rid/v2 with default scopes <none> last used 140727.736776077 (now 140784.737117289)
2026-06-12 08:02:52.036 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 08:02:52.036 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 08:03:49.035 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 5 to http://dss2.uss2.localutm/rid/v2 with default scopes <none> last used 140838.60913193 (now 140895.609494248)
2026-06-12 08:04:00.958 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 6 to http://dss3.uss2.localutm/rid/v2 with default scopes <none> last used 140850.532617472 (now 140907.532837481)
2026-06-12 08:05:28.534 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 6 to http://dss3.uss2.localutm/rid/v2 with default scopes <none> last used 140938.108235326 (now 140995.108507731)
2026-06-12 08:06:23.254 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 08:06:23.254 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 08:07:20.254 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 6 to http://dss3.uss2.localutm/rid/v2 with default scopes <none> last used 141049.827818052 (now 141106.828250085)
2026-06-12 08:07:34.315 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 7 to http://dss1.uss3.localutm/rid/v2 with default scopes <none> last used 141063.889556253 (now 141120.889712562)
2026-06-12 08:09:02.469 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 7 to http://dss1.uss3.localutm/rid/v2 with default scopes <none> last used 141152.04297001 (now 141209.043263422)
2026-06-12 08:09:59.126 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 08:09:59.127 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 08:10:56.126 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 7 to http://dss1.uss3.localutm/rid/v2 with default scopes <none> last used 141265.699955304 (now 141322.700228459)
2026-06-12 08:11:08.858 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 8 to http://dss2.uss3.localutm/rid/v2 with default scopes <none> last used 141278.431888446 (now 141335.432050845)
2026-06-12 08:12:36.282 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 8 to http://dss2.uss3.localutm/rid/v2 with default scopes <none> last used 141365.856158436 (now 141422.856454844)
2026-06-12 08:13:34.934 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 08:13:34.934 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:158 - Running "ASTM NetRID DSS: Concurrent Requests" scenario...
2026-06-12 08:14:31.933 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 8 to http://dss2.uss3.localutm/rid/v2 with default scopes <none> last used 141481.507452336 (now 141538.507695863)
2026-06-12 08:14:44.417 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 9 to http://dss3.uss3.localutm/rid/v2 with default scopes <none> last used 141493.990931223 (now 141550.991194007)
2026-06-12 08:16:15.118 | DEBUG    | monitoring.monitorlib.infrastructure:close_if_idle:179 - Closing idle UTMClientSession 9 to http://dss3.uss3.localutm/rid/v2 with default scopes <none> last used 141584.691693192 (now 141641.691981483)
2026-06-12 08:17:09.222 | INFO     | monitoring.uss_qualifier.suites.suite:_run_test_scenario:188 - "ASTM NetRID DSS: Concurrent Requests" scenario completed
2026-06-12 08:17:09.222 | INFO     | __main__:execute_test_run:151 - Top-level test suite action complete
2026-06-12 08:17:09.222 | DEBUG    | monitoring.uss_qualifier.reports.artifacts:generate_artifacts:41 - Writing artifacts to /app/monitoring/uss_qualifier/output/netrid_concurrency
2026-06-12 08:17:09.222 | INFO     | monitoring.uss_qualifier.reports.artifacts:generate_artifacts:55 - Redacting access tokens from report
2026-06-12 08:17:12.819 | INFO     | monitoring.uss_qualifier.reports.artifacts:make_raw_report:63 - Writing raw report to output/netrid_concurrency/report.json
2026-06-12 08:17:12.838 | INFO     | monitoring.uss_qualifier.reports.artifacts:make_sequence_view:115 - Writing sequence view to output/netrid_concurrency/sequence
2026-06-12 08:17:12.847 | INFO     | monitoring.uss_qualifier.reports.artifacts:make_timing_report:143 - Writing timing report to output/netrid_concurrency/timing
2026-06-12 08:17:13.109 | INFO     | monitoring.uss_qualifier.reports.artifacts:make_timing_report:146 - Wrote timing report in 0.3s
2026-06-12 08:17:13.329 | INFO     | monitoring.uss_qualifier.reports.artifacts:make_raw_report:72 - Wrote raw report in 0.5s
2026-06-12 08:17:14.587 | INFO     | monitoring.uss_qualifier.reports.artifacts:make_sequence_view:121 - Wrote sequence view in 1.7s
2026-06-12 08:17:14.651 | INFO     | __main__:main:323 - ========== Completed uss_qualifier for configuration configurations.dev.netrid_concurrency ==========

@the-glu

the-glu commented Jun 18, 2026

Copy link
Copy Markdown
Member Author

I switched that one to draft. Running tests with the new proposed tool in interuss/monitoring#1519, 27 queries in // show those results:

image

Things seems faster without latency, but as soon as it increase things explode, and that probably dues to similar reason to SCD: too much contentions.

Trying again with minimum number of queries in //, 9. we get:

image

Same problem, but with higher latency.

Not sure how to decide between different tradeoff.

With this PR, request can be processed faster (especially with light load), but there is a point (dependant on latency) when it's worse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[NRID] DSS 5xx Errors during Concurrent Requests USS Qualifier Test Scenario

1 participant