Add compositional analysis code and example by beyazit-y · Pull Request #57 · BerkeleyLearnVerify/VerifAI

beyazit-y · 2025-11-11T04:06:28Z

No description provided.

dfremont · 2026-04-17T17:33:25Z

+
+* Train a reinforcement learning policy using `train.py`.
+* Test the policy and save generated traces using `test.py`.
+* See `analyze.py` for an example of how to perform compositional analysis on generated traces.


If I understand correctly, just running these three scripts in a row won't work: analyze.py looks for files which are only created by run_exp.py. Please give an example of how to run that script. (Or do you want people to run exps.sh instead? That file isn't explained either.)

Even after running exps.sh I get the following error from analyze.py:

Traceback (most recent call last; use -b to show Scenic internals): File "/Users/dfremont/Documents/Berkeley/repositories/BerkeleyLearnVerify/VerifAI/examples/compositional_analysis/analyze.py", line 18, in <module> scenario_base = ScenarioBase(logs, delta=0.01) File "/Users/dfremont/Documents/Berkeley/repositories/BerkeleyLearnVerify/VerifAI/src/verifai/compositional_analysis.py", line 38, in __init__ raise FileNotFoundError(f"CSV file for scenario '{name}' not found: {path}") FileNotFoundError: CSV file for scenario 'S' not found: storage/traces/S/traces.csv

dfremont · 2026-04-17T17:41:39Z

+    """
+    delta = 1 - confidence_level
+    n = math.log(2 / delta) / (2 * error_bound ** 2)
+    return int(math.ceil(n))


math.ceil already returns an int

dfremont · 2026-04-17T17:42:26Z

+
+    # Launch all processes
+    processes = []
+    start_time = time.time()


Timings should use time.monotonic() so as not to be affected by changes to the system clock.

dfremont · 2026-04-17T17:46:11Z

+                with open(csv_path, 'r') as f:
+                    lines = f.readlines()
+
+                if len(lines) <= 1:  # Only header or empty
+                    current_count = 0
+                else:
+                    # Count unique trace_ids in final file
+                    trace_ids = set()
+                    for line in lines[1:]:  # Skip header
+                        parts = line.split(',')
+                        if parts:
+                            trace_ids.add(parts[0])
+                    current_count = len(trace_ids)


This code looks basically identical to the code in the previous loop -- please consolidate into a helper function.

dfremont · 2026-04-17T17:46:54Z

+            else:
+                # Process completed successfully
+                # Count episodes in completed file
+                with open(csv_path, 'r') as f:


Likewise, let's use a helper function (maybe the same one) for all this CSV reading stuff.

dfremont · 2026-04-17T17:47:55Z

+    return results
+
+
+def parse_scenario(input_scenario):


What is the purpose of this function? It looks equivalent to set(input_scenario).

dfremont · 2026-04-17T17:49:47Z

+    print("\n" + "="*60)
+    print("SUMMARY")
+    print("="*60)
+    print(f"Command: python compare_analysis.py --scenario \"{input_scenario}\" "


This would be easier to read as a triple-quoted f-string (you wouldn't need to escape the quotes or keep writing f" at the start of each line).

dfremont · 2026-04-17T17:50:18Z

+
+
+if __name__ == "__main__":
+    import argparse


Should be at top of file (there's no need to defer this import).

dfremont · 2026-04-17T17:52:51Z

+    assert args.confidence_level is not None
+    assert (args.error_bound is not None) or (args.time_budget is not None)


These are provided by the user, so you should raise an exception if they are invalid, not fail an assertion. For confidence_level, if it is required you can just tell argparse that.

dfremont · 2026-04-17T17:54:57Z

+from train import make_env
+from functools import partial
+import matplotlib.pyplot as plt
+from stable_baselines3 import PPO


New dependencies such as this one will need to be added to the examples extra in pyproject.toml so that pip install verifai[examples] gets everything needed to run the examples.

dfremont · 2026-04-17T17:55:53Z

+from stable_baselines3.common.utils import set_random_seed
+from metadrive.component.map.pg_map import MapGenerateMethod
+from metadrive.utils.draw_top_down_map import draw_top_down_map
+from stable_baselines3.common.vec_env.subproc_vec_env import SubprocVecEnv


In this and all other files, please arrange imports in 3 blocks separated by 1 blank line: Python stdlib packages, external packages, and imports from VerifAI itself.

dfremont · 2026-04-17T17:56:53Z

+    # while True:
+    #     env=make_env(scenario=args.scenario, monitor=False)
+    #     env.reset()
+    #     ret = draw_top_down_map(env.current_map)
+    #     # ret = env.render(mode="topdown", window=False)
+    #     # ret = env.render(mode="topdown",
+    #     #                  window=False,
+    #     #                  # screen_size=(600, 600),
+    #     #                  # camera_position=(50, 50)
+    #     #                  )
+    #     env.close()
+    #     plt.axis("off")
+    #     plt.imshow(ret)
+    #     plt.show()
+    #     clear_output()


Either delete this or move it into a separate function for debugging.

dfremont · 2026-04-17T18:00:13Z

+        csv_path = os.path.join(args.save_dir, "traces.csv")
+        if os.path.exists(csv_path):
+            os.remove(csv_path)
+        f = open(csv_path, "w", newline="")


Should use finally: to ensure this file is closed even if an exception is raised. You'll probably want to move the main loop into a helper function so that the try: isn't huge.

dfremont · 2026-04-17T18:02:33Z

+        help="Scenario string")
+    args = parser.parse_args()
+
+    # while True:


Remove debugging code

dfremont · 2026-04-17T18:03:29Z

+    """
+
+    if not expert:
+        assert model_path is not None, "You must provide a valid model_path (.zip file)"


Raise ValueError instead

dfremont · 2026-04-17T18:04:02Z

+
+    if expert:
+        # print("USING EXPERT POLICY")
+        from metadrive.policy.expert_policy import ExpertPolicy


Is there a reason to defer this import? If not, it should be at the top of the file.

dfremont · 2026-04-17T18:04:29Z

+        f.close()
+
+    env.close()
+    return


dfremont · 2026-04-17T18:06:19Z

+from train import make_env
+
+
+def generate_traces(


This function seems almost the same as the main code in test.py. Can you use it there, with appropriate options? If not, it would be good to consolidate as much as possible into common helper functions.

dfremont · 2026-04-17T18:11:49Z

+            t_last = df_t.sort_values("step").groupby("trace_id").tail(1)
+
+            # KDE features
+            if features:


Promote this check out of the loop (i.e. write if not features: and raise the exception before the loop).

dfremont · 2026-04-17T18:13:39Z

+                    continue
+
+            # KDE features
+            if features:


As above. In fact, can some of the code of check and falsify be unified?

dfremont · 2026-04-17T18:19:44Z

+    parser.add_argument(
+        "--n-envs",
+        type=int,
+        default=16,


This is a big default (remember people may try this on their laptops). I'd suggest either a smaller constant, like 4, or using the total number of logical cores on the system minus 1.

dfremont · 2026-04-17T18:24:16Z

+This example uses the [MetaDrive](https://metadriverse.github.io/metadrive/) simulator.
+
+* Train a reinforcement learning policy using `train.py`.
+* Test the policy and save generated traces using `test.py`.


Running this I got the following warning which would be good to investigate:

[WARNING] env.vehicle will be deprecated soon. Use env.agent instead (base_env.py:734)

dfremont · 2026-04-17T18:26:24Z

+        scenario: List[str],
+        features: Optional[List[str]] = None,
+        center_feat_idx: Optional[List[int]] = None,
+        bw_method: str | int = 10,


This type union syntax was not added until Python 3.10, but VerifAI supports 3.8 and 3.9 (and in fact Alejandro's example requires 3.9). Please fix and test your code on 3.8 and 3.9. (This would have been caught by tox if you had any test cases, which I'll mention in the top-level review.)

dfremont

Hi Beyazit, thanks for making this PR! With a few changes I was able to run everything except the final analysis (see comments), and the code looks OK overall. A few bigger things to improve:

There should be unit tests for the new functionality. They can be basic, not using a neural network or anything (we want the tests to run without optional dependencies such as stable baselines anyway), but should exercise the main code paths and check that the output is reasonable.
There should be a new page in the documentation describing the compositional analysis features. It can explain the example in more detail (for example, what do S, X, SXS, etc. actually mean) and also explain how to use the top-level APIs ScenarioBase and CompositionalAnalysisEngine. The docstrings for those are a good starting point (and you can auto-generate documentation from them, as we do in docs/feature_api.rst for example), but they need to be fleshed out: for example, the analysis methods accept a list of scenario names, but the example only ever calls them with a single string.

If you can help me solve the issue I had when running analyze.py I will try to run the example again. Thanks!

dfremont · 2026-04-17T19:31:56Z

+    parser.add_argument(
+        "--save-dir",
+        type=str,
+        default="storage",


Let's also add a .gitignore file in this folder saying to ignore storage; otherwise running the example makes Git think you've made a bunch of changes.

beyazit-y and others added 21 commits November 10, 2025 20:05

Add compositional analysis code and example

6e99c5d

Use expert policy to generate traces if model is not given

faf0b60

Fix compositional SMC

ff693da

Fix compositional falsification

ffae171

Add exp script

54c88bd

Fix exp code

221fbeb

Fix exp code

5735c24

Add exp bash script

b112585

Add exp bash script

95e3176

Add exp bash script

9539352

Add exp bash script

45c2b39

Add exp bash script

76cd260

Add exp bash script

0e3ba2f

Add exp bash script

90972cb

Add exp bash script

c47d027

Add exp bash script

332b85b

Add some results

113f743

Add some results

f0dbeec

Lower error bound in exps

3493635

Add new results

68da456

Fix typo

93c686e

kevinchang73 approved these changes Apr 6, 2026

View reviewed changes

Remove results

5225a1d

beyazit-y requested a review from kevinchang73 April 8, 2026 18:01

kevinchang73 approved these changes Apr 8, 2026

View reviewed changes

dfremont reviewed Apr 17, 2026

View reviewed changes

Comment thread examples/compositional_analysis/utils.py

f.close()

env.close()

return

Copy link
Copy Markdown

Collaborator

dfremont Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant

dfremont reviewed Apr 17, 2026

View reviewed changes

dfremont requested changes Apr 17, 2026

View reviewed changes

dfremont reviewed Apr 17, 2026

View reviewed changes

		assert args.confidence_level is not None
		assert (args.error_bound is not None) or (args.time_budget is not None)

Conversation

beyazit-y commented Nov 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dfremont left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants