feat(evaluation): add option to save eval results to CSV by vaibhav-patel · Pull Request #6182 · google/adk-python

vaibhav-patel · 2026-06-22T09:54:07Z

Summary

Adds an optional, opt-in way to persist AgentEvaluator results to a CSV file, as requested in #2652. A new output_file parameter is added to both AgentEvaluator.evaluate and AgentEvaluator.evaluate_eval_set. When provided, per-invocation results for every metric (both passing and failing) are written to the given path; when omitted (the default), behavior is unchanged.

Motivation

Running agent evals from pytest currently only prints detailed tables (and only for failing metrics). Users want to save results to disk for later inspection/tracking. This implements the CSV suggestion from #2652, adapted to the current evaluation architecture (the codebase has changed substantially since the issue was filed).

What changed

New output_file: Optional[str] = None on AgentEvaluator.evaluate and evaluate_eval_set.
Results are flattened to one row per metric per invocation. Columns: eval_set_id, eval_id, metric_name, threshold, score, eval_status, prompt, expected_response, actual_response, expected_tool_calls, actual_tool_calls.
Parent directory is created if needed; rows are appended (header written once) so evaluating a directory of .test.json files accumulates results in a single file.
Reuses existing content/tool-call formatting helpers; CSV writing uses pandas, already declared in the eval optional dependencies.

Backward compatibility

Fully backward-compatible. The feature is disabled unless output_file is set.

Testing

Added tests/unittests/evaluation/test_agent_evaluator.py covering row flattening, the missing-expected-invocation case, single-file writing with directory creation, and append-without-duplicate-header. All pass; the full tests/unittests/evaluation/ suite still collects cleanly.

Fixes #2652.

Add an optional `output_file` parameter to `AgentEvaluator.evaluate` and `AgentEvaluator.evaluate_eval_set`. When set, per-invocation evaluation results for every metric (both passing and failing) are flattened and written to the given path as a CSV file, making it easy to persist and inspect results from pytest-based eval runs. The option is disabled by default, so existing behavior is unchanged. The parent directory is created if needed, and rows are appended so results from a directory of test files accumulate in a single file. CSV writing reuses the existing text/tool-call formatting helpers and relies on pandas, which is already part of the `eval` optional dependencies. Fixes google#2652.

google-cla · 2026-06-22T09:54:25Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

vaibhav-patel · 2026-06-22T10:00:43Z

@googlebot I signed it!

chore: re-trigger CI checks

aa74e0d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(evaluation): add option to save eval results to CSV#6182

feat(evaluation): add option to save eval results to CSV#6182
vaibhav-patel wants to merge 2 commits into
google:mainfrom
vaibhav-patel:fix/2652-agentevaluator-csv-export

vaibhav-patel commented Jun 22, 2026

Uh oh!

google-cla Bot commented Jun 22, 2026

Uh oh!

vaibhav-patel commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vaibhav-patel commented Jun 22, 2026

Summary

Motivation

What changed

Backward compatibility

Testing

Uh oh!

google-cla Bot commented Jun 22, 2026

Uh oh!

vaibhav-patel commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant