Skip to content

Add AgenticCodeExecution sample solution - MCP code-execution agents#82

Open
rbogdano wants to merge 13 commits intoopea-project:mainfrom
rbogdano:feature/agentic-code-execution
Open

Add AgenticCodeExecution sample solution - MCP code-execution agents#82
rbogdano wants to merge 13 commits intoopea-project:mainfrom
rbogdano:feature/agentic-code-execution

Conversation

@rbogdano
Copy link
Copy Markdown

@rbogdano rbogdano commented Apr 1, 2026

Adds a new sample solution AgenticCodeExecution — a self-contained MCP-based agentic code execution demo supporting retail, airline, stocks, banking, and triage domains.

What's included:

  • Two-server MCP architecture (tools-server + sandbox-server) with per-session DB isolation
  • Flowise 3.0.12 as visual agent UI (included in docker-compose)
  • Auto-download of tau2-bench databases (airline, retail) on first run
  • LLM deployment guide (Enterprise Inference Helm charts + standalone Docker)
  • 5 domain-specific Flowise agent flow templates and system prompts
  • Comprehensive troubleshooting section (proxy, Flowise SSE, NUMA/NRI, vLLM)

@rbogdano rbogdano force-pushed the feature/agentic-code-execution branch from f82d439 to 794bb42 Compare April 1, 2026 12:56
Comment thread sample_solutions/AgenticCodeExecution/docker-compose.yml Outdated
@rbogdano rbogdano force-pushed the feature/agentic-code-execution branch from 5d44018 to 888d65b Compare April 7, 2026 10:30
Comment thread sample_solutions/AgenticCodeExecution/README.md Outdated
Comment thread sample_solutions/AgenticCodeExecution/requirements.txt
Comment thread sample_solutions/AgenticCodeExecution/examples/banking/data/db.json
Comment thread sample_solutions/AgenticCodeExecution/examples/stocks/data/db.json
@rbogdano rbogdano force-pushed the feature/agentic-code-execution branch 2 times, most recently from f144c79 to 30f7c67 Compare April 14, 2026 11:55

You need a running vLLM endpoint serving `Qwen/Qwen3-Coder-30B-A3B-Instruct` (or a compatible tool-calling model).

### Pre-download model (recommended)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need of this step. EI should handle this. You can remove this


### Option A: Enterprise Inference (Kubernetes)

Deploy vLLM via the [Enterprise Inference](../../docs/README.md) Helm charts. `Qwen/Qwen3-Coder-30B-A3B-Instruct` is not in the EI pre-validated model menu, but vLLM supports it natively. See the [EI deployment guide](../../docs/README.md) for prerequisites and cluster setup.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you have added model support in Enterprise Inference, you can remove the statement not in the prevalidated EI menu. Just tell deploy the model with model name and EI readme link which you already have


Deploy vLLM via the [Enterprise Inference](../../docs/README.md) Helm charts. `Qwen/Qwen3-Coder-30B-A3B-Instruct` is not in the EI pre-validated model menu, but vLLM supports it natively. See the [EI deployment guide](../../docs/README.md) for prerequisites and cluster setup.

#### TP=1 (recommended for simplicity)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cpu pinning and other things in the helm install command. Ideally EI will handle all of these. I feel since the model is added now in the list, we can remove all of these tp=1, tp=2 etc.


> The K8s service listens on port **80** (not 8000). Use the URL above without a port number.

#### vLLM v0.16.0: LOGNAME fix
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These things should be taken care by EI. You can remove it

kubectl rollout restart deployment/vllm-qwen3-coder -n default
```

#### Behind a corporate proxy
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also taken care by EI and there is a documentation explaining how to deploy model behind coporate proxy. You can remove this section as well

kubectl get configmap -n kube-system nri-resource-policy-balloons-config -o yaml | grep -A5 reservedResources
```

### vLLM v0.16.0: `getpwuid(): uid not found: 1001`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove this

- Reduce `VLLM_CPU_KVCACHE_SPACE` to `5` or use a smaller model
- With TP=1, use `VLLM_CPU_OMP_THREADS_BIND="all"` to avoid NUMA strict binding

### vLLM: `sched_setaffinity errno: 22`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this automatically happens in EI. We can remove this

- URL must use host IP: `http://<host-ip>:5051/sse`
- Check logs: `docker compose logs -f sandbox-server tools-server`

### vLLM OOMKilled (exit code 137)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this and just add minimum requirements of the machine. Example cores: 32, memory 128gb etc.

Copy link
Copy Markdown
Contributor

@vivekrsintc vivekrsintc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📄 Third-Party Notice Required for τ-bench

This solution uses data and domain concepts from τ-bench (tau2-bench) by Sierra Research, licensed under MIT. A third-party notice file should be added.

Suggestion: Create sample_solutions/AgenticCodeExecution/THIRD_PARTY_NOTICES with the following content:

This project includes or references components from the following third-party projects:

τ-bench (tau2-bench) https://github.com/sierra-research/tau2-bench License: MIT Copyright (c) Sierra Research

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a top-level _disclaimer field to all bundled data files to clarify these are synthetic:

"_disclaimer": "This is synthetic data generated for testing and demonstration purposes only. No personal or real-world data is used."

The retail/airline db.json files are fetched from tau-bench at runtime — please also note their synthetic nature in the README's Data Attribution section.

`

Turn 5: execute_python → o = actions.get_order_details("#W003")
Turn 6: execute_python → p = actions.get_product_details("123")
Turn 7: execute_python → p = actions.get_product_details("456")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this solution includes an execute_python tool that lets the LLM agent generate and run arbitrary Python code, please add a prominent disclaimer near the top of the README
as below

** Disclaimer:** This is a reference application intended for demonstration and evaluation purposes only. The execute_python tool allows the LLM agent to generate and execute Python code in a sandboxed environment. Review and harden the sandbox security configuration before any production use.

sgurunat
sgurunat previously approved these changes Apr 23, 2026
amberjain1
amberjain1 previously approved these changes Apr 23, 2026
@rbogdano rbogdano dismissed stale reviews from amberjain1 and sgurunat via a1168ad April 23, 2026 13:03
MCP-based agentic code execution demo with Flowise, supporting
retail, airline, stocks, banking, and triage domains.

- Two-server MCP architecture: tools-server + sandbox-server
- Flowise 3.0.12 as visual agent UI (docker-compose included)
- Auto-download of tau2-bench databases (airline, retail)
- LLM deployment guide (EI Helm + standalone Docker)
- Comprehensive troubleshooting section
- Per-session database isolation for concurrent users

Source: https://github.com/intel-sandbox/Agentic-Code-Execution
Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
…tead

- Remove Flowise service from docker-compose.yml (use plugins/agenticai)
- Update README to reference EI agenticai plugin for Flowise deployment
- Update Flowise MCP config, troubleshooting sections for K8s context
- Remove FLOWISE_PORT from .env

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
…to table

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
Sync with standalone Agentic-Code-Execution repo refactor:
- Replace tools-server/, system-prompts/, data/ with examples/{domain}/ structure
- Each domain (retail, airline, stocks, banking, triage) is self-contained
- Remove start_all.sh (unused)
- Update docker-compose.yml, Dockerfile, .gitignore, README paths
- Update MCP server imports and default DB/session paths

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
Reference EI deployment guide in Option A section for prerequisites
and cluster setup context.

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
Pin all dependencies to exact versions verified on a working installation:
- requirements.txt: fastmcp==3.2.3, mcp==1.27.0, pydantic==2.13.0, etc.
- examples/requirements.txt: fastmcp==3.2.3, pydantic==2.13.0, uvicorn==0.44.0, starlette==1.0.0
- sandbox-server/requirements.txt: + pydantic-monty==0.0.11

Also fix code-mode git URL (intel-sandbox -> universal-tool-calling-protocol).

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
rbogdano and others added 5 commits April 23, 2026 06:06
…pt updates

- Retail: port dict returns, direct access, and optimized prompts from tau2-bench
- Retail: improvements for Code Execution Retail Agent
- Remove <policy> tags from system prompt (confuses agent)
- Rename Flowise retail agentflow to agentflow_fast_code_execution_retail.json
- Update error_hints.py with improved json.loads guidance
- Sandbox server improvements

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
…ise rename

- Update retail system prompt with best tau2 score version
- Add example retail conversation to README
- Revert Flowise retail agentflow rename back to agentflow_code_execution_retail.json

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
- Add Qwen/Qwen3-Coder-30B-A3B-Instruct to EI model menu (CPU option 27)
- Add model config in xeon-values.yaml (qwen3_coder parser, 10GB KV cache)
- Add deploy/uninstall tasks in deploy-inference-models.yml
- README: retail-first copy-paste flow, remove redundant Flowise summary
- README: add test user credentials (Mia Garcia, Aarav Anderson)

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
- Remove pre-download model section (EI handles this)
- Simplify Option A: single helm install, link to EI docs
- Remove Option B (Standalone Docker) — not needed for EI context
- Remove EI-handled troubleshooting (OOMKilled, sched_setaffinity, LOGNAME, proxy, ECR)
- Add security disclaimer for execute_python sandbox
- Add THIRD_PARTY_NOTICES for τ-bench (MIT, Sierra Research)
- Add _disclaimer field to bundled banking and stocks db.json
- Update Data Attribution with synthetic data notice
- Sync with latest Agentic-Code-Execution upstream changes

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
- README: add hardware requirements table, NUMA considerations section,
  and multi-NUMA warning for EI deployment
- docker-compose: add PYTHON_BASE_IMAGE build arg for both services

Signed-off-by: Rafal Bogdanowicz <rafal.bogdanowicz@intel.com>
@rbogdano rbogdano force-pushed the feature/agentic-code-execution branch from a1168ad to d74d1dd Compare April 23, 2026 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants