🏆 #1 on OpenAI MLE-Bench
- [6/17/2026] The AIBuildAI Agent 2.5 version is made available to Max users: a live build dashboard, run replay, cross-run memory, MCP research tools, and DeepSeek support. Release notes and download.
- [5/26/2026] The AIBuildAI Agent 2.0 version is made available to Pro users.
- [5/1/2026] In the TGS Salt Identification Challenge hosted by Kaggle, the model automatically developed by our AIBuildAI Agent ranked in the top 5.7%. Among 3,219 teams composed of human experts, this performance reaches the level of top-tier human AI experts. Model and code developed by the Agent: tasks/tgs-salt-identification-challenge.
- [4/27/2026] Excited to announce AIBuildAI Agent 2.0! It has once again achieved #1 on OpenAI's MLE-Bench, reaching a score of 70.7% and substantially outperforming the agents ranked 2nd through 5th. Compared to version 1.0, Agent 2.0 introduces several technical advancements, which we will detail in an upcoming technical report. The 2.0 version will be available to Pro users soon.
- [4/24/2026] In a heart disease prediction competition hosted by Kaggle, the model automatically developed by our AIBuildAI Agent ranked in the top 6.6%. Among 4,370 teams composed of human experts, this performance reaches the level of top-tier human AI experts. Model and code developed by the agent: tasks/playground-series-s6e2.
demo.mp4
AIBuildAI is an AI agent that automatically builds AI models. Given a task, it runs an agent loop that analyzes the problem, designs models, writes code to implement them, trains them, tunes hyperparameters, evaluates model performance, and iteratively improves the models. By automating the model development workflow, AIBuildAI reduces much of the manual effort required to build AI models.
On OpenAI MLE-Bench, AIBuildAI ranked #1, demonstrating strong performance on real-world AI model building tasks.
AIBuildAI requires a Linux x86_64 machine (Ubuntu 20.04 or newer).
There are three versions. V2.5 (Max) is the current, most capable version. V2 (Pro) and V1 (free) remain available.
| Version | Plan | To run it |
|---|---|---|
| V2.5 (Max) — current | Max subscription | aibuildai login (Max plan) + a model API key (Anthropic or DeepSeek) |
| V2 (Pro) | Pro subscription | aibuildai login (Pro plan) + an Anthropic API key |
| V1 | free | an Anthropic API key (no account) |
Subscriptions are managed at accounts.aibuildai.io; see the account page for the available plans.
-
Subscribe. Create an account at accounts.aibuildai.io/sign-up and switch to the Max plan.
-
Install.
curl -fsSL https://github.com/aibuildai/AI-Build-AI/releases/download/v2.5-latest/aibuildai-linux-x86_64.tar.gz | tar xz && ./aibuildai-linux-x86_64-*/install.sh
-
Log in (required before running):
aibuildai login # opens a browser to sign in aibuildai whoami # should show an active Max plan
-
Set your model-provider API key. This is separate from the aibuildai subscription above: the subscription (
aibuildai login) lets you run the product; this key pays for the AI model calls. aibuildai uses Anthropic (Claude) by default, so set your Anthropic API key:export AIBUILDAI_API_KEY=your-anthropic-api-keyDeepSeek is also supported — set a
deepseek-*id in the config'sllm.modeland put your DeepSeek key in the sameAIBUILDAI_API_KEY. -
Run. V2.5 is driven by a YAML config:
aibuildai config > task.yaml # writes a starter config with every field # edit task.yaml: set run.task_name, run.data_root, run.instruction, run.playground_root aibuildai run task.yaml
Other commands:
aibuildai memorize(summarize past runs into memory),aibuildai replay <run-dir>(replay a finished run),aibuildai --help.
-
Subscribe to the Pro plan at accounts.aibuildai.io/sign-up.
-
Install.
curl -fsSL https://github.com/aibuildai/AI-Build-AI/releases/download/v2.0-latest/aibuildai-linux-x86_64.tar.gz | tar xz && ./aibuildai-linux-x86_64-*/install.sh
-
Log in.
aibuildai login aibuildai whoami # should show an active Pro plan -
Set credentials.
export ANTHROPIC_API_KEY=your-api-key -
Run. V2 is driven by command-line flags:
aibuildai run --task-name <name> --data-dir <path> \ --playground-dir <path> --instruction "$(cat task.md)" --no-form
Or run
aibuildaiwith no flags to fill in the parameters in an interactive form.
No account or subscription required.
-
Install.
curl -fsSL https://github.com/aibuildai/AI-Build-AI/releases/download/v1.0-latest/aibuildai-linux-x86_64.tar.gz | tar xz && ./aibuildai-linux-x86_64-*/install.sh
-
Set your Anthropic API key.
export ANTHROPIC_API_KEY=your-api-key -
Run with command-line flags:
aibuildai --task-name <name> --data-dir <path> \ --playground-dir <path> --instruction "$(cat task.md)" --no-form
Or run
aibuildaiwith no flags to fill in the parameters in an interactive form.
Important: run the command directly in your terminal. Do not wrap it in a .sh/.bash script — running it through a script may cause the TUI (Text User Interface) to crash.
After a run completes, the output directory usually looks like (structure may slightly vary by task):
├── candidate_1/ candidate_2/ candidate_3/ # Auto-generated training scripts and model checkpoints
├── checkpoint.pth # Best model checkpoint
├── inference.py # Standalone inference script for the final model
├── submission.csv # Test predictions (if test inputs are provided)
└── progress.pdf # Visual progress report
The main outputs of an AIBuildAI run are the model checkpoints and the script inference.py, which runs predictions with the final model on any data. When the task data folder includes unlabeled test inputs, AIBuildAI also writes a predicted-label file submission.csv.
We provide ready-to-run task markdowns and datasets in the tasks/ folder of this repository. You can also write your own task description and point the run at your own dataset.
git clone https://github.com/aibuildai/AI-Build-AI.gitThis project is licensed under the MIT License - see the LICENSE file for details.
@article{zhang2026aibuildai,
title={AIBuildAI: An AI Agent for Automatically Building AI Models},
author={Ruiyi Zhang and Peijia Qin and Qi Cao and Li Zhang and Pengtao Xie},
year={2026},
journal={arXiv},
url={https://arxiv.org/abs/2604.14455}
}
