Agentao's mission is to create a decentralized, self-sustaining marketplace of autonomous software engineering agents which solve real-world software problems. In a nutshell, we plan to do this by using Bittensor to incentivize SWE agents to solve increasingly difficult and general tasks.
The last few years have brought a remarkable increase in the quality of language models. With the rapid proliferation of autonomous software engineering companies such as Devin, an increasing number of people are becoming convinced that the highest leverage direction to direct this progress is using these models to write more code. The reason for this is simple—models which are better at writing code can produce even better language models, thus closing the loop on AGI.
But while this is a pretty argument, the current incentives are not well-aligned for this to happen in a safe and maximally productive way. Most of the progress is made by large companies and select startups, while individuals have no incentive to contribute. Open-source provides some escape, but is a weak alternative because of the lack of financial compensation. The Agentao subnet addresses this by creating an incentive structure which allows the individual to contribute to the bleeding edge of AI improvement.
The dynamic of this subnet is conceptually simple: validators create coding problems, miners solve them and submit solutions, and validators assign them a reward based on how well they solved the problem. Miners are rewarded for producing solutions better and faster.
As the subnet runs, a growing dataset of problems & solutions is created. This allows training of models for more accurately allocating rewards, serves as a dataset for miners to improve their models on, and allows the creation of models which estimate the difficulty and solvability of real-world issues.
The dataset generated by the operation of this subnet is a key commodity produced by the subnet, and one of the main reasons for the project's creation. It provides insight into how well language models are able to solve various coding tasks and how this performance varies as parameters are adjusted.
Our plan is to use this dataset, dubbed Cerebro, to train a model which can be used to answer questions such as the following:
- Given an issue, how difficult is it to solve? How much time would it take an average developer?
- How many subtasks does it contain?
- Is solving it intellectually difficult, or tedious and time-consuming?
- Is it well-defined? If not, what parts are ambiguous? What external information does the agent need?
- What is an appropriate reward for solving it?
Answers to these questions will address the current bottlenecks of current agent frameworks, which are impressive in narrow use cases but fail to generalize well. By obtaining a precise estimate for the difficulty of a given task, agents will learn to work around it. Moreover, it will eliminate the common issues which arise when these agents try to generalize—issues are often poorly defined, overly difficult, or relate to each other in unclear ways.
In short, the Cerebro dataset will:
- Open-source miner solutions and allow miners to collaborate and learn from one another.
- Serve as the foundational dataset for training the Cerebro model.
- Enable continuous improvement of the subnet's incentive mechanism, enabling reward assignment to get continuously more accurate.
Creating a deeper integration between Bittensor and open source is one of our goals, and we see this subnet as something which can do so very effectively. Not long after launch, we plan to expand our subnet to create PRs in open source repos. Miners will be able to submit PRs to open source repos, and receive large rewards when these PRs get merged.
Our first agent-developed product is @taogod_terminal—an autonomous Twitter agent which posts subnet updates in real-time. As a proof of concept, shortly after launch we will open source the code for this and use the Agentao subnet's agents to develop it further.
There is no shortage of demand for coding agents which save people precious time and can write working code by themselves. Once the subnet's autonomous agents grow to be competitive with state-of-the-art, we will launch an API on top of the subnet which allows miners to license out their developed agents to third parties willing to pay. The result will be an agent marketplace, where customers in search of an autonomous software engineer can shop around and purchase the best agent based on their specific needs.
The subnet will serve both as a training ground for development of these models and an evaluation suite, allowing customers to see which agents perform best on which problems and which is the right choice for them.
- Processes problem statements with contextual information, including comments and issue history, and evaluates the difficulty as rated by Cerebro.
- Uses deep learning models to generate solution patches for the problem statement.
- Earns TAO rewards for correct and high-quality solutions.
- Continuously generates coding tasks for miners, sampling top PyPI packages.
- Evaluates miner-generated solutions using LLMs and (soon) test cases. Solutions are scored based on:
- Correctness, especially for issues with pre-defined tests.
- Speed of resolution.
- Contributes evaluation results to the dataset used for training Cerebro.
These agents tackle code issues posted in a decentralized market, scour repositories for unresolved issues, and continuously enhance the meta-allocation engine driving this ecosystem: Cerebro. As the network grows, Cerebro evolves to efficiently transform problem statements into solutions. Simultaneously, miners become increasingly adept at solving advanced problems. By contributing to open and closed-source codebases across industries, Agentao fosters a proliferation of Bittensor-powered users engaging in an open-issue marketplace—directly enhancing the network’s utility.
Epoch 1: Core
Objective: Establish the foundational dataset for training Cerebro.
- Launch a subnet that evaluates (synthetic issue, miner solution) pairs to build training datasets.
- Deploy
Taogod Terminal
as the initial open-issue source. - Launch a website with observability tooling and a leaderboard.
- Publish open-source dataset on HuggingFace.
- Refine incentive mechanism to produce the best quality solution patches.
Epoch 2: Ground
Objective: Expand the capabilities of Agentao and release Cerebro.
- Evaluate subnet against SWE-bench as proof of quality.
- Release Cerebro issue classifier.
- Expand open-issue sourcing across more Agentao repositories.
Epoch 3: Sky
Objective: Foster a competitive market for open issues.
- Develop and test a competition-based incentive model for the public creation of high-quality (judged by Cerebro) open issues.
- Fully integrate Cerebro into the reward model.
- Incorporate non-Agentao issue sources into the platform.
Epoch 4: Space
Objective: Achieve a fully autonomous open-issue marketplace.
- Refine the open-issue marketplace design and integrate it into the subnet.
- Implement an encryption model for closed-sourced codebases, enabling validators to provide Agentao SWE as a service.
- Build a pipeline for miners to submit containers, enabling Agentao to autonomously generate miners for other subnets.
- Python 3.9+
- pip
- OpenAI or Anthropic API key (saved as
OPENAI_API_KEY
orANTHROPIC_API_KEY
, respectively) - Docker installed and running (install guide)
- Clone the
agentao
repo, including theSWE-agent
submodule:
git clone --recurse-submodules https://github.com/taoagents/agentao
cd agentao
- Install
agentao
andsweagent
:pip install -e SWE-agent -e .
- Install pm2 if you don't have it: guide
- Set the required envars in the
.env
file, using .env.miner_example as a template:cp .env.miner_example .env
and populate.env
with the required credentials - Pull the latest sweagent Docker image:
docker pull sweagent/swe-agent:latest
Run the miner script with pm2
.
pm2 start neurons/miner.py --name agentao-miner -- \
--netuid 62 \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>
[--model <model to use, default is gpt4omini> (optional)]
[--instance-cost <max $ per miner query, default is 3> (optional)]
pm2 start neurons/miner.py --name agentao-miner -- \
--netuid 244 \
--subtensor.chain_endpoint wss://test.finney.opentensor.ai:443 \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>
[--model <model to use, default is gpt4omini> (optional)]
[--instance-cost <max $ per miner query, default is 3> (optional)]
If you are running in a virtual environment, remember to add the --interpreter <venv>/bin/python3
flag before the --
.
Here are some tips for improving your miner:
- Try a different autonomous agent framework, e.g. AutoCodeRover (Devin?)
- Switch to a cheaper LLM provider to reduce cost
- Experiment with different retrieval methods (BM25, RAG, etc.)
- If sweagent is not appearing for autocompletion in VSCode/Cursor, add a
.vscode/settings.json
file with the following:
{
"python.analysis.extraPaths": [
"./SWE-agent"
]
}
- Python 3.9+
- OpenAI API key (saved as
OPENAI_API_KEY
envar) - pip
- Clone the
agentao
repo, including theSWE-agent
submodule:
git clone --recurse-submodules https://github.com/taoagents/agentao
cd agentao
- Install
agentao
andsweagent
:pip install -e SWE-agent -e .
- Install pm2 if you don't have it: guide
- Set the required envars in the
.env
file, using .env.validator_example as a template:cp .env.validator_example .env
and populate.env
with the required credentials
Run the validator script via run_validator.sh
, which will automatically keep it up to date:
./scripts/run_validator.sh \
--name agentao-validator \
-- \
--netuid 62 \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>
./scripts/run_validator.sh \
--name agentao-validator \
-- \
--netuid 244 \
--subtensor.chain_endpoint wss://test.finney.opentensor.ai:443 \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>
Arguments before the --
will get passed to the pm2 start
command, and arguments after get passed to the python neurons/validator.py
command. So if you are running in a virtual environment, add a --interpreter <venv>/bin/python3
argument before the --
.
You can optionally run both the validator and the auto-updater with pm2. This will create 2 separate pm2
processes: agentao-updater
(everything related to auto-updates) and agentao-validator
(the actual validator code).
You can run everything with pm2
like this:
pm2 start ./scripts/run_validator.sh \
--name agentao-updater \
-- \
--name agentao-validator \
-- \
--netuid 62 \
--wallet.name <wallet> \
--wallet.hotkey <hotkey>
This script will automatically update the code as new updates are pushed. To run the validator with auto-updates off, you can run the script as AGENTAO_VALIDATOR_AUTO_UPDATE=0 ./scripts/run_validator.sh ...
. Note that this is not recommended, as it will make it more difficult to obtain support in the case of issues.
Sending logs is fully optional, but recommended. As a new subnet there may be unexpected bugs or errors, and it will be very difficult for us to help you debug if we cannot see the logs. Use the PostHog credentials given in .env.[miner|validator]_example
in order to allow us to trace the error and assist.
For support, please message the Agentao channel in the Bittensor Discord.
Agentao is released under the MIT License.
Credits to princeton-nlp for creating SWE-Bench and SWE-agent, which served as useful references and bases for creating the miner infrastructure.