AI Code Mentor

AI Code Mentor is a runtime environment designed for building applications with integrated autonomous AI agents. These applications are defined as workflows, written in Markdown files (.wf.md), which are interpreted and executed by the Code Mentor system.

The AI agents (multiple instances are supported) generate and execute commands (e.g., Bash shell commands) directly, feeding the output back to the AI for analysis and iteration. This allows AI agents to self-improve their tasks when necessary. Currently, OpenAI's platform is integrated, with plans to add alternative AI models in the future.

Features & Benefits

Key Features

Local Execution: Data remains local and is processed inside a Docker container, avoiding unnecessary cloud uploads.
Autonomous Execution: AI-generated commands are executed automatically within the container, eliminating the need for manual copy-pasting.
Multi-Agent Support: Multiple AI agents can operate simultaneously within the same workflow, distributing tasks efficiently.
Structured Workflow Definition: Workflows define clear goals, paths, branches, and alternative solutions, providing better AI context compared to free-text prompting.
Scalability & Automation: Execute workflows on multiple targets, e.g., grading 100+ source code submissions automatically.
Traceability & Transparency: All AI-generated commands and their results are logged, providing visibility into the execution process.

Example Use Case: Automatic Grading of Software Development Assignments

AI Code Mentor enables autonomous evaluation of source code projects based on program requirements and specifications. The AI agents:

Analyze student submissions.
Provide feedback.
Assign grades automatically.
Iterate if necessary to refine evaluations.

This is particularly useful for software development educators, reducing manual grading effort while maintaining fairness and transparency.
For more details see AI CodeMentor – Automating the Evaluation of Programming Assignments.

Screenshots

Workflow Execution for a REST Service Evaluation

Console Output of an AI Code Mentor Run

Implementation Status

🚀Latest News in version: 0.1.5

Added batch-processing, e.g. to run benchmarks
AI-Agent configuration parameters support, e.g temperature, n_top, f_penalty,..
AI-Agent telemetry recording, e.g. tokens used, duration, iterations
added Google Gemini and Anthropic Claude (as AI-Agents) support

Version history see app/version.py

Future Enhancements

Integrate local running AI agents (using Huggingface)
create agent instead of "Prompt: System"
Fix COTH mixup with reasoning AI-Agents
Implement PLaG technique (see docs/literature/PLaG.md) as a sample - should fit perfectly to CodeMentor
Enhance AI agent feedback loops for self-improvement: Adapt temperature Setting: start with 0, increase on IMPROVE path
Implement workflow for fhtw-bif5-swkom-paperless
Workflow Validation
Implement command execution whitelists/blacklists and a reputation mechanism for security.
Develop a collaboration model for AI agents.
Introduce a server mode with a REST API (eliminating volume mount dependencies).
CodeRunner Integration? (as docker-container)

Research Questions

1. How can AI Code Mentor achieve full AI autonomy?

Follow Google's AI Agent definition:
- Integration with external systems.
- Session-based interactions with multi-turn inference.
- Native tool integration within the AI agent architecture.
- AI agents learn from feedback without external intervention.

2. How does AI-driven software development differ from traditional methods?

AI-generated output is inherently non-deterministic, requiring flexible error-handling mechanisms.
Unlike structured programming, AI-driven execution lacks formal grammars, requiring robust parsers for interpreting agent outputs.
AI-generated outputs should be iteratively improved rather than statically parsed.

Usage

Prerequisites

Install Docker: Get Docker
You need to create an account and credentials on the cloud AI vendor of your choice (e.g. OpenAI, Google or Anthropic): see docs/setup.md
Create an .env file in the docker/ directory, based on the docker/.env.sample file.

Running AI Code Mentor

Linux:

bin/run_codementor.sh [options] <workflow-file.md> [<key=value> ...]

Windows:

bin\run_codementor.ps1 [options] <workflow-file.md> [<key=value> ...]

For more details, see bin/README.md.

Command-Line Options:

Option	Description
-h, --help	Show help message
-v, --version	Display version info
--verbose	Show log output in console

Arguments:

<workflow-file.md>: Markdown file defining the workflow (e.g., workflows/check-toolchain.wf.md).
[<key=value> ...]: Optional key-value parameters passed to the workflow.

Example Execution

Run the check-toolchain workflow with a specific REPO_URL:

Linux:

bin/run_codementor-java.sh workflows/check-toolchain.wf.md REPO_URL=https://github.com/BernLeWal/fhtw-bif5-swkom-paperless.git

Windows:

PS > bin\run_codementor-java.ps1 workflows/check-toolchain.wf.md REPO_URL=https://github.com/BernLeWal/fhtw-bif5-swkom-paperless.git

⚠️ Note: Path parameters must use forward slashes (/) since the application runs in a Linux-based environment.

Logging

The console (stdout/stderr) is reserved for CLI output.
Logs are stored in: log/codementor.log
Docker logs remain empty unless running with --verbose or in --server mode.

Create your own Workflows

See the docs/tutorial to get started with creating your own worklfows

.
├── app                         # The CodeMentor sources (Python)
├── artwork                     # Logos, etc.
├── bin                         # Scripts to run (and build) the CodeMentor
├── docker                      # Docker-Environment in which the CodeMentor will run
│   ├── codementor              # - CodeMentor can execute BASH (and Python) commands
│   └── codementor-java         # - CodeMentor can execute BASH commands and has a Java21+Maven environment
├── docs                        # Documentation
├── log                         # AI CodeMentor will output the application logs into that directory
├── output                      # AI CodeMentor will output the trace-files here, for detailed investigations
├── test                        # Unit-Tests for the Python application
└── workflows                   # Contains the workflow files, which AI CodeMentor will execute

Development

Setup

Install Python & Dependencies

sudo apt install python3 python3-pip -y
pip install virtualenv
python -m virtualenv .venv
source .venv/bin/activate
pip install -r requirements.txt

Running & Debugging

Set PYTHONPATH before running:

Linux (Bash):
```
export PYTHONPATH=$(pwd)
```
Windows (PowerShell):
```
$env:PYTHONPATH = (Get-Location).Path
```

Run the application with a workflow file:

python app/main.py workflows/check-toolchain.wf.md

Help & Usage:
```
python app/main.py -h
```
Run the Test-Suite

Pytest is used to run the unit-tests, start it from the project root directory:
```
pytest
```

Sample Workflow Execution

python app/main.py workflows/source-eval/paperless-sprint1.wf.md REPO_URL=https://github.com/BernLeWal/fhtw-bif5-swkom-paperless.git

Documentation

For software architecture and implementation details, see docs/README.md.

User documentation and Tutorials are here: docs/tutorial.

License

This project is licensed under the AGPLv3 open-source license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Code Mentor

Features & Benefits

Key Features

Example Use Case: Automatic Grading of Software Development Assignments

Screenshots

Workflow Execution for a REST Service Evaluation

Console Output of an AI Code Mentor Run

Implementation Status

Future Enhancements

Research Questions

1. How can AI Code Mentor achieve full AI autonomy?

2. How does AI-driven software development differ from traditional methods?

Usage

Prerequisites

Running AI Code Mentor

Command-Line Options:

Arguments:

Example Execution

Logging

Create your own Workflows

Contents

Development

Setup

Running & Debugging

Sample Workflow Execution

Documentation

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.vscode		.vscode
app		app
artwork		artwork
bin		bin
docker		docker
docs		docs
secrets		secrets
test		test
workflows		workflows
.env.sample		.env.sample
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

License

BernLeWal/AICodeMentor

Folders and files

Latest commit

History

Repository files navigation

AI Code Mentor

Features & Benefits

Key Features

Example Use Case: Automatic Grading of Software Development Assignments

Screenshots

Workflow Execution for a REST Service Evaluation

Console Output of an AI Code Mentor Run

Implementation Status

Future Enhancements

Research Questions

1. How can AI Code Mentor achieve full AI autonomy?

2. How does AI-driven software development differ from traditional methods?

Usage

Prerequisites

Running AI Code Mentor

Command-Line Options:

Arguments:

Example Execution

Logging

Create your own Workflows

Contents

Development

Setup

Running & Debugging

Sample Workflow Execution

Documentation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages