Name	Name	Last commit message	Last commit date
parent directory ..
test	test
DEMO.md	DEMO.md
DEMO_ZH.md	DEMO_ZH.md
README.md	README.md
README_ZH.md	README_ZH.md
__init__.py	__init__.py
agent_helper.py	agent_helper.py
app_deploy.py	app_deploy.py
operator_tools_adapter.py	operator_tools_adapter.py
prompts.py	prompts.py
session_logger.py	session_logger.py
setup_server.sh	setup_server.sh

Data-Juicer Q&A Copilot

Q&A Copilot is the question-answering component of Data-Juicer Agents. It runs as an AgentScope-based web service and answers Data-Juicer ecosystem questions with a combination of LLM reasoning, GitHub MCP retrieval, and operator lookup tools.

You can chat with Juicer on the official Data-Juicer documentation site.

Core Components

Agent: ReActAgent-based Q&A service
GitHub MCP Integration: search_repositories, search_code, and get_file_contents
Operator Tools: retrieve_operators_api (llm mode) and get_operator_info
Session Storage: JSON-based storage by default, Redis optional
Web API: REST endpoints for chat, memory, clear, and feedback

Quick Start

Prerequisites

Python >=3.10, <=3.12
DashScope API key
GitHub token
Redis server only if you want SESSION_STORE_TYPE=redis

Installation

Install dependencies.

cd ..
uv pip install '.[copilot]'
cd qa-copilot

Export required environment variables.

export DASHSCOPE_API_KEY="your_dashscope_api_key"
export GITHUB_TOKEN="your_github_token"

Optional session storage configuration.

export SESSION_STORE_TYPE="json"  # or "redis"

# JSON mode
export SESSION_STORE_DIR="./sessions"
export SESSION_TTL_SECONDS="21600"
export SESSION_CLEANUP_INTERVAL="1800"

# Redis mode
export REDIS_HOST="localhost"
export REDIS_PORT="6379"
export REDIS_DB="0"
export REDIS_PASSWORD=""
export REDIS_MAX_CONNECTIONS="10"

Optional service configuration.

export DJ_COPILOT_SERVICE_HOST="127.0.0.1"
export DJ_COPILOT_SERVICE_PORT="8080"
export DJ_COPILOT_ENABLE_LOGGING="true"
export DJ_COPILOT_LOG_DIR="./logs"
export FASTAPI_CONFIG_PATH=""
export SAFE_CHECK_HANDLER_PATH=""

Start the service.
```
bash setup_server.sh
```

Runtime Behavior

Model

Default model: qwen3.6-plus
Transport: DashScope OpenAI-compatible endpoint
Streaming: enabled
The runtime applies local formatter-based truncation with OpenAIChatFormatter.
Provider-side context window is 1M tokens; the local formatter conservatively truncates at 0.8M tokens to leave headroom for tokenizer mismatch between DashScope/Qwen serving and the local OpenAI-compatible token counter.

Mounted Tools

The current QA runtime mounts these tools:

GitHub MCP:
- search_repositories
- search_code
- get_file_contents
Operator tools:
- retrieve_operators_api
- get_operator_info

retrieve_operators_api is wrapped so that QA always uses llm retrieval mode internally.

API

1. Q&A Conversation

POST /process
Content-Type: application/json

{
  "input": [
    {
      "role": "user",
      "content": [{"type": "text", "text": "How do I use Data-Juicer for data cleaning?"}]
    }
  ],
  "session_id": "your_session_id",
  "user_id": "user_id"
}

2. Get Session History

POST /memory
Content-Type: application/json

{
  "session_id": "your_session_id",
  "user_id": "user_id"
}

3. Clear Session History

POST /clear
Content-Type: application/json

{
  "session_id": "your_session_id",
  "user_id": "user_id"
}

4. Submit User Feedback

POST /feedback
Content-Type: application/json

{
  "data": {
    "message_id": "message_id_here",
    "feedback_type": "like",
    "comment": "optional user comment"
  },
  "session_id": "your_session_id",
  "user_id": "user_id"
}

Feedback parameters:

message_id: target message id
feedback_type: like or dislike
comment: optional free-form comment

WebUI

You can launch the Runtime WebUI with:

npx @agentscope-ai/chat agentscope-runtime-webui --url http://localhost:8080/process

If you change DJ_COPILOT_SERVICE_PORT, update the WebUI URL accordingly.

See AgentScope Runtime WebUI for more details.

Environment Variables

JSON session settings only apply when SESSION_STORE_TYPE=json. Redis settings only apply when SESSION_STORE_TYPE=redis.

Variable	Required	Default	Description
`DASHSCOPE_API_KEY`	✅ Yes	-	DashScope API key
`GITHUB_TOKEN`	✅ Yes	-	GitHub token for MCP integration
`SESSION_STORE_TYPE`	❌ No	`"json"`	Session storage type: `"json"` or `"redis"`
`SESSION_STORE_DIR`	❌ No	`"./sessions"`	Session file directory in JSON mode
`SESSION_TTL_SECONDS`	❌ No	`21600`	Session TTL in JSON mode
`SESSION_CLEANUP_INTERVAL`	❌ No	`1800`	Cleanup interval in JSON mode
`REDIS_HOST`	❌ No	`"localhost"`	Redis host in Redis mode
`REDIS_PORT`	❌ No	`6379`	Redis port in Redis mode
`REDIS_DB`	❌ No	`0`	Redis database number
`REDIS_PASSWORD`	❌ No	unset	Redis password
`REDIS_MAX_CONNECTIONS`	❌ No	`10`	Redis max connections
`DJ_COPILOT_SERVICE_HOST`	❌ No	`"127.0.0.1"`	Service host
`DJ_COPILOT_SERVICE_PORT`	❌ No	`8080`	Service port
`DJ_COPILOT_ENABLE_LOGGING`	❌ No	`"true"`	Enable session logging
`DJ_COPILOT_LOG_DIR`	❌ No	`qa-copilot/logs`	Log directory. If unset, logs are written under the `logs` directory next to `session_logger.py`
`FASTAPI_CONFIG_PATH`	❌ No	`""`	Optional FastAPI config JSON file
`SAFE_CHECK_HANDLER_PATH`	❌ No	`""`	Optional safe-check handler module

Troubleshooting

Common Issues

Redis connection failure in SESSION_STORE_TYPE=redis
- Check redis-cli ping
- Verify REDIS_HOST, REDIS_PORT, REDIS_DB, and REDIS_PASSWORD
MCP startup failure
- Ensure GITHUB_TOKEN is exported
- Confirm the token has the required access for GitHub MCP
DashScope authentication or quota failure
- Verify DASHSCOPE_API_KEY
- Check Model Studio quota and model availability
Custom config or safe-check handler not loading
- Verify FASTAPI_CONFIG_PATH points to a valid JSON file
- Verify SAFE_CHECK_HANDLER_PATH points to an importable Python module

Acknowledgments

Parts of the service scaffolding and MCP integration were adapted from AgentScope Samples - Alias.

License

This project uses the same license as the main project. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Data-Juicer Q&A Copilot

Core Components

Quick Start

Prerequisites

Installation

Runtime Behavior

Model

Mounted Tools

API

1. Q&A Conversation

2. Get Session History

3. Clear Session History

4. Submit User Feedback

WebUI

Environment Variables

Troubleshooting

Common Issues

Acknowledgments

License

Related Links

FilesExpand file tree

qa-copilot

Directory actions

More options

Directory actions

More options

Latest commit

History

qa-copilot

Folders and files

parent directory

README.md

Data-Juicer Q&A Copilot

Core Components

Quick Start

Prerequisites

Installation

Runtime Behavior

Model

Mounted Tools

API

1. Q&A Conversation

2. Get Session History

3. Clear Session History

4. Submit User Feedback

WebUI

Environment Variables

Troubleshooting

Common Issues

Acknowledgments

License

Related Links