[Nadiad] Yugkumar Mistry — Vibe Coding Submission by Yug-Mistry · Pull Request #9 · nasscomAI/rag-to-mcp

Yug-Mistry · 2026-04-14T10:47:41Z

RAG-to-MCP — Submission PR

Name: Yugkumar Nadiad
City / Group: Nadiad
Date: 14 April 2026
AI tool(s) used: GitHub Copilot (Claude Sonnet 4.6)

Submission Checklist

UC-0A — Complaint Classifier

Which failure mode did you encounter first?

Taxonomy drift — the naive prompt invented category names like "Road Issue" and "Drainage Problem" instead of using the exact schema values. The same complaint type received different labels across rows.

Which enforcement rule fixed it? Quote from your agents.md:

"Category must be exactly one value from the allowed list: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other. No variations or invented names."

Your commit message for UC-0A:

UC-0A Generated agents.md and skills.md from README, implemented classifier

Verification checkpoints:

All severity-signal rows (injury/child/school/hospital keywords) classified as Urgent
No invented categories outside the defined taxonomy
Justification column present and non-empty for every row

UC-RAG — RAG Server

Which failure mode did you encounter?

Chunk boundary failure — policy clause 5.2 ("requires approval from the Department Head and the HR Director") was split across two fixed-size chunks. Neither chunk alone contained the complete dual-approver obligation, so retrieval returned an incomplete answer.

What chunking strategy did you use and why?

Sentence-boundary chunking: text is split on sentence-ending punctuation and sentences are accumulated until the 400-token limit is reached. If adding the next sentence would exceed the limit, the current chunk is flushed first and the sentence opens a new chunk. This guarantees no clause is cut mid-sentence regardless of clause length.

Did your system correctly refuse "What is the flexible working culture?"?

Yes — no chunk scored above the 0.3 threshold for this query. The refusal template was returned with all retrieved chunk sources listed and no LLM call was made.

Did your system retrieve the correct document for "Can I use my personal phone for work files?"?

Yes — top retrieved chunks were policy_it_acceptable_use.txt chunk 0 and policy_it_acceptable_use.txt chunk 1. No HR leave chunks appeared in the passing set.

Which enforcement rule in agents.md prevented answers outside retrieved context?

"Answers must use only information present in the retrieved chunks. Never add context, assumptions, or qualifications from outside the retrieved set."

Your commit message for UC-RAG:

UC-RAG Generated agents.md and skills.md from README, implemented RAG server

Verification checkpoints:

At least 3 test queries return grounded answers (cited from retrieved context)
"What is the flexible working culture?" returns the refusal template (not a hallucinated answer)
"Can I use my personal phone for work files?" retrieves IT policy, not HR leave policy
Chunking produces more than 1 chunk per document (not whole-document embedding)

UC-MCP — MCP Server

Paste your tool description from mcp_server.py TOOL_DEFINITION:

"Answers questions about City Municipal Corporation (CMC) policy documents: HR Leave Policy, IT Acceptable Use Policy, and Finance Reimbursement Policy. Returns answers grounded in retrieved document chunks with cited sources. Questions outside these three documents return a refusal message — this tool does not answer general knowledge questions, budget forecasts, or topics not covered by the indexed CMC policy documents."

Does it state the document scope explicitly?

Yes — names all three policy documents and explicitly states what the tool will not answer.

Run result: python test_client.py --run-all

✅ tools/list — tool discovered with correct scope description
✅ In-scope: "Who approves leave without pay?" — answer returned
✅ Cross-doc: "Can I use my personal phone for work files?" — answer returned
✅ Out-of-scope: "What is the budget forecast for 2025?" — correctly refused
✅ Unknown method → -32601 error returned

Did the budget forecast question return isError: true?

Yes — no chunk scored above 0.3 for this query. The refusal template was returned with isError: true and no LLM call was made.

In one sentence — why is the tool description the enforcement?

The agent reads the tool description to decide when to call the tool, so a vague description grants implicit permission to call it for questions it cannot answer, wasting tool calls and producing empty or hallucinated responses.

Your commit message for UC-MCP:

UC-MCP Generated agents.md and skills.md from README, implemented MCP server

Verification checkpoints:

Tool description explicitly states document scope (which policies are covered)
Tool description states refusal behavior for out-of-scope queries
python test_client.py --run-all executes without connection error
Budget forecast question returns isError: true (out of scope)

CRAFT Reflection

Which step of the CRAFT loop was hardest across all three UCs?

Constrain — specifically calibrating the similarity threshold in UC-RAG. Writing the rule "refuse if score below 0.6" was easy; discovering that all-MiniLM-L6-v2 produces scores of 0.3–0.5 for semantically related but non-verbatim policy text required running the pipeline end-to-end and reading raw distance values. The rule looked correct on paper but failed silently at inference time until grounded in observed model behaviour.

What did you add to agents.md manually that the AI did not generate?

In UC-RAG agents.md, the explicit cross-document separation rule: "If the query spans two documents, retrieve from each document separately. Never merge retrieved chunks from different documents into a single blended answer." The AI generated a generic grounding rule but did not restrict per-document retrieval, which is the specific enforcement needed to prevent IT+HR policy blending.

One specific task in your real work where you will use R.I.C.E in the next 7 days:

Building an internal document Q&A bot for onboarding — new employees currently get inconsistent answers sourced from a mix of HR, IT, and Finance wikis. I will apply RICE to scope the agent strictly to indexed wiki pages and CRAFT to test whether naive retrieval blends documents before writing any enforcement rules.

…sifier

…_server.py

… Server

github-actions · 2026-04-14T10:47:51Z

Hi there, participant! Thanks for joining our RAG-to-MCP Workshop!

We're reviewing your PR for the 3 Use Cases (UC-0A, UC-RAG, UC-MCP). Once your submission is validated and merged, you'll be awarded your completion badge!

Next Steps:

Make sure all 3 UCs are finished.
Ensure your commit messages match the required format.
Fill out every section of the PR template.
Good luck!

Yug-Mistry-STTL added 5 commits April 14, 2026 11:20

UC-0A Generated agents.md and skills.md from README, implemented clas…

ffa6916

…sifier

UC-RAG Generated agents.md and skills.md from README, implemented rag…

8ea921a

…_server.py

UC-RAG Fix query function missing in RAG server

aab778d

UC-RAG Fix thresold issue in RAG server

fa43139

UC-MCP Generated agents.md and skills.md from README, implemented MCP…

4d674d0

… Server

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Nadiad] Yugkumar Mistry — Vibe Coding Submission#9

[Nadiad] Yugkumar Mistry — Vibe Coding Submission#9
Yug-Mistry wants to merge 5 commits into
nasscomAI:masterfrom
Yug-Mistry:participant/Yugkumar-Nadiad

Yug-Mistry commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Yug-Mistry commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

RAG-to-MCP — Submission PR

Submission Checklist

UC-0A — Complaint Classifier

UC-RAG — RAG Server

UC-MCP — MCP Server

CRAFT Reflection

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Yug-Mistry commented Apr 14, 2026 •

edited

Loading