Proposal: CCpedia - Canton Knowledge Infrastructure#439
Conversation
Signed-off-by: Fearless Oleksii <143862878+0xFearless-1@users.noreply.github.com>
Signed-off-by: Fearless Oleksii <143862878+0xFearless-1@users.noreply.github.com>
|
Champion identified Digital Asset The committee will verify this champion during review. |
|
SIG labels auto-detected and applied: If this is incorrect, you can ask the reviewers to update the labels. |
|
What ccpedia is: a single searchable layer over the 15 places Canton info lives in today (forum, mailing lists, both GitHub orgs, docs, YouTube, dev fund, whitepapers, ecosystem, canton.foundation). The mailing list archives, the dev fund history, and the YouTube transcripts aren't indexed anywhere else. Who actually uses it: community devs onboarding to Canton, people writing about Canton (the Messari research report is one example, we just integrated it), Foundation DevRel who want to track what changes week to week. AI tooling is one consumer of the index, not the goal. What's live right now at ccpedia.xyz (you can see all of this at /api/v1/sources): 15 sources, 33 MCP tools, around 39k RAG chunks, 124 CIPs, 4,147 forum topics, 2,885 doc pages, 880 mailing messages across 6 groups (cip-discuss, cip-vote, cip-announce, globalSyncForum, grants-discuss, validator-announce), 443 dev fund items, 165 YouTube videos with 51 transcribed (the other 114 are still being processed because YouTube rate-limits cloud IPs), 113 GitHub releases, 184 ecosystem projects, 41 canton.foundation pages, plus the 150 entries from Jatin's Build-on-Canton-MCP Curated KB that we integrated after he asked us to, with attribution preserved. In the past week the MCP server grew from 22 to 33 tools. The new ones are list_videos, semantic_search, get_cip_votes, list_mailing_threads, get_mailing_thread, get_foundation_info, plus 5 tools wired to Jatin's KB. We also added the Daml YouTube channel (83 historical videos going back to 2019), the Messari research report, the cip-vote and cip-announce mailing lists, and canton.foundation as a doc source. Transcripts now come straight from YouTube's internal API, no yt-dlp and no cookies. What the funding covers: keeping the sync running across all sources, embeddings and LLM inference, hosting, and the ongoing work of adding new sources as they show up, new MCP tools, search improvements, basically keeping ccpedia aligned with how Canton itself keeps changing. The full open-source release under Apache 2.0 is the M1 deliverable once the grant is approved. CC @waynecollier-da @hythloda @stas-sbi @tkatrichenko @isegall-da @Andrew-Pohl @LimKianAn @Denend @nycnewman @mziolekda @monsieurleberre @hrischuk-da @paulbrauner-da. If anyone wants a walk-through of ccpedia.xyz, or has a specific section you want more detail on, just let me know. Happy to point at concrete examples so you can form a read. |
Milestone 1 - scope update (live infrastructure has grown since submission)Already live in production at ccpedia.xyz - the code is closed-source today; the MCP service runs in production now. Every number below is independently verifiable at https://ccpedia.xyz/api/v1/sources:
The 33 production tools M1:
|
|
thoughts @0xFearless-1
The tool is of value yes, but as a tool whose alternatives are available for free and grant to make it although its running already, that'd be conflicting for me to understand. Suggestion: Instead of grant to make an already running MCP service, better approach is to explore options of only maintain it for public good out there. |
Proposal file: /proposals/2026-06-UnityNodes-ccpedia.md
Summary
ccpedia is Canton's live knowledge infrastructure: 50,000+ records from 15 sources, 37,000+ RAG chunks, and 22 MCP tools already running in production. Phase 2 expands to 60 tools across 17 sources, open-sources the infrastructure under Apache 2.0, and gates 65% of funding on verified adoption by independent Canton teams (tiered: 3 / 5 / 7+ teams).
Checklist
/proposals/