docs: added Agent Description Standard page in Technical Docs#17
docs: added Agent Description Standard page in Technical Docs#17pia-roettcher wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a new documentation page for the Agent Description Standard to the technical documentation section. This standard provides guidelines for how AI agents should be described in the MASUMI Registry, helping developers create consistent, comprehensive, and discoverable agent descriptions.
- Adds a new documentation page defining the Agent Description Standard with templates and best practices
- Updates navigation metadata to include the new documentation page in the technical documentation section
- Provides both an annotated example template and a raw template for agent descriptions
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| content/docs/documentation/technical-documentation/meta.json | Adds "agent-description-standard" entry to the technical documentation navigation menu |
| content/docs/documentation/technical-documentation/agent-description-standard.mdx | New comprehensive documentation page with templates, examples, and best practices for describing AI agents |
| content/docs/documentation/meta.json | Adds "technical-documentation/agent-description-standard" entry to the main documentation navigation structure |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ```jsx | ||
| ## Overview | ||
| Provide a clear, professional overview of your agents service. |
There was a problem hiding this comment.
Grammar error: "agents" should be "agent's" (possessive form). The phrase should read "your agent's service" to indicate the service belongs to the agent.
| Provide a clear, professional overview of your agents service. | |
| Provide a clear, professional overview of your agent's service. |
|
|
||
| - **User Input:** [What data the user provides] | ||
| - **Data Collection:** [Where and how data is sourced] | ||
| - **Processing:** [How the agent analyses and extracts patterns] |
There was a problem hiding this comment.
Spelling inconsistency: "analyses" should be "analyzes" to match American English spelling used elsewhere in the document (see line 41 which uses "analyzes").
| - **Processing:** [How the agent analyses and extracts patterns] | |
| - **Processing:** [How the agent analyzes and extracts patterns] |
|
|
||
| ## Informative Agent Description Standard with examples | ||
|
|
||
| ```jsx |
There was a problem hiding this comment.
Incorrect code block language identifier. The content is markdown/text, not JSX. Consider using markdown or (no language identifier) instead of ```jsx to properly indicate the content type. JSX is for JavaScript XML syntax, not markdown templates.
| ```jsx | |
| ```markdown |
|
|
||
| ## Raw Agent Description Format | ||
|
|
||
| ```jsx |
There was a problem hiding this comment.
Incorrect code block language identifier. The content is markdown/text, not JSX. Consider using markdown or (no language identifier) instead of ```jsx to properly indicate the content type. JSX is for JavaScript XML syntax, not markdown templates.
| ```jsx | |
| ```markdown |
| ## Overview | ||
| Provide a clear, professional overview of your agents service. | ||
|
|
||
| 1. The Problem | ||
| Clearly state the user’s challenge or need this agent solves (e.g., lack of real-time trend visibility, inefficient manual work, decision-making delays). | ||
|
|
||
| 2. The Solution | ||
| Explain how the agent solves the problem with AI. Focus on its core logic and outcomes, in one strong paragraph. | ||
|
|
||
| 3. Key Capabilities | ||
| **Capability 1:** [e.g. Detects trends across platforms in real-time] | ||
| **Capability 2:** [e.g. Identifies influencers and peak activity] | ||
| **Capability 3:** [e.g. Recommends strategic actions based on data] | ||
|
|
||
| 4. How It Works | ||
| **Input - Processing - Output**, broken down as: | ||
| - **User Input:** What data the user provides (e.g. keywords, industries) | ||
| - **Data Collection:** Where and how data is sourced (e.g. LinkedIn, blogs via API/scraper) | ||
| - **Processing:** How the agent analyzes and extracts patterns (model types, tools etc.) | ||
| - **Output:** What is returned (e.g. markdown, PDF report, JSON object, dashboard view) | ||
|
|
||
| 5. Transparency & Data Handling | ||
|
|
||
| | Field | Details | | ||
| |----------------------|-------------------------------------------------------------------------| | ||
| | Processing Location | Secure cloud infrastructure (servers based in [e.g., EU – Frankfurt]) | | ||
| | LLMs Used | [Specify model, e.g., GPT-4, Claude 3] for text understanding and summarization | | ||
| | Third-Party Tools | List Tools and APIs used by your agent. For example: <br> A) Unsplash API to fetch high-quality images for content generation or design tasks <br> B) Google Cloud Translation API for translating text between multiple languages <br> C) Scrapfly for thorough and targeted site scans | | ||
| | Data Usage | Website URLs and crawl data used temporarily for analysis; results may be cached briefly | | ||
| | Data Retention | Minimal; data processed in-memory or temporarily cached (no long-term storage) | | ||
| | Data Storage Location| Data processed and stored (temporarily) on EU-based infrastructure | | ||
| | Security Measures | TLS encryption, access-restricted storage, and internal audit logging | | ||
| | Privacy | Fully GDPR/DSGVO-compliant; only publicly accessible website data is analyzed | | ||
| | Legal Basis | Data processing based on legitimate interest (Art. 6(1)(f) GDPR) | | ||
| | User Rights | Users may request data access, correction, or deletion at any time via our support | | ||
| | Access Logs | No personal user tracking; minimal session logging for operational diagnostics only | | ||
| | Output Formats | - Structured SEO audit report <br> - Prioritized action plan <br> - Technical issue breakdown | | ||
|
|
||
| 6. Real-World Impact | ||
| Summarize tangible benefits: | ||
| - Saves X hours of manual research | ||
| - Boosts accuracy of trend recognition | ||
| - Enables faster strategic response | ||
|
|
||
| 7. Who It’s For | ||
| - **Enterprises:** Strategy, comms, marketing, foresight | ||
| - **Professionals:** Analysts, consultants, media planners | ||
| - **Consumers:** (If applicable) | ||
| - **General users** (if relevant) | ||
|
|
||
| 8. How to Use the Agent | ||
|
|
||
| **Input Examples** | ||
|
|
||
| | Input Field | Example | | ||
| |-----------------|-------------| | ||
| | Input field 1 | Example 1 | | ||
| | Input field 2 | Example 2 | | ||
| | Input field 3 | Example 3 | | ||
|
|
||
| **Prompt Tips and Limitations** | ||
| - “Be specific about topics, time frames, or platforms.” | ||
| - “You can request different output formats (e.g. charts, bullet lists, executive summary).” | ||
| - “Ask follow-ups to refine or extend the result.” | ||
|
|
||
| ``` | ||
|
|
||
| ## Raw Agent Description Format | ||
|
|
||
| ```jsx | ||
| # Overview |
There was a problem hiding this comment.
Inconsistent heading level in template. Line 23 uses "## Overview" (h2) while line 93 uses "# Overview" (h1). The two templates should use consistent heading levels. Consider using the same heading level (preferably "# Overview") in both the example and raw template formats for consistency.
bd91dcc to
3881926
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 5. Transparency & Data Handling | ||
|
|
||
| | Field | Details | | ||
| |----------------------|-------------------------------------------------------------------------| |
There was a problem hiding this comment.
The table header alignment is inconsistent with the raw template. In the raw template (line 123), the header uses centered alignment (:-------------------: | :------:), but here the header uses left alignment without colons. For consistency and better formatting, consider using the same alignment style in both examples.
| |----------------------|-------------------------------------------------------------------------| | |
| |:--------------------:|:------------------------------------------------------------------------:| |
| | Data Retention | Minimal; data processed in-memory or temporarily cached (no long-term storage) | | ||
| | Data Storage Location| Data processed and stored (temporarily) on EU-based infrastructure | | ||
| | Security Measures | TLS encryption, access-restricted storage, and internal audit logging | | ||
| | Privacy | Fully GDPR/DSGVO-compliant; only publicly accessible website data is analyzed | |
There was a problem hiding this comment.
The example text contains "GDPR/DSGVO-compliant" where DSGVO is the German acronym for GDPR. This redundancy may be confusing for international users. Consider using just "GDPR-compliant" or clarifying the relationship between the terms in documentation targeting a global audience.
| | Privacy | Fully GDPR/DSGVO-compliant; only publicly accessible website data is analyzed | | |
| | Privacy | Fully GDPR-compliant (DSGVO in German); only publicly accessible website data is analyzed | |
| | Field | Details | | ||
| | :-------------------: | :------: | |
There was a problem hiding this comment.
The table header alignment in the raw template is centered (:-------------------: | :------:), but in the informative example (line 46-47), the table uses left alignment without colons. For consistency within the documentation, both tables should use the same alignment style. Consider standardizing on one approach throughout the document.
content/docs/documentation/technical-documentation/agent-description-standard.mdx
Outdated
Show resolved
Hide resolved
| 4. **Actionable Capabilities:** List key capabilities using active verbs, focusing on what the agent does for the user (e.g., "Detects trends", "Identifies influencers", "Recommends actions"). | ||
| 5. **Transparent "How It Works":** Detail the input, data collection, processing, and output steps. Be specific about data sources, models, and output formats. | ||
| 6. **Comprehensive Transparency & Data Handling:** Fill out all fields in this section thoroughly, ensuring compliance details, security measures, and data handling practices are clearly stated. Be precise about processing and storage locations (e.g., "EU – Frankfurt", not just “AWS Server”). | ||
| 7. **Quantify Impact:** Summarize tangible benefits using quantifiable metrics where possible (e.g., "Saves X hours," "Boosts accuracy", access to non-public knowledgebase). |
There was a problem hiding this comment.
The phrase "access to non-public knowledgebase" appears to be an example that doesn't quite fit with the other quantifiable metrics mentioned in this sentence. It would be clearer to either provide a more concrete quantifiable metric or move this to a separate benefit category, as it describes a capability rather than a measurable impact.
| 7. **Quantify Impact:** Summarize tangible benefits using quantifiable metrics where possible (e.g., "Saves X hours," "Boosts accuracy", access to non-public knowledgebase). | |
| 7. **Quantify Impact:** Summarize tangible benefits using quantifiable metrics where possible (e.g., "Saves X hours per week," "Boosts accuracy by Y%", "Reduces costs by Z%"). |
3881926 to
3279b6e
Compare
47fc8b5 to
66184bc
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 7 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
scripts/fetch-notion.mjs
Outdated
| return `${jsxKey}=\"${value}\"`; | ||
| }).join(' '); | ||
|
|
||
| return `<${Component} ${attrsForJsx} width={1200} height={800} className={\`${existingClass} w-full h-auto\`} />`; |
There was a problem hiding this comment.
The same magic numbers (1200, 800) for image dimensions are duplicated here. This is a violation of the DRY principle and makes maintenance harder. These should be defined as constants and reused.
scripts/fetch-notion.mjs
Outdated
| const timeoutPromise = new Promise((_, reject) => { | ||
| setTimeout(() => reject(new Error('Request timeout after 30 seconds')), 30000); |
There was a problem hiding this comment.
The timeout duration of 30 seconds is hardcoded as a magic number in the error message and the setTimeout call. This should be extracted as a named constant to ensure consistency and easier configuration.
| // Configuration - REPLACE THESE WITH YOUR VALUES | ||
| const NOTION_PAGES = [ | ||
| { | ||
| pageId: '28cf3c5366cb80b5bc1bcb72e8749b84', // The Notion page ID (32 char hex string at the end of the page URL) |
There was a problem hiding this comment.
The hardcoded page ID in the configuration array should be removed or replaced with a placeholder. Committing actual Notion page IDs to the repository could expose internal documentation structure. Consider moving this to environment variables or using a placeholder value with clear documentation on how to configure it.
| // Configuration - REPLACE THESE WITH YOUR VALUES | |
| const NOTION_PAGES = [ | |
| { | |
| pageId: '28cf3c5366cb80b5bc1bcb72e8749b84', // The Notion page ID (32 char hex string at the end of the page URL) | |
| // Configuration - REPLACE THESE WITH YOUR VALUES (prefer environment variables) | |
| const NOTION_PAGES = [ | |
| { | |
| pageId: process.env.NOTION_PAGE_ID || '[NOTION_PAGE_ID]', // The Notion page ID (32 char hex string at the end of the page URL) |
scripts/fetch-notion.mjs
Outdated
|
|
||
| updatedContent = parts.map((part, index) => { | ||
| // If this part is a code block (odd indices after split), preserve it | ||
| if (codeBlockRegex.test(part)) { |
There was a problem hiding this comment.
The regex testing logic is flawed. After splitting by codeBlockRegex, the code attempts to test each part with the same regex pattern. However, regex objects with the global flag ('g') maintain state between matches, and the split operation has already consumed the matches. The current approach of testing codeBlockRegex.test(part) will produce incorrect results. Instead, check if the index is odd (parts at odd indices are the captured groups) or create a new regex without the global flag for the test.
| if (codeBlockRegex.test(part)) { | |
| if (index % 2 === 1) { |
scripts/fetch-notion.mjs
Outdated
| }); | ||
|
|
||
| // Initialize NotionToMarkdown | ||
| const n2m = new NotionToMarkdown({ notionClient: notion }); |
There was a problem hiding this comment.
The variable name n2m is cryptic and doesn't follow clear naming conventions. Consider using a more descriptive name like notionToMd or markdownConverter to improve code readability and maintainability.
| const n2m = new NotionToMarkdown({ notionClient: notion }); | |
| const notionToMarkdown = new NotionToMarkdown({ notionClient: notion }); |
scripts/fetch-notion.mjs
Outdated
| <ImageZoom src=\"${lightSrc}\" alt=\"${altText}\" width={1200} height={800} className={\`${existingClass} w-full h-auto block dark:hidden\`} ${sharedAttrs} /> | ||
| <ImageZoom src=\"${darkSrc}\" alt=\"${altText}\" width={1200} height={800} className={\`${existingClass} w-full h-auto hidden dark:block\`} ${sharedAttrs} /> |
There was a problem hiding this comment.
Magic numbers 1200 and 800 are hardcoded for image width and height. These values should be extracted as named constants at the top of the file to improve maintainability and make it easier to adjust image dimensions across all transformations.
.env.example
Outdated
| @@ -0,0 +1,3 @@ | |||
| # Notion API Token | |||
| # Get your token from: https://www.notion.so/my-integrations | |||
| NOTION_TOKEN=nmkr_notion_token_here | |||
There was a problem hiding this comment.
The example token value "nmkr_notion_token_here" appears to be a custom placeholder format. Notion API tokens actually start with "secret_" (for internal integrations) or "ntn_" (for public integrations), not "nmkr_". Using an incorrect format in the example could confuse users about what a valid token looks like.
| NOTION_TOKEN=nmkr_notion_token_here | |
| NOTION_TOKEN=secret_notion_token_here |
scripts/fetch-notion.mjs
Outdated
|
|
||
| function parseAttributes(attributesString) { | ||
| const attributes = {}; | ||
| const attributeRegex = /([\w-]+)=["']([^"]*)["']/g; |
There was a problem hiding this comment.
The regex pattern has an issue with the character class. The pattern uses double quotes inside the character class [^"]* but is itself enclosed in double quotes with the attribute format ["']([^"]*)["']. This will not correctly match attributes with single quotes because the negated character class only excludes double quotes, not single quotes. This could cause the regex to incorrectly capture attribute values that contain single quotes when the attribute itself is single-quoted.
| const attributeRegex = /([\w-]+)=["']([^"]*)["']/g; | |
| const attributeRegex = /([\w-]+)=["']([^"']*)["']/g; |
| const imageRegexes = [ | ||
| /!\[[^\]]*\]\(([^)]+)\)/g, // markdown images | ||
| /<img[^>]+src=["']([^"']+)["'][^>]*>/g, // img tags | ||
| ]; |
There was a problem hiding this comment.
The regex patterns are recreated on every iteration through the outer loop. For better performance, these regex patterns should be created once outside the loop since they don't change between iterations.
scripts/fetch-notion.mjs
Outdated
|
|
||
| if (success) { | ||
| // Replace the URL in content - use global replace to catch all occurrences | ||
| updatedContent = updatedContent.replaceAll(imageUrl, publicPath); |
There was a problem hiding this comment.
The function uses replaceAll which requires Node.js 15.0.0 or higher. The package.json specifies engine requirements, but there's no explicit Node version constraint. If the project needs to support older Node versions, this could cause runtime errors. Consider using a polyfill or the global replace pattern with the 'g' flag instead.
- Changed static agent description file to use Notion API fetch - Fixed code scanning alert no. 8: Incomplete URL substring sanitization - Fixed code scanning alert no. 10: Incomplete string escaping or encoding - Fixed code scanning alert no. 11: Replacement of a substring with itself
66184bc to
98cc88a
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 8 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Process each match | ||
| for (const match of matches) { | ||
| const imageUrl = match[1]; | ||
|
|
||
| // Skip if it's already a local path | ||
| if (imageUrl.startsWith('/synced-images/')) { | ||
| continue; | ||
| } | ||
|
|
||
| // Extract file extension or use default | ||
| try { | ||
| const urlObj = new URL(imageUrl); | ||
| const hostname = urlObj.hostname; | ||
|
|
||
| // Skip if it's not a Notion image URL (strict hostname check) | ||
| const isNotionHost = | ||
| hostname === 'notion.so' || | ||
| hostname === 'www.notion.so' || | ||
| hostname.endsWith('.notion.so') || | ||
| hostname === 'notion-static.com' || | ||
| hostname.endsWith('.notion-static.com'); | ||
|
|
||
| if (!isNotionHost) { | ||
| continue; | ||
| } | ||
|
|
||
| const urlPath = urlObj.pathname; | ||
| const fileName = path.basename(urlPath) || `image-${Date.now()}.png`; | ||
| const localPath = path.join(imagesDir, fileName); | ||
| const publicPath = `/synced-images/notion/${pageId}/${fileName}`; | ||
|
|
||
| // Fetch and save the image with auth token | ||
| console.log(`📸 Fetching image: ${imageUrl}`); | ||
| const success = await fetchImage(imageUrl, localPath, authToken); | ||
|
|
||
| if (success) { | ||
| // Replace the URL in content - replace all occurrences | ||
| // Use split/join pattern for compatibility (works in all Node versions) | ||
| updatedContent = updatedContent.split(imageUrl).join(publicPath); | ||
| console.log(`✅ Synced image: ${fileName}`); | ||
| } else { | ||
| console.log(`❌ Failed to sync image: ${fileName}`); | ||
| } | ||
| } catch (urlError) { | ||
| console.error(`❌ Invalid image URL: ${imageUrl}`, urlError.message); | ||
| continue; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Images are fetched sequentially in the for loop, which could significantly slow down the sync process when there are multiple images. Consider using Promise.all() or Promise.allSettled() to fetch images in parallel, while maintaining a reasonable concurrency limit to avoid overwhelming the server.
| <ImageZoom src=\"${lightSrc}\" alt=\"${altText}\" width={${IMAGE_DEFAULT_WIDTH}} height={${IMAGE_DEFAULT_HEIGHT}} className={\`${existingClass} w-full h-auto block dark:hidden\`} ${sharedAttrs} /> | ||
| <ImageZoom src=\"${darkSrc}\" alt=\"${altText}\" width={${IMAGE_DEFAULT_WIDTH}} height={${IMAGE_DEFAULT_HEIGHT}} className={\`${existingClass} w-full h-auto hidden dark:block\`} ${sharedAttrs} /> | ||
| </div>`; | ||
| }); | ||
|
|
||
| // Convert standalone <img> tags | ||
| const imgRegex = /<img([^>]+)>/gi; | ||
| updatedContent = updatedContent.replace(imgRegex, (match, attributesString) => { | ||
| if (match.includes('dark:hidden') || match.includes('hidden dark:block')) { | ||
| return match; | ||
| } | ||
|
|
||
| const attributes = parseAttributes(attributesString); | ||
| const isGif = attributes.src && attributes.src.toLowerCase().endsWith('.gif'); | ||
| const Component = isGif ? 'img' : 'ImageZoom'; | ||
| const existingClass = attributes.class || ''; | ||
|
|
||
| const attrsForJsx = Object.entries(attributes) | ||
| .filter(([key]) => !['class', 'width', 'height'].includes(key)) | ||
| .map(([key, value]) => { | ||
| const jsxKey = key === 'class' ? 'className' : key; | ||
| return `${jsxKey}=\"${value}\"`; | ||
| }).join(' '); | ||
|
|
||
| return `<${Component} ${attrsForJsx} width={${IMAGE_DEFAULT_WIDTH}} height={${IMAGE_DEFAULT_HEIGHT}} className={\`${existingClass} w-full h-auto\`} />`; |
There was a problem hiding this comment.
The hard-coded image dimensions (1200x800) appear in multiple locations in the conversion logic. While constants are defined at the top of the file, extracting the repeated template literal pattern into a helper function would reduce duplication and make the code more maintainable.
| } | ||
| // Otherwise, escape backslashes and curly braces | ||
| return part | ||
| .replace(/\\/g, '\\\\') |
There was a problem hiding this comment.
The backslash escaping will double-escape backslashes that are already escaped in the content. This could lead to display issues with literal backslashes in the documentation. Consider only escaping backslashes that aren't already escaped, or handle this more carefully to avoid double-escaping.
| .replace(/\\/g, '\\\\') | |
| .replace(/(?<!\\)\\(?!\\)/g, '\\\\') |
| if (match.includes('dark:hidden') || match.includes('hidden dark:block')) { | ||
| return match; | ||
| } | ||
|
|
||
| const attributes = parseAttributes(attributesString); |
There was a problem hiding this comment.
The regex pattern will incorrectly match tags that have been processed and contain "dark:hidden" or "hidden dark:block" anywhere in the tag, not just in the className attribute. This could cause processed images to be skipped when they shouldn't be. Consider checking specifically for these values in the className attribute or using a more precise check.
| if (match.includes('dark:hidden') || match.includes('hidden dark:block')) { | |
| return match; | |
| } | |
| const attributes = parseAttributes(attributesString); | |
| const attributes = parseAttributes(attributesString); | |
| const classAttr = attributes.class || attributes.className || ''; | |
| if (classAttr.includes('dark:hidden') || classAttr.includes('hidden dark:block')) { | |
| return match; | |
| } |
|
ready to rebase in my opinion |
No description provided.