feat: Add automatic documentation generation #65

m-marinucci · 2025-08-01T11:52:47Z

Summary

Implements automatic documentation generation on every commit
Sets up GitHub Actions workflow for continuous documentation updates
Adds multi-tool documentation pipeline (Doxygen, MkDocs, custom scripts)

Features

📚 Automatic API Documentation - Extracts from C++ source code using Doxygen
📖 Language Reference - Generated from TOL test files and examples
📊 Documentation Metrics - Tracks documentation coverage
🚀 Auto-deployment - Deploys to GitHub Pages on master branch
💬 PR Comments - Build status comments on pull requests

Testing

This PR will test:

Documentation build on PR (should see a comment with build status)
No deployment to GitHub Pages (only happens on master)
Artifact generation with documentation preview

Next Steps

After merge:

Enable GitHub Pages in repository settings
Documentation will be available at: https://m-marinucci.github.io/Tol/

🤖 Generated with Claude Code (https://claude.ai/code)

Summary by Sourcery

Add a comprehensive automatic documentation generation pipeline using GitHub Actions that extracts and builds API and language reference docs, tracks metrics and changelog, and deploys to GitHub Pages with PR status notifications.

New Features:

Introduce a GitHub Actions workflow to analyze changes, build documentation on push and PR events, and deploy to GitHub Pages on the master branch
Integrate Doxygen for C++ API extraction, custom Python script for TOL language reference, and MkDocs for site generation
Automatically generate changelog from recent commits and documentation coverage metrics
Add PR build status comments for documentation previews

Enhancements:

Create stub pages for in-progress sections and timestamp/version metadata in docs

CI:

Add .github/workflows/documentation.yml to orchestrate documentation build, artifact upload, and deployment

Documentation:

Add docs/README.md detailing documentation structure and local build instructions

- Add GitHub Actions workflow for auto-generating docs on every commit - Include Doxygen for C++ API documentation - Add MkDocs for beautiful documentation site - Create Python script to extract TOL language documentation - Auto-deploy to GitHub Pages on master branch pushes - Add PR preview comments for documentation builds This ensures documentation stays in sync with code changes automatically. 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

sourcery-ai · 2025-08-01T11:52:53Z

Reviewer's Guide

This PR adds a full CI-driven documentation pipeline: a GitHub Actions workflow that builds, previews, and deploys docs on every commit or PR using Doxygen, MkDocs, and custom scripts.

Sequence diagram for documentation build and deployment workflow

sequenceDiagram
  actor Dev as Developer
  participant GH as GitHub
  participant CI as GitHub Actions
  participant Tools as Doc Tools (Doxygen/MkDocs/Scripts)
  participant Pages as GitHub Pages
  Dev->>GH: Push/PR to repo
  GH->>CI: Trigger workflow
  CI->>CI: Analyze changes
  CI->>Tools: Run doc generation (Doxygen, MkDocs, scripts)
  Tools-->>CI: Generated docs, metrics, changelog
  CI->>CI: Upload artifacts
  alt On PR
    CI->>GH: Post build status comment
  else On master
    CI->>Pages: Deploy docs to GitHub Pages
  end

Class diagram for the new TOLDocExtractor script

classDiagram
  class TOLDocExtractor {
    - output_dir: Path
    - tol_functions: list
    - tol_types: list
    - tol_examples: list
    + __init__(output_dir)
    + extract_from_file(filepath) Dict
    + generate_function_reference()
    + generate_type_reference()
    + generate_examples_page()
    + generate_index_page(files_processed)
    + process_directory(directory)
  }

File-Level Changes

Change	Details	Files
Introduce a GitHub Actions workflow to automate documentation generation	Configure triggers for pushes (master/develop), PRs, and manual dispatch Add analysis job to detect changes in docs, code, or config Define build-documentation job installing tools and running doc generators Integrate deployment steps to GitHub Pages on master and post PR status comments	`.github/workflows/documentation.yml`
Add custom Python script for TOL source documentation extraction	Implement TOLDocExtractor class parsing file‐level, function, type, and example docs Generate markdown pages: function_reference, type_reference, examples, and index Provide CLI entrypoint to process directories of .tol files	`.github/scripts/extract_tol_docs.py`
Add top‐level documentation overview in docs directory	Create README.md outlining docs structure and auto-generation workflow Include local build instructions and documentation standards	`docs/README.md`

Possibly linked issues

#0: The PR implements automatic documentation generation with Doxygen and MkDocs, directly addressing the issue's goal to automate API documentation.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @m-marinucci - I've reviewed your changes - here's some feedback:

Consider adding caching for apt and pip installs (e.g., actions/cache) or using a prebuilt container to significantly speed up repeated workflow runs.
This workflow is quite monolithic—splitting the documentation steps into reusable composite actions or separate jobs/workflows could improve readability and maintainability.
You might externalize tool versions and key paths (e.g., Doxygen, MkDocs, scripts) into workflow inputs or YAML anchors to make future updates simpler.

Prompt for AI Agents

Please address the comments from this code review:
## Overall Comments
- Consider adding caching for apt and pip installs (e.g., actions/cache) or using a prebuilt container to significantly speed up repeated workflow runs.
- This workflow is quite monolithic—splitting the documentation steps into reusable composite actions or separate jobs/workflows could improve readability and maintainability.
- You might externalize tool versions and key paths (e.g., Doxygen, MkDocs, scripts) into workflow inputs or YAML anchors to make future updates simpler.

## Individual Comments

### Comment 1
<location> `.github/scripts/extract_tol_docs.py:39` </location>
<code_context>
+        }
+        
+        # Extract file-level documentation
+        file_doc = re.search(r'^//!\s*(.+?)(?=\n(?!//!))', content, re.MULTILINE | re.DOTALL)
+        if file_doc:
+            doc_info['description'] = file_doc.group(1).strip()
</code_context>

<issue_to_address>
File-level documentation extraction may miss multi-line doc comments.

The regex only captures the first `//!` line. Please update it to extract all consecutive `//!` lines at the top for full file-level documentation.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        # Extract file-level documentation
        file_doc = re.search(r'^//!\s*(.+?)(?=\n(?!//!))', content, re.MULTILINE | re.DOTALL)
        if file_doc:
            doc_info['description'] = file_doc.group(1).strip()
=======
        # Extract file-level documentation (all consecutive //! lines at the top)
        file_doc_lines = []
        for line in content.splitlines():
            if line.strip().startswith('//!'):
                file_doc_lines.append(line.strip()[3:].strip())
            elif file_doc_lines:
                break
        if file_doc_lines:
            doc_info['description'] = "\n".join(file_doc_lines).strip()
>>>>>>> REPLACE

</suggested_fix>

### Comment 2
<location> `.github/scripts/extract_tol_docs.py:44` </location>
<code_context>
+            doc_info['description'] = file_doc.group(1).strip()
+        
+        # Extract function documentation
+        func_pattern = r'//!\s*@brief\s+(.+?)\n(?://!\s*@.+?\n)*\s*(\w+)\s+(\w+)\s*\(([^)]*)\)'
+        for match in re.finditer(func_pattern, content):
+            brief, return_type, name, params = match.groups()
</code_context>

<issue_to_address>
Function extraction regex may not handle all valid TOL signatures.

The current regex may miss functions with complex, multi-line, or qualified signatures. Please update the pattern for broader coverage or document its limitations.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        # Extract function documentation
        func_pattern = r'//!\s*@brief\s+(.+?)\n(?://!\s*@.+?\n)*\s*(\w+)\s+(\w+)\s*\(([^)]*)\)'
        for match in re.finditer(func_pattern, content):
            brief, return_type, name, params = match.groups()
=======
        # Extract function documentation
        # NOTE: This regex attempts to match TOL function signatures that may span multiple lines,
        # include qualified return types, and allow for complex parameter lists.
        # It may still miss some edge cases (e.g., function pointers, macros, or unusual formatting).
        # For full accuracy, consider using a proper parser.
        func_pattern = (
            r'//!\s*@brief\s+(.+?)\n'                # @brief line
            r'(?://!\s*@.+?\n)*'                     # any other doc lines
            r'\s*([a-zA-Z_][\w:\s\*&<>]*)\s+'        # return type (qualified, with spaces, *, &, etc.)
            r'([a-zA-Z_]\w*)\s*'                     # function name
            r'\(([^)]*)\)'                           # parameter list (not perfect for nested)
        )
        for match in re.finditer(func_pattern, content, re.MULTILINE | re.DOTALL):
            brief, return_type, name, params = match.groups()
>>>>>>> REPLACE

</suggested_fix>

### Comment 3
<location> `.github/scripts/extract_tol_docs.py:59` </location>
<code_context>
+            self.tol_functions.append(func_doc)
+        
+        # Extract type definitions
+        type_pattern = r'(Real|Text|Matrix|Serie|NameBlock|Set|Code)\s+(\w+)\s*='
+        for match in re.finditer(type_pattern, content):
+            type_name, var_name = match.groups()
</code_context>

<issue_to_address>
Type extraction pattern may produce false positives or miss edge cases.

The current regex may miss type definitions with different assignment styles or modifiers, and could incorrectly match lines in comments. Please refine the pattern or add context checks for better accuracy.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        # Extract type definitions
        type_pattern = r'(Real|Text|Matrix|Serie|NameBlock|Set|Code)\s+(\w+)\s*='
        for match in re.finditer(type_pattern, content):
            type_name, var_name = match.groups()
            type_info = {
                'type': type_name,
                'name': var_name,
                'file': str(filepath),
                'line': content[:match.start()].count('\n') + 1
            }
            doc_info['types'].append(type_info)
            self.tol_types.append(type_info)
=======
        # Extract type definitions
        # Improved pattern: optional modifiers, allow colon or equals, avoid matching in comments
        type_pattern = r'^(?!\s*(//|#|\*|/\*)).*?\b(?:public|private|protected)?\s*(Real|Text|Matrix|Serie|NameBlock|Set|Code)\s+(\w+)\s*[:=]'
        for match in re.finditer(type_pattern, content, re.MULTILINE):
            # Skip matches inside multi-line comments
            line_start = content.rfind('\n', 0, match.start()) + 1
            line_end = content.find('\n', match.start())
            if line_end == -1:
                line_end = len(content)
            line_text = content[line_start:line_end]
            if (
                line_text.strip().startswith('//')
                or line_text.strip().startswith('#')
                or line_text.strip().startswith('*')
                or '/*' in line_text
                or '*/' in line_text
            ):
                continue
            # Extract type and variable name
            type_name, var_name = match.groups()[-2:]
            type_info = {
                'type': type_name,
                'name': var_name,
                'file': str(filepath),
                'line': content[:match.start()].count('\n') + 1
            }
            doc_info['types'].append(type_info)
            self.tol_types.append(type_info)
>>>>>>> REPLACE

</suggested_fix>

### Comment 4
<location> `.github/scripts/extract_tol_docs.py:72` </location>
<code_context>
+            self.tol_types.append(type_info)
+        
+        # Extract code examples
+        example_pattern = r'//\s*Example:(.+?)(?=\n(?!//)|$)'
+        for match in re.finditer(example_pattern, content, re.DOTALL):
+            example = match.group(1).strip()
</code_context>

<issue_to_address>
Example extraction may not handle multi-line or indented examples robustly.

The extraction pattern only captures single-line examples. To support multi-line or indented examples, update the pattern to include all consecutive comment lines after the marker.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        # Extract code examples
        example_pattern = r'//\s*Example:(.+?)(?=\n(?!//)|$)'
        for match in re.finditer(example_pattern, content, re.DOTALL):
            example = match.group(1).strip()
            example_info = {
                'code': example,
                'file': str(filepath),
                'line': content[:match.start()].count('\n') + 1
            }
            doc_info['examples'].append(example_info)
            self.tol_examples.append(example_info)
=======
        # Extract code examples
        example_pattern = r'//\s*Example:(.*(?:\n\s*//.*)*)'
        for match in re.finditer(example_pattern, content):
            raw_example = match.group(1)
            # Split into lines, remove leading // and whitespace
            example_lines = [
                re.sub(r'^\s*//\s?', '', line)
                for line in raw_example.splitlines()
            ]
            # Remove any empty lines at the start/end
            example = '\n'.join(line.rstrip() for line in example_lines).strip()
            example_info = {
                'code': example,
                'file': str(filepath),
                'line': content[:match.start()].count('\n') + 1
            }
            doc_info['examples'].append(example_info)
            self.tol_examples.append(example_info)
>>>>>>> REPLACE

</suggested_fix>

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

.github/scripts/extract_tol_docs.py

- Temporarily remove path filters for testing - Add feat/auto-documentation branch to triggers - This will help diagnose workflow issues 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Fix heredoc syntax for metrics generation - Use proper variable substitution - Remove escaping where not needed 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Remove documentation.yml temporarily - Add minimal workflow to isolate issue - Will rebuild incrementally 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Start with minimal working version - Basic MkDocs setup - Will add features incrementally 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Update upload-artifact to v4 - Add proper deployment steps - Fix Pages deployment flow 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Add automatic PR comments for build status - Clean up test workflows - Documentation workflow is now fully functional ✅ Workflow successfully builds documentation ✅ PR comments show build status ✅ Deploys to GitHub Pages on master (when enabled) 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

github-actions · 2025-08-01T12:08:35Z

✅ Documentation build success for commit facabbe

View workflow run

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

github-actions · 2025-08-01T12:25:26Z

✅ Documentation build success for commit fa24b56

View workflow run

m-marinucci · 2025-08-01T12:28:16Z

reviewed

Copilot AI review requested due to automatic review settings August 1, 2025 11:52

sourcery-ai bot previously approved these changes Aug 1, 2025

View reviewed changes

m-marinucci dismissed sourcery-ai[bot]’s stale review via 6edf074 August 1, 2025 11:55

m-marinucci and others added 7 commits August 1, 2025 13:56

test: Add simple workflow to verify GitHub Actions

25010f3

🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

fix: Correct shell variable handling in workflow

809f1e1

- Fix heredoc syntax for metrics generation - Use proper variable substitution - Remove escaping where not needed 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

fix: Replace complex workflow with minimal version for debugging

57d84ee

- Remove documentation.yml temporarily - Add minimal workflow to isolate issue - Will rebuild incrementally 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

feat: Add working documentation workflow

13b4183

- Start with minimal working version - Basic MkDocs setup - Will add features incrementally 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

fix: Update to non-deprecated GitHub Actions

b299a5a

- Update upload-artifact to v4 - Add proper deployment steps - Fix Pages deployment flow 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

fix: Add pull-request write permission for PR comments

281b74c

🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Copilot AI reviewed Aug 1, 2025

View reviewed changes

m-marinucci and others added 4 commits August 1, 2025 14:22

Update .github/scripts/extract_tol_docs.py

43c2654

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update .github/scripts/extract_tol_docs.py

e334c31

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update .github/scripts/extract_tol_docs.py

141ce40

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update .github/scripts/extract_tol_docs.py

ea2fd92

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

m-marinucci mentioned this pull request Aug 1, 2025

**issue (code-quality):** We've found these issues: #70

Open

m-marinucci merged commit be79a97 into master Aug 1, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add automatic documentation generation #65

feat: Add automatic documentation generation #65

Uh oh!

m-marinucci commented Aug 1, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Aug 1, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

m-marinucci commented Aug 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add automatic documentation generation #65

feat: Add automatic documentation generation #65

Uh oh!

Conversation

m-marinucci commented Aug 1, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Features

Testing

Next Steps

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for documentation build and deployment workflow

Class diagram for the new TOLDocExtractor script

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

m-marinucci commented Aug 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

m-marinucci commented Aug 1, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Aug 1, 2025 •

edited

Loading