-
Notifications
You must be signed in to change notification settings - Fork 1
PyCaret: Refactor section to dedicated page #303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a new PyCaret integration page under docs/integrate, links it from the integrations toctree, and removes the PyCaret content from the ML topic page, replacing it with a seealso pointer to the new integration page. Changes
Sequence Diagram(s)Not applicable — documentation-only changes; no control-flow or runtime behavior modified. Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
✨ Finishing touches🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Pre-merge checks✅ Passed checks (3 passed)
|
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (8)
docs/integrate/pycaret/index.md (8)
6-6
: Add descriptive alt text to the logo for a11y.Empty alt text reduces accessibility.
-[{w=180px}](https://pycaret.org/) +[{w=180px}](https://pycaret.org/)
18-21
: Use consistent proper-noun casing for libraries.Prefer “XGBoost”, “Ray”, “LightGBM”; keep “scikit-learn”.
-libraries like scikit-learn, xgboost, ray, lightgbm, and many more. +libraries like scikit-learn, XGBoost, Ray, LightGBM, and many more.
26-30
: Tighten wording and fix punctuation.Streamline the AutoML concept sentence.
-The general concept of PyCaret - and for the matter of fact for AutoML in general - -is rather simple: One takes the raw data, splits it into a training and a test set -and then trains a number of different models on the training set. The models are -then evaluated on the test set and the best performing model is selected. +The general concept of PyCaret—and, in fact, of AutoML in general—is straightforward: +take raw data, split it into training and test sets, train multiple models on the +training set, evaluate on the test set, and select the best‑performing model.
39-41
: Name tuning methods consistently.Use standard terms: “Grid Search”, “Random Search”, “Bayesian Optimization”; minor grammar fix.
-Modern algorithms for executing all these experiments are - amongst others - -GridSearch, RandomSearch and BayesianSearch. For a quick introduction into -these methods, see [Introduction to hyperparameter tuning]. +Common approaches include Grid Search, Random Search, and Bayesian Optimization. +For a quick introduction to these methods, see [Introduction to hyperparameter tuning].
49-49
: Spelling: “straightforward” (no hyphen).-and provides a simple interface to execute all these experiments in a -straight-forward way. The notebooks referenced below demonstrate how this works. +and provides a simple interface to execute all these experiments in a +straightforward way. The notebooks referenced below demonstrate how this works.
70-74
: Align tags with the classification notebook.Remove Time Series/Anomaly/Forecasting; add Classification.
-{tags-primary}`Fundamentals` \ -{tags-secondary}`Time Series` \ -{tags-secondary}`Anomaly Detection` \ -{tags-secondary}`Prediction / Forecasting` +{tags-primary}`Fundamentals` \ +{tags-secondary}`Classification`
91-95
: Remove “Classification” tag from the forecasting notebook block.{tags-primary}`Fundamentals` \ {tags-secondary}`Time Series` \ {tags-secondary}`Training` \ -{tags-secondary}`Classification` \ {tags-secondary}`Forecasting`
51-56
: Consider adding a “-learn” page if a step-by-step tutorial is planned.Matches the repo’s integration pattern (index anchor + separate learn page).
Happy to scaffold docs/integrate/pycaret/learn.md with a toctree and placeholders.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/integrate/index.md
(1 hunks)docs/integrate/pycaret/index.md
(1 hunks)docs/topic/ml/index.md
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-09T16:19:43.966Z
Learnt from: amotl
PR: crate/cratedb-guide#238
File: docs/integrate/azure-functions/learn.rst:1-1
Timestamp: 2025-08-09T16:19:43.966Z
Learning: In the CrateDB Guide documentation, main integration anchors (e.g., `azure-functions`) are intentionally placed in the `index.md` files of their respective integration folders, while detailed tutorials use the `-learn` suffix (e.g., `azure-functions-learn`) in their `learn.rst` or `learn.md` files. This is a deliberate architectural pattern for the documentation restructuring.
Applied to files:
docs/integrate/pycaret/index.md
🔇 Additional comments (3)
docs/integrate/index.md (1)
61-61
: LGTM: toctree entry added in the right place.Ordering and slug look consistent. Please confirm the doc build passes on RTD.
Run “sphinx-build -nW …” or the project’s docs build workflow locally/CI to ensure no broken refs.
docs/topic/ml/index.md (1)
132-134
: LGTM: replaced section with seealso to dedicated page.Keeps the anchor and avoids content duplication.
docs/integrate/pycaret/index.md (1)
64-67
: Resolved — badges now point to .ipynb notebooks. GitHub and Colab links for both classification and timeseries return HTTP 200.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
docs/integrate/pycaret/index.md (5)
63-65
: Fix badge/link mismatch: points to .py while labeled “Notebook on GitHub”.Link the classification badge to the .ipynb on GitHub (it already exists for Colab).
-[automl-classify-github]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/pycaret/automl_classification_with_pycaret.py +[automl-classify-github]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/pycaret/automl_classification_with_pycaret.ipynbAlso applies to: 96-97
68-71
: Remove inaccurate tag “Time Series” from the classification card.The classification example isn’t time series.
{tags-primary}`Fundamentals` \ -{tags-secondary}`Time Series` \ {tags-secondary}`Classification`
86-91
: Validate tags for forecasting card.“Anomaly Detection” seems unrelated to a forecasting tutorial; suggest dropping it (keep Time Series + Prediction/Forecasting).
{tags-primary}`Fundamentals` \ {tags-secondary}`Time Series` \ -{tags-secondary}`Anomaly Detection` \ {tags-secondary}`Prediction / Forecasting`
38-40
: Prefer vendor‑neutral docs over Medium (paywalls/walled gardens).Recommend pointing “Introduction to hyperparameter tuning” to scikit‑learn’s model selection docs.
-[Introduction to hyperparameter tuning]: https://medium.com/analytics-vidhya/comparison-of-hyperparameter-tuning-algorithms-grid-search-random-search-bayesian-optimization-5326aaef1bd1 +[Introduction to hyperparameter tuning]: https://scikit-learn.org/stable/modules/grid_search.htmlAlso applies to: 100-100
4-7
: Host the logo asset locally for stability.Hotlinking a GitHub asset can break; consider vendoring the logo under docs/_static and referencing it relatively.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/integrate/pycaret/index.md
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build docs
🔇 Additional comments (2)
docs/integrate/pycaret/index.md (2)
54-92
: Confirm custom directives/roles render in RTD.info-card, grid-item, tags-primary/tags-secondary look project-specific. Please confirm no Sphinx warnings and the cards render as expected in the preview.
1-52
: Nice standalone integration page.Good structure, anchor, and concise “About/Concept/Benefits/Learn” sections.
About
Details about PyCaret have been on a specific page before. Let's make it more generic, following the lemma "each integration item on a dedicated slot within the »integrations« backbone section, then link to it", so nobody will get confused where to add more of the same kind.
Preview
https://cratedb-guide--303.org.readthedocs.build/integrate/pycaret/
References
/cc @seut