crate · amotl · Sep 16, 2025 · Sep 16, 2025 · Sep 16, 2025 · Sep 16, 2025
diff --git a/docs/_include/card/timeseries-intro.md b/docs/_include/card/timeseries-intro.md
@@ -22,11 +22,11 @@ for fast aggregations.
 Combine time series data with document data: CrateDB is all you need.
 ::::
 
-::::{grid-item-card} {material-outlined}`lightbulb;2em` Time Series: Advanced SQL
+::::{grid-item-card} {material-outlined}`lightbulb;2em` Time Series: Analyzing Weather Data
 :link: guide:timeseries-analysis-weather
 :link-type: ref
 :class-footer: text-smaller
-CrateDB provides enhanced features for querying time series data.
+CrateDB provides advanced SQL features for querying time series data.
 
 :::{rubric} What's Inside
 :::

diff --git a/docs/_include/links.md b/docs/_include/links.md
@@ -48,6 +48,7 @@
 [langchain-rag-sql-binder]: https://mybinder.org/v2/gh/crate/cratedb-examples/main?labpath=topic%2Fmachine-learning%2Flangchain%2Fcratedb-vectorstore-rag-openai-sql.ipynb
 [langchain-rag-sql-colab]: https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/langchain/cratedb-vectorstore-rag-openai-sql.ipynb
 [langchain-rag-sql-github]: https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/langchain/cratedb-vectorstore-rag-openai-sql.ipynb
+[MLOps]: https://en.wikipedia.org/wiki/MLOps
 [MongoDB]: https://www.mongodb.com/docs/manual/
 [MongoDB Atlas]: https://www.mongodb.com/docs/atlas/
 [MongoDB CDC Relay]: inv:ctk:*:label#mongodb-cdc-relay

diff --git a/docs/feature/query/index.md b/docs/feature/query/index.md
@@ -166,6 +166,9 @@ search, all based on vanilla SQL.
 - {ref}`vector-search`
 - {ref}`hybrid-search`
 
+## Text-to-SQL
+Natural language to SQL conversions using adapters and frameworks.
+- {ref}`text-to-sql`
 
 ## Time Bucketing
 https://community.cratedb.com/t/resampling-time-series-data-with-date-bin/1009

diff --git a/docs/feature/search/vector/index.md b/docs/feature/search/vector/index.md
@@ -20,9 +20,13 @@
 :::{rubric} Overview
 :::
 CrateDB can be used as a [vector database] (VDBMS) for storing and retrieving
-vector embeddings based on the FLOAT_VECTOR data type and its accompanying
-KNN_MATCH and VECTOR_SIMILARITY functions, effectively conducting HNSW
-semantic similarity searches on them, also known as vector search.
+vector embeddings.
+
+CrateDB's FLOAT_VECTOR data type implements a vector store and the k-nearest
+neighbor (kNN) search algorithm to find vectors that are similar to a query
+vector. This works by using its accompanying KNN_MATCH and VECTOR_SIMILARITY
+functions to perform HNSW-based semantic similarity search,
+also known as vector search.
 
 :::{rubric} About
 :::
@@ -35,6 +39,10 @@ search finds similar data using approximate nearest neighbor (ANN) algorithms.
 Compared to traditional keyword search, vector search yields more relevant
 results and executes faster.
 
+Feature vectors are computed from raw data via ML methods such as feature
+extraction, word embeddings, or deep neural networks.
+
+
 :::{rubric} Details
 :::
 CrateDB uses Lucene as a storage layer, so it inherits the implementation

diff --git a/docs/solution/index.md b/docs/solution/index.md
@@ -69,6 +69,19 @@ always have them ready for historical analysis.
 :::
 
 
+:::{grid-item-card} {material-outlined}`model_training;2em` Machine Learning
+:link: machine-learning
+:link-type: ref
+:link-alt: About CrateDB for machine learning applications
+
+Learn how to integrate CrateDB with machine learning frameworks and tools.
++++
+**What's inside:**
+Use CrateDB with LangChain, LlamaIndex, MLflow, PyCaret, scikit-learn,
+or TensorFlow.
+:::
+
+
 ::::
 
 
@@ -78,5 +91,6 @@ always have them ready for historical analysis.
 
 analytics/index
 industrial/index
+Machine learning <machine-learning/index>
 migrate/index
 ```
diff --git a/docs/solution/machine-learning/index.md b/docs/solution/machine-learning/index.md
@@ -0,0 +1,198 @@
+(ml)=
+(ml-tools)=
+(machine-learning)=
+# Machine learning with CrateDB
+
+:::{include} /_include/links.md
+:::
+
+:::{div} sd-text-muted
+CrateDB provides a vector store natively, and adapters for integrating
+with machine learning frameworks.
+:::
+
+## Vector store
+
+:::{div}
+[Vector databases][Vector Database] can be used for similarity search,
+multi-modal search, recommendation engines, large language models (LLMs),
+and other applications.
+
+These applications can answer questions about specific sources of information,
+for example using techniques like Retrieval Augmented Generation (RAG).
+RAG is a technique for augmenting LLM knowledge with additional data,
+often private or real-time.
+:::
+
+::::{grid} 2
+:gutter: 4
+
+:::{grid-item-card} Documentation: Vector search
+:link: vector-search
+:link-type: ref
+CrateDB's FLOAT_VECTOR data type implements a vector store and the k‑nearest
+neighbors (k‑NN) search algorithm to find vectors that are similar to a query
+vector.
++++
+Vector search on machine learning embeddings: CrateDB is all you need.
+:::
+
+:::{grid-item-card} Documentation: Hybrid search
+:link: hybrid-search
+:link-type: ref
+Hybrid search is a technique to enhance relevancy and accuracy by combining
+traditional full-text with semantic search algorithms, for achieving better
+accuracy and relevancy than each algorithm would individually.
++++
+Combined BM25 term search and vector search based on Apache Lucene,
+using SQL: CrateDB is all you need.
+:::
+
+:::{grid-item-card} Integration: LangChain
+:link: langchain
+:link-type: ref
+LangChain is a framework for developing applications powered by language models,
+written in Python, and with a strong focus on composability.
+It supports retrieval-augmented generation (RAG).
++++
+The LangChain adapter lets you use CrateDB as a vector store database, load
+documents via document loaders, and use LangChain’s conversational memory.
+:::
+
+::::
+
+
+(text-to-sql)=
+## Text-to-SQL
+
+:::{div} sd-text-muted
+Integrate CrateDB with Text-to-SQL solutions,
+and provide MCP and AI enterprise data integrations.
+:::
+
+::::{grid} 2
+:gutter: 4
+
+:::{grid-item-card} Text-to-SQL with LlamaIndex
+:link: llamaindex
+:link-type: ref
+Text-to-SQL is a technique that converts natural language queries into SQL
+queries that can be executed by a database.
+:::
+
+:::{grid-item-card} All about MCP
+:link: mcp
+:link-type: ref
+The Model Context Protocol (MCP) is an open protocol that enables seamless
+integration between LLM applications and external data sources and tools.
+:::
+
+:::{grid-item-card} MindsDB
+:link: mindsdb
+:link-type: ref
+MindsDB is the platform for customizing AI from enterprise data.
+:::
+
+::::
+
+
+## Time series analysis
+
+:::{div} sd-text-muted
+Load and analyze data from database systems for
+time series anomaly detection and forecasting.
+:::
+
+::::{grid} 2
+:gutter: 4
+
+:::{grid-item-card} Statistical analysis and visualization on huge datasets
+:link: r-tutorial
+:link-type: ref
+Learn how to create a machine learning pipeline using R and CrateDB.
+:::
+
+:::{grid-item-card} Regression analysis with pandas and scikit-learn
+:link: scikit-learn
+:link-type: ref
+Use pandas and scikit-learn to run a regression analysis within a Jupyter Notebook.
+:::
+
+:::{grid-item-card} Build model for predictive maintenance with TensorFlow
+:link: tensorflow-tutorial
+:link-type: ref
+Learn how to build a machine learning model that will predict whether
+a machine will fail within a specified time window in the future.
+:::
+
+:::{grid-item-card} Advanced time series analysis with MLflow and PyCaret
+:link: ml-timeseries
+:link-type: ref
+Learn how to conduct advanced data analysis on large time series datasets
+with CrateDB, MLflow, and PyCaret:
+Anomaly detection and forecasting, time series decomposition,
+Exploratory data analysis (EDA).
+:::
+
+::::
+
+
+## MLOps and model training
+
+:::{div} sd-text-muted
+CrateDB supports MLOps procedures through adapters to best-of-breed software
+frameworks.
+:::
+
+:::{div}
+Training a machine learning model, running it in production, and maintaining
+it, requires a significant amount of data processing and bookkeeping
+operations.
+
+Machine Learning Operations [MLOps] is a paradigm that aims to deploy and
+maintain machine learning models in production reliably and efficiently,
+including experiment tracking, and in the spirit of continuous development
+and DevOps.
+:::
+
+::::{grid} 2
+:gutter: 4
+
+:::{grid-item-card} MLflow
+:link: mlflow
+:link-type: ref
+MLflow is an open-source platform to manage the whole ML lifecycle,
+including experimentation, reproducibility, deployment, and a central
+model registry.
++++
+CrateDB can be used as a storage database for the MLflow Tracking subsystem.
+:::
+
+:::{grid-item-card} PyCaret
+:link: pycaret
+:link-type: ref
+PyCaret is an open-source, low-code machine learning library for Python
+that automates machine learning workflows (AutoML).
++++
+CrateDB can be used as a storage database for training and production datasets.
+:::
+
+:::{grid-item-card} Advanced time series analysis with MLflow and PyCaret
+:link: ml-timeseries
+:link-type: ref
+:columns: 12
+Learn how to conduct advanced data analysis on large time series datasets
+with CrateDB, MLflow, and PyCaret.
++++
+**What's inside:** Anomaly detection and forecasting, time series decomposition,
+exploratory data analysis (EDA).
+:::
+
+::::
+
+
+:::{toctree}
+:maxdepth: 1
+:hidden:
+time-series
+:::