Skip to content

Commit 7395fb1

Browse files
committed
Machine learning: Refactor "advanced time series analysis" section
1 parent bf415e0 commit 7395fb1

File tree

5 files changed

+88
-130
lines changed

5 files changed

+88
-130
lines changed

docs/_include/card/timeseries-intro.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,11 @@ for fast aggregations.
2222
Combine time series data with document data: CrateDB is all you need.
2323
::::
2424

25-
::::{grid-item-card} {material-outlined}`lightbulb;2em` Time Series: Advanced SQL
25+
::::{grid-item-card} {material-outlined}`lightbulb;2em` Time Series: Analyzing Weather Data
2626
:link: guide:timeseries-analysis-weather
2727
:link-type: ref
2828
:class-footer: text-smaller
29-
CrateDB provides enhanced features for querying time series data.
29+
CrateDB provides advanced SQL features for querying time series data.
3030

3131
:::{rubric} What's Inside
3232
:::

docs/solution/machine-learning/index.md

Lines changed: 72 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,8 @@ and other applications.
2020

2121
These applications can answer questions about specific sources of information,
2222
for example using techniques like Retrieval Augmented Generation (RAG).
23-
RAG is a technique for augmenting LLM knowledge with additional data.
23+
RAG is a technique for augmenting LLM knowledge with additional data,
24+
often private or real-time.
2425
:::
2526

2627
::::{grid} 2
@@ -52,6 +53,7 @@ using SQL: CrateDB is all you need.
5253
:link-type: ref
5354
LangChain is a framework for developing applications powered by language models,
5455
written in Python, and with a strong focus on composability.
56+
It supports retrieval-augmented generation (RAG).
5557
+++
5658
The LangChain adapter for CrateDB provides support to use CrateDB as a vector
5759
store database, to load documents using LangChain’s DocumentLoader, and also
@@ -61,10 +63,13 @@ supports LangChain’s conversational memory subsystem.
6163
::::
6264

6365

64-
## Text-to-SQL and MCP
66+
(text-to-sql)=
67+
## Text-to-SQL
6568

66-
The adapters enumerated below integrate CrateDB for Text-to-SQL purposes,
69+
:::{div} sd-text-muted
70+
Integrate CrateDB with Text-to-SQL solutions,
6771
and provide MCP and AI enterprise data integrations.
72+
:::
6873

6974
::::{grid} 2
7075
:gutter: 4
@@ -76,7 +81,7 @@ Text-to-SQL is a technique that converts natural language queries into SQL
7681
queries that can be executed by a database.
7782
:::
7883

79-
:::{grid-item-card} MCP
84+
:::{grid-item-card} All about MCP
8085
:link: mcp
8186
:link-type: ref
8287
The Model Context Protocol (MCP), is an open protocol that enables seamless
@@ -92,8 +97,54 @@ MindsDB is the platform for customizing AI from enterprise data.
9297
::::
9398

9499

100+
## Time series analysis
101+
102+
:::{div} sd-text-muted
103+
Load and analyze data from database systems for
104+
time series anomaly detection and forecasting.
105+
:::
106+
107+
::::{grid} 2
108+
:gutter: 4
109+
110+
:::{grid-item-card} Statistical analysis and visualization on huge datasets
111+
:link: r-tutorial
112+
:link-type: ref
113+
Learn how to create a machine learning pipeline using R and CrateDB.
114+
:::
115+
116+
:::{grid-item-card} Regression analysis with pandas and scikit-learn
117+
:link: scikit-learn
118+
:link-type: ref
119+
Use pandas and scikit-learn to run a regression analysis within a Jupyter Notebook.
120+
:::
121+
122+
:::{grid-item-card} Build model for predictive maintenance with TensorFlow
123+
:link: tensorflow-tutorial
124+
:link-type: ref
125+
Learn how to build a machine learning model that will predict whether
126+
a machine will fail within a specified time window in the future.
127+
:::
128+
129+
:::{grid-item-card} Advanced time series analysis with MLflow and PyCaret
130+
:link: ml-timeseries
131+
:link-type: ref
132+
Learn how to conduct advanced data analysis on large time series datasets
133+
with CrateDB, MLflow, and PyCaret:
134+
Anomaly detection and forecasting, time series decomposition,
135+
Exploratory data analysis (EDA).
136+
:::
137+
138+
::::
139+
140+
95141
## MLOps and model training
96142

143+
:::{div} sd-text-muted
144+
CrateDB supports MLOps procedures through adapters to best-of-breed software
145+
frameworks.
146+
:::
147+
97148
:::{div}
98149
Training a machine learning model, running it in production, and maintaining
99150
it, requires a significant amount of data processing and bookkeeping
@@ -103,16 +154,14 @@ Machine Learning Operations [MLOps] is a paradigm that aims to deploy and
103154
maintain machine learning models in production reliably and efficiently,
104155
including experiment tracking, and in the spirit of continuous development
105156
and DevOps.
106-
107-
CrateDB supports MLOps procedures through adapters to best-of-breed software
108-
frameworks.
109157
:::
110158

111159
::::{grid} 2
112160
:gutter: 4
113161

114162
:::{grid-item-card} MLflow
115163
:link: mlflow
164+
:link-type: ref
116165
MLflow is an open-source platform to manage the whole ML lifecycle,
117166
including experimentation, reproducibility, deployment, and a central
118167
model registry.
@@ -122,36 +171,29 @@ CrateDB can be used as a storage database for the MLflow Tracking subsystem.
122171

123172
:::{grid-item-card} PyCaret
124173
:link: pycaret
174+
:link-type: ref
125175
PyCaret is an open-source, low-code machine learning library for Python
126176
that automates machine learning workflows (AutoML).
127177
+++
128178
CrateDB can be used as a storage database for training and production datasets.
129179
:::
130180

131-
::::
132-
133-
134-
## Time-series anomaly detection and forecasting
135-
136-
Load and analyze data from database systems.
137-
138-
::::{grid} 2
139-
:gutter: 4
140-
141-
:::{grid-item-card} Statistical analysis and visualization on huge datasets
142-
:link: r-tutorial
143-
Learn how to create a Machine Learning pipeline using R and CrateDB.
181+
:::{grid-item-card} Advanced time series analysis with MLflow and PyCaret
182+
:link: ml-timeseries
183+
:link-type: ref
184+
:columns: 12
185+
Learn how to conduct advanced data analysis on large time series datasets
186+
with CrateDB, MLflow, and PyCaret.
187+
+++
188+
**What's inside:** Anomaly detection and forecasting, time series decomposition,
189+
exploratory data analysis (EDA).
144190
:::
145191

146-
:::{grid-item-card} Regression analysis with pandas and scikit-learn
147-
:link: scikit-learn-tutorial
148-
Use pandas and scikit-learn to run a regression analysis within a Jupyter Notebook.
149-
:::
192+
::::
150193

151-
:::{grid-item-card} Use TensorFlow and CrateDB for predictive maintenance
152-
:link: tensorflow-tutorial
153-
Learn how to build a machine learning model that will predict whether
154-
a machine will fail within a specified time window in the future.
155-
:::
156194

157-
::::
195+
:::{toctree}
196+
:maxdepth: 1
197+
:hidden:
198+
time-series
199+
:::

docs/topic/timeseries/advanced.md renamed to docs/solution/machine-learning/time-series.md

Lines changed: 3 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
1+
(ml-timeseries)=
12
(timeseries-advanced)=
23
(timeseries-analysis)=
3-
44
# Advanced Time Series Analysis
55

6+
:::{div} sd-text-muted
67
Learn how to conduct advanced data analysis on large time series datasets
78
with CrateDB.
9+
:::
810

911
{tags-primary}`Anomaly detection`
1012
{tags-primary}`Forecasting / Prediction`
1113
{tags-primary}`Time series decomposition`
1214
{tags-primary}`Exploratory data analysis`
13-
{tags-primary}`Metadata integration`
1415

1516
:::{include} /_include/links.md
1617
:::
@@ -169,102 +170,7 @@ identify patterns.
169170
::::
170171

171172

172-
(timeseries-analysis-metadata)=
173-
## Metadata Integration
174-
175-
CrateDB is particularly effective when you need to combine time series data
176-
with metadata, for instance, in scenarios where data like sensor readings
177-
or log entries, need to be augmented with additional context for more
178-
insightful analysis. See also [](#document).
179-
180-
CrateDB supports effective time series analysis with fast aggregations, a
181-
rich set of built-in functions, and [JOIN](inv:crate-reference#sql_joins)
182-
operations.
183-
184-
:::{rubric} Examples
185-
:::
186-
187-
::::{info-card}
188-
189-
:::{grid-item}
190-
:columns: auto 9 9 9
191-
**Analyzing Device Readings with Metadata Integration**
192-
193-
This tutorial illustrates how to augment time series data with metadata, in
194-
order to enable more comprehensive analysis. It uses a time series dataset that
195-
captures various device readings, such as battery, CPU, and memory information.
196-
197-
{{ '{}(#timeseries-objects)'.format(tutorial) }}
198-
:::
199-
200-
:::{grid-item}
201-
:columns: 3
202-
203-
{tags-primary}`Rich Time Series`
204-
{tags-primary}`Metadata`
205-
206-
{tags-secondary}`SQL`
207-
:::
208-
209-
::::
210-
211-
212-
(timeseries-analysis-sql)=
213-
## SQL Analysis
214-
215-
CrateDB offers enhanced features for analysing time series data.
216-
217-
**Examples**
218-
219-
::::{info-card}
220-
221-
:::{grid-item}
222-
:columns: 9
223-
**Analyzing Weather Data**
224-
225-
Run aggregations with gap filling / interpolation, using common
226-
table expressions (CTEs) and LAG / LEAD window functions.
227-
228-
Find maximum values using the MAX_BY aggregate function, returning
229-
the value from one column based on the maximum or minimum value of another
230-
column within a group.
231-
232-
{{ '{}(#timeseries-analysis-weather)'.format(tutorial) }}
233-
:::
234-
235-
:::{grid-item}
236-
:columns: 3
237-
238-
{tags-primary}`Aggregations`
239-
{tags-primary}`Time series`
240-
241-
{tags-secondary}`SQL`
242-
:::
243-
244-
::::
245-
246-
247-
(timeseries-visualization)=
248-
## Visualization
249-
250-
Similar to EDA, just applying [data and information visualization] can yield
251-
significant insights into the characteristics of your data. By using
252-
best-of-breed data visualization tools, initial data exploration is
253-
mostly your first encounter with the data.
254-
255-
:::{rubric} Examples
256-
:::
257-
258-
:::{include} /_include/card/timeseries-explore.md
259-
:::
260-
261-
:::{include} /_include/card/timeseries-datashader.md
262-
:::
263-
264-
265-
266173
[anomaly]: https://en.wikipedia.org/wiki/Anomaly_(natural_sciences)
267-
[Data and information visualization]: https://en.wikipedia.org/wiki/Data_and_information_visualization
268174
[Decomposition of time series]: https://en.wikipedia.org/wiki/Decomposition_of_time_series
269175
[Exploratory data analysis (EDA)]: https://en.wikipedia.org/wiki/Exploratory_data_analysis
270176
[forecasting]: https://en.wikipedia.org/wiki/Forecasting

docs/topic/timeseries/fundamentals.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,14 @@
22
(timeseries-fundamentals)=
33
# Time Series Fundamentals with CrateDB
44

5+
:::{div} sd-text-muted
6+
Learn how to conduct fundamental data analysis on large time series datasets
7+
with CrateDB.
8+
:::
9+
10+
{tags-primary}`Metadata integration`
11+
{tags-primary}`Advanced SQL for time series`
12+
513
:::{include} /_include/links.md
614
:::
715

@@ -19,6 +27,9 @@ for your own explorations.
1927
:::{include} /_include/card/timeseries-explore.md
2028
:::
2129

30+
:::{include} /_include/card/timeseries-datashader.md
31+
:::
32+
2233
:::{include} /_include/card/timeseries-dask.md
2334
:::
2435

docs/topic/timeseries/index.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,6 @@ Optimizing storage for historic time series data.
110110
:hidden:
111111

112112
Fundamentals <fundamentals>
113-
Advanced <advanced>
114113
video
115114
Long Term Store <longterm>
116115
:::

0 commit comments

Comments
 (0)