Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion docs/ingest/telemetry/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
(metrics-store)=
(telemetry)=
(integrate-metrics)=
# Metrics and telemetry data
# Metrics, telemetry, and logs

:::::{grid}
:padding: 0
Expand Down Expand Up @@ -59,6 +59,12 @@ Prometheus is an open-source systems monitoring and alerting toolkit
for collecting metrics data from applications and infrastructures.
::::

::::{grid-item-card} rsyslog
:link: rsyslog
:link-type: ref
Send logs with rsyslog, a rocket‑fast system for log processing.
::::

::::{grid-item-card} Telegraf
:link: telegraf
:link-type: ref
Expand Down
1 change: 1 addition & 0 deletions docs/integrate/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ queryzen/index
r/index
rill/index
risingwave/index
rsyslog/index
scikit-learn/index
sql-server/index
streamlit/index
Expand Down
45 changes: 45 additions & 0 deletions docs/integrate/rsyslog/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
(rsyslog)=
# rsyslog

```{div} .float-right
[![rsyslog logo](https://www.rsyslog.com/files/2019/01/logo_neu_cropped.png){height=60px loading=lazy}][rsyslog]
```
```{div} .clearfix
```


:::{rubric} About
:::

[Rsyslog] is a rocket-fast system for log processing.

It offers high performance, advanced security features, and a modular design.
Originally a regular syslogd, rsyslog has evolved into a highly versatile
logging solution capable of ingesting data from numerous sources,
transforming it, and outputting it to a wide variety of destinations.

Rsyslog can deliver over one million messages per second to local
destinations under minimal processing load. Even with complex routing
and remote forwarding, performance remains excellent.

:::{rubric} Learn
:::

::::{grid} 2

:::{grid-item-card} Tutorial: Store server logs in CrateDB using rsyslog
:link: rsyslog-tutorial
:link-type: ref
Storing server logs in CrateDB delivers fast search and aggregations on them.
:::

::::

:::{toctree}
:maxdepth: 1
:hidden:
Tutorial <tutorial>
:::


[rsyslog]: https://www.rsyslog.com/
145 changes: 145 additions & 0 deletions docs/integrate/rsyslog/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
(rsyslog-tutorial)=
# Store server logs on CrateDB for fast search and aggregations

## Introduction

CrateDB stores server logs efficiently and makes them easy to query.

Common pain points with traditional log stacks and SIEMs include:

* timeouts when searching across long time ranges
* proprietary, complex query syntaxes
* awkward integrations with application monitoring dashboards

CrateDB addresses these issues: query logs with standard SQL from any
PostgreSQL‑compatible tool, and use full‑text search and aggregations
backed by efficient indexes. The sections below walk through a minimal
setup.

## Setup

### CrateDB

First, start CrateDB. For production, use a dedicated cluster. For this demo, run a single‑node container:

```bash
sudo docker run -d --name cratedb \
-p 4200:4200 -p 5432:5432 \
-e CRATE_HEAP_SIZE=1g \
crate:latest -Cdiscovery.type=single-node
```
Comment on lines +25 to +30
Copy link
Member Author

@amotl amotl Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kneth: What do you think about omitting the sudo all around, so the tutorial commands can easily be used more universally, e.g. on macOS, without much ado?

Copy link
Member Author

@amotl amotl Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correcting myself: rsyslog is not meant to be used on macOS, so nevermind: Versatile users can easily omit the sudo prefix on their own.


Next, create a table for logs. Open `http://localhost:4200/#!/console` or invoke `crash` and run:

```sql
CREATE TABLE doc.systemevents (
message TEXT,
INDEX message_ft USING FULLTEXT(message) WITH (analyzer = 'english'),
facility INTEGER,
fromhost TEXT,
priority INTEGER,
DeviceReportedTime TIMESTAMP,
ReceivedAt TIMESTAMP,
InfoUnitID INTEGER,
SysLogTag TEXT
);
```
Tip: On headless systems, run queries with the {ref}`command-line tools <connect-cli>`.

Then we need an account for the logging system:

```sql
-- Use a strong secret; e.g. from a secret manager or env var.
CREATE USER rsyslog WITH (PASSWORD='pwd123');
```

and we need to grant permissions on the table above:

```sql
GRANT DML ON TABLE doc.systemevents TO rsyslog;
```

### rsyslog

We will use [rsyslog](https://github.com/rsyslog/rsyslog) to send the logs to CrateDB, for this setup we need `rsyslog` v8.2202 or higher and the `ompgsql` module:

```bash
sudo DEBIAN_FRONTEND=noninteractive apt install --yes software-properties-common
sudo add-apt-repository -y ppa:adiscon/v8-stable
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think add-apt-repository has also been phased out on modern Debian/Ubuntu? Shall we exercise and verify this explicitly? Did you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just verified the command add-apt-repository does not exist on any of the Docker-based default installations of Ubuntu, probing the three most recent LTS releases 20, 22, and 24.

docker run --rm -it ubuntu:20.04 bash
root@33d8b5a2784f:/# add-apt-repository
bash: add-apt-repository: command not found

It always needed the installation of the venerable software-properties-common package. While many people may have it already, it creates significant installation overhead for those who don't.

# apt install -y software-properties-common
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
[...]
0 upgraded, 123 newly installed, 0 to remove and 1 not upgraded.
Need to get 43.1 MB of archives.
After this operation, 157 MB of additional disk space will be used.
[...]

It needs to install zillions of dependency packages and also prompts asking to configure tzdata.

Configuring tzdata
------------------

Please select the geographic area in which you live. Subsequent configuration questions will narrow this down by presenting a list of cities, representing the
time zones in which they are located.

  1. Africa  2. America  3. Antarctica  4. Arctic  5. Asia  6. Atlantic  7. Australia  8. Europe  9. Indian  10. Pacific  11. Etc  12. Legacy
Geographic area:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now adding this command to install the software-properties-common package to the procedure. It's the quickest path to success, even if it adds drag by the amount of dependencies it pulls in.

docker run --rm -it ubuntu:24.04 bash
sudo apt update --yes
sudo DEBIAN_FRONTEND=noninteractive apt install --yes software-properties-common

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added with 63d9a30.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it has become too complicated 😢

sudo apt update --yes
sudo debconf-set-selections <<< 'rsyslog-pgsql rsyslog-pgsql/dbconfig-install string false'
sudo apt install --yes rsyslog rsyslog-pgsql
```

Let's now configure it to use the account we created earlier:

```bash
echo 'module(load="ompgsql")' | sudo tee /etc/rsyslog.d/pgsql.conf
echo '*.* action(type="ompgsql" conninfo="postgresql://rsyslog:pwd123@localhost/doc")' | sudo tee -a /etc/rsyslog.d/pgsql.conf
sudo chmod 640 /etc/rsyslog.d/pgsql.conf
sudo systemctl restart rsyslog
```

If you are interested in more advanced setups involving queuing for additional reliability in production scenarios, you can read more about available settings in the [rsyslog documentation](https://www.rsyslog.com/doc/v8-stable/tutorials/high_database_rate.html).

### MediaWiki

To generate logs, run a [MediaWiki](https://www.mediawiki.org/wiki/MediaWiki) container and forward its logs to rsyslog:

```bash
sudo docker run --name mediawiki \
-p 80:80 -d \
--log-driver syslog \
--log-opt syslog-address=unixgram:///dev/log \
mediawiki
```

Open `http://localhost/` to see the MediaWiki setup page.
Click “set up the wiki”, then “Continue” to generate log entries.
CrateDB now stores new rows in `doc.systemevents`, with `syslogtag` matching the container ID.


## Explore

Use {ref}`crate-reference:predicates_match` to find specific error messages:

```sql
SELECT devicereportedtime,message
FROM doc.systemevents
WHERE MATCH(message_ft, 'Could not reliably determine') USING PHRASE
ORDER BY 1 DESC;
```

```text
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| devicereportedtime | message |
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1691510710000 | AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.3. Set the 'ServerName' directive globally to suppress this message |
| 1691510710000 | AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.3. Set the 'ServerName' directive globally to suppress this message |
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```

Show the top log sources by event count:

```sql
SELECT syslogtag,count(*)
FROM doc.systemevents
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;
```

```text
+----------------------+----------+
| syslogtag | count(*) |
+----------------------+----------+
| kernel: | 23 |
| 083053ae8ea3[52134]: | 20 |
| systemd[1]: | 15 |
| sudo: | 10 |
| rsyslogd: | 5 |
+----------------------+----------+
```

We hope this was useful. Share feedback and questions in the
[CrateDB Community](https://community.cratedb.com/).