Skip to content

docs: Query History export #9362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 23, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/pages/guides/recipes.mdx
Original file line number Diff line number Diff line change
@@ -77,6 +77,7 @@ These recipes will show you the best practices of using Cube.

- [Building UI with drilldowns](/guides/recipes/data-exploration/drilldowns)
- [Retrieving numeric values on the front-end](/guides/recipes/data-exploration/cast-numerics)
- [Analyzing data from Query History export](/guides/recipes/data-exploration/query-history-export)

### Upgrading Cube

3 changes: 2 additions & 1 deletion docs/pages/guides/recipes/data-exploration/_meta.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
module.exports = {
"drilldowns": "Building UI with drilldowns",
"cast-numerics": "Retrieving numeric values on the front-end"
"cast-numerics": "Retrieving numeric values on the front-end",
"query-history-export": "Analyzing data from Query History export"
}
199 changes: 199 additions & 0 deletions docs/pages/guides/recipes/data-exploration/query-history-export.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
# Analyzing data from Query History export

You can use [Query History export][ref-query-history-export] to bring [Query
History][ref-query-history] data to an external monitoring solution for further
analysis.

In this recipe, we will show you how to export Query History data to Amazon S3, and then
analyze it using Cube by reading the data from S3 using DuckDB.

## Configuration

[Vector configuration][ref-vector-configuration] for exporting Query History to Amazon S3
and also outputting it to the console of the Vector agent in your Cube Cloud deployment.

In the example below, we are using the `aws_s3` sink to export the `cube-query-history-export-demo`
bucket in Amazon S3, but you can use any other storage solution that Vector supports.

```toml
[sinks.aws_s3]
type = "aws_s3"
inputs = [
"query-history"
]
bucket = "cube-query-history-export-demo"
region = "us-east-2"
compression = "gzip"

[sinks.aws_s3.auth]
access_key_id = "$CUBE_CLOUD_MONITORING_AWS_ACCESS_KEY_ID"
secret_access_key = "$CUBE_CLOUD_MONITORING_AWS_SECRET_ACCESS_KEY"

[sinks.aws_s3.encoding]
codec = "json"

[sinks.aws_s3.healthcheck]
enabled = false


[sinks.my_console]
type = "console"
inputs = [
"query-history"
]
target = "stdout"
encoding = { codec = "json" }
```

You'd also need to set the following environment variables in the <Btn>Settings → Environment
variables</Btn> page of your Cube Cloud deployment:

```bash
CUBE_CLOUD_MONITORING_AWS_ACCESS_KEY_ID=your-access-key-id
CUBE_CLOUD_MONITORING_AWS_SECRET_ACCESS_KEY=your-secret-access-key

CUBEJS_DB_DUCKDB_S3_ACCESS_KEY_ID=your-access-key-id
CUBEJS_DB_DUCKDB_S3_SECRET_ACCESS_KEY=your-secret-access-key
CUBEJS_DB_DUCKDB_S3_REGION=us-east-2
```

## Data modeling

Example data model for analyzing data from Query History export that is brought to a
bucket in Amazon S3. The data is accessed directly from S3 using DuckDB.

With this data model, you can run queries that aggregate data by dimensions such as
`status`, `environment_name`, `api_type`, etc. and also calculate metrics like
`count`, `total_duration`, or `avg_duration`:

```yaml
cubes:
- name: requests
sql: >
SELECT
*,
api_response_duration_ms / 1000 AS api_response_duration,
EPOCH_MS(start_time_unix_ms) AS start_time,
EPOCH_MS(end_time_unix_ms) AS end_time
FROM read_json_auto('s3://cube-query-history-export-demo/**/*.log.gz')

dimensions:
- name: trace_id
sql: trace_id
type: string
primary_key: true

- name: deployment_id
sql: deployment_id
type: number

- name: environment_name
sql: environment_name
type: string

- name: api_type
sql: api_type
type: string

- name: api_query
sql: api_query
type: string

- name: security_context
sql: security_context
type: string

- name: cache_type
sql: cache_type
type: string

- name: start_time
sql: start_time
type: time

- name: end_time
sql: end_time
type: time

- name: duration
sql: api_response_duration
type: number

- name: status
sql: status
type: string

- name: error_message
sql: error_message
type: string

- name: user_name
sql: "SUBSTRING(security_context::JSON ->> 'user', 3, LENGTH(security_context::JSON ->> 'user') - 4)"
type: string

segments:
- name: production_environment
sql: "{environment_name} IS NULL"

- name: errors
sql: "{status} <> 'success'"

measures:
- name: count
type: count

- name: count_non_production
description: >
Counts all non-production environments.
See for details: https://cube.dev/docs/product/workspace/environments
type: count
filters:
- sql: "{environment_name} IS NOT NULL"

- name: total_duration
type: sum
sql: "{duration}"

- name: avg_duration
type: number
sql: "{total_duration} / {count}"

- name: median_duration
type: number
sql: "MEDIAN({duration})"

- name: min_duration
type: min
sql: "{duration}"

- name: max_duration
type: max
sql: "{duration}"

pre_aggregations:
- name: count_and_durations_by_status_and_start_date
measures:
- count
- min_duration
- max_duration
- total_duration
dimensions:
- status
time_dimension: start_time
granularity: hour
refresh_key:
sql: SELECT MAX(end_time) FROM {requests.sql()}
every: 10 minutes

```

## Result

Example query in Playground:

<Screenshot src="https://ucarecdn.com/327373f0-217b-4a91-8ac7-fd2d97c79513/" />


[ref-query-history-export]: /product/workspace/monitoring#query-history-export
[ref-query-history]: /product/workspace/query-history
[ref-vector-configuration]: /product/workspace/monitoring#configuration
2 changes: 1 addition & 1 deletion docs/pages/product/caching/using-pre-aggregations.mdx
Original file line number Diff line number Diff line change
@@ -614,7 +614,7 @@ simple queries using a familiar SQL syntax. You can connect using the MySQL CLI
client, for example:

```bash
mysql -h <CUBESTORE_IP> --user=cubestore -pcubestore
mysql -h <CUBESTORE_IP> --user=cubestore -pcubestore --protocol=TCP
```

<WarningBox>
13 changes: 7 additions & 6 deletions docs/pages/product/deployment/cloud/pricing.mdx
Original file line number Diff line number Diff line change
@@ -191,11 +191,11 @@ You can upgrade to a chosen tier in the

[Monitoring Integrations][ref-monitoring-integrations] feature has the following tiers:

| Tier | CCUs per hour | Exported data |
| ---- | :-----------: | -------------- |
| XS | 1 | Up to 10 GB/mo |
| S | 2 | Up to 25 GB/mo |
| M | 4 | Up to 50 GB/mo |
| Tier | CCUs per hour | Exported data | Dependent features |
| ---- | :-----------: | -------------- | --- |
| XS | 1 | Up to 10 GB/mo | — |
| S | 2 | Up to 25 GB/mo | — |
| M | 4 | Up to 50 GB/mo | [Query History export][ref-query-history-export] |

You can [upgrade][ref-monitoring-integrations-config] to a chosen tier in the
<Btn>Settings</Btn> of your deployment.
@@ -367,4 +367,5 @@ product tier level. Payments are non-refundable.
[ref-customer-managed-keys]: /product/workspace/encryption-keys
[ref-semantic-catalog]: /product/workspace/semantic-catalog
[ref-ai-api]: /product/apis-integrations/ai-api
[ref-ai-assistant]: /product/workspace/ai-assistant
[ref-ai-assistant]: /product/workspace/ai-assistant
[ref-query-history-export]: /product/workspace/monitoring#query-history-export
65 changes: 64 additions & 1 deletion docs/pages/product/workspace/monitoring.mdx
Original file line number Diff line number Diff line change
@@ -107,6 +107,7 @@ Cube Cloud deployment should export their logs:
| `refresh-scheduler` | Logs of the refresh worker |
| `warmup-job` | Logs of the [pre-aggregation warm-up][ref-preagg-warmup] |
| `cubestore` | Logs of Cube Store |
| `query-history` | [Query History export](#query-history-export) |

Example configuration for exporting logs to
[Datadog][vector-docs-sinks-datadog]:
@@ -274,6 +275,61 @@ You can also customize the user name and password for `prometheus_exporter` by
setting `CUBE_CLOUD_MONITORING_METRICS_USER` and
`CUBE_CLOUD_MONITORING_METRICS_PASSWORD` environment variables, respectively.

## Query History export

With Query History export, you can bring [Query History][ref-query-history] data to an
external monitoring solution for further analysis, for example:
* Detect queries that do not hit pre-aggregations.
* Set up alerts for queries that exceed a certain duration.
* Attribute usage to specific users and implement chargebacks.

<SuccessBox>

Query History export requires the [M tier](/product/deployment/cloud/pricing#monitoring-integrations-tiers)
of Monitoring Integrations.

</SuccessBox>

To configure Query History export, add the `query-history` input to the `inputs`
option of the sink configuration. Example configuration for exporting Query History data
to the standard output of the Vector agent:

```toml
[sinks.my_console]
type = "console"
inputs = [
"query-history"
]
target = "stdout"
encoding = { codec = "json" }
```

Exported data includes the following fields:

| Field | Description |
| --- | --- |
| `trace_id` | Unique identifier of the API request. |
| `account_name` | Name of the Cube Cloud account. |
| `deployment_id` | Identifier of the [deployment][ref-deployments]. |
| `environment_name` | Name of the [environment][ref-environments], `NULL` for production. |
| `api_type` | Type of [data API][ref-apis] used (`rest`, `sql`, etc.), `NULL` for errors. |
| `api_query` | Query executed by the API, represented as string. |
| `security_context` | [Security context][ref-security-context] of the request, represented as a string. |
| `status` | Status of the request: `success` or `error`. |
| `error_message` | Error message, if any. |
| `start_time_unix_ms` | Start time of the execution, Unix timestamp in milliseconds. |
| `end_time_unix_ms` | End time of the execution, Unix timestamp in milliseconds. |
| `api_response_duration_ms` | Duration of the execution in milliseconds. |
| `cache_type` | [Cache type][ref-cache-type]: `no_cache`, `pre_aggregations_in_cube_store`, etc. |

<ReferenceBox>

See [this recipe][ref-query-history-export-recipe] for an example of analyzing data from
Query History export.

</ReferenceBox>


[ref-autosuspend]: /product/deployment/cloud/auto-suspension#effects-on-experience
[self-sinks-for-metrics]: #configuration-sinks-for-metrics
[ref-dedicated-infra]: /product/deployment/cloud/infrastructure#dedicated-infrastructure
@@ -302,4 +358,11 @@ setting `CUBE_CLOUD_MONITORING_METRICS_USER` and
[mimir]: https://grafana.com/oss/mimir/
[grafana-cloud]: https://grafana.com/products/cloud/
[ref-prod-env]: /product/workspace/environments#production-environment
[ref-preagg-warmup]: /product/deployment/cloud/warm-up#pre-aggregation-warm-up
[ref-preagg-warmup]: /product/deployment/cloud/warm-up#pre-aggregation-warm-up
[ref-query-history]: /product/workspace/query-history
[ref-deployments]: /product/deployment/cloud/deployments
[ref-environments]: /product/workspace/environments
[ref-apis]: /product/apis-integrations
[ref-security-context]: /product/auth/context
[ref-cache-type]: /product/caching#cache-type
[ref-query-history-export-recipe]: /guides/recipes/data-exploration/query-history-export
23 changes: 17 additions & 6 deletions docs/pages/product/workspace/query-history.mdx
Original file line number Diff line number Diff line change
@@ -5,11 +5,13 @@ redirect_from:

# Query History

The Query History screen in Cube Cloud is a one-stop shop for all performance
and diagnostic information about queries issued for a deployment. It is kept
up-to-date in real time and provides a quick way to check whether queries are
being accelerated with [pre-aggregations][ref-caching-gs-preaggs], how long they
took to execute, and if they failed.
The Query History feature in Cube Cloud is a one-stop shop for all performance
and diagnostic information about queries issued for a deployment.

It provides a real-time and historic view of requests to [data APIs][ref-apis] of your
Cube Cloud deployment, so you can check whether queries were accelerated with
[pre-aggregations][ref-caching-gs-preaggs], how long they took to execute, and if they
failed.

<SuccessBox>

@@ -19,6 +21,13 @@ You can also choose a [Query History tier](/product/deployment/cloud/pricing#que

</SuccessBox>

You can set the [time range](#setting-the-time-range), [explore queries](#exploring-queries)
and filter them, drill down on specific queries to [see more details](#inspecting-api-queries).

You can also use [Query History export][ref-query-history-export] to bring Query History
data to an external monitoring solution for further analysis.

<br/>
<video width="100%" controls>
<source
src="https://ucarecdn.com/ceba0bdc-298b-44e6-8491-2b0e39985465/video.mp4"
@@ -220,4 +229,6 @@ while the query is in the query execution queue:
[ref-query-format]: /product/apis-integrations/rest-api/query-format
[ref-cache-types]: /product/caching#cache-type
[ref-security-context]: /product/auth/context
[ref-multitenancy]: /product/configuration/advanced/multitenancy
[ref-multitenancy]: /product/configuration/advanced/multitenancy
[ref-apis]: /product/apis-integrations
[ref-query-history-export]: /product/workspace/monitoring#query-history-export