diff --git a/docs/pages/guides/recipes.mdx b/docs/pages/guides/recipes.mdx index db0c8b92d9679..c7bac95261abd 100644 --- a/docs/pages/guides/recipes.mdx +++ b/docs/pages/guides/recipes.mdx @@ -77,6 +77,7 @@ These recipes will show you the best practices of using Cube. - [Building UI with drilldowns](/guides/recipes/data-exploration/drilldowns) - [Retrieving numeric values on the front-end](/guides/recipes/data-exploration/cast-numerics) +- [Analyzing data from Query History export](/guides/recipes/data-exploration/query-history-export) ### Upgrading Cube diff --git a/docs/pages/guides/recipes/data-exploration/_meta.js b/docs/pages/guides/recipes/data-exploration/_meta.js index b616aa1b7c9be..d29114c8c7a4c 100644 --- a/docs/pages/guides/recipes/data-exploration/_meta.js +++ b/docs/pages/guides/recipes/data-exploration/_meta.js @@ -1,4 +1,5 @@ module.exports = { "drilldowns": "Building UI with drilldowns", - "cast-numerics": "Retrieving numeric values on the front-end" + "cast-numerics": "Retrieving numeric values on the front-end", + "query-history-export": "Analyzing data from Query History export" } \ No newline at end of file diff --git a/docs/pages/guides/recipes/data-exploration/query-history-export.mdx b/docs/pages/guides/recipes/data-exploration/query-history-export.mdx new file mode 100644 index 0000000000000..2b975a4591c88 --- /dev/null +++ b/docs/pages/guides/recipes/data-exploration/query-history-export.mdx @@ -0,0 +1,199 @@ +# Analyzing data from Query History export + +You can use [Query History export][ref-query-history-export] to bring [Query +History][ref-query-history] data to an external monitoring solution for further +analysis. + +In this recipe, we will show you how to export Query History data to Amazon S3, and then +analyze it using Cube by reading the data from S3 using DuckDB. + +## Configuration + +[Vector configuration][ref-vector-configuration] for exporting Query History to Amazon S3 +and also outputting it to the console of the Vector agent in your Cube Cloud deployment. + +In the example below, we are using the `aws_s3` sink to export the `cube-query-history-export-demo` +bucket in Amazon S3, but you can use any other storage solution that Vector supports. + +```toml +[sinks.aws_s3] +type = "aws_s3" +inputs = [ + "query-history" +] +bucket = "cube-query-history-export-demo" +region = "us-east-2" +compression = "gzip" + +[sinks.aws_s3.auth] +access_key_id = "$CUBE_CLOUD_MONITORING_AWS_ACCESS_KEY_ID" +secret_access_key = "$CUBE_CLOUD_MONITORING_AWS_SECRET_ACCESS_KEY" + +[sinks.aws_s3.encoding] +codec = "json" + +[sinks.aws_s3.healthcheck] +enabled = false + + +[sinks.my_console] +type = "console" +inputs = [ + "query-history" +] +target = "stdout" +encoding = { codec = "json" } +``` + +You'd also need to set the following environment variables in the Settings → Environment +variables page of your Cube Cloud deployment: + +```bash +CUBE_CLOUD_MONITORING_AWS_ACCESS_KEY_ID=your-access-key-id +CUBE_CLOUD_MONITORING_AWS_SECRET_ACCESS_KEY=your-secret-access-key + +CUBEJS_DB_DUCKDB_S3_ACCESS_KEY_ID=your-access-key-id +CUBEJS_DB_DUCKDB_S3_SECRET_ACCESS_KEY=your-secret-access-key +CUBEJS_DB_DUCKDB_S3_REGION=us-east-2 +``` + +## Data modeling + +Example data model for analyzing data from Query History export that is brought to a +bucket in Amazon S3. The data is accessed directly from S3 using DuckDB. + +With this data model, you can run queries that aggregate data by dimensions such as +`status`, `environment_name`, `api_type`, etc. and also calculate metrics like +`count`, `total_duration`, or `avg_duration`: + +```yaml +cubes: + - name: requests + sql: > + SELECT + *, + api_response_duration_ms / 1000 AS api_response_duration, + EPOCH_MS(start_time_unix_ms) AS start_time, + EPOCH_MS(end_time_unix_ms) AS end_time + FROM read_json_auto('s3://cube-query-history-export-demo/**/*.log.gz') + + dimensions: + - name: trace_id + sql: trace_id + type: string + primary_key: true + + - name: deployment_id + sql: deployment_id + type: number + + - name: environment_name + sql: environment_name + type: string + + - name: api_type + sql: api_type + type: string + + - name: api_query + sql: api_query + type: string + + - name: security_context + sql: security_context + type: string + + - name: cache_type + sql: cache_type + type: string + + - name: start_time + sql: start_time + type: time + + - name: end_time + sql: end_time + type: time + + - name: duration + sql: api_response_duration + type: number + + - name: status + sql: status + type: string + + - name: error_message + sql: error_message + type: string + + - name: user_name + sql: "SUBSTRING(security_context::JSON ->> 'user', 3, LENGTH(security_context::JSON ->> 'user') - 4)" + type: string + + segments: + - name: production_environment + sql: "{environment_name} IS NULL" + + - name: errors + sql: "{status} <> 'success'" + + measures: + - name: count + type: count + + - name: count_non_production + description: > + Counts all non-production environments. + See for details: https://cube.dev/docs/product/workspace/environments + type: count + filters: + - sql: "{environment_name} IS NOT NULL" + + - name: total_duration + type: sum + sql: "{duration}" + + - name: avg_duration + type: number + sql: "{total_duration} / {count}" + + - name: median_duration + type: number + sql: "MEDIAN({duration})" + + - name: min_duration + type: min + sql: "{duration}" + + - name: max_duration + type: max + sql: "{duration}" + + pre_aggregations: + - name: count_and_durations_by_status_and_start_date + measures: + - count + - min_duration + - max_duration + - total_duration + dimensions: + - status + time_dimension: start_time + granularity: hour + refresh_key: + sql: SELECT MAX(end_time) FROM {requests.sql()} + every: 10 minutes + +``` + +## Result + +Example query in Playground: + + + + +[ref-query-history-export]: /product/workspace/monitoring#query-history-export +[ref-query-history]: /product/workspace/query-history +[ref-vector-configuration]: /product/workspace/monitoring#configuration \ No newline at end of file diff --git a/docs/pages/product/caching/using-pre-aggregations.mdx b/docs/pages/product/caching/using-pre-aggregations.mdx index 7f584a4cb550d..2d694e5025aa7 100644 --- a/docs/pages/product/caching/using-pre-aggregations.mdx +++ b/docs/pages/product/caching/using-pre-aggregations.mdx @@ -614,7 +614,7 @@ simple queries using a familiar SQL syntax. You can connect using the MySQL CLI client, for example: ```bash -mysql -h --user=cubestore -pcubestore +mysql -h --user=cubestore -pcubestore --protocol=TCP ``` diff --git a/docs/pages/product/deployment/cloud/pricing.mdx b/docs/pages/product/deployment/cloud/pricing.mdx index 20de77c7dd58c..6f2838ffc2ecc 100644 --- a/docs/pages/product/deployment/cloud/pricing.mdx +++ b/docs/pages/product/deployment/cloud/pricing.mdx @@ -191,11 +191,11 @@ You can upgrade to a chosen tier in the [Monitoring Integrations][ref-monitoring-integrations] feature has the following tiers: -| Tier | CCUs per hour | Exported data | -| ---- | :-----------: | -------------- | -| XS | 1 | Up to 10 GB/mo | -| S | 2 | Up to 25 GB/mo | -| M | 4 | Up to 50 GB/mo | +| Tier | CCUs per hour | Exported data | Dependent features | +| ---- | :-----------: | -------------- | --- | +| XS | 1 | Up to 10 GB/mo | — | +| S | 2 | Up to 25 GB/mo | — | +| M | 4 | Up to 50 GB/mo | [Query History export][ref-query-history-export] | You can [upgrade][ref-monitoring-integrations-config] to a chosen tier in the Settings of your deployment. @@ -367,4 +367,5 @@ product tier level. Payments are non-refundable. [ref-customer-managed-keys]: /product/workspace/encryption-keys [ref-semantic-catalog]: /product/workspace/semantic-catalog [ref-ai-api]: /product/apis-integrations/ai-api -[ref-ai-assistant]: /product/workspace/ai-assistant \ No newline at end of file +[ref-ai-assistant]: /product/workspace/ai-assistant +[ref-query-history-export]: /product/workspace/monitoring#query-history-export \ No newline at end of file diff --git a/docs/pages/product/workspace/monitoring.mdx b/docs/pages/product/workspace/monitoring.mdx index 692b3759d4638..ef1de605b5ee1 100644 --- a/docs/pages/product/workspace/monitoring.mdx +++ b/docs/pages/product/workspace/monitoring.mdx @@ -107,6 +107,7 @@ Cube Cloud deployment should export their logs: | `refresh-scheduler` | Logs of the refresh worker | | `warmup-job` | Logs of the [pre-aggregation warm-up][ref-preagg-warmup] | | `cubestore` | Logs of Cube Store | +| `query-history` | [Query History export](#query-history-export) | Example configuration for exporting logs to [Datadog][vector-docs-sinks-datadog]: @@ -274,6 +275,61 @@ You can also customize the user name and password for `prometheus_exporter` by setting `CUBE_CLOUD_MONITORING_METRICS_USER` and `CUBE_CLOUD_MONITORING_METRICS_PASSWORD` environment variables, respectively. +## Query History export + +With Query History export, you can bring [Query History][ref-query-history] data to an +external monitoring solution for further analysis, for example: +* Detect queries that do not hit pre-aggregations. +* Set up alerts for queries that exceed a certain duration. +* Attribute usage to specific users and implement chargebacks. + + + +Query History export requires the [M tier](/product/deployment/cloud/pricing#monitoring-integrations-tiers) +of Monitoring Integrations. + + + +To configure Query History export, add the `query-history` input to the `inputs` +option of the sink configuration. Example configuration for exporting Query History data +to the standard output of the Vector agent: + +```toml +[sinks.my_console] +type = "console" +inputs = [ + "query-history" +] +target = "stdout" +encoding = { codec = "json" } +``` + +Exported data includes the following fields: + +| Field | Description | +| --- | --- | +| `trace_id` | Unique identifier of the API request. | +| `account_name` | Name of the Cube Cloud account. | +| `deployment_id` | Identifier of the [deployment][ref-deployments]. | +| `environment_name` | Name of the [environment][ref-environments], `NULL` for production. | +| `api_type` | Type of [data API][ref-apis] used (`rest`, `sql`, etc.), `NULL` for errors. | +| `api_query` | Query executed by the API, represented as string. | +| `security_context` | [Security context][ref-security-context] of the request, represented as a string. | +| `status` | Status of the request: `success` or `error`. | +| `error_message` | Error message, if any. | +| `start_time_unix_ms` | Start time of the execution, Unix timestamp in milliseconds. | +| `end_time_unix_ms` | End time of the execution, Unix timestamp in milliseconds. | +| `api_response_duration_ms` | Duration of the execution in milliseconds. | +| `cache_type` | [Cache type][ref-cache-type]: `no_cache`, `pre_aggregations_in_cube_store`, etc. | + + + +See [this recipe][ref-query-history-export-recipe] for an example of analyzing data from +Query History export. + + + + [ref-autosuspend]: /product/deployment/cloud/auto-suspension#effects-on-experience [self-sinks-for-metrics]: #configuration-sinks-for-metrics [ref-dedicated-infra]: /product/deployment/cloud/infrastructure#dedicated-infrastructure @@ -302,4 +358,11 @@ setting `CUBE_CLOUD_MONITORING_METRICS_USER` and [mimir]: https://grafana.com/oss/mimir/ [grafana-cloud]: https://grafana.com/products/cloud/ [ref-prod-env]: /product/workspace/environments#production-environment -[ref-preagg-warmup]: /product/deployment/cloud/warm-up#pre-aggregation-warm-up \ No newline at end of file +[ref-preagg-warmup]: /product/deployment/cloud/warm-up#pre-aggregation-warm-up +[ref-query-history]: /product/workspace/query-history +[ref-deployments]: /product/deployment/cloud/deployments +[ref-environments]: /product/workspace/environments +[ref-apis]: /product/apis-integrations +[ref-security-context]: /product/auth/context +[ref-cache-type]: /product/caching#cache-type +[ref-query-history-export-recipe]: /guides/recipes/data-exploration/query-history-export \ No newline at end of file diff --git a/docs/pages/product/workspace/query-history.mdx b/docs/pages/product/workspace/query-history.mdx index 2476b5683ff43..0690cb92d5d9f 100644 --- a/docs/pages/product/workspace/query-history.mdx +++ b/docs/pages/product/workspace/query-history.mdx @@ -5,11 +5,13 @@ redirect_from: # Query History -The Query History screen in Cube Cloud is a one-stop shop for all performance -and diagnostic information about queries issued for a deployment. It is kept -up-to-date in real time and provides a quick way to check whether queries are -being accelerated with [pre-aggregations][ref-caching-gs-preaggs], how long they -took to execute, and if they failed. +The Query History feature in Cube Cloud is a one-stop shop for all performance +and diagnostic information about queries issued for a deployment. + +It provides a real-time and historic view of requests to [data APIs][ref-apis] of your +Cube Cloud deployment, so you can check whether queries were accelerated with +[pre-aggregations][ref-caching-gs-preaggs], how long they took to execute, and if they +failed. @@ -19,6 +21,13 @@ You can also choose a [Query History tier](/product/deployment/cloud/pricing#que +You can set the [time range](#setting-the-time-range), [explore queries](#exploring-queries) +and filter them, drill down on specific queries to [see more details](#inspecting-api-queries). + +You can also use [Query History export][ref-query-history-export] to bring Query History +data to an external monitoring solution for further analysis. + +