Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Plan how to send telemetry data as a content analytics event #31320

Closed
john-thomas-dotcms opened this issue Feb 6, 2025 · 1 comment
Closed

Comments

@john-thomas-dotcms
Copy link
Contributor

Parent Issue

No response

Task

We currently send a static telemetry data on a schedule, inserting it into the telemetry postgres database. The analytics database records activity data in a separate clickhouse database.

We need to merge the static telemetry data with the analytics activity database. To do this, every time we send the telemetry data packet to the telemetry DB, we will also send the same data to the analytics database as an analytics event. So, the telemetry data will be sent on the same schedule, and each time the schedule hits, the data will be sent to both the telemetry DB and the analytics DB in parallel.

Note: We will eventually remove the telemetry DB. However, to avoid breaking existing existing telemetry reports, we'll send the data in parallel to both DBs until the existing reports can be modified to pull the data from the analytics DB.

Proposed Objective

Same as Parent Issue

Proposed Priority

Priority 2 - Important

Acceptance Criteria

  1. All data in the telemetry DB is mirrored exactly in the analytics DB.

Note: There is no requirement for which telemetry data is stored together in a single analytics event. So, the telemetry data can be stored in a single analytics event, or broken into multiple analytics events.

External Links... Slack Conversations, Support Tickets, Figma Designs, etc.

No response

Assumptions & Initiation Needs

No response

Quality Assurance Notes & Workarounds

No response

Sub-Tasks & Estimates

No response

@john-thomas-dotcms john-thomas-dotcms moved this from New to Next 1-3 Sprints in dotCMS - Product Planning Feb 6, 2025
@john-thomas-dotcms john-thomas-dotcms changed the title Send telemetry data as a content analytics event Spike: Plan how to send telemetry data as a content analytics event Mar 5, 2025
@john-thomas-dotcms john-thomas-dotcms moved this from Next 1-3 Sprints to Current Sprint Backlog in dotCMS - Product Planning Mar 5, 2025
@jcastro-dotcms jcastro-dotcms self-assigned this Mar 7, 2025
@jcastro-dotcms jcastro-dotcms moved this from Current Sprint Backlog to In Progress in dotCMS - Product Planning Mar 7, 2025
@jcastro-dotcms
Copy link
Contributor

jcastro-dotcms commented Mar 13, 2025

SUMMARY

There are several approaches we might follow with regards to adding the Telemetry Report to our CA infrastructure. So, far, it seems these are the options we have:

  • Separate columns for each metric.
  • Metrics saved as separate events.
  • Metrics saved in a single event record.

I've put together a list of pros and cons for each of them, which must be discussed with the team:

Separate columns for each metric

Pros:

  • Fast queries with direct columnar storage benefits.
  • Efficient filtering, aggregation, and indexing on specific metrics.
  • Takes advantage of ClickHouse’s column-oriented nature for analytics.

Cons:

  • We need to update the CubeJS and ClickHouse schemas every time a new metric is introduced.
  • A growing number of columns could impact performance and manageability. We might need an additional spike to add a huge amount of columns and records to ClickHouse to see how it performs.

Metrics saved as separate events

Pros:

  • Schema remains stable; adding new metrics doesn’t require schema changes, it's more scalable.
  • Queries for specific metrics are simple and efficient.
  • Works well with ClickHouse’s strength in handling large numbers of records.

Cons:

  • We'll store many small records per telemetry report, which might lead to increased storage overhead.
  • Querying a full telemetry report (reconstructing the JSON) requires more joins or aggregations.
  • Write amplification may become a concern (many inserts per report) -- maybe?
  • Not all metrics have single values. For instance, the Metric to retrieve the total size of files under all Themes in all Sites is multi-value, so we need to take that into consideration -- We can handle this by adding two columns instead: One as a numeric value, and the other one as a String. This way, we can query ranges for numeric values more easily.

Metrics saved in a single event record

NOTES:

  1. What if we only use ClickHouse for this, and NOT expose it via CubeJS?
  2. Can we handle all JSON-related operations in the base SQL, and then expose that data as simple data to CubeJS?

We'll need a POC for both options.

Pros:

  • Single point for retrieving ALL Telemetry data.
  • We're saving a single value, even if we add or remove Metrics along the way.

Cons:

  • Querying individual metrics may be inefficient.
  • No native JSON value support in CubeJS (yet), so filtering and aggregation require expensive string operations.
  • Harder to index and optimize queries -- maybe?

Aside from choosing the appropriate way to store the Telemetry data, we need to define what part of the code is going to save it and how it's going to do that. For instance:

  1. A new method in the TelemetryResource REST Endpoint can be added in order to trigger the event generation process. The new Event Type can be based on the existing Custom Event we already have. The advantage of having this is that we can separate this logic from the existing Telemetry Quartz Job altogether since the beginning. NOTE: We'll keep using the Telemetry Job, we're just getting rid of the dedicated Telemetry DB.
  2. Or, if we don't need any request-related data, we can mock some objects just like we do in our Integration Tests and call the Collector from the Telemetry Quartz Job directly. This way, we'll persist the Telemetry data at the same time it is persisted, for now, to the "old" Telemetry service.
  3. We might need to update the WebEventsCollectorService class to NOT require the request and response objects because those are not important for saving Telemetry data.

@github-project-automation github-project-automation bot moved this from In Progress to Done in dotCMS - Product Planning Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

2 participants