Proposal: Inclusion of Trace Object & Profile #1243

pladamgregory · 2024-11-05T20:34:28Z

Proposal: Inclusion of Trace Object & Profile

Description: This proposal introduces the concept of traces to the OCSF schema. The goal is to enhance OCSF to cover observability data for distributed traces.

Traces contain vital information about the flow of requests through a distributed system, including a unique trace ID, individual span IDs, timestamps for start and end, duration, and metadata such as service names and error details. This data provides a comprehensive view of how requests are processed, revealing performance metrics and service dependencies. Traces are useful for performance monitoring, as they help identify bottlenecks and slow operations. They also facilitate root cause analysis by allowing developers to pinpoint issues and optimize the overall system for improved reliability and user experience.

To support the proposal, here's how the modeled example would look when applied to a purchase transaction trace. This illustrates how each span and event would be structured, using OCSF:

Example Trace: Purchase Transaction Trace

1. User Service Span

Span Name: User Authentication
Service: User Service
Duration: 10ms
Events:
- start_auth: Marks when authentication started
- db_query: Records time spent querying the user database
- auth_success: Indicates successful authentication

2. Order Service Span

Span Name: Create Order
Service: Order Service
Parent Span: User Authentication (order creation requires user authentication)
Duration: 50ms
Events:
- validate_cart: Checks if all items in the cart are available
- calculate_total: Calculates the total price
- order_created: Confirms that the order was created in the system

3. Payment Service Span

Span Name: Process Payment
Service: Payment Service
Parent Span: Create Order
Duration: 100ms
Events:
- start_payment: Marks the initiation of the payment process
- payment_gateway_call: Time spent calling an external payment gateway
- payment_success: Confirms successful payment processing

4. Inventory Service Span

Span Name: Update Inventory
Service: Inventory Service
Parent Span: Create Order
Duration: 30ms
Events:
- inventory_lock: Temporarily locks inventory items
- update_db: Updates inventory database to reflect items sold
- inventory_release: Releases inventory lock

5. Notification Service Span

Span Name: Send Confirmation Email
Service: Notification Service
Parent Span: Create Order
Duration: 20ms
Events:
- email_generated: Generates the email content
- email_sent: Confirms the email was sent to the user

Summary of Trace

Trace: Purchase Item
Flow: User Authentication → Create Order → Process Payment → Update Inventory → Send Confirmation Email

OCSF Model (Table)

Action	Description	Event Class	Profile Type	Trace ID	Span ID
start_auth	Marks when authentication started	3002	Trace Profile	Trace_001	Span_001
db_query	Records time spent querying the user database	6005	Trace Profile	Trace_001	Span_002
auth_success	Indicates successful authentication	3002	Trace Profile	Trace_001	Span_003
validate_cart	Checks if all items in the cart are available	6009 (New Application Execution Activity)	Trace Profile	Trace_001	Span_004
calculate_total	Calculates the total price	6009 (New Application Execution Activity)	Trace Profile	Trace_001	Span_005
order_created	Confirms that the order was created in the system	6009 (New Application Execution Activity)	Trace Profile	Trace_001	Span_006
start_payment	Marks the initiation of the payment process	6009 (New Application Execution Activity)	Trace Profile	Trace_001	Span_007
payment_gateway_call	Time spent calling an external payment gateway	6003	Trace Profile	Trace_001	Span_008
payment_success	Confirms successful payment processing	6003	Trace Profile	Trace_001	Span_009
inventory_lock	Temporarily locks inventory items	6009 (New Application Execution Activity)	Trace Profile	Trace_001	Span_010
update_db	Updates inventory database to reflect items sold	6005	Trace Profile	Trace_001	Span_011
inventory_release	Releases inventory lock	6009 (New Application Execution Activity)	Trace Profile	Trace_001	Span_012
email_generated	Generates the email content	4009	Trace Profile	Trace_001	Span_013
email_sent	Confirms the email was sent to the user	4009	Trace Profile	Trace_001	Span_014

New `trace_info` Object & Profile

Trace Object: Defines key application Trace Information for trace events. (Included Via trace profile)

{
  "caption": "Trace",
  "description": "The trace object contains information about distruibuted traces which are critical to observability and describe how requests move through a system, capturing each step's timing and status.",
  "extends": "object",
  "name": "trace",
  "attributes": {
    "uid": {
      "description": "The unique identifier of the trace used in distributed systems and microservices architecture to track and correlate requests across various components of an application.",
      "requirement": "required"
    },
    "span": {
      "description": "The attributes associated with a span within a distributed trace.",
      "requirement": "optional"
    },
    "service": {
      "description": "Identifies the service or component generating the trace.",
      "requirement": "optional"
    },
    "status_code": {
      "description": "Indicates whether the operations in the trace were successful, failed, or had an error, aiding in pinpointing issues.",
      "requirement": "optional"
    },
    "start_time": {
      "description": "The start timestamp of the trace, essential for identifying latency and performance bottlenecks.",
      "requirement": "optional"
    },
    "end_time": {
      "description": "The end timestamp of the trace, essential for identifying latency and performance bottlenecks.",
      "requirement": "optional"
    },
    "duration": {
      "description": "The trace duration, the amount of time the trace covers from <code>start_time</code> to <code>end_time</code> in milliseconds.",
      "requirement": "optional"
    }
  }
}

New Trace Attributes: Enum of key application Trace Information for trace events.

"trace": {
  "caption": "Trace",
  "description": "The attributes associated with an event containing trace data.",
  "type": "trace"
},
"span": {
  "caption": "Span",
  "description": "The attributes associated with an event containing span data.",
  "type": "span"
},

New Span Object Attributes: Enum of key application Trace Information for trace events.

{
  "caption": "Span",
  "description": "The attributes associated with an event containing span data.",
  "extends": "object",
  "name": "span",
  "attributes": {
    "uid": {
      "description": "The unique identifier of the span used in distributed systems and microservices architecture to track and correlate requests across various components of an application.",
      "requirement": "required"
    },
    "service": {
      "description": "Identifies the service or component creating the span, which helps track its path through a distributed system.",
      "requirement": "optional"
    },
    "operation": {
      "description": "Describes an actions performed in a span, such as API requests, database queries, or computations.",
      "requirement": "optional",
      "is_array": true
    },
    "parent_span": {
      "description": "The parent span of this span object. It is recommended to only populate this field for the first process object, to prevent deep nesting.",
      "requirement": "optional"
    },
    "start_time": {
      "description": "The start timestamp of the span, essential for identifying latency and performance bottlenecks.",
      "requirement": "optional"
    },
    "end_time": {
      "description": "The end timestamp of the span, essential for identifying latency and performance bottlenecks.",
      "requirement": "optional"
    },
    "duration": {
      "description": "The span duration, the amount of time the trace covers from <code>start_time</code> to <code>end_time</code> in milliseconds.",
      "requirement": "optional"
    },
    "status_code": {
      "description": "Indicates whether the operations in the span were successful, failed, or had an error, aiding in pinpointing issues.",
      "requirement": "optional"
    }
  }
}

Traces profile

{
  "description": "The Traces Profile extends the OCSF framework to capture and standardize observability events, specifically targeting trace-level data. This profile enables integration and normalization of distributed tracing information, allowing OCSF events to retain essential trace context such as trace IDs, span relationships, and service dependencies.",
  "meta": "profile",
  "caption": "Traces",
  "name": "traces",
  "annotations": {
    "group": "primary"
  },
  "attributes": {
    "trace": {
      "description": "The trace object contains information about distruibuted traces which are critical to observability and describe how requests move through a system, capturing each step's timing and status.",
      "requirement": "recommended"
    }
  }
}

Reverting span operation back to string in span.json Signed-off-by: Adam Gregory <[email protected]>

objects/span.json

davemcatcisco · 2024-11-07T14:21:20Z

objects/trace.json

+      "description": "The unique identifier of the trace used in distributed systems and microservices architecture to track and correlate requests across various components of an application.",
+      "requirement": "required"
+    },
+    "span": {


Does every trace have just a single top-level span? In your example, your Purchase Item trace seems to contain five spans. Is it assumed that there will be a single top-level root span to which all of these will refer to as their parent_span? What I'm asking, I guess, is if this should be an array of spans?

I'm a little bit confused too about how and where child spans would be represented. If a trace decomposes into one or more spans, and if each span can be further decomposed, then does it not make sense for the whole thing to be a sort of recursive structure? e.g.

{ "spans": [ { "service": {}, "operation": {}, "spans": [ { "service": {}, "operation": {} }, { "service": {}, "operation": {}, "spans": [ { "service": {}, "operation": {} } ] } ] }, { "service": {}, "operation": {}, "spans": [ { "service": {}, "operation": {} } ] } ] }

It's possible I'm completely missing the point here!

I think within each span, there might be several events which should be represented by the entire ocsf class. For example a span might contain a change freeze event, a data base update event, and a change unfreeze event. each of these will be represented as its own OCSF record hence the correlation of 1 span to 1 event in the context of ocsf

Please let me know if this explanation is satisfactory.

I think within each span, there might be several events which should be represented by the entire ocsf class. For example a span might contain [multiple events]

Sorry, I'm not really understanding. According to the way you're proposing to set it up in the schema, a span would fall within a trace, and a trace would fall within a HTTP Activity or API Activity. So I don't understand how you're saying that a span contains events.

I think contextually you need to think about spans and traces in a different way compared to many of the other OCSF attribution. You can think of a trace as a transaction of transactions and a span as a transaction of related events.

I think the issue is when thinking about the trace span paradigm is that maybe we are viewing traces as events but traces are not events they are essentially a recording of metadata associated with related events.

The trace and span objects essentially are a way of “tagging” the event with the span and trace information from which it may have been associated with

Because of how the ontology of OCSF is designed, the event cannot represent a span or trace but rather carries forward the relevant metadata associated with it. Traces and spans are related events but they are not events in and of themselves and therefore cannot be representing as an OCSF class for example.

By using the trace profile we can ensure the metadata of the relevant associated traces/spans is preserved that is the goal here. Spans are within traces and events are within spans but the trace and span object represent the metadata associated with this for the event in which the objects exists.

I appreciate the effort that has gone into that explanation but I'm afraid it's simply too abstract for somebody who isn't familiar with the trace & span terminology. I really don't want to take up any more of your time (or mine) than is necessary. So unless you can exemplify the above with a concrete example that would help me to understand the concepts, I'm going to have to drop out of reviewing this and leave it to folks with a better grasp of this area.

But in your Trace example in the PR desc, it appears each trace will have multiple spans? Is that not the case?

Frankly, I am in the same boat as @davemcatcisco here. What would truly help is, an example OCSF event, which utilizes the proposed updates. The end goal being -> this is how an OCSF API Activity event can be augmented with observability/trace information, using these new structures, and then, this is what it all means..

I missing this contextual info, to truly help review the modeling aspects in the PR.

Thanks, @floydtree. I'm a little relieved to know it's not just me. I have a tendancy sometimes in situations like this to wonder, "am I just too stupid to understand this?"

So, @pladamgregory, there are now two reviewers suggesting that a worked example might be the best way to explain these concepts clearly. Is that something you could do?

The following is a simple implementation of how this would work using OCSF auth + database query over 3 spans within a trace. @davemcatcisco @floydtree Please see below:

[ { "time": 1731414896, "activity_id": 6, "activity": "Preauth", "user": { "username": "john.doe" }, "type_uid": 300201, "category_uid": 3, "trace": { "uid": "Trace_001", "span": { "uid": "Span_001", "service": "User Service", "operation": [ "User Authentication" ], "start_time": "2024-11-12T12:00:00Z", "end_time": "2024-11-12T12:00:10Z", "duration": "10ms" }, "service": "User Service", "status_code": "Success", "start_time": "2024-11-12T12:00:00Z", "end_time": "2024-11-12T12:01:00Z", "duration": "1min" } }, { "time": 1731414896, "activity_id": 1, "type_uid": 600501, "category_uid": 6, "actor": { "name": "JohnDoe", "role": "User", "process": "example_process" }, "database": { "name": "SQLdatabase", "uid": "bc6e9d20-a125-11ef-91f2-0242ac110007", "type_id": 1 }, "query_info": { "query_string": "GET user.name" }, "src_endpoint": { "ip": "192.168.1.1", "port": 443 }, "trace": { "uid": "Trace_001", "span": { "uid": "Span_002", "service": "User Service", "operation": [ "User Authentication", "DB Query" ], "start_time": "2024-11-12T12:00:10Z", "end_time": "2024-11-12T12:00:30Z", "duration": "20ms" }, "service": "User Service", "status_code": "Success", "start_time": "2024-11-12T12:00:00Z", "end_time": "2024-11-12T12:01:00Z", "duration": "1min" } }, { "time": 1731414896, "activity_id": 1, "activity": "Logon", "user": { "username": "john.doe" }, "type_uid": 300201, "category_uid": 3, "trace": { "uid": "Trace_003", "span": { "uid": "Span_002", "service": "User Service", "operation": [ "Auth Success" ], "start_time": "2024-11-12T12:00:10Z", "end_time": "2024-11-12T12:00:30Z", "duration": "20ms" }, "service": "User Service", "status_code": "Success", "start_time": "2024-11-12T12:00:00Z", "end_time": "2024-11-12T12:01:00Z", "duration": "1min" } } ]

dictionary.json

objects/span.json

Signed-off-by: Adam Gregory <[email protected]>

complete

floydtree · 2024-11-12T17:51:52Z

profiles/trace.json

Does this make more sense to be called "Observability" Profile instead?

I think trace is ideal since the metric component of observability will likely have to be a class in and of itself, likely within the discovery category.

Signed-off-by: Adam Gregory <[email protected]>

pladamgregory requested review from floydtree, pagbabian-splunk, Aniak5, mikeradka, zschmerber, jonrau-at-queryai and davemcatcisco as code owners November 5, 2024 20:34

pladamgregory force-pushed the main branch from 62d18d7 to a60f9f0 Compare November 6, 2024 18:36

Adding trace profile with trace objects and dependencies.

d1a640c

pladamgregory force-pushed the main branch from a60f9f0 to d1a640c Compare November 6, 2024 18:40

Reverting span operation back to string in span.json

c6286e6

Reverting span operation back to string in span.json Signed-off-by: Adam Gregory <[email protected]>

davemcatcisco previously requested changes Nov 7, 2024

View reviewed changes

pladamgregory added 3 commits November 7, 2024 10:20

Updating dictionary alphabetical ordering.

740ed9a

Signed-off-by: Adam Gregory <[email protected]>

Update alphabetical sorting on span.json

e83bbba

Signed-off-by: Adam Gregory <[email protected]>

Update span.json

4b4de4b

Signed-off-by: Adam Gregory <[email protected]>

pladamgregory requested a review from davemcatcisco November 7, 2024 15:27

pladamgregory added 2 commits November 7, 2024 10:28

Update description on parent_span

4b866e4

Signed-off-by: Adam Gregory <[email protected]>

Update span.json

a86db90

Signed-off-by: Adam Gregory <[email protected]>

floydtree reviewed Nov 12, 2024

View reviewed changes

pladamgregory requested a review from floydtree November 12, 2024 18:51

Update trace.json

7939636

Signed-off-by: Adam Gregory <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Inclusion of Trace Object & Profile #1243

Proposal: Inclusion of Trace Object & Profile #1243

pladamgregory commented Nov 5, 2024 •

edited

Loading

davemcatcisco Nov 7, 2024 •

edited

Loading

pladamgregory Nov 7, 2024

pladamgregory Nov 7, 2024

davemcatcisco Nov 7, 2024

pladamgregory Nov 8, 2024

davemcatcisco Nov 8, 2024

floydtree Nov 11, 2024

davemcatcisco Nov 12, 2024

pladamgregory Nov 12, 2024 •

edited

Loading

floydtree Nov 12, 2024

pladamgregory Nov 12, 2024

Proposal: Inclusion of Trace Object & Profile #1243

Are you sure you want to change the base?

Proposal: Inclusion of Trace Object & Profile #1243

Conversation

pladamgregory commented Nov 5, 2024 • edited Loading