-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Inclusion of Trace Object & Profile #1243
base: main
Are you sure you want to change the base?
Conversation
Reverting span operation back to string in span.json Signed-off-by: Adam Gregory <[email protected]>
"description": "The unique identifier of the trace used in distributed systems and microservices architecture to track and correlate requests across various components of an application.", | ||
"requirement": "required" | ||
}, | ||
"span": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does every trace have just a single top-level span? In your example, your Purchase Item
trace seems to contain five spans. Is it assumed that there will be a single top-level root span to which all of these will refer to as their parent_span
? What I'm asking, I guess, is if this should be an array of spans?
I'm a little bit confused too about how and where child spans would be represented. If a trace decomposes into one or more spans, and if each span can be further decomposed, then does it not make sense for the whole thing to be a sort of recursive structure? e.g.
{
"spans": [
{
"service": {},
"operation": {},
"spans": [
{
"service": {},
"operation": {}
},
{
"service": {},
"operation": {},
"spans": [
{
"service": {},
"operation": {}
}
]
}
]
},
{
"service": {},
"operation": {},
"spans": [
{
"service": {},
"operation": {}
}
]
}
]
}
It's possible I'm completely missing the point here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think within each span, there might be several events which should be represented by the entire ocsf class. For example a span might contain a change freeze event, a data base update event, and a change unfreeze event. each of these will be represented as its own OCSF record hence the correlation of 1 span to 1 event in the context of ocsf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know if this explanation is satisfactory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think within each span, there might be several events which should be represented by the entire ocsf class. For example a span might contain [multiple events]
Sorry, I'm not really understanding. According to the way you're proposing to set it up in the schema, a span
would fall within a trace
, and a trace
would fall within a HTTP Activity
or API Activity
. So I don't understand how you're saying that a span
contains events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think contextually you need to think about spans and traces in a different way compared to many of the other OCSF attribution. You can think of a trace as a transaction of transactions and a span as a transaction of related events.
I think the issue is when thinking about the trace span paradigm is that maybe we are viewing traces as events but traces are not events they are essentially a recording of metadata associated with related events.
The trace and span objects essentially are a way of “tagging” the event with the span and trace information from which it may have been associated with
Because of how the ontology of OCSF is designed, the event cannot represent a span or trace but rather carries forward the relevant metadata associated with it. Traces and spans are related events but they are not events in and of themselves and therefore cannot be representing as an OCSF class for example.
By using the trace profile we can ensure the metadata of the relevant associated traces/spans is preserved that is the goal here. Spans are within traces and events are within spans but the trace and span object represent the metadata associated with this for the event in which the objects exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the effort that has gone into that explanation but I'm afraid it's simply too abstract for somebody who isn't familiar with the trace & span terminology. I really don't want to take up any more of your time (or mine) than is necessary. So unless you can exemplify the above with a concrete example that would help me to understand the concepts, I'm going to have to drop out of reviewing this and leave it to folks with a better grasp of this area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in your Trace example in the PR desc, it appears each trace will have multiple spans? Is that not the case?
Frankly, I am in the same boat as @davemcatcisco here. What would truly help is, an example OCSF event, which utilizes the proposed updates. The end goal being -> this is how an OCSF API Activity event can be augmented with observability/trace information, using these new structures, and then, this is what it all means..
I missing this contextual info, to truly help review the modeling aspects in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @floydtree. I'm a little relieved to know it's not just me. I have a tendancy sometimes in situations like this to wonder, "am I just too stupid to understand this?"
So, @pladamgregory, there are now two reviewers suggesting that a worked example might be the best way to explain these concepts clearly. Is that something you could do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following is a simple implementation of how this would work using OCSF auth + database query over 3 spans within a trace. @davemcatcisco @floydtree Please see below:
[
{
"time": 1731414896,
"activity_id": 6,
"activity": "Preauth",
"user": {
"username": "john.doe"
},
"type_uid": 300201,
"category_uid": 3,
"trace": {
"uid": "Trace_001",
"span": {
"uid": "Span_001",
"service": "User Service",
"operation": [
"User Authentication"
],
"start_time": "2024-11-12T12:00:00Z",
"end_time": "2024-11-12T12:00:10Z",
"duration": "10ms"
},
"service": "User Service",
"status_code": "Success",
"start_time": "2024-11-12T12:00:00Z",
"end_time": "2024-11-12T12:01:00Z",
"duration": "1min"
}
},
{
"time": 1731414896,
"activity_id": 1,
"type_uid": 600501,
"category_uid": 6,
"actor": {
"name": "JohnDoe",
"role": "User",
"process": "example_process"
},
"database": {
"name": "SQLdatabase",
"uid": "bc6e9d20-a125-11ef-91f2-0242ac110007",
"type_id": 1
},
"query_info": {
"query_string": "GET user.name"
},
"src_endpoint": {
"ip": "192.168.1.1",
"port": 443
},
"trace": {
"uid": "Trace_001",
"span": {
"uid": "Span_002",
"service": "User Service",
"operation": [
"User Authentication",
"DB Query"
],
"start_time": "2024-11-12T12:00:10Z",
"end_time": "2024-11-12T12:00:30Z",
"duration": "20ms"
},
"service": "User Service",
"status_code": "Success",
"start_time": "2024-11-12T12:00:00Z",
"end_time": "2024-11-12T12:01:00Z",
"duration": "1min"
}
},
{
"time": 1731414896,
"activity_id": 1,
"activity": "Logon",
"user": {
"username": "john.doe"
},
"type_uid": 300201,
"category_uid": 3,
"trace": {
"uid": "Trace_003",
"span": {
"uid": "Span_002",
"service": "User Service",
"operation": [
"Auth Success"
],
"start_time": "2024-11-12T12:00:10Z",
"end_time": "2024-11-12T12:00:30Z",
"duration": "20ms"
},
"service": "User Service",
"status_code": "Success",
"start_time": "2024-11-12T12:00:00Z",
"end_time": "2024-11-12T12:01:00Z",
"duration": "1min"
}
}
]
Signed-off-by: Adam Gregory <[email protected]>
Signed-off-by: Adam Gregory <[email protected]>
Signed-off-by: Adam Gregory <[email protected]>
Signed-off-by: Adam Gregory <[email protected]>
Signed-off-by: Adam Gregory <[email protected]>
profiles/trace.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make more sense to be called "Observability" Profile instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think trace is ideal since the metric component of observability will likely have to be a class in and of itself, likely within the discovery category.
Signed-off-by: Adam Gregory <[email protected]>
Proposal: Inclusion of Trace Object & Profile
Description: This proposal introduces the concept of traces to the OCSF schema. The goal is to enhance OCSF to cover observability data for distributed traces.
Traces contain vital information about the flow of requests through a distributed system, including a unique trace ID, individual span IDs, timestamps for start and end, duration, and metadata such as service names and error details. This data provides a comprehensive view of how requests are processed, revealing performance metrics and service dependencies. Traces are useful for performance monitoring, as they help identify bottlenecks and slow operations. They also facilitate root cause analysis by allowing developers to pinpoint issues and optimize the overall system for improved reliability and user experience.
To support the proposal, here's how the modeled example would look when applied to a purchase transaction trace. This illustrates how each span and event would be structured, using OCSF:
Example Trace: Purchase Transaction Trace
1. User Service Span
start_auth
: Marks when authentication starteddb_query
: Records time spent querying the user databaseauth_success
: Indicates successful authentication2. Order Service Span
validate_cart
: Checks if all items in the cart are availablecalculate_total
: Calculates the total priceorder_created
: Confirms that the order was created in the system3. Payment Service Span
start_payment
: Marks the initiation of the payment processpayment_gateway_call
: Time spent calling an external payment gatewaypayment_success
: Confirms successful payment processing4. Inventory Service Span
inventory_lock
: Temporarily locks inventory itemsupdate_db
: Updates inventory database to reflect items soldinventory_release
: Releases inventory lock5. Notification Service Span
email_generated
: Generates the email contentemail_sent
: Confirms the email was sent to the userSummary of Trace
User Authentication
→Create Order
→Process Payment
→Update Inventory
→Send Confirmation Email
OCSF Model (Table)
New
trace_info
Object & ProfileTrace Object: Defines key application Trace Information for trace events. (Included Via
trace
profile)New Trace Attributes: Enum of key application Trace Information for trace events.
New Span Object Attributes: Enum of key application Trace Information for trace events.
Traces profile