-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Streams 🌊] Enrichment state management improvements #211686
base: main
Are you sure you want to change the base?
[Streams 🌊] Enrichment state management improvements #211686
Conversation
…/kibana into tonyghiani-102-refactor-state-management
…ctor-state-management
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
Just moved it into this shared package from https://github.com/elastic/kibana/pull/211686/files#diff-588ef3f03c3164d4bc7833514b6f1d8fe7605c998d730517547a2fcc4332f2de, it gives a typed placeholder for required services.
const dynamicImageImport = colorMode === 'LIGHT' ? light() : dark(); | ||
|
||
dynamicImageImport.then((module) => setImageSrc(module.default)); | ||
dynamicImageImport.then((module) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
Fixes an async state update warning for unmounted images.
@@ -101,7 +101,9 @@ export const ProcessorErrors = ({ metrics }: { metrics: ProcessorMetrics }) => { | |||
const shouldDisplayErrorToggle = remainingCount > 0; | |||
|
|||
const getCalloutProps = (type: ProcessorMetrics['errors'][number]['type']): EuiCallOutProps => { | |||
const isWarningError = type === 'generic_processor_failure' && success_rate > 0; | |||
const isWarningError = | |||
type === 'non_additive_processor_failure' || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
Make non-additive changes a warning, as we'll now allow them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
This is a small re-usable actor to invoke when we need a confirmation prompt. A couple of presets used in the simulation machine are also defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
This small actor is in charge of the samples data fetching and mapping to flatten records.
It will handle request abortion with the built-in signal in case the fetching state is left before resolving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
These are a bunch of utilities moved from ...lic/components/data_management/stream_detail_enrichment/hooks/use_processing_simulator.ts, there is no new logic to review so reviewers can skip this part.
|
||
const consoleInspector = createConsoleInspector(); | ||
|
||
const StreamEnrichmentContext = createActorContext(streamEnrichmentMachine); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
Using the createActorContext
utility from xstate, we get a context already set with strong typing and the utility we need exposed from the context.
const ListenForDefinitionChanges = ({ | ||
children, | ||
definition, | ||
}: React.PropsWithChildren<StreamEnrichmentInput>) => { | ||
const service = StreamEnrichmentContext.useActorRef(); | ||
|
||
useEffect(() => { | ||
service.send({ type: 'stream.received', definition }); | ||
}, [definition, service]); | ||
|
||
return children; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
This is (not very elegant) workaround to make the machine react to definition changes until I will connect them to react by event, which will be a clean-up task.
@@ -16,7 +16,6 @@ import { | |||
export type WithUIAttributes<T extends ProcessorDefinition> = T & { | |||
id: string; | |||
type: ProcessorTypeOf<T>; | |||
status: 'draft' | 'saved' | 'updated'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
Not necessary anymore, the processorMachine
will track exactly in which status a processor is.
const Content = ({ definition, refreshDefinition }: Required<SchemaEditorProps>) => { | ||
const isLoadingDefinition = useStreamDetailSelector((state) => | ||
state.matches({ initialized: 'loadingDefinition' }) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
Removes prop drilling for the loading state of a definition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note
This is a simple and re-usable state machine that replicates the behaviour of useDateRange
.
To implement it a part of the enrichment flow and communicate with its parent machine, I put together this machine, that can be used with useActor
to replicate the same behaviour the other hook, while the other way around was not possible to re-use the logic in the simulation machine.
/ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, this is a massive body of work! I really like how we don't need opaque files with tons of hooks in them to define state transitions, but have a clear model of how transitions work.
I left a couple comments, but they all go into the same direction - I think we shouldn't put all logic into xstate, only where we can actually benefit from it.
In a lot of cases, xstate strikes me as overly complex and the clean modeling of state transitions is not actually as valuable as the development speed that's lost with it.
My high level opinion: Let's use xstate in central places that get real messy without, but let's try to split out local state into simpler alternatives where possible. I think the ideal of having all state within an xstate machine is the wrong one to pursue.
Basically the way to approach a feature would be trying to solve it with local state. Once
- that becomes hard to maintain introduce a sub machine for it to manage state transitions better.
- you need to lift state up, add it to the higher level xstate machine
I really worry we lose a lot of development speed by putting too much into this setup. Maybe a rule of thumb could be "if you can solve it with less or equal three hooks, do it locally, otherwise turn it into xstate".
I know that there is a learning curve aspect to it - I didn't work a lot with xstate before, so I need to get used to the abstractions. But it's not just that - the modeling approach of xstate is a very verbose one. Adding something equivalent to a useState hook requires changes in 5 different places instead of 1. Also when reading the code it's not easy to follow the control flow because things are in multiple places - actor definition in one file, event types in another file, guards in one place and all is assembled in yet another place. There is value in this for some things, especially when local state updates and side effects are mixed in a single control flow, but I think the hurdle for moving into this model should be really high.
Refactor Grok AI suggestion with a smaller machine
I'm bit on the edge whether a state machine makes sense for the suggestions or not. They seem relatively simple, so not sure. Maybe it's worth it if we lift the state up so the state is shared across different processors (e.g. for the block list that would be nice)
input: ({ context }) => ({ | ||
condition: context.samplingCondition, | ||
streamName: context.streamName, | ||
absoluteTimeRange: context.dateRangeRef.getSnapshot().context.absoluteTimeRange, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we are doing a ton of work to integrate the timefilter service with xstate by wrapping it in a sub machine with multiple actors, just to get the time range here.
I don't think this is necessary, we should just pass data
to createSamplesFetchActor
so this can be an implementation detail of the actor instead of modelling it in xstate.
If we one day need to react to time updates, we can add a simple fromObservable
actor for it - no need to have a whole state machine just to wrap some other state somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put this logic into a smaller state machine (which can be re-used easily as useDateRange, but it's not coupled to the react interface and can be spawned like here if necessary) because we already need to programmatically react to events related to time changes.
We don't only need to get the absolute time range, but we want this transition to happen only when a time change occurs and we are in the simulation state.
If I keep it separate, I would still need to use the date range hook, which is not synced with the state of this machine, leaving room to undesired side effects. The dateRange machine is a pretty small and re-usable piece, I don't see much need to maintain it once it's done and integrates well not only with other machines but can be used also independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want this transition to happen only when a time change occurs and we are in the simulation state
Isn't the update coming from here in practice?
Line 74 in 520f287
dateRangeRef?.send({ type: 'dateRange.update', range: dateRange }); |
And even if we do need it, why not hooking the actor directly into the place that needs it?
The dateRange machine is a pretty small and re-usable piece, I don't see much need to maintain it once it's done and integrates well not only with other machines but can be used also independently.
It's still code the next dev needs to read and understand, which comes at a cost. Let's not put abstractions in place because we might need them later, this is a classic death-by-a-thousand-cuts issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And even if we do need it, why not hooking the actor directly into the place that needs it?
I left a note here regarding this here.
It's still code the next dev needs to read and understand.
Forgive me here, but I don't see any difference with next dev having to read and understands the useTimefilter hook, expect here the state changes can be visualized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this relates, this comment is about changes of the definition
, not the time range, right?
I don't see any difference with next dev having to read and understands the useTimefilter hook, expect here the state changes can be visualized
Why do we need useTimefilter
, we would just need a fromObservable(data.search.timefilter.getTimefilter$())
as an actor in the state machine that needs access, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we would just need a fromObservable(data.search.timefilter.getTimefilter$()) as an actor in the state machine that needs access
Correct, adding a context value to hold the observable output, an initial value for this context value and an action to set it.
It reduces to what the date range machine does, but I don't see the need to couple it all together it we could re-use this, for example, for the routing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do it twice, then generalize out - we only do it once here, it's too early for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, it would be too early if we were not aware of use cases where we need it. We have those use cases already implemented, where this 30 lines machine would be very handy.
But mostly the main reason I split it is to keep simpler the simulation machine, so that it doesn't have to worry about knowing how the timefilter work.
I don't see that as too early given the point we are at with the stream project and what is coming next.
import { createLoadDefinitionActor } from './load_definition_actor'; | ||
import { StreamDetailContext, StreamDetailEvent, StreamDetailInput } from './types'; | ||
|
||
export const streamDetailMachine = setup({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned this in your description, but this state machine is basically doing nothing and a simple hook could do the same job. Things might get more complex later, but this smells a lot like YAGNI. Let's introduce the complexity we need right now, not the one we might need at some point.
For example, I'm not sure whether we even want to introduce more complex URL state syncing and I still have high hope we can keep the context-awareness logic much simpler.
I think we are going too far here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair concerns Joe, and agree it might be premature, a couple of reasons why I went this way:
- Props drilling: passing down and down the definition and the refresh function has been the default pattern so far. It works well sure, but it makes every component much less maintainable, and trigger a rerender of the whole tree on every change, which is probably unnecessary as it breaks memoization in many places.
Consuming this from the top-level context makes it easier to access it and provide updates. I didn't change any usage site here, but there is good margin of usage for this context in the whole client app. useAbortableSync
, which is used behind the scene, is purely side-effect based, which limits in reacting sequentially when we update the value. We want to deterministically trigger an update and know when this is done, using that hook doesn't help with that. Simply said, we cannot justawait refreshDefinition()
and then do something, we need to put anotheruseEffect
on the definition changes, which is unlikely what we want to do for some cases + it adds more side effects burden to the app.
The other points about the initialization flow are probably preventive yeah, but I feel the above already gives some good reasons to have a stricter root level machine. Happy to discuss more and pull in more folks to discuss the trade-off here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the props drilling - the definition doesn't change all the time, so I'm not too worried about the rerenders. We could also add a normal context for this, that might be helpful anyway because the definition is such a central thing to access (especially for the name).
I'm not sure I follow your point about useAbortableSync
, how does the state machine change that we need to await a new definition? Where in the existing code would we need an additional hook?
We anyway have this here, right?
Line 88 in 28eef97
service.send({ type: 'stream.received', definition }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, is about the note here regarding this #211686 (comment).
The issue with useAbortableSync is that it treats the refetch not as an event, but as a side-effect, hence it makes difficult to wait and know when the refetch is done. With this instead we can notified by the stream.update anyway, and we can wire it directly to the machine by event as I mentioned in the note.
storeSimulation: assign((_, params: { simulation: Simulation }) => ({ | ||
simulation: params.simulation, | ||
})), | ||
derivePreviewColumns: assign( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert in xstate, but storing derived state in the context like this strikes me as pretty weird - why not using a simple useMemo
in the component where they are needed?
This way it's possible to forget to derive them at some state condition, and I don't see an upside. You could argue business logic and UI are separated this way, but we can still keep the logic in a separate function that's easily unit-testable in isolation.
Xstate has its strong suits, but derived data is not one of them, I think we should keep it out of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable, we don't strictly need to keep the derive columns here, I initially put them here because a memoization with useMemo()
won't check for the content and we'll need to use something like a useEffect that checks deeply for changes, but it shouldn't be a big change, I'll give it a try.
...ream_detail_enrichment/state_management/simulation_state_machine/simulation_state_machine.ts
Show resolved
Hide resolved
...stream_detail_enrichment/state_management/processor_state_machine/processor_state_machine.ts
Show resolved
Hide resolved
…/kibana into tonyghiani-102-refactor-state-management
…ctor-state-management
@flash1293 thanks for the in-depth review, much appreciated! I addressed all of you comments, happy to jump in a call if you feel something needs more discussion and we can pull in the team for a wider audience. To give a wider view of where I decided there was a need for a machine, it reduces to having a state that reflects what the UI allows the user to do. The complex states we manage here have subscriptions to time changes, sequential data fetching and particularly, reactivity to recurrent events. All of this need to occur under strict conditions, I added machines only where I thought it would bring benefits. I also agree with you that we shouldn't go with Xstate for anything has a state, and in the enrichment section there is still plenty of local states that don't need to update, it's about finding the balance.
WRT this statement, IMO it's not about the amount of logic or custom hooks that we have, but what they do and they interact with each other. If we have a page with 100s of independent toggles, there is no need for a machine.
There is no urgency in updating this part yet, although as we mentioned offline I spotted a couple of bugs (not addressed here) due to the several conditions we have to decide when suggestions should be available or not. We can post-pone this if there are doubts and evaluate the change if it gets more complex. |
Yeah, it's just a rule of thumb, I tried to make it quantifiable somehow :) Maybe it's fine to say "it needs to be decided on a case-by-case basis" - IT, Depends ;) Added some follow-up comments to the threads above. |
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]Module Count
Public APIs missing comments
Async chunks
History
|
📓 Summary
Closes https://github.com/elastic/streams-program/issues/102
This re-work of the enrichment state management introduces XState as state library to prepare scaling the enrichment part for more processors and improve performance reducing unnecessary side effects.
🤓 Reviewers note
There is a lot to digest on this PR, I'm open to any suggestion and I left some notes around to guide the review.
This is also far from perfect as there is margin for other minor DX improvements for consuming the state machines, but it will all come in follow-up work after we resolve prioritized work such as integrating the Schema Editor.
Most of the changes on this PR are about the state management for the stream enrichment, but it touches also some other areas to integrate the event-based flow.
Stream Detail machine
This machine is responsible for the whole stream detail view initialization.
At the moment, it only handle the fetching of the stream definition we are navigating to, but it'll work as the entry point for restoring url state or handle more complex initialization flows, such as determining the stream type flow.
Stream enrichment machine
This machine handles the complexity around updating/promoting/deleting processors, and the available simulation states.
It's a root level machine that spawns and manages its children machine, one for the simulation behaviour and one for each processor instantiated.
Simulation machine
This machine handle the flow around sampling -> simulating, handling debouncing and determining once a simulation can run or should refresh. It also spawn a child date range machine to react to the observable time changes and reloads.
It also derives all the required table configurations (columns, filters, documents) centralizing the parsing and reducing the cases for re-computing, since we don't rely anymore on the previous live processors copy.
Processor machine
A processor can be in different states depending on the changes, not this tracks each of them independently and send events to the parent machine to react accordingly. It provide a boost in performance compared to the previous approach, as we don't have to rerender the whole page tree since the changes are encapsulated in the machine state.
🔜 Follow Ups