-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChannelFlow instances do not prefer conflation over buffering when fused #4352
Comments
A simplified answer: So, yes, this is not a bug but unclear documentation. To elaborate, the mental model of operator chaining is that of a pipeline through which data moves.
What happens when the large basket is put right behind the small basket? If there is space in the large basket, the strict person moves the data from the small basket; if not, the strict person forbids new data from entering the small basket, so the operator of the small basket throws away the incoming data when both baskets are full. So, what we observe is: the oldest element gets dropped, and 4 elements fit into the two baskets. What fusion does is it allows throwing away all basket but the last one while effectively preserving the basket operators: if you are given a flow with a basket of size 64, you can demand that it's replaced by a small basket. If your flow pipeline ends with a basket, a consumer of your flow can replace your basket with a larger or a smaller one, but you get to decide what the operator of the basket does to your flow. |
I think on a philosophical level that line of reasoning makes some sense, however what are some real world use cases for this behavior? There's at least one scenario I can think of that the following statement doesn't apply to:
I'm currently using a I'm not trying to be a jerk and nitpick your reasoning, either. This is a real scenario I ran into when writing some unit tests for my library which then uncovered this behavior in the first place. |
Only if you're in a single-threaded scenario. import kotlinx.coroutines.*
import kotlinx.coroutines.flow.*
import kotlinx.coroutines.channels.*
fun main() {
runBlocking {
channelFlow {
repeat(50) {
trySend(it)
Thread.sleep(50) // not a suspension, just making the producer slow
}
}.conflate().flowOn(Dispatchers.IO).collect {
println(it)
}
}
} prints every number. So, semantically, |
I'm sorry, I don't really see how your example emulates a producer quickly (and synchronously) sending many values to the underlying channel. If I remove the producer from being made slow, which isn't reflective of the example I was attempting to make, then I end up with the following code that does exhibit conflation. @Test
fun foo() = runTest {
// Multithreaded, fast as possible consumer.
val list = channelFlow<Int> {
repeat(50) { trySend(it) }
}.conflate().flowOn(Dispatchers.Default).toList()
println(list) // [31, 33, 34, 36, 37, 49]
// Single threaded buffer.
val list2 = channelFlow<Int> {
repeat(50) { trySend(it) }
}.conflate().buffer(50).toList()
println(list2) // [0, ..., 49]
} This example really emphasizes my core point -- |
It's not what the example emulates. It emulates the consumer being much quicker than the producer. |
Okay, so to summarize:
val original = flow {
val channel = Channel(CONFLATED)
// ... register some callback
try {
for (item in channel) emit(item)
} finally {
// ... unregister the callback
}
}
val buffered = original.buffer(3) What are some example use cases of this behavior? More specifically, when would it be useful for a flow which was previously conflated to be buffered later? |
A coroutine doesn't always need to go through a dispatch to consume an element. It may just keep spinning if, after being done with one element, it notices that another one is available already.
Most of the time, sure, probably. You can't rely on that, though, as it's subject to race conditions if the callback gets executed in another thread.
For example, we use Another example: diagnostics, to see how many values fly through the system. |
Okay, I think you've convinced me. |
Describe the bug
The
Flow.conflate()
documentation states that:This is not the behavior I've observed when using a callbackFlow within my library.
What happened? What should have happened instead?
While working on a library I publish, I wanted to ensure that a flow I return from my public API is always conflated. It does not make sense for events to be buffered because, even though I was using a
callbackFlow
, the items within that flow describe state and not events. I thought I had achieved this behavior by applying theFlow.conflate
operator to the flow I return, however I was able to disprove that this unconditionally always configures the underlying channel to be conflated in some unit tests. In my experience, adjacent applications of theFlow.buffer
operator would reconfigure the underlying channel's capacity, but would not reconfigure the buffer overflow strategy.Instead, I would expect calls to
Flow.conflate
to always take precedence over other fused operators as the documentation leads me to believe.I am hoping that this isn't just treated as unclear documentation. From my perspective, I don't see much utility in allowing Flow consumers to buffer a flow which has previously been conflated.
Provide a Reproducer
The above snippet will configure the underlying channel to have a capacity of three and a buffer overflow strategy of
DROP_OLDEST
. It seems like partial retention of the conflated policy is successful sinceDROP_OLDEST
is retained, however the final configured capacity is inconsistent with what's documented.The text was updated successfully, but these errors were encountered: