You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since Bluefin is adding concurrency primitives #34, it might be worth considering adding parallel primitives also.
I can implement this, but I'd rather wait for concurrency API to stabilize. In the meantime, my prototype is here.
The idea is to create primitives for programming with threads, that guarantee by construction, that programs written with them are deterministic (or pure, or from now on: parallel). This API would be to concurrent API, what runPureEff is to runEff. That is, sacrificing power for safety.
Main source of inspiration is monad-par which has many examples and very simple API.
The main motivation for this is that Bluefin is an effect system, which programs are supposed to be structured around. Dropping to monad-par for some computations and losing access to every effect is not desirable. Deterministic parallelism by construction, with access to runPureEff goodness, seems attractive. That being said, monad-par doesn't look very popular those days. I'll ask on Haskell Discourse about reasons why (link).
How does it to work?
We will rely on an ability to shield our parallel code from other effects in scope. This can be done as follows (I hope there is no workaround for this).
runNewEff:: (forallfreshEs.EfffreshEsr) ->Effesr
runNewEff _ =undefined
f = runPureEff do
evalState ""\s ->do
put s "you shall not pass"
runNewEff do
get s -- compilation error
This is important, as we can't distinguish effects and can't trust them to be pure. The very same idea extends to shielding threads from sharing effects.
With our shields up, it is very easy to recreate spawn from monad-par.
(I'm still unsure about getput and IVar abstractions. put has a very nasty restriction that you can use it on given IVar only once, or it throws. I don't know what power do they give over spawn.)
This is sound, our main thread is pure, and our threads are pure, and they share no state and cannot communicate other than by their results. In some sense, our threads are just lazy values!
With just this one simple trick, we can have monad-par in Bluefin!
If our threads are just pure lazy values of type a, then they can also be pure lazy values of type [a]. So instead of waiting for them to compute the whole list, we can allow them to publish results as they go. Then other threads can access those values as they are created, instead of waiting for the whole thing. This is sound as long as every access to such result is a separate stream, from the very beginning.
This is doable, but more complicated. Roughly, we can accumulate results in a list, and coordinate with the consumer thread's index of the last inserted value.
While streaming is cool, it usually involves IO. This is also fake streaming, as we have to keep the whole list in memory, in case some other thread wishes to access it too. But there might be problems that are easier to express this way. This has to be investigated. Plus, I didn't try to integrate it with Coroutine effect, would be great if it can be done.
Checked exceptions/early return/jump.
Now, this is a banger. The idea is mine (but I'm sure somebody did it already, if it is doable), so I hope I won't feel embarrassed about it, if it doesn't pan out. The problem with exceptions is that they easily create race conditions.
Let's look at this code, it is a pure function.
validateList:: [Int] ->EitherString [Int]
validateList xs = runPureEff do
try \e ->do
when (1`elem` xs) do throw e "1"
when (2`elem` xs) do throw e "2"
when (3`elem` xs) do throw e "3"return xs
But what about this?
validateListPar:: [Int] ->EitherString [Int]
validateListPar xs = runPureEff do
try \(e ::ExceptionStringe) ->do
runParallelEx e \par ex ->do
t2 <- pureForkEx par ex \p1 ex1 ->do
when (2`elem` xs) do pureThrow p1 ex1 "2"
t3 <- pureForkEx par ex \p1 ex1 ->do
when (3`elem` xs) do pureThrow p1 ex1 "3"
when (1`elem` xs) do pureThrow par ex "1"
pureGet par t2
pureGet par t3
return xs
With Bluefin exceptions as they are, this is bad. We don't know which thread will throw first.
The idea is as follows. There is a special thread in our parallel computation, let's call it main. Main is special in two ways, every thread originates from it, and it is the thread that returns the final value. All we have to do is to track when a value from a thread that has thrown an exception is requested by main! validateListPar [3,2] will always return Left "2". It is a bit more complicated when threads create other threads, but a simple state machine can do it. I'll have to give it a go in TLA+, but it looks dauntingly doable.
Exceptions in pure code.
monad-par docs don't mention exceptions. It might be a bit of a taboo in pure parallel community :) Those exceptions will not be deterministic, unless we apply checked exceptions trick. But that might be not worth the effort.
Schedulers.
monad-par offers a few schedulers. As parallelism is all about performance, it might be important. I know nothing about schedulers, nor I'm planing to implement any. My plan for this is to cross fingers that GHC runtime will do (and benchmark).
API considerations.
I'll leave those for now. Other that guaranteing that only special effects can cross threads boundaries and slapping NFData here and there, I'm not that concerned.
The text was updated successfully, but these errors were encountered:
We will rely on an ability to shield our parallel code from other effects in scope. This can be done as follows (I hope there is no workaround for this).
This looks fine. Isn't runNewEff = pure . runPureEff?
monad-par docs don't mention exceptions. It might be a bit of a taboo in pure parallel community :) Those exceptions will not be deterministic
Imprecise exceptions are not deterministic anyway (i.e., the semantics of Haskell don't distinguish between "different" bottoms) so this is fine.
This all sounds very promising! Thanks for writing it up.
My next action is to implement at least a very basic concurrency API that I'm content won't need to change. I think I'll start by wrapping withAsync from the async library.
Since Bluefin is adding concurrency primitives #34, it might be worth considering adding parallel primitives also.
I can implement this, but I'd rather wait for concurrency API to stabilize. In the meantime, my prototype is here.
The idea is to create primitives for programming with threads, that guarantee by construction, that programs written with them are deterministic (or pure, or from now on: parallel). This API would be to concurrent API, what
runPureEff
is torunEff
. That is, sacrificing power for safety.Main source of inspiration is monad-par which has many examples and very simple API.
The main motivation for this is that Bluefin is an effect system, which programs are supposed to be structured around. Dropping to
monad-par
for some computations and losing access to every effect is not desirable. Deterministic parallelism by construction, with access torunPureEff
goodness, seems attractive. That being said,monad-par
doesn't look very popular those days. I'll ask on Haskell Discourse about reasons why (link).How does it to work?
We will rely on an ability to shield our parallel code from other effects in scope. This can be done as follows (I hope there is no workaround for this).
This is important, as we can't distinguish effects and can't trust them to be pure. The very same idea extends to shielding threads from sharing effects.
With our shields up, it is very easy to recreate
spawn
frommonad-par
.(I'm still unsure about
get
put
andIVar
abstractions.put
has a very nasty restriction that you can use it on givenIVar
only once, or it throws. I don't know what power do they give overspawn
.)This is sound, our main thread is pure, and our threads are pure, and they share no state and cannot communicate other than by their results. In some sense, our threads are just lazy values!
With just this one simple trick, we can have
monad-par
in Bluefin!But I believe we can do more.
Streaming/Channels/Generators?. (idea from here)
If our threads are just pure lazy values of type
a
, then they can also be pure lazy values of type[a]
. So instead of waiting for them to compute the whole list, we can allow them to publish results as they go. Then other threads can access those values as they are created, instead of waiting for the whole thing. This is sound as long as every access to such result is a separate stream, from the very beginning.This is doable, but more complicated. Roughly, we can accumulate results in a list, and coordinate with the consumer thread's index of the last inserted value.
While streaming is cool, it usually involves IO. This is also fake streaming, as we have to keep the whole list in memory, in case some other thread wishes to access it too. But there might be problems that are easier to express this way. This has to be investigated. Plus, I didn't try to integrate it with Coroutine effect, would be great if it can be done.
Checked exceptions/early return/jump.
Now, this is a banger. The idea is mine (but I'm sure somebody did it already, if it is doable), so I hope I won't feel embarrassed about it, if it doesn't pan out. The problem with exceptions is that they easily create race conditions.
Let's look at this code, it is a pure function.
But what about this?
With Bluefin exceptions as they are, this is bad. We don't know which thread will throw first.
The idea is as follows. There is a special thread in our parallel computation, let's call it main. Main is special in two ways, every thread originates from it, and it is the thread that returns the final value. All we have to do is to track when a value from a thread that has thrown an exception is requested by main!
validateListPar
[3,2] will always returnLeft "2"
. It is a bit more complicated when threads create other threads, but a simple state machine can do it. I'll have to give it a go in TLA+, but it looks dauntingly doable.Exceptions in pure code.
monad-par
docs don't mention exceptions. It might be a bit of a taboo in pure parallel community :) Those exceptions will not be deterministic, unless we apply checked exceptions trick. But that might be not worth the effort.Schedulers.
monad-par
offers a few schedulers. As parallelism is all about performance, it might be important. I know nothing about schedulers, nor I'm planing to implement any. My plan for this is to cross fingers that GHC runtime will do (and benchmark).API considerations.
I'll leave those for now. Other that guaranteing that only special effects can cross threads boundaries and slapping
NFData
here and there, I'm not that concerned.The text was updated successfully, but these errors were encountered: