Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding parallel primitives #37

Open
jeukshi opened this issue Jan 18, 2025 · 1 comment
Open

Adding parallel primitives #37

jeukshi opened this issue Jan 18, 2025 · 1 comment

Comments

@jeukshi
Copy link

jeukshi commented Jan 18, 2025

Since Bluefin is adding concurrency primitives #34, it might be worth considering adding parallel primitives also.

I can implement this, but I'd rather wait for concurrency API to stabilize. In the meantime, my prototype is here.

The idea is to create primitives for programming with threads, that guarantee by construction, that programs written with them are deterministic (or pure, or from now on: parallel). This API would be to concurrent API, what runPureEff is to runEff. That is, sacrificing power for safety.

Main source of inspiration is monad-par which has many examples and very simple API.

The main motivation for this is that Bluefin is an effect system, which programs are supposed to be structured around. Dropping to monad-par for some computations and losing access to every effect is not desirable. Deterministic parallelism by construction, with access to runPureEff goodness, seems attractive. That being said, monad-par doesn't look very popular those days. I'll ask on Haskell Discourse about reasons why (link).

How does it to work?

We will rely on an ability to shield our parallel code from other effects in scope. This can be done as follows (I hope there is no workaround for this).

runNewEff :: (forall freshEs. Eff freshEs r) -> Eff es r
runNewEff _ = undefined

f = runPureEff do
    evalState "" \s -> do
        put s "you shall not pass"
        runNewEff do
            get s -- compilation error

This is important, as we can't distinguish effects and can't trust them to be pure. The very same idea extends to shielding threads from sharing effects.

With our shields up, it is very easy to recreate spawn from monad-par.

spawn
    :: (e :> es)
    => Scope sE e
    -> Eff forkEs r
    -> Eff es (Thread r sE es)

(I'm still unsure about get put and IVar abstractions. put has a very nasty restriction that you can use it on given IVar only once, or it throws. I don't know what power do they give over spawn.)

This is sound, our main thread is pure, and our threads are pure, and they share no state and cannot communicate other than by their results. In some sense, our threads are just lazy values!

With just this one simple trick, we can have monad-par in Bluefin!

But I believe we can do more.

Streaming/Channels/Generators?. (idea from here)

If our threads are just pure lazy values of type a, then they can also be pure lazy values of type [a]. So instead of waiting for them to compute the whole list, we can allow them to publish results as they go. Then other threads can access those values as they are created, instead of waiting for the whole thing. This is sound as long as every access to such result is a separate stream, from the very beginning.

This is doable, but more complicated. Roughly, we can accumulate results in a list, and coordinate with the consumer thread's index of the last inserted value.

While streaming is cool, it usually involves IO. This is also fake streaming, as we have to keep the whole list in memory, in case some other thread wishes to access it too. But there might be problems that are easier to express this way. This has to be investigated. Plus, I didn't try to integrate it with Coroutine effect, would be great if it can be done.

Checked exceptions/early return/jump.

Now, this is a banger. The idea is mine (but I'm sure somebody did it already, if it is doable), so I hope I won't feel embarrassed about it, if it doesn't pan out. The problem with exceptions is that they easily create race conditions.

Let's look at this code, it is a pure function.

validateList :: [Int] -> Either String [Int]
validateList xs = runPureEff do
    try \e -> do
        when (1 `elem` xs) do throw e "1"
        when (2 `elem` xs) do throw e "2"
        when (3 `elem` xs) do throw e "3"
        return xs

But what about this?

validateListPar :: [Int] -> Either String [Int]
validateListPar xs = runPureEff do
    try \(e :: Exception String e) -> do
        runParallelEx e \par ex -> do
            t2 <- pureForkEx par ex \p1 ex1 -> do
                when (2 `elem` xs) do pureThrow p1 ex1 "2"
            t3 <- pureForkEx par ex \p1 ex1 -> do
                when (3 `elem` xs) do pureThrow p1 ex1 "3"
            when (1 `elem` xs) do pureThrow par ex "1"
            pureGet par t2
            pureGet par t3
            return xs

With Bluefin exceptions as they are, this is bad. We don't know which thread will throw first.

The idea is as follows. There is a special thread in our parallel computation, let's call it main. Main is special in two ways, every thread originates from it, and it is the thread that returns the final value. All we have to do is to track when a value from a thread that has thrown an exception is requested by main! validateListPar [3,2] will always return Left "2". It is a bit more complicated when threads create other threads, but a simple state machine can do it. I'll have to give it a go in TLA+, but it looks dauntingly doable.

Exceptions in pure code.

monad-par docs don't mention exceptions. It might be a bit of a taboo in pure parallel community :) Those exceptions will not be deterministic, unless we apply checked exceptions trick. But that might be not worth the effort.

Schedulers.

monad-par offers a few schedulers. As parallelism is all about performance, it might be important. I know nothing about schedulers, nor I'm planing to implement any. My plan for this is to cross fingers that GHC runtime will do (and benchmark).

API considerations.

I'll leave those for now. Other that guaranteing that only special effects can cross threads boundaries and slapping NFData here and there, I'm not that concerned.

@tomjaguarpaw
Copy link
Owner

We will rely on an ability to shield our parallel code from other effects in scope. This can be done as follows (I hope there is no workaround for this).

This looks fine. Isn't runNewEff = pure . runPureEff?

monad-par docs don't mention exceptions. It might be a bit of a taboo in pure parallel community :) Those exceptions will not be deterministic

Imprecise exceptions are not deterministic anyway (i.e., the semantics of Haskell don't distinguish between "different" bottoms) so this is fine.

This all sounds very promising! Thanks for writing it up.

My next action is to implement at least a very basic concurrency API that I'm content won't need to change. I think I'll start by wrapping withAsync from the async library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants