Skip to content

Conversation

sfc-gh-tteixeira
Copy link
Collaborator

@sfc-gh-tteixeira sfc-gh-tteixeira commented Sep 24, 2025

Summary

Make it possible for st.fragments to run in a parallel thread.

Problem statement

Dashboards are one of the most common classes of apps in Streamlit. In a dashboard, data is typically loaded, then transformed (sometimes after some user input), then finally displayed as charts and other widgets.

It's very common for the load-transform code paths of any given chart to be completely distinct from the code paths of other charts. However, these code paths are typically executed sequentially, which leads to a slow loading pattern for the app, where one section will only load once the previous has done so.

Toy example:

import numpy as np

def load_user_growth():
    time.sleep(1)
    return np.random.randn(100, 2)

def load_revenue_growth():
    time.sleep(1)
    return np.random.randn(100, 2)

def load_expenses_growth():
    time.sleep(1)
    return np.random.randn(100, 2)

def transform_user_growth(arr, x):
    time.sleep(1)
    return arr + x

def transform_revenue_growth(arr, x):
    time.sleep(1)
    return arr - x

def transform_expenses_growth(arr, x):
    time.sleep(1)
    return arr * x

slider1 = st.slider("Pick a number", 123)
slider2 = st.slider("Pick a second number", 456)

arr1 = load_user_growth()
arr1 = transform_user_growth(arr1, slider1)
st.line_chart(arr1)

arr2 = load_revenue_growth()
arr2 = transform_revenue_growth(arr2, slider2)
st.line_chart(arr2)

arr3 = load_expenses_growth()
arr3 = transform_expenses_growth(arr3, slider2)
st.line_chart(arr3)

In this app, each step runs sequentially after the previous one is done, so the whole thing takes 6s to draw:

flowchart
	s1@{ label: "slider1" }
	s2@{ label: "slider2" }
	l1@{ label: "load_user_growth (1s)" }
	l2@{ label: "load_revenue_growth (1s)" }
	l3@{ label: "load_expenses_growth (1s)" }
	t4@{ label: "transform_user_growth (1s)" }
	t5@{ label: "transform_revenue_growth (1s)" }
	t6@{ label: "transform_expenses_growth (1s)" }
	d7@{ label: "st.line_chart" }
	d8@{ label: "st.line_chart" }
	d9@{ label: "st.line_chart" }
	startCircle@{ shape: "circle", label: "Start" }
	endCircle@{ shape: "circle", label: "End" }
	startCircle --> s1
	s1 --> s2
	s2 --> l1
	l1 --> t4
	l2 --> t5
	l3 --> t6
	t4 --> d7
	t5 --> d8
	t6 --> d9
	d7 --> l2
	d8 --> l3
	d9 --> endCircle
	style s1 fill:#e0f2fe
	style s2 fill:#e0f2fe
	style l1 fill:#fce7f3
	style l2 fill:#fce7f3
	style l3 fill:#fce7f3
	style t4 fill:#ecfccb
	style t5 fill:#ecfccb
	style t6 fill:#ecfccb
	style d7 fill:#fef9c3
	style d8 fill:#fef9c3
	style d9 fill:#fef9c3
	style startCircle fill:#eee
	style endCircle fill:#eee
Loading

Question 1: Given that these code paths are so different, it would make a lot more sense to load them in parallel instead. What would be the a simple, Streamlity API that is powerful enough to cover the more common patterns for this?

Question 2: When a user moves the sliders, the entire app reloads. How can we make sure only the fragments that depend on that slider reload instead? For now we'll leave this unanswered, as it will be the subject of a separate StEP. But you should have this question in mind as you think through this StEP since we don't want the solution to Question 1 to preclude a great solution for Question 2.

Goals

  1. Make it possible to run @st.fragments in a separate thread.
  2. Very easy to use.
  3. Covers major use cases.
  4. Does not break existing apps.

Non-goals

  1. Covering every possible scenario.

Proposed solution

To address question 1, let's extend the fragments primitive to support parallel execution, so the example above looks more like this:

(NOTE: Ignore the exact API right now)

import numpy as np

def load_user_growth():
    time.sleep(1)
    return np.random.randn(100, 2)

def load_revenue_growth():
    time.sleep(1)
    return np.random.randn(100, 2)

def load_expenses_growth():
    time.sleep(1)
    return np.random.randn(100, 2)

def transform_user_growth(arr, x):
    time.sleep(1)
    return arr + x

def transform_revenue_growth(arr, x):
    time.sleep(1)
    return arr - x

def transform_expenses_growth(arr, x):
    time.sleep(1)
    return arr * x

slider1 = st.slider("Pick a number", 123)
slider2 = st.slider("Pick a second number", 456)

@st.fragment(parallelize=True)
def chart1():
    arr1 = load_user_growth()
    arr1 = transform_user_growth(arr1, slider1)
    st.line_chart(arr1)

@st.fragment(parallelize=True)
def chart2():
    arr2 = load_revenue_growth()
    arr2 = transform_revenue_growth(arr2, slider2)
    st.line_chart(arr2)

@st.fragment(parallelize=True)
def chart3():
    arr3 = load_expenses_growth()
    arr3 = transform_expenses_growth(arr3, slider2)
    st.line_chart(arr3)

chart1()
chart2()
chart3()

With parallel fragments, the app takes 2s to load, and its execution flow looks like this:

flowchart
	s1@{ label: "slider1" }
	s2@{ label: "slider2" }
	l1@{ label: "load_user_growth (1s)" }
	l2@{ label: "load_revenue_growth (1s)" }
	l3@{ label: "load_expenses_growth (1s)" }
	t4@{ label: "transform_user_growth (1s)" }
	t5@{ label: "transform_revenue_growth (1s)" }
	t6@{ label: "transform_expenses_growth (1s)" }
	d7@{ label: "st.line_chart" }
	d8@{ label: "st.line_chart" }
	d9@{ label: "st.line_chart" }
	startCircle@{ shape: "circle", label: "Start" }
	endCircle@{ shape: "circle", label: "End" }
	startCircle --> s1
	s1 --> s2
        s2 --> l1
	s2 --> l2
	s2 --> l3
	l1 --> t4
	l2 --> t5
	l3 --> t6
	t4 --> d7
	t5 --> d8
	t6 --> d9
	d7 --> endCircle
	d8 --> endCircle
	d9 --> endCircle
	style s1 fill:#e0f2fe
	style s2 fill:#e0f2fe
	style l1 fill:#fce7f3
	style l2 fill:#fce7f3
	style l3 fill:#fce7f3
	style t4 fill:#ecfccb
	style t5 fill:#ecfccb
	style t6 fill:#ecfccb
	style d7 fill:#fef9c3
	style d8 fill:#fef9c3
	style d9 fill:#fef9c3
	style startCircle fill:#eee
	style endCircle fill:#eee
Loading

API

How should we declare that a given fragment can be executed in a parallel thread?

Option 1: New keyword argument

Signature

st.fragment(func=None, *, run_every=None, parallelize=True)

Usage

@st.fragment(parallelize=True, ...)
def my_fragment():
  ...

Pros

  • Doesn't introduce a new primitive in Streamlit
  • Very discoverable
  • ?

Cons

  • A bit wordy
  • ?

Naming

  1. parallelize
  2. thread
  3. background
  4. bg
  5. async
  6. task
  7. daemon
  8. background_task
  9. run_in_thread
  10. run_in_parallel
  11. run_in_background
  12. run_in_bg
  13. run_async
Option 2: New decorator

Signature

st.parallel_fragment(func=None, *, run_every=None)

Usage

@st.parallel_fragment
def my_fragment():
  ...

Pros

  • Very discoverable
  • ?

Cons

  • Introduces a new flow control primitive in Streamlit.

    People tend to be confused by the primitives we already support (cache_resource, cache_data, fragment, form), so I'd rather not make things more complicated for them.

  • ?

Naming:

  1. @st.parallel_fragment
  2. @st.threaded_fragment
  3. @st.async_fragment
  4. @st.thread
  5. @st.fragment_thread
  6. @st.daemon
  7. @st.task
  8. @st.async
Option 3: Async def ✅ CURRENT FAVORITE

The idea Option 3 is that you declare a parallel fragment using async def instead of def.

Signature

With this option, there would be no change to the @st.fragment signature:

st.fragment(func=None, *, run_every=None)

Usage

@st.fragment
async def my_fragment():
   ...

Pros

  • Doesn't introduce a new primitive in Streamlit
  • [Opinion] Feels really natural
  • ?

Cons

  • Harder to discover
  • This somewhat stretches the definition of async in Python
  • ?

Design

This is a Python-only feature. No impact on design.

Behavior

The return value of an async fragment is ignored.

Another option would be to return a Future or to somehow stuff the return value into Session State, but it's unclear that any of this is needed. So let's leave this feature out for now and see if there's a need. We can always add this later.

Other solutions considered

Just use threads

Today, if you use a Thread in Streamlit you need to do some magic with the script run context. We
plan on fixing that soon.
Once we fix it, you'll be able to solve Question 1 with pure Python
as shown below. So why add another Streamlit primitive?

import threading

def chart1():
    arr1 = load_user_growth()
    arr1 = transform_user_growth(arr1, slider1)
    st.line_chart(arr1)

def chart2():
    arr2 = load_revenue_growth()
    arr2 = transform_revenue_growth(arr2, slider2)
    st.line_chart(arr2)

def chart3():
    arr3 = load_expenses_growth()
    arr3 = transform_expenses_growth(arr3, slider2)
    st.line_chart(arr3)

threading.Thread(target=chart1).start()
threading.Thread(target=chart2).start()
threading.Thread(target=chart3).start()

Pros

  1. It's just Python!
  2. ?

Cons

  1. The syntax is a little contrived
  2. Arguable, but it's possible that solutions using an st.command are better at nudging
    developers to actually use them. But it's possible that this is just a matter of documentation.
  3. ?

Major difference

In the end, the thing that's inserted in the app is not a fragment. Which means that when the
user interacts with widgets inside that block they cause a full rerun of the a script. This may
be desired in some situations, but my hypothesis is that in most cases it would be better to
rerun just that "block" of the app.

In this scenario, you could turn on fragment behavior by using @st.fragment:

import threading

@st.fragment
def chart1():
    arr1 = load_user_growth()
    arr1 = transform_user_growth(arr1, slider1)
    st.line_chart(arr1)

@st.fragment
def chart2():
    arr2 = load_revenue_growth()
    arr2 = transform_revenue_growth(arr2, slider2)
    st.line_chart(arr2)

@st.fragment
def chart3():
    arr3 = load_expenses_growth()
    arr3 = transform_expenses_growth(arr3, slider2)
    st.line_chart(arr3)

threading.Thread(target=chart1).start()
threading.Thread(target=chart2).start()
threading.Thread(target=chart3).start()

Note: I don't know if this would actually work! Needs to be verified.

Metrics

Impact on metrics:

The hope is that this would make a certain class of apps faster. However, it may be hard to measure this since we'd need to look at performance metrics from before and after the change.

Requires new metrics:

If going with Option 3, we'll need to add some telemetry logic to be able to tell how much usage this feature is getting.

Otherwise, Options 1 and 2 should get automatically tracked with the current telemetry logic.

Implementation

Once there's a prototype implementation, we'll link the Github branch for it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant