Skip to content

Commit c81c5e9

Browse files
committed
prelim: add triage and the checklist
1 parent 9c2cdf5 commit c81c5e9

File tree

9 files changed

+151
-19
lines changed

9 files changed

+151
-19
lines changed

index.rst

+5
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,11 @@ by `Input Output Global <https://iohk.io/>`_.
2626
Release History
2727
---------------
2828

29+
* February, 2025
30+
31+
* :ref:`The Checklist <The Checklist>` first draft finished.
32+
* :ref:`Triage <Triage>` first draft finished.
33+
2934
* September, 2024
3035

3136
* :ref:`Memory Footprints of Data Types <Memory Footprint of Data Types Chapter>` first draft finished.

src/Measurement_Observation/Stg_RTS_Profiling/reduced_stack.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. Reduced Stack
1+
.. _Reduced Stack:
22

33
:lightgrey:`The Reduced Stack Method`
44
=====================================

src/Optimizations/Code_Changes/unroll_stacks.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. _Unroll Stacks:
1+
.. _Unroll Monad Transformers Chapter:
22

33
:lightgrey:`Unroll Monad Transformer Stacks`
44
============================================

src/Preliminaries/how_to_use.rst

+17-13
Original file line numberDiff line numberDiff line change
@@ -32,12 +32,12 @@ picking your favorite Haskell library and attempting to optimize that!
3232

3333
The book assumes you are using GHC |ghcVersion| and a Linux distribution (kernel
3434
version ``5.8`` and higher). Should you be using an older compiler than some
35-
sections, such as :ref:`Using EventLog
35+
sections, such as :ref:`using EventLog
3636
<EventLog Chapter>`; which arrived in ``GHC 8.8``
37-
may still be useful, while others such as :ref:`Using Cachegrind
37+
may still be useful, while others such as :ref:`using Cachegrind
3838
<Cachegrind Chapter>`; which relies on
3939
:term:`DWARF` symbols (added in ``GHC 8.10.x``) may not be applicable.
40-
Similarly, some chapters, such as :ref:`Using perf
40+
Similarly, some chapters, such as :ref:`using perf
4141
<Perf Chapter>` will only be
4242
applicable for Linux and Linux based operating systems.
4343

@@ -61,21 +61,25 @@ Where to Begin
6161
--------------
6262

6363
The book is structured into discrete independent parts to better serve as a
64-
handbook. Thus, the book is not meant to be read in a linear order. Instead, one
65-
should pick and choose which chapter to read next based on their needs because
66-
*the book assumes you have a problem that needs solving*.
67-
68-
There are two parts: Part 1, focuses on measurement, profiling and observation
69-
of Haskell programs. This part is ordered from the bottom-up; it begins with
70-
tools and probes that are language agnostic and close to the machine, such as
71-
:ref:`Perf <Perf Chapter>` and :ref:`Cachegrind <Cachegrind Chapter>`, then
72-
proceeds through each `intermediate representation
64+
handbook and is not meant to be read in a linear order. Instead, one should pick
65+
and choose which chapter to read next based on their needs because *the book
66+
assumes you have a problem that needs solving*.
67+
68+
The best place to start is :ref:`triage <Triage>`, this should help you
69+
narrow down your next steps. If you are short on time, or just have a problem to
70+
solve, then skip to the :ref:`checklist <The Checklist>`.
71+
72+
The book is roughly divided into two parts: Part 1, focuses on measurement,
73+
profiling and observation of Haskell programs. This part is ordered from the
74+
bottom-up; it begins with tools and probes that are language agnostic and close
75+
to the machine, such as :ref:`perf <Perf Chapter>` and :ref:`cachegrind
76+
<Cachegrind Chapter>`, then proceeds through each `intermediate representation
7377
<https://en.wikipedia.org/wiki/Intermediate_representation#:~:text=An%20intermediate%20representation%20(IR)%20is,such%20as%20optimization%20and%20translation.>`_
7478
(IR) describing the tools, probes, and information available at each IR.
7579

7680
Part 2, provides an ordered sequence of techniques to optimize code. It is
7781
ordered from the easiest methods, such as choosing the right libraries; to the
78-
hardest and more invasive methods, such as exploiting :ref:`Backpack <Backpack
82+
hardest and more invasive methods, such as exploiting :ref:`backpack <Backpack
7983
Chapter>` for fine-grained :term:`Unboxed` data types or exploiting
8084
:term:`Levity Polymorphism` to control the runtime representation of a data
8185
type.

src/Preliminaries/index.rst

+2
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ Preliminaries
66
:name: Preliminaries
77

88
how_to_use
9+
triage
10+
the_checklist
911
what_makes_fast_hs
1012
philosophies_of_optimization
1113
golden_rules

src/Preliminaries/philosophies_of_optimization.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ statements for they will waste your time and iteration cycles.
172172

173173

174174
References and Footnotes
175-
========================
175+
------------------------
176176

177177
.. [#] See `this <https://youtu.be/pgoetgxecw8?si=0csotFBkya5gGDvJ>`__ series by
178178
Casey Muratori. We thank him for his labor.

src/Preliminaries/the_checklist.rst

+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
.. _The Checklist:
2+
3+
The Checklist
4+
=============
5+
6+
Here is a checklist of things you might try to improve the performance of your
7+
program.
8+
9+
- [ ] Are you compiling with ``-O2``, ``-Wall``?
10+
- [ ] Have you checked for :ref:`memory leaks <Eventlog Chapter>` on the heap?
11+
- [ ] Have you checked for :ref:`stack leaks <Reduced Stack>`?
12+
- [ ] Have you :ref:`weighed <Weigh Chapter>` your data structures?
13+
- [ ] Do you understand the :ref:`memory footprints <Memory Footprint of Data Types Chapter>` of your data types?
14+
- [ ] Can you reduce the memory footprint of you data types?
15+
- [ ] Are you using data structures that have a :ref:`low impedence <canonical-domain-modeling>` to your problem?
16+
- [ ] Have you set up benchmarks? Are they realistic? Do the exercise the full data space?
17+
- [ ] Are your data types strict? Can you unpack them?
18+
- [ ] Have you removed excessive polymorphism?
19+
- [ ] Are you using ``Text`` or ``ByteString`` instead of ``String``?
20+
- [ ] Can you inline and monomorphise critical functions, especially in hot loops?
21+
- [ ] Are you :ref:`accidentally allocating <excessive-closure-allocation>` in a hot loop?
22+
- [ ] Are any functions in a hot loop taking more than five arguments? Can you reduce the number of arguments?
23+
- [ ] Can you :ref:`defunctionalize <Defunctionalization Chapter>` critical functions? Is GHC defunctionalizing for you?
24+
- [ ] Are you using a :ref:`left fold over a list <canonical-pointer-chasing>`?
25+
- [ ] Are your datatypes definitions ordered such that the most common constructor is first?
26+
- [ ] Are you using explicit export lists?
27+
- [ ] Have you checked for :userGuide:`missed specializations </using-warnings.html#ghc-flag--Wmissed-specialisations>`?
28+
- [ ] Have you checked the ratio of known to unknown function calls?
29+
- [ ] Have you inspected the Core?
30+
- [ ] Have you inspected the STG?
31+
- [ ] Would your program benefit from compiling with ``LLVM``?
32+
- [ ] Are you :ref:`shotgun parsing <shotgun-parsing>`? Can you lift information
33+
into the type system to avoid subsequent checks over the same data?
34+
- [ ] Are you grouping things that need the same processing together?
35+
- [ ] Could your program benefit for concurrency of parallelism?
36+
- [ ] Could your program benefit from the :ref:`one-shot monad trick <OneShot Monad Chapter>`?
37+
- [ ] Have you :ref:`unrolled <Unroll Monad Transformers Chapter>` your monad transformers?
38+
- [ ] Have you inspected the :ref:`cache behavior <Cachegrind Chapter>`?
39+
40+
..
41+
The grouping things should be about data oriented design and using things like zigs arraylist
42+
43+
.. todo::
44+
Each item should have a concomitant link.
45+
46+
See also
47+
--------
48+
49+
- `This older checklist <https://github.com/haskell-perf/checklist>`__.

src/Preliminaries/triage.rst

+70
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
.. _Triage:
2+
3+
========
4+
Triage
5+
========
6+
7+
This is a triage; it is the signpost that marks the start of your journey and
8+
should give you enough direction to make your next steps.
9+
10+
Symptoms
11+
--------
12+
13+
You do not have a problem, but want to learn performance-oriented Haskell
14+
Begin with the :ref:`Philosophies of Optimization <Philosophies of
15+
Optimization Chapter>`. Then read the :ref:`the programs of consistent
16+
lethargy <What Makes Fast HS Chapter>`, and some of the case studies. This
17+
should give you enough to decide your next steps. If you decide to begin
18+
doing some optimizations see the :ref:`checklist <The Checklist>` for more
19+
ideas.
20+
21+
You have a performance regression that you want to understand and fix
22+
You need to diagnose the regression and begin thinking in terms of an
23+
investigation. Read :ref:`how to debug <How to Debug Chapter>` to make sure
24+
you know how to make progress. Since you have observed a regression, try to
25+
find a commit or state in your project where you *do not* observe the
26+
regression. This will let you bisect your project to narrow down the space
27+
of changes that START. You may also consider other forms of profiling and
28+
observation, such as:
29+
30+
- Running a :ref:`tickyticky <Ticky Chapter>` profile.
31+
- :ref:`Checking the heap <Eventlog Chapter>`.
32+
- Inspecting the :ref:`Core <Reading Core>`.
33+
- Inspecting the :ref:`STG <Reading STG>`.
34+
- Observing the :ref:`cache behavior <Cachegrind Chapter>`.
35+
- Observing the :ref:`CPU's Performance Counters <Perf Chapter>`.
36+
37+
You have a program that you want to begin optimizing
38+
If you are short on time, begin with the :ref:`checklist <The Checklist>`
39+
and then check for :ref:`memory leaks <Eventlog Chapter>`. If not, begin with
40+
the easy changes:
41+
42+
- Use better datastructures.
43+
- Carry checks in the type system so that the program is not always checking
44+
the same predicates.
45+
- Filter before you enter a hot loop.
46+
- Remove niceties in the hot loops, such as logging.
47+
- :ref:`Check the heap <Eventlog Chapter>`. The :ref:`klister <Klister Case
48+
Study>` case study is a good example of this kind of optimization.
49+
50+
Then move into the more invasive changes such as:
51+
52+
- :ref:`unrolling <Unroll Monad Transformers Chapter>` your monad transformers.
53+
- Using the :ref:`one-shot monad trick <OneShot Monad Chapter>`.
54+
- Selectively :ref:`defunctionalizing <Defunctionalization Chapter>` critical functions.
55+
- Critically analyzing your architecture from a performance perspective.
56+
57+
You have a program that you've optimized, but want to optimize more
58+
If you have already harvested the low hanging fruit then you have likely
59+
driven the program into a local maxima. Therefore, if you still need more
60+
speed then you must make more invasive changes, such as we listed
61+
above. However, the best changes you can make will exploit properties of the
62+
problem domain to reduce the work your program must do to arrive at a
63+
result. Often times these will be architectural changes.
64+
65+
.. todo::
66+
67+
In lieu of having links for you continue in this case. You can search for
68+
data-oriented design to begin refactoring your system in this manner. I
69+
highly recommend this `this talk
70+
<https://youtu.be/IroPQ150F6c?si=mD486UkpWquFygjr>`__ by Andrew Kelley.

src/Preliminaries/what_makes_fast_hs.rst

+5-3
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. _sec-lethargy:
1+
.. _What Makes Fast HS Chapter:
22

33
The Programs of Consistent Lethargy
44
===================================
@@ -148,7 +148,6 @@ without thinking about their memory representation; and especially around
148148
laziness. As such, most of these instances are well known and have floated
149149
around the community for some time.
150150

151-
152151
How does Excessive Pointer Chasing Slow Down Runtime Performance?
153152
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
154153

@@ -256,7 +255,7 @@ quality); second, in order to observe it, the programmer must track the memory
256255
allocation of their program across many functions, modules and packages, which
257256
is not a common experience when writing Haskell. For our purposes', we'll
258257
inspect examples that GHC should have no problem finding and optimizing. See the
259-
:ref:`Impact of seq Removal on SBV's cache <SBV572>` case study for an example of excessive memory allocation in a widely used library.
258+
:ref:`Impact of seq Removal on SBV's cache <SBV572>` case study for an example of excessive memory allocation in a widely used library.
260259

261260
.. todo::
262261
Not yet written, see `#18 <https://github.com/input-output-hk/hs-opt-handbook.github.io/issues/18>`_
@@ -267,6 +266,7 @@ transformations is beneficial; it trains you to start thinking in terms of
267266
memory allocation when reading or writing Haskell code, and teaches you to
268267
perform these optimizations manually when GHC fails to optimize.
269268

269+
.. _excessive-closure-allocation:
270270

271271
How does Excessive Closure Allocation Slow Down Runtime Performance
272272
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -442,6 +442,8 @@ expressing itself in the implementation.
442442
Need example as case study see `#20 <https://github.com/input-output-hk/hs-opt-handbook.github.io/issues/20>`_
443443

444444

445+
.. _shotgun-parsing:
446+
445447
Problem Domain Invariants are Difficult to Express
446448
""""""""""""""""""""""""""""""""""""""""""""""""""
447449

0 commit comments

Comments
 (0)