-
Notifications
You must be signed in to change notification settings - Fork 14
/
README
91 lines (72 loc) · 3.98 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
# Low-memory-footprint mutexes for pthreads
The main kind of lock provided by the pthreads API is the mutex
(`pthread_mutex_t`). These have a lot of features (enabled though the
attributes set in `pthread_mutexattr_t`), integrate with condition
variables, and handle contention gracefully.
But a drawback is their size. On Linux, a `pthread_mutex_t` occupies
24 bytes on 32-bit machines and 40 bytes on 64-bit machines. If the
mutex is protecting a small data structure, this can lead to unwelcome
overheads in memory usage, and reduce the effectiveness of
caches.
Some pthreads implementations also have spinlocks
(`pthread_spinlock_t`). These are smaller (4 bytes on Linux). But
they don't handle contention gracefully, so they are best used for
critical sections containing small amounts of code that can be
verified to have a short bounded running time.
Hence skinny mutexes: This library provides mutexes that occupy one
pointer-sized word. Like pthreads mutexes, they integrate with
condition variables and handle contention gracefully, so code using
pthreads mutexes can be easily converted to use skinny mutexes
instead.
Skinny mutexes use atomic operations to when possible (e.g. when
locking or unlocking an uncontended skinny mutex), and fall back to
the pthreads primitives when necessary (e.g. when a lock is contended
causing a thread to block). So you will still need to compile with
`-pthread`. Performance should generally be similar to pthreads
mutexes, and it might even be better in some cases.
## How to use
Copy `skinny_mutex.h` and `skinny_mutex.c` into your project (or
include them as a git submodule or however you prefer to do it). Then
`#include "skinny_mutex.h"` and convert from pthread mutexes with the
following substitutions:
Pthread | Skinny mutex
----------------------------|-----------------
`pthread_mutex_t` | `skinny_mutex_t`
`pthread_mutex_init` | `skinny_mutex_init`
`pthread_mutex_destroy` | `skinny_mutex_destroy`
`pthread_mutex_lock` | `skinny_mutex_lock`
`pthread_mutex_unlock` | `skinny_mutex_unlock`
`pthread_mutex_trylock` | `skinny_mutex_trylock`
`pthread_cond_wait` | `skinny_mutex_cond_wait`
`pthread_cond_timedwait` | `skinny_mutex_cond_timedwait`
`PTHREAD_MUTEX_INITIALIZER` | `SKINNY_MUTEX_INITIALIZER`
Note that `skinny_mutex_init` does not take an attributes argument
(see below for more details). Other than that, all the arguments of
the functions mentioned correspond to the pthreads ones, and their
specifications and return values are intended to correspond exactly.
In particular, `skinny_mutex_lock` is not a thread cancellation point,
and `skinny_mutex_cond_wait` is.
## Limitations compared to pthread_mutexes
Unlike pthreads mutexes, skinny mutexes do not currently support
any mutex attributes. Their behaviour corresponds to the default
pthread mutex attributes (i.e. with `NULL` passed as the second
argument to `pthread_mutex_init`).
I have a plan for how to add support for error checking corresponding
to the `PTHREAD_MUTEX_ERRORCHECK` type attribute (from
`pthread_mutexattr_settype`). This will probably be a compile-time
option.
It seems feasible to add support for the protocol attribute
(`PTHREAD_PRIO_INHERIT` and `PTHREAD_PRIO_PROTECT` from
`pthread_mutexattr_setprotocol`). I don't currently have much use for
this myself, but if anyone is interested, ask me about it.
The `PTHREAD_MUTEX_RECURSIVE` type attribute will not be supported, as
it would require `skinny_mutex_t` to grow, and you can rewrite code to
avoid the need for recursive mutexes.
Support for the process-shared and priority ceiling attributes
(`pthread_mutexattr_setpshared` and
`pthread_mutexattr_setprioceiling`) is also unlikely, as they seem to
be of marginal usefulness and/or hard to implement.
## Portability
The code uses gcc's atomic built-ins, so should be widely portable.
There are a few uses of x86 inline assembly where this results in
better code, but it will fall back to the built-ins for other targets.