|
| 1 | +# Artifact: exoTM/STMCAS Mechanisms, Policies, and Data Structures |
| 2 | + |
| 3 | +## Abstract |
| 4 | + |
| 5 | +This artifact consists of synchronization libraries and data structures for |
| 6 | +evaluating the performance of the exoTM synchronization mechanism and STMCAS |
| 7 | +synchronization policy. It consists of synchronization libraries, data |
| 8 | +structure implementations, and microbenchmarks for stress-testing those data |
| 9 | +structures. The code requires an Intel CPU with support for the `rdtscp` |
| 10 | +instruction, which has been available on most Intel CPUs for more than 10 years. |
| 11 | +For the most meaningful evaluation, a system with a large number of cores is |
| 12 | +recommended. The provided Dockerfile handles all of the necessary software |
| 13 | +dependencies. |
| 14 | + |
| 15 | +## Description |
| 16 | + |
| 17 | +This repository consists of the following components: |
| 18 | + |
| 19 | +* Synchronization Policies (`artifact/policies`) |
| 20 | +* Data Structures (`artifact/ds`) |
| 21 | +* Microbenchmarks (`artifact/ubench`) |
| 22 | +* Evaluation Scripts (`artifact/scripts`) |
| 23 | +* Build Environment (`Docker`) |
| 24 | + |
| 25 | +### Synchronization Policies |
| 26 | + |
| 27 | +This artifact considers five synchronization policies |
| 28 | + |
| 29 | +* Compiler-based STM (xSTM) |
| 30 | +* Hand-instrumented STM (handSTM) |
| 31 | +* Software Transactional Multiword Compare and Swap (STMCAS) |
| 32 | +* handSTM+STMCAS (hybrid) |
| 33 | +* Traditional blocking/nonblocking approaches (baseline) |
| 34 | + |
| 35 | +Each synchronization policy can be found in a subfolder of `artifact/policies`. |
| 36 | +Most policies are "header-only" C++ files, which do not require special |
| 37 | +compilation. The exception is xSTM, for which we provide a version of the |
| 38 | +llvm-transmem TM plugin for C++. |
| 39 | + |
| 40 | +### Data Structures |
| 41 | + |
| 42 | +This artifact includes several data structures implemented with STMCAS |
| 43 | +(doubly-linked list, skip list, singly-linked list, closed addressing resizable |
| 44 | +unordered map, binary search tree, red/black tree). As appropriate, these data |
| 45 | +structures are also provided for other synchronization policies. The `ds` |
| 46 | +folder holds all data structures. The subfolders of `ds` correspond to the |
| 47 | +different synchronization policies. |
| 48 | + |
| 49 | +### Microbechmarks |
| 50 | + |
| 51 | +The artifact's microbenchmark harness runs a stress test microbenchmark. The |
| 52 | +microbenchmark has a variety of configuration options, some related to the data |
| 53 | +structure's configuration (e.g., initial size of the unordered map), others |
| 54 | +related to the experiment's configuration (e.g., operation mix, number of |
| 55 | +threads). |
| 56 | + |
| 57 | +### Build Environment |
| 58 | + |
| 59 | +The easiest way to set up an appropriate build environment is to build a Docker |
| 60 | +container. The included `Dockerfile` has instructions for building an |
| 61 | +appropriate container. The dependencies are relatively minimal: |
| 62 | + |
| 63 | +* Ubuntu 22.04 |
| 64 | +* Clang++ 15 |
| 65 | +* CMake (only needed for xSTM) |
| 66 | +* Standard Linux build tools |
| 67 | +* Standard Python3 charting tools |
| 68 | + |
| 69 | +## Hardware Dependencies |
| 70 | + |
| 71 | +This artifact has been tested on a system with 192 GB of RAM and two Intel Xeon |
| 72 | +Platinum 8160 CPUs (48 threads / 96 cores), running Ubuntu 22.04. In general, |
| 73 | +any modern x86 CPU should work. The exoTM/STMCAS codes do not require many |
| 74 | +advanced x86 features. The most noteworthy is the `rdtscp` instruction, which |
| 75 | +has been available in most Intel processors for over a decade. |
| 76 | + |
| 77 | +Please note that the baseline data structures based on the PathCAS |
| 78 | +synchronization methodology require support for Intel TSX. If you do not have a |
| 79 | +machine with TSX support, you will need to comment out lines 112/113 and 138/139 |
| 80 | +in `artifact/scripts/Targets.py`. Otherwise the automated testing/charting |
| 81 | +scripts will fail. |
| 82 | + |
| 83 | +## Software Dependencies |
| 84 | + |
| 85 | +This artifact was developed and tested on Linux systems, running a variety of |
| 86 | +kernel versions. The xSTM policy that we compare against requires Clang 15, so |
| 87 | +we have opted to use Clang throughout the artifact. Our build configuration |
| 88 | +uses the `-std=c++20` flag, but we do not require any particularly advanced |
| 89 | +features (e.g., no concepts or coroutines). For exoTM/STMCAS, any modern C++ |
| 90 | +compiler should be satisfactory. |
| 91 | + |
| 92 | +## Data Sets |
| 93 | + |
| 94 | +The artifact does not require any special data sets. |
| 95 | + |
| 96 | +## Instructions for Repeating the Experiments in the Paper |
| 97 | + |
| 98 | +If you wish to repeat the experiments from our paper, follow these instructions: |
| 99 | + |
| 100 | +1. Check out this repository ( `git clone [email protected]:exotm/pact23.git`) |
| 101 | +2. Build the Docker image (`cd Docker && sudo docker build -t exotm_ae . && cd ..`) |
| 102 | +3. Launch a container (`sudo docker run --privileged --rm -v $(pwd):/root -it exotm_ae`) |
| 103 | +4. Build and run (`make`) |
| 104 | + |
| 105 | +Please note that the Docker image will require roughly 1.7 GB of disk space. To |
| 106 | +check out and build the source code will require another 60 MB. |
| 107 | + |
| 108 | +Also note that you will probably want to run a parallel make command in step 4 |
| 109 | +(e.g., `make -j 16`). |
| 110 | + |
| 111 | +### Experiment Workflow |
| 112 | + |
| 113 | +The top-level Makefile first builds all necessary executable files. Please see |
| 114 | +the README.md files in subfolders for more details. In general, each data |
| 115 | +structure will produce its own executable. |
| 116 | + |
| 117 | +Once all executables are built, the Makefile will invoke `scripts/Runner.py` to |
| 118 | +collect data and plot charts. For the charts in the paper, this script took |
| 119 | +about 6 hours to run, and required about 1GB of space to store the charts and |
| 120 | +data files. |
| 121 | + |
| 122 | +When the script completes, the `scripts/data` folder will hold all results. The |
| 123 | +charts can be found in the `scripts/charts` folder. A second set of charts, |
| 124 | +with error bars, can be found in `scripts/variance`. |
| 125 | + |
| 126 | +Note that typing `make clean` will remove all build artifacts and also all |
| 127 | +experiment results and charts. |
| 128 | + |
| 129 | +## Instructions for Reusing the Artifact (Adding New Data Structures) |
| 130 | + |
| 131 | +Below we discuss the process one can use to add new data structures. |
| 132 | + |
| 133 | +1. Create a new `.h` file with the implementation of the data structure. This |
| 134 | + should go in the appropriate sub-folder of `artifact/ds`, based on the |
| 135 | + synchronization policy used by the data structure. |
| 136 | +2. Create a new `.cc` file in the appropriate sub-folder of `artifact/ubench`, |
| 137 | + depending on the synchronization policy used by the data structure. Note |
| 138 | + that these files are typically quite small (~7 lines), as they only include |
| 139 | + other files, define some types, and invoke a policy's initializer. |
| 140 | +3. In the same folder as the `.cc` file, add the `.cc` file's name (without an |
| 141 | + extension) to the `DS` variable in the `common.mk` file. Typing `make` |
| 142 | + should now build a version of the microbenchmark for testing the new data |
| 143 | + structure. Under rare circumstances, the Makefile might issue a warning |
| 144 | + about duplicate rules in the generated `rules.mk` file. Should this happen, |
| 145 | + type `make clean` and then `make` (or `make -j 16`, for a parallel build). |
| 146 | +4. To integrate the new data structure into the test scripts for an existing |
| 147 | + chart, first add it to the `exeNames` listing in |
| 148 | + `artifact/scripts/ExpCfg.py`. Then locate the chart(s) to augment in |
| 149 | + `artifact/scripts/Targets.py` and add a new `Curve` with a matching |
| 150 | + `exeName`. |
| 151 | + |
0 commit comments