DatAFLow is a fuzzer built on top of AFL++. However, instead of a control-flow-based feedback mechanism (e.g., based on control-flow edge coverage), datAFLow uses a data-flow-based feedback mechanism; specifically, data flows based on def-use associations.
To enable performant fuzzing, datAFLow uses a custom low-fat pointer memory
allocator for efficiently tracking data flows at runtime. This is achieved via
two mechanisms: a runtime replacement for malloc and friends, libfuzzalloc
,
and a set of LLVM passes to transform your target to use libfuzzalloc
.
More details are available in our registered report, published at the 1st International Fuzzing Workshop (FUZZING) 2022. You can read our report here.
The datAFLow
fuzzer requires a custom version of clang. Once this is built,
the fuzzalloc
toolchain can be built. FUZZALLOC_SRC
variable refers to this
directory.
fuzzalloc
requires a patch to the clang compiler to disable turning constant
arrays into packed constant structs.
To build the custom clang:
# Get the LLVM source code and update the clang source code
mkdir llvm
cd llvm
$FUZZALLOC_SRC/llvm-scripts/get_llvm_src.sh
$FUZZALLOC_SRC/llvm-scripts/update_clang_src.sh
# Build and install LLVM/clang/etc.
mkdir build
mkdir install
cd build
# If debugging you can also add -DCMAKE_BUILD_TYPE=Debug -DCOMPILER_RT_DEBUG=On
# Note that if you're going to use gclang, things seem to work better if you use
# the gold linker (https://llvm.org/docs/GoldPlugin.html)
cmake ../llvm -DLLVM_ENABLE_PROJECTS="clang;compiler-rt" \
-DLLVM_BUILD_EXAMPLES=Off -DLLVM_INCLUDE_EXAMPLES=Off \
-DLLVM_TARGETS_TO_BUILD="X86" -DCMAKE_INSTALL_PREFIX=$(realpath ../install)
cmake --build .
cmake --build . --target install
# Add the install directory to your path so that you use the correct clang
export PATH=$(realpath ../install):$PATH
Fuzzing is typically performed in conjunction with a
sanitizer so that "silent" bugs can
be uncovered. Sanitizers such as
ASan typically
hook and replace dynamic memory allocation routines such as malloc
/free
so
that they can detect buffer over/under flows, use-after-frees, etc.
Unfortunately, this means that we lose the ability to track dataflow (as we
rely on the memory allocator to do this). Therefore, we must use a custom
version of ASan in order to (a) detect bugs and (b) track dataflow.
To build the custom ASan, run the following after running get_llvm_src.sh
and
update_clang_src.sh
above:
cd llvm
$FUZZALLOC_SRC/llvm-scripts/update_compiler_rt_src.sh
$FUZZALLOC_SRC/llvm-scripts/update_llvm_src.sh
# Build and install LLVM/clang/etc.
cd build
# If debugging you can also add -DCMAKE_BUILD_TYPE=Debug -DCOMPILER_RT_DEBUG=On
cmake ../llvm -DLLVM_ENABLE_PROJECTS="clang;compiler-rt" \
-DFUZZALLOC_ASAN=On -DLIBFUZZALLOC_PATH=/path/to/libfuzzalloc.so \
-DLLVM_BUILD_EXAMPLES=Off -DLLVM_INCLUDE_EXAMPLES=Off \
-DLLVM_TARGETS_TO_BUILD="X86" -DCMAKE_INSTALL_PREFIX=$(realpath ../install)
cmake --build .
cmake --build . --target install
# Make sure the install path is available in $PATH
Note that after building LLVM with the custom ASan, you will have to rebuild
fuzzalloc with the new clang/clang++ (found under install/bin
).
mkdir build
cd build
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DAFL_PATH=/path/to./afl++/source $FUZZALLOC_SRC
make -j
libfuzzalloc
is a drop-in replacement for malloc and friends. When using
gcc, it's safest to pass in the flags
-fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free
All you have to do is link your target with -lfuzzalloc
.
The dataflow-cc
(and dataflow-cc++
) tools can be used as dropin replacements
for clang
(and clang++
).
Note that this typically requires running dataflow-preprocess
before running
dataflow-cc
to collect the allocation sites to tag.
If the target uses custom memory allocation routines (i.e., wrapping malloc
,
calloc
, etc.), then a special case
list containing a
list of these routines should be provided to dataflow-preprocess
. Doing so
ensures dynamically-allocated variable def sites are appropriately tagged. The
list is provided via the FUZZALLOC_MEM_FUNCS
environment variable; i.e.,
FUZZALLOC_MEM_FUNCS=/path/to/special/case/list
. The special case list must be
formatted as:
[fuzzalloc]
fun:malloc_wrapper
fun:calloc_wrapper
fun:realloc_wrapper
The locations of variable tag sites are stored in a file specified by the
FUZZALLOC_TAG_LOG
environment variable.
dataflow-cc
is a drop-in replacement for clang
. To use the tag list
generated by dataflow-preprocess
, set it in the FUZZALLOC_TAG_LOG
environment variable (e.g., FUZZALLOC_TAG_LOG=/path/to/tags
).
Other useful environment variables include:
-
FUZZALLOC_FUZZER
: Sets the fuzzer instrumentation to use. Valid fuzzers include:debug-log
(logging tostderr
. This requiresfuzzalloc
be built in debug mode; i.e., with-DCMAKE_BUILD_TYPE=Debug
),AFL
, andlibfuzzer
. -
FUZZALLOC_SENSITIVITY
: Sets the use site sensitivity. Valid sensitivities are:mem-read
,mem-write
,mem-access
,mem-read-offset
,mem-write-offset
, andmem-access-offset
.
The following flags are added to libFuzzer:
use_dataflow
: Enable dataflow-based coverageprint_dataflows
: Print out covered def/use chainsjob_prefix
: fuzz-JOB.log prefix