All Projects → HexHive → fuzzing-seed-selection

HexHive / fuzzing-seed-selection

Licence: other
"Seed Selection for Successful Fuzzing" artifact (at ISSTA 2021)

Programming Languages

C++
36643 projects - #6 most used programming language
python
139335 projects - #7 most used programming language
Dockerfile
14818 projects
shell
77523 projects
CMake
9771 projects

Projects that are alternatives of or similar to fuzzing-seed-selection

rbuster
yet another dirbuster
Stars: ✭ 21 (-27.59%)
Mutual labels:  fuzzing
soltix
SOLTIX: Scalable automated framework for testing Solidity compilers.
Stars: ✭ 30 (+3.45%)
Mutual labels:  fuzzing
fuzzware
Fuzzware's main repository. Start here to install.
Stars: ✭ 132 (+355.17%)
Mutual labels:  fuzzing
MsFontsFuzz
OpenType font file format fuzzer for Windows
Stars: ✭ 49 (+68.97%)
Mutual labels:  fuzzing
swiftfuzztools
Swift-based fuzzing tools
Stars: ✭ 18 (-37.93%)
Mutual labels:  fuzzing
fuzzer-challenges
Challenging testcases for fuzzers
Stars: ✭ 44 (+51.72%)
Mutual labels:  fuzzing
PersonalStuff
This is a repo is to upload files done during my research.
Stars: ✭ 94 (+224.14%)
Mutual labels:  fuzzing
stateafl
StateAFL: A Greybox Fuzzer for Stateful Network Servers
Stars: ✭ 101 (+248.28%)
Mutual labels:  fuzzing
Easy-Pickings
Automatic function exporting and linking for fuzzing cross-architecture binaries.
Stars: ✭ 49 (+68.97%)
Mutual labels:  fuzzing
jest-fuzz
Fuzz testing for jest
Stars: ✭ 24 (-17.24%)
Mutual labels:  fuzzing
sidefuzz
Fuzzer to automatically find side-channel (timing) vulnerabilities
Stars: ✭ 94 (+224.14%)
Mutual labels:  fuzzing
code-examples
Code examples from the https://sttp.site book
Stars: ✭ 19 (-34.48%)
Mutual labels:  software-testing
FuzzImageMagick
Sample files for fuzzing ImageMagick
Stars: ✭ 15 (-48.28%)
Mutual labels:  fuzzing
afl-pin
run AFL with pintool
Stars: ✭ 64 (+120.69%)
Mutual labels:  fuzzing
phuzz
Find exploitable PHP files by parameter fuzzing and function call tracing
Stars: ✭ 53 (+82.76%)
Mutual labels:  fuzzing
libdft64
libdft for Intel Pin 3.x and 64 bit platform. (Dynamic taint tracking, taint analysis)
Stars: ✭ 174 (+500%)
Mutual labels:  fuzzing
e9afl
AFL binary instrumentation
Stars: ✭ 234 (+706.9%)
Mutual labels:  fuzzing
openbsd-tests
Unofficial OpenBSD regression tests
Stars: ✭ 22 (-24.14%)
Mutual labels:  software-testing
Grammar-Mutator
A grammar-based custom mutator for AFL++
Stars: ✭ 133 (+358.62%)
Mutual labels:  fuzzing
wasm runtimes fuzzing
Improving security and resilience of WebAssembly VMs/runtimes/parsers using fuzzing
Stars: ✭ 56 (+93.1%)
Mutual labels:  fuzzing

Seed Selection for Successful Fuzzing

The artifact associated with our ISSTA 2021 paper "Seed Selection for Successful Fuzzing". While our primary artifact is the OptiMin corpus minimizer, we also provide the necessary infrastructure to reproduce our fuzzing experiments.

Getting Started

Setup your environment

Set up your environment (assumes a modern Ubuntu OS, >= 18.04 && <= 20.04, and Python, >= 3.6 && <= 3.8):

# Install prerequisites
sudo apt update
sudo apt install -y git docker.io python3-venv 

# Add yourself to the docker group (don't forget to log out and log back in so
# that the group changes take effect)
sudo usermod -aG docker $USER

# Setup virtualenv
python3 -m venv seed_selection
source seed_selection/bin/activate
pip3 install wheel

# Get this repo
git clone https://github.com/HexHive/fuzzing-seed-selection
pip3 install fuzzing-seed-selection/scripts

Build OptiMin

OptiMin is our SAT-based corpus minimization tool. It supports coverage generated by both AFL and llvm-cov (only AFL is used in the paper). Similarly, OptiMin can back out to both Z3 or EvalMaxSAT (only EvalMaxSAT is used in the paper). To build:

docker build -t seed-selection/optimin fuzzing-seed-selection/optimin

Run OptiMin

OptiMin takes a large "collection corpus" and selects a subset of seeds that are used for fuzzing. This is based on the code coverage for each seed in the collection corpus.

While we provide tools to generate code coverage information for a given corpus (based on afl-showmap), this can be time consuming (depending on the size of the corpus). Thus, we provide seed traces in HDF5 archives.

For example, to perform a corpus minimization base on Google FTS FreeType2 coverage:

  1. Download the coverage HDF5 from here.

    wget https://datacommons.anu.edu.au/DataCommons/rest/records/anudc:6106/data/afl-showmap-coverage/fts/freetype2.hdf5
  2. Expand the HDF5 using the expand_hdf5_coverage.py script

    expand_hdf5_coverage.py -i freetype2.hdf5 -o /tmp/freetype2
    
    # Expected output:
    #
    # 466 seeds to extract
    # Expanding freetype2.hdf5: 100%
  3. Perform an unweighted minimization based on edges only (not hit counts)

    docker run -v /tmp/freetype2:/tmp/freetype2   \
      seed-selection/optimin -e /tmp/freetype2
    
    # Expected output:
    #
    # afl-showmap corpus minimization
    #
    # [############################################################] 100% Reading seed coverage
    # [############################################################] 100% Generating clauses
    # [*] Running Optimin on /tmp/freetype2
    # [*] Running EvalMaxSAT on WCNF
    # [+] EvalMaxSAT completed
    # [*] Parsing EvalMaxSAT output
    # [+] Solution found for /tmp/freetype2
    # 
    # [+] Total time: 0.01 sec
    # [+] Num. seeds: 37
    #
    # ...
  4. Perform an unweighted minimization including edge hit counts

    docker run -v /tmp/freetype2:/tmp/freetype2  \
      seed-selection/optimin /tmp/freetype2
    
    # Expected output:
    #
    # afl-showmap corpus minimization
    #
    # [############################################################] 100% Reading seed coverage
    # [############################################################] 100% Generating clauses
    # [*] Running Optimin on /tmp/freetype2
    # [*] Running EvalMaxSAT on WCNF
    # [+] EvalMaxSAT completed
    # [*] Parsing EvalMaxSAT output
    # [+] Solution found for /tmp/freetype2
    #
    # [+] Total time: 0.01 sec
    # [+] Num. seeds: 53
    #
    # ...
  5. Download the file weights (i.e., sizes) from here.

    wget https://datacommons.anu.edu.au/DataCommons/rest/records/anudc:6106/data/weights/ttf.csv
  6. Perform a weighted minimization based on file size and edges only

    docker run -v /tmp/freetype2:/tmp/freetype2 -v $(pwd):/tmp   \
      seed-selection/optimin -e -w /tmp/ttf.csv /tmp/freetype2
    
    # Expected output:
    #
    # afl-showmap corpus minimization
    #
    # [*] Reading weights from `/tmp/ttf.csv`... 0s
    # [############################################################] 100% Calculating top
    # [############################################################] 100% Reading seed coverage
    # [############################################################] 100% Generating clauses
    # [*] Running Optimin on /tmp/freetype2
    # [*] Running EvalMaxSAT on WCNF
    # [+] EvalMaxSAT completed
    # [*] Parsing EvalMaxSAT output
    # [+] Solution found for /tmp/freetype2
    #
    # [+] Total time: 0.01 sec
    # [+] Num. seeds: 37
    #
    # ...

Detailed Description

Additional Files

The sizes of our collection corpora mean that we cannot store them in a Git repo. Instead, we store ancillary data at ANU's DataCommons repository, available here.

Tracing Code Coverage

Corpus minimization is typically based on some notion of "code coverage". To ensure a fair and uniform comparison across the three corpus minimization tools (afl-cmin, MinSet, and OptiMin), we use AFL's notion of edge coverage. This coverage information can be generated as follows

  1. Compile your target with AFL instrumentation. See the AFL documentation for instructions on how to do this.
  2. Run replay_seeds.py with your target program and your collection corpus. This will generate an HDF5 archive containing coverage information that can then be minimized.

Corpus Minimization

Our paper surveys a number of corpus minimization tools: OptiMin, afl-cmin, and MinSet. A more detailed explanation on how to use these tools and reproduce our results is given below.

OptiMin

Instructions for running OptiMin are given above. As described previously, a weighted minimization can be performed by supplying a weights CSV file to OptiMin's -w option. This weights file has the following format:

FILE_1,WEIGHT
FILE_2,WEIGHT
FILE_3,WEIGHT
FILE_4,WEIGHT
FILE_5,WEIGHT

Where FILE_1, FILE_2, ... corresponds to the name of a file within the corpus directory (only the filename needs to be provided: the corpus directory path should not be provided), and WEIGHT is an unsigned integer >= 1. We provide weights for our collection corpora here.

afl-cmin

afl-cmin is AFL's inbuilt corpus minimization tool. afl_cmin.py wraps afl-cmin so that it outputs the names of the seeds in the minimized corpus (rather than copying the seeds and wasting storage).

MinSet

MinSet is the tool developed by Rebert et al. in their paper Optimizing Seed Selection for Fuzzing. While we were able to obtain the tool from the authors, it is not open source and thus we are unable to provide it here. Please contact the authors if you would like to obtain the source code.

If you have access to the source code, you can perform a MinSet minimization by:

  1. Generate code coverage as described here
  2. Expand the generated HDF5 archive using expand_hdf5_coverage.py
  3. Convert the expanded coverage to a set of bitvector traces using MoonBeam
  4. Run the qminset.py wrapper on the bitvector traces

Fuzzing Experiments

In addition to the OptiMin tool, we also provide the necessary infrastructure to reproduce our fuzzing experiments. Detailed instructions are provided here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].