Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → trailofbits → Mishegos

trailofbits / Mishegos

Licence: apache-2.0

A differential fuzzer for x86 decoders

Programming Languages

50402 projects - #5 most used programming language

mishegos

A differential fuzzer for x86 decoders.

Usage

Start with a clone, including submodules:

git clone --recurse-submodules https://github.com/trailofbits/mishegos

Building

mishegos is most easily built within Docker:

docker build -t mishegos .

Alternatively, you can try building it directly.

Make sure you have binutils-dev (or however your system provides libopcodes) installed:

make
# or
make debug

Running

Run the fuzzer for a bit:

./src/mishegos/mishegos ./workers.spec > /tmp/mishegos

mishegos checks for three environment variables:

V=1 enables verbose output on stderr
D=1 enables the "dummy" mutation mode for debugging purposes
M=1 enables the "manual" mutation mode (i.e., read from stdin)
MODE=mode can be used to configure the mutation mode in the absence of D and M
- Valid mutation modes are sliding (default), havoc, and structured

Convert mishegos's raw output into JSONL suitable for analysis:

./src/mish2jsonl/mish2jsonl /tmp/mishegos > /tmp/mishegos.jsonl

mish2jsonl checks for V=1 to enable verbose output on stderr.

Run an analysis/filter pass group on the results:

./src/analysis/analysis -p same-size-different-decodings < /tmp/mishegos.jsonl > /tmp/mishegos.interesting

Generate an ~~ugly~~ pretty visualization of the filtered results:

./src/mishmat/mishmat < /tmp/mishegos.interesting > /tmp/mishegos.html
open /tmp/mishegos.html

Contributing

We welcome contributors to mishegos!

A guide for adding new disassembler workers can be found here.

Performance notes

All numbers below correspond to the following run:

V=1 timeout 60s ./src/mishegos/mishegos ./workers.spec > /tmp/mishegos

Outside Docker:

On a Linux desktop (Ubuntu 20.04, Ryzen 5 3600, 32GB DDR4):
- Commit d80063a
- 8 workers (no udis86) + 1 mishegos fuzzer process
- 8.7M outputs/minute
- 9 cores pinned

TODO

Performance improvements
- Break cohort collection out into a separate process (requires re-addition of semaphores)
- Maybe use a better data structure for input/output/cohort slots
Add a scaling factor for workers, e.g. spawn N of each worker
Pre-analysis normalization (whitespace, immediate representation, prefixes)
Analysis strategies:
- Filter by length, decode status discrepancies
- Easy: lexical comparison
- Easy: reassembly + effects modeling (maybe with microx?)
Scoring ideas:
- Low value: Flag/prefix discrepancies
- Medium value: Decode success/failure/crash discrepancies
- High value: Decode discrepancies with differing control flow, operands, maybe some immediates
Visualization ideas:
- Basic but not really basic: some kind of mouse-over differential visualization

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 126

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (7) 🔗