All Projects → fireeye → Stringsifter

fireeye / Stringsifter

Licence: apache-2.0
A machine learning tool that ranks strings based on their relevance for malware analysis.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Stringsifter

Simpleator
Simpleator ("Simple-ator") is an innovative Windows-centric x64 user-mode application emulator that leverages several new features that were added in Windows 10 Spring Update (1803), also called "Redstone 4", with additional improvements that were made in Windows 10 October Update (1809), aka "Redstone 5".
Stars: ✭ 260 (-44.56%)
Mutual labels:  malware-analysis, reverse-engineering
Polichombr
Collaborative malware analysis framework
Stars: ✭ 307 (-34.54%)
Mutual labels:  malware-analysis, reverse-engineering
Drltrace
Drltrace is a library calls tracer for Windows and Linux applications.
Stars: ✭ 282 (-39.87%)
Mutual labels:  malware-analysis, reverse-engineering
Flare Vm
No description or website provided.
Stars: ✭ 3,201 (+582.52%)
Mutual labels:  malware-analysis, reverse-engineering
Simplify
Android virtual machine and deobfuscator
Stars: ✭ 3,865 (+724.09%)
Mutual labels:  malware-analysis, reverse-engineering
stringsifter
A machine learning tool that ranks strings based on their relevance for malware analysis.
Stars: ✭ 567 (+20.9%)
Mutual labels:  strings, malware-analysis
Macbook
《macOS软件安全与逆向分析》随书源码
Stars: ✭ 302 (-35.61%)
Mutual labels:  malware-analysis, reverse-engineering
Xapkdetector
APK/DEX detector for Windows, Linux and MacOS.
Stars: ✭ 208 (-55.65%)
Mutual labels:  malware-analysis, reverse-engineering
Gef
GEF (GDB Enhanced Features) - a modern experience for GDB with advanced debugging features for exploit developers & reverse engineers ☢
Stars: ✭ 4,197 (+794.88%)
Mutual labels:  malware-analysis, reverse-engineering
Pwndbg
Exploit Development and Reverse Engineering with GDB Made Easy
Stars: ✭ 4,178 (+790.83%)
Mutual labels:  malware-analysis, reverse-engineering
Drsemu
DrSemu - Sandboxed Malware Detection and Classification Tool Based on Dynamic Behavior
Stars: ✭ 237 (-49.47%)
Mutual labels:  malware-analysis, reverse-engineering
Dex Oracle
A pattern based Dalvik deobfuscator which uses limited execution to improve semantic analysis
Stars: ✭ 398 (-15.14%)
Mutual labels:  malware-analysis, reverse-engineering
Shed
.NET runtime inspector
Stars: ✭ 229 (-51.17%)
Mutual labels:  malware-analysis, reverse-engineering
Pev
The PE file analysis toolkit
Stars: ✭ 422 (-10.02%)
Mutual labels:  malware-analysis, reverse-engineering
Radare2
UNIX-like reverse engineering framework and command-line toolset
Stars: ✭ 15,412 (+3186.14%)
Mutual labels:  malware-analysis, reverse-engineering
Freki
🐺 Malware analysis platform
Stars: ✭ 285 (-39.23%)
Mutual labels:  malware-analysis, reverse-engineering
Cmulator
Cmulator is ( x86 - x64 ) Scriptable Reverse Engineering Sandbox Emulator for shellcode and PE binaries . Based on Unicorn & Zydis Engine & javascript
Stars: ✭ 197 (-58%)
Mutual labels:  malware-analysis, reverse-engineering
Lief
Authors
Stars: ✭ 2,730 (+482.09%)
Mutual labels:  malware-analysis, reverse-engineering
Idenlib
idenLib - Library Function Identification [This project is not maintained anymore]
Stars: ✭ 322 (-31.34%)
Mutual labels:  malware-analysis, reverse-engineering
Drakvuf Sandbox
DRAKVUF Sandbox - automated hypervisor-level malware analysis system
Stars: ✭ 384 (-18.12%)
Mutual labels:  malware-analysis, reverse-engineering

StringSifter is a machine learning tool that automatically ranks strings based on their relevance for malware analysis.

Quick Links

Usage

StringSifter requires Python version 3.6 or newer. Run the following commands to get the code, run unit tests, and use the tool:

Installation

Use pip to get running immediately. Choose the major version corresponding to your version of python:

Python Version Stringsifter Version Branch Example Pip Command
3.8+ 2.x master pip install stringsifter~=2.0
3.6, 3.7 1.x python3.7 pip install stringsifter~=1.0

For development, check out the correct branch for your Python version or stay on master for the latest supported version. Then use pipenv:

git clone https://github.com/fireeye/stringsifter.git
cd stringsifter
git checkout python3.7 #Optional
pipenv install --dev

Running Unit Tests

To run unit tests from the StringSifter installation directory:

pipenv run tests

Running from the Command Line

The pip install command installs two runnable scripts flarestrings and rank_strings into your python environment. When developing from source, use pipenv run flarestrings and pipenv run rank_strings.

flarestrings mimics features of GNU binutils' strings, and rank_strings accepts piped input, for example:

flarestrings <my_sample> | rank_strings

rank_strings supports a number of command line arguments. The positional argument input_strings specifies a file of strings to rank. The optional arguments are:

Option Meaning
--scores (-s) Include the rank scores in the output
--limit (-l) Limit output to the top limit ranked strings
--min-score (-m) Limit output to strings with score >= min-score
--batch (-b) Specify a folder of strings outputs for batch processing

Ranked strings are written to standard output unless the --batch option is specified, causing ranked outputs to be written to files named <input_file>.ranked_strings.

flarestrings supports an option -n (or --min-len) to print sequences of characters that are at least min-len characters long, instead of the default 4. For example:

flarestrings -n 8 <my_sample> | rank_strings

will print and rank only strings of length 8 or greater.

Running from a Docker container

  • After cloning the repo, build the container. From the the package's top level directory:
docker build -t stringsifter -f docker/Dockerfile .
  • Run the container with flarestrings or rank_strings argument to use the respective command. The containerized commands can be used in pipelines:
cat <my_sample> | docker run -i stringsifter flarestrings | docker run -i stringsifter rank_strings
  • Or, run the container without arguments to get a shell prompt, using the -v flag to expose a host directory to the container:
docker run -v <my_malware>:/samples -it stringsifter

where <my_malware> contains samples for analysis, for example:

docker run -v $HOME/malware/binaries:/samples -it stringsifter
  • At the container prompt:
flarestrings /samples/<my_sample> | rank_strings <options>

All command line arguments are supported in the containerized scripts.

Running on FLOSS Output

StringSifter can be applied to arbitrary lists of strings, making it useful for practitioners looking to glean insights from alternative intelligence-gathering sources such as live memory dumps, sandbox runs, or binaries that contain obfuscated strings. For example, FireEye Labs Obfuscated Strings Solver (FLOSS) extracts printable strings just as Strings does, but additionally reveals obfuscated strings that have been encoded, packed, or manually constructed on the stack. It can be used as an in-line replacement for Strings, meaning that StringSifter can be similarly invoked on FLOSS output using the following command:

$PY2_VENV/bin/floss –q <options> <my_sample> | rank_strings <options>

Notes:

  1. The –q argument suppresses headers and formatting to show only extracted strings. To learn more about additional FLOSS options, please see its Usage Docs.
  2. FLOSS requires Python 2, while StringSifter requires Python 3. In the example command at least one of floss or rank_strings must include a relative path referencing a python virtual enviroment.
  3. FLOSS can be downloaded as a standalone executable. In this case it is not required to specify a Python environment because the executable does not rely on a Python interpreter.

Notes on running strings

This distribution includes the flarestrings program to ensure predictable output across platforms. If you choose to run your system's installed strings note that its options are not consistent across versions and platforms:

Linux

Most Linux distributions include the strings program from GNU Binutils. To extract both "wide" and "narrow" strings the program must be run twice, piping to an output file:

strings <my_sample>       > strs.txt   # narrow strings
strings -el <my_sample>  >> strs.txt   # wide strings.  note the ">>"

MacOS

Some versions of BSD strings packaged with MacOS do not support wide strings. Also note that the -a option to strings to scan the whole file may be disabled in the default configuration. Without -a informative strings may be lost. We recommend installing GNU Binutils via Homebrew or MacPorts to get a version of strings that supports wide characters. Use care to invoke the correct version of strings.

Windows

strings is not installed by default on Windows. We recommend installing Windows Sysinternals, Cygwin, or Malcode Analyst Pack to get a working strings.

Discussion

This version of StringSifter was trained using Strings outputs from sampled malware binaries associated with the first EMBER dataset. Ordinal labels were generated using weak supervision procedures, and supervised learning is performed by Gradient Boosted Decision Trees with a learning-to-rank objective function. See Quick Links for further technical details. Please note that neither labeled data nor training code is currently available, though we may reconsider this approach in future releases.

Issues

We use GitHub Issues for posting bugs and feature requests.

Acknowledgements

  • Thanks to the FireEye Data Science (FDS) and FireEye Labs Reverse Engineering (FLARE) teams for review and feedback.
  • StringSifter was designed and developed by Philip Tully (FDS), Matthew Haigh (FLARE), Jay Gibble (FLARE), and Michael Sikorski (FLARE).
  • The StringSifter logo was designed by Josh Langner (FLARE).
  • flarestrings is derived from the excellent tool FLOSS.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].