All Projects → blackthorne → Codetective

blackthorne / Codetective

a tool to determine the crypto/encoding algorithm used according to traces from its representation

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Codetective

Algorithmictrading
This repository contains three ways to obtain arbitrage which are Dual Listing, Options and Statistical Arbitrage. These are projects in collaboration with Optiver and have been peer-reviewed by staff members of Optiver.
Stars: ✭ 157 (+29.75%)
Mutual labels:  analysis, algorithm
bytes-java
Bytes is a utility library that makes it easy to create, parse, transform, validate and convert byte arrays in Java. It supports endianness as well as immutability and mutability, so the caller may decide to favor performance.
Stars: ✭ 120 (-0.83%)
Mutual labels:  encoding, hash
Libchef
🍀 c++ standalone header-only basic library. || c++头文件实现无第三方依赖基础库
Stars: ✭ 178 (+47.11%)
Mutual labels:  algorithm, encoding
Hashlib4pascal
Hashing for Modern Object Pascal
Stars: ✭ 132 (+9.09%)
Mutual labels:  hash, encoding
Hashids
A small PHP library to generate YouTube-like ids from numbers. Use it when you don't want to expose your database ids to the user.
Stars: ✭ 4,596 (+3698.35%)
Mutual labels:  hash, encoding
harsh
Hashids implementation in Rust
Stars: ✭ 48 (-60.33%)
Mutual labels:  encoding, hash
hashids.pm
Hashids, ported for Perl
Stars: ✭ 15 (-87.6%)
Mutual labels:  encoding, hash
Wyhash Rs
wyhash fast portable non-cryptographic hashing algorithm and random number generator in Rust
Stars: ✭ 44 (-63.64%)
Mutual labels:  algorithm, hash
Bcrypt.net
BCrypt.Net - Bringing updates to the original bcrypt package
Stars: ✭ 422 (+248.76%)
Mutual labels:  algorithm, hash
Hashids.js
A small JavaScript library to generate YouTube-like ids from numbers.
Stars: ✭ 3,525 (+2813.22%)
Mutual labels:  hash, encoding
Sooty
The SOC Analysts all-in-one CLI tool to automate and speed up workflow.
Stars: ✭ 867 (+616.53%)
Mutual labels:  analysis, hash
Minperf
A Minimal Perfect Hash Function Library
Stars: ✭ 107 (-11.57%)
Mutual labels:  algorithm, hash
Bplustree
A minimal but extreme fast B+ tree indexing structure demo for billions of key-value storage
Stars: ✭ 1,598 (+1220.66%)
Mutual labels:  algorithm
Android
Android app for collecting OpenStreetCam imagery
Stars: ✭ 119 (-1.65%)
Mutual labels:  encoding
Git Repo Analysis
Scripts to analyse large Git repositories.
Stars: ✭ 117 (-3.31%)
Mutual labels:  analysis
Article
前端相关、CSS、JavaScript、工具、解决方案…相关文章
Stars: ✭ 116 (-4.13%)
Mutual labels:  algorithm
Timesketch
Collaborative forensic timeline analysis
Stars: ✭ 1,795 (+1383.47%)
Mutual labels:  analysis
Oh My Python
剑指offer(python版)/ 算法图解 / python基础 / 数据结构
Stars: ✭ 119 (-1.65%)
Mutual labels:  algorithm
Digestpp
C++11 header-only message digest library
Stars: ✭ 116 (-4.13%)
Mutual labels:  hash
Awesome Algorithms Books
CLRS + Algorithhms by Robert Sedgewick, Kevin Wayne +Algorithm_design by Jon Kleinberg and Éva Tardos
Stars: ✭ 116 (-4.13%)
Mutual labels:  algorithm

Codetective

Sometimes we run into hashes and other artefacts and can't figure out where did they come from and how they were generated. This tool is able to recognise the output format of many different algorithms in many different possible encodings for analysis purposes. It also infers the levels of certainty for each finding based on traces of its representation .

This may be useful e.g. when you are testing systems from a security perspective and are able to grab a password file with hashed contents maybe from an exposed backup file or by dumping memory. This may also be useful as a part of a fingerprinting process or simply to verify valid implementations of different algorithms. You may also try running this tool against network traffic captures or large source code repositories to look out for interesting stuff.

You can either use a generic version or as a plugin for the Volatility framework. The usage is similar.

Changelog

Version 0.8.2

  • Added detection for JWT tokens
  • Added generic secrets detection

Version 0.8.1

Finally a new version of Codetective is ready. This is close to a complete re-write with new features and many, many bug fixes. The code is still not something to look at but it’s now a lot more object oriented and easier to maintain. Now, codetective can also report the exact location of a finding and the ‘certainty’ feature is now numeric in order to include multiple factors with different weights and become more precise. The new entropy checks added to better detect cryptographic findings also take advantage of this and you can limit results being displayed only to findings with a specified minimum level of certainty (-m number). A different approach regarding the way data is loaded into memory has been adopted so now Codetective will break data into slices and analyse them by turn. This is to prevent it from filling the memory when reading large files. To prevent losing results there is now an overlapping window which will make sure no findings are caught in the breaks (*). A verbose mode was also added, it’s now possible to see the progress status which can be useful for the longer runs.

Directory mode (-d rootPath) allows you to tell Codetective to look at folders rather than files and you can use it in recursive mode (-r) as well. if applied to large amounts of data, it might be useful to filter outputs by minimum certainty level e.g. -m 70. It also supports reading data from standard input.

A new filter was added called ‘personal’ and can be used for findings that may include personal data such as phone numbers and credit cards which are now supported as well. Other algorithms supported in the new version include web cookies, URLs and with the ‘generator’ option, Codetective will try to load all encoding/decoding supported by your Python environment (the ‘aliases’ module) and apply them in order to find something meaningful. This can be used to try multiple encoding (-g encode) or decodings (-g decode) or both (-g both). Preprocessors (-p) now allow binary structures to be first converted onto strings using the C struct packing string patterns. Finally, the ‘validators’ function now provides a powerful way to filter outputs. You may use up to 3 validators per run using the -v1 -v2 -v3 parameters respectively. Each validator it is a combination of a predicate ‘ALL’ or ‘HAS’ with a matching function such as UPPER for findings in upper case. Functions currently supported are :

  • NUMERIC
  • ALPHA
  • LOWER
  • UPPER
  • ALPHANUMERIC
  • SYMBOL

For a custom validator you can also provide your own regular expression using the predicate SEARCH followed by the desired regular expression.

(*) - you may have duplicate results because of this but will be addressed in future.

Supported filters are: win, web, unix, db, personal, crypto and other.

  • web-cookie
  • mssql2000
  • md5
  • URL
  • md4
  • phone number
  • credit cards
  • mssql2005
  • lm hash
  • ntlm hash
  • MySQL4+
  • MySQL323
  • base64
  • SAM(*:ntlm)
  • SAM(lm:*)
  • SAM(lm:ntlm)
  • RipeMD320
  • sha1
  • sha224
  • sha256
  • sha384
  • sha512
  • whirpool
  • CRC
  • des-salt-unix
  • sha256-salt-django
  • sha256-django
  • sha384-salt-django
  • sha384-django
  • sha256-salt-unix
  • sha512-salt-unix
  • apr1-salt-unix
  • md5-salt-unix
  • md5-wordpress
  • md5-phpBB3
  • md5-joomla2
  • md5-salt-joomla2
  • md5-joomla1
  • md5-salt-joomla1
  • blowfish-salt-unix
  • uuid
  • JWT
  • secrets in code

Examples

$ python codetective.py -r -d mypath/ -m 80

Examples (old)

$ python codetective.py '79b61b093c3c063fd45f03d55493902f'
confident: ['md5']
likely: ['lm', 'ntlm', 'md5-joomla2', 'md5-joomla1']
possible: ['md4', 'base64']

$ python codetective.py '79B61B093C3C063FD45F03D55493902F'
confident: ['md5', 'lm', 'ntlm']
possible: ['md4', 'base64']

$ python codetective.py '79B61B093C3C063FD45F03D55493902F:*'
confident: ['md5', 'SAM(lm:*)']
likely: ['lm', 'ntlm']
possible: ['md4']

$ python codetective.py -t win '79B61B093C3C063FD45F03D55493902F:*'
confident: ['SAM(lm:*)']
likely: ['md5', 'lm', 'ntlm']
possible: ['md4']

$ python codetective.py -a -t win '79B61B093C3C063FD45F03D55493902F:*'
confident: ['SAM(lm:*)']
    	hashes in SAM file - LM:79B61B093C3C063FD45F03D55493902F        NTLM:not defined
likely: ['md5', 'lm', 'ntlm']
possible: ['md4']


$ python vol.py codetective -n notepad -v -f JOHN-2CF071298B-20120318-024807.raw 
Volatile Systems Volatility Framework 2.0

Found 29 tasks
kernel mapping...
Calculating task mappings...
Process: wuauclt.exe	PPID: 1120	Pid: 1908
Process: vmtoolsd.exe	PPID: 688	Pid: 348
Process: vmacthlp.exe	PPID: 688	Pid: 912
Process: svchost.exe	PPID: 688	Pid: 976
Process: smss.exe	PPID: 4	Pid: 384
Process: explorer.exe	PPID: 1680	Pid: 1696
Process: cmd.exe	PPID: 1696	Pid: 1520
Process: svchost.exe	PPID: 688	Pid: 160
Process: vmtoolsd.exe	PPID: 1696	Pid: 1820
Process: lsass.exe	PPID: 644	Pid: 700
Process: services.exe	PPID: 644	Pid: 688
Process: alg.exe	PPID: 688	Pid: 1936
Process: svchost.exe	PPID: 688	Pid: 924
Process: csrss.exe	PPID: 384	Pid: 620
Process: svchost.exe	PPID: 688	Pid: 1208
Process: TPAutoConnSvc.e	PPID: 688	Pid: 1220
Process: spoolsv.exe	PPID: 688	Pid: 1572
Process: svchost.exe	PPID: 688	Pid: 1172
Process: svchost.exe	PPID: 688	Pid: 1120
Process: winlogon.exe	PPID: 384	Pid: 644
Process: rundll32.exe	PPID: 1696	Pid: 1788
Process: TPAutoConnect.e	PPID: 1220	Pid: 1256
Process: notepad.exe	PPID: 1696	Pid: 1896

=> at offset Virtual: 0x8012e000  	Physical: 0x12e000    	 Size: 0x1000        
 Found md5 (likely) 	MD5 hash: 0A5AE0AB474FF954BA5FB5CC22691599
 Found md4 (possible) 	MD4 hash: 0A5AE0AB474FF954BA5FB5CC22691599

=> at offset Virtual: 0x8013d000  	Physical: 0x13d000    	 Size: 0x1000        
 Found md5 (likely) 	MD5 hash: 4BE2C18D9154D0240B36AEF861085FEC
 Found md4 (possible) 	MD4 hash: 4BE2C18D9154D0240B36AEF861085FEC

=> at offset Virtual: 0x80153000  	Physical: 0x153000    	 Size: 0x1000        
 Found md5 (likely) 	MD5 hash: 0B79C053C7D38EE4AB9A00CB3B5D2472
 Found md4 (possible) 	MD4 hash: 0B79C053C7D38EE4AB9A00CB3B5D2472

=> at offset Virtual: 0x80171000  	Physical: 0x171000    	 Size: 0x1000        
 Found md5 (likely) 	MD5 hash: 10F84F9347B42F6428155C59A743D317
 Found md4 (possible) 	MD4 hash: 10F84F9347B42F6428155C59A743D317

=> at offset Virtual: 0x8018a000  	Physical: 0x18a000    	 Size: 0x1000        
 Found md5 (likely) 	MD5 hash: 5046ab8cb6b1ce11920c00aa006c4972
 Found md4 (possible) 	MD4 hash: 5046ab8cb6b1ce11920c00aa006c4972

=> at offset Virtual: 0x801a2000  	Physical: 0x1a2000    	 Size: 0x1000        
 Found md5 (likely) 	MD5 hash: C25F308FAE39B3A4D9E1561F679CD8AA
 Found md4 (possible) 	MD4 hash: C25F308FAE39B3A4D9E1561F679CD8AA
...	



$ python codetective.py -a -f test.txt 
Administrator:500:CC5E9ACBAD1B25C9AAD3B435B51404EE:996E6760CDDD8815A2C24A110CF040FB::: : {'confident': ['md5', 'SAM(lm:ntlm)'], 'likely': ['lm', 'ntlm'], 'possible': ['md4', 'des-salt-unix']}
     hashes in SAM file - LM:CC5E9ACBAD1B25C9AAD3B435B51404EE        NTLM:996E6760CDDD8815A2C24A110CF040FB
    	UNIX shadow file using salted DES - salt:Ad     hash:ministrator
ibrahim:$1$hanhd/cF$3lzrzB14HceT7uc3oTmog1:14323:0:99999:7::: : {'confident': ['md5-salt-unix'], 'likely': [], 'possible': []}
    	UNIX shadow file using salted MD5 - salt:hanhd/cF       hash:3lzrzB14HceT7uc3oTmog1
563DE3D2F07D0747BBE4BA2697AE33AA : {'confident': ['md5'], 'likely': ['lm', 'ntlm'], 'possible': ['md4', 'base64']}
	base64 decoded string: ??p?N?Ӿ;8
463C8A7593A8A79078CB5C119424E62A : {'confident': ['md5'], 'likely': ['lm', 'ntlm'], 'possible': ['md4', 'base64']}
    	base64 decoded string: ?????p<?t????-u?????
E852191079EA08B654CCF4C2F38A162E3E84EE04 : {'confident': [], 'likely': ['sha1'], 'possible': ['base64']}
    	base64 decoded string: ?v??t????z瀂??׭??O8M8
94F94C9C97BFA92BD267F70E2ABD266B069428C282F30AD521D486A069918925 : {'confident': [], 'likely': ['sha256'], 'possible': ['base64']}
    	base64 decoded string: ??}?/B??E݁n???Cۮ?ӯx????aw???P??4??u?ݹ
sha384$12345678$c0be393a500c7d42b1bd03a1a0a76302f7f472fc132f11ea6373659d0bd8675d04e12d8016d83001c327f0ab70843dd5 : {'confident': [], 'likely': ['sha384', 'sha384-salt-django'], 'possible': []}
    	Django shadow file using salted SHA384 - salt:12345678  hash:c0be393a500c7d42b1bd03a1a0a76302f7f472fc132f11ea6373659d0bd8675d04e12d8016d83001c327f0ab70843dd5
5850478A34D818CE : {'confident': [], 'likely': ['mysql323'], 'possible': ['base64']}
    	base64 decoded string: ??t?߀????
    	MySQL v3.23 or previous hash: ['5850478A34D818CE']
08EE13E9A295641BE6158366C0651B84A1AD9E47 : {'confident': [], 'likely': ['sha1'], 'possible': ['base64']}
    	base64 decoded string: ???q=oy?A?y?~?
                                         N??8P?N;
****:7db9d24c238b77af11b99f0a67e99abe  : {'confident': ['md5'], 'likely': ['lm', 'ntlm', 'md5-joomla1'], 'possible': ['md4']}
    	Joomla v1 MD5 - hash:7db9d24c238b77af11b99f0a67e99abe
****:d2f46e7173b1d88c9d7b2f52271cd8af:YEfafQuaj58ExG3V  : {'confident': ['md5', 'md5-salt-joomla1'], 'likely': ['lm', 'ntlm'], 'possible': ['md4']}
    	Joomla v1 salted MD5 - hash:d2f46e7173b1d88c9d7b2f52271cd8af    salt:YEfafQuaj58ExG3V
****:4aad84c0929c72f1c72a9c884e5c0f18:tNT52oL0I8ClmMjO  : {'confident': ['md5', 'md5-salt-joomla1'], 'likely': ['lm', 'ntlm'], 'possible': ['md4']}
    	Joomla v1 salted MD5 - hash:4aad84c0929c72f1c72a9c884e5c0f18    salt:tNT52oL0I8ClmMjO
****:1ad6692b7e3b2deb36606603ced0c8b6:LhiqX4pL3s8xy0qd  : {'confident': ['md5', 'md5-salt-joomla1'], 'likely': ['lm', 'ntlm'], 'possible': ['md4']}
    	Joomla v1 salted MD5 - hash:1ad6692b7e3b2deb36606603ced0c8b6    salt:LhiqX4pL3s8xy0qd
dGVzdGUK : {'confident': [], 'likely': [], 'possible': ['base64']}
    	base64 decoded string: teste

Usage

Generic version: usage: codetective.py [-h] [-t filters] [-a] [-v] [-m MIN_CERTAINTY] [-p PREPROCESSOR] [-g GENERATOR] [-v1 VALIDATOR1] [-v2 VALIDATOR2] [-v3 VALIDATOR3] [-r] [-f FILENAME] [-d DIRECTORY] [-fp FILE_PATTERN] [-l] [-s] [-ver] [string]

a tool to determine the crypto/encoding algorithm used according to traces of
its representation

positional arguments:
  string                determine algorithm used for <string> according to its
		    data representation

optional arguments:
  -h, --help            show this help message and exit
  -t filters            filter by source of your string. can be: win, web, db,
		    unix or other
  -a, -analyze          show more details whenever possible (expands shadow
		    files fields,...)
  -v, -verbose          verbose mode shows progress status (useful for large
		    files) and time taken
  -m MIN_CERTAINTY, -minimum-certainty MIN_CERTAINTY
		    specify the minimum acceptable certainty level for
		    displayed results (0 - 100)
  -p PREPROCESSOR, --preprocessor PREPROCESSOR
		    <struct format string> interpret bytes as packed
		    binary data. Unpacks contents from different data and
		    endianess types according to format strings patterns
		    as specified on:
		    https://docs.python.org/2/library/struct.html
  -g GENERATOR, -generator GENERATOR
		    find encoding/decoding algorithm that exposes
		    interesting artifacts (choose: 'encode', 'decode',
		    'both')
  -v1 VALIDATOR1, -validator1 VALIDATOR1
		    applies validator 1
  -v2 VALIDATOR2, -validator2 VALIDATOR2
		    applies validator 2
  -v3 VALIDATOR3, -validator3 VALIDATOR3
		    applies validator 3
  -r, -recursive        sets recursive mode upon specified directory (current
		    workdir by default). Consider using it with
		    min_certainty option
  -f FILENAME, -file FILENAME
		    load a specified file
  -d DIRECTORY, -directory DIRECTORY
		    load a specified directory
  -fp FILE_PATTERN, -file-pattern FILE_PATTERN
		    specified which file pattern to be used with directory
		    (default: '*')
  -l, -list             lists supported algorithms
  -s, -stdin            read data from standard input
  -ver, -version        displays software version

use filters for more accurate results. Report bugs, ideas, feedback to:
[email protected]

As a Volatility v2.0 plugin:

$ python vol.py codetective -h
Volatile Systems Volatility Framework 2.0
Usage: Volatility - A memory forensics analysis platform.

Options:
  -h, --help            list all available options and their default values.
                        Default values may be set in the configuration file
                        (/etc/volatilityrc)
  --conf-file=/your/home/.volatilityrc
                        User based configuration file
  -d, --debug           Debug volatility
  --info                Print information about all registered objects
  --plugins=PLUGINS     Additional plugin directories to use (colon separated)
  --cache-directory=/Users/blackthorne/.cache/volatility
                        Directory where cache files are stored
  --no-cache            Disable caching
  --tz=TZ               Sets the timezone for displaying timestamps
  -f FILENAME, --filename=FILENAME
                        Filename to use when opening an image
  --output=text         Output in this format (format support is module
                        specific)
  --output-file=OUTPUT_FILE
                        write output in this file
  -v, --verbose         Verbose information
  -k KPCR, --kpcr=KPCR  Specify a specific KPCR address
  -g KDBG, --kdbg=KDBG  Specify a specific KDBG virtual address
  --dtb=DTB             DTB Address
  --cache-dtb           Cache virtual to physical mappings
  --use-old-as          Use the legacy address spaces
  -w, --write           Enable write support
  --profile=WinXPSP2x86
                        Name of the profile to load
  -l LOCATION, --location=LOCATION
                        A URN location from which to load an address space
  -n PNAME, --pname=PNAME
                        define target Process name
  -p PID, --pid=PID     define target Process ID
  -t FILTERS, --filters=FILTERS
                        apply filters, can be: win, web, unix, db and other
                        (default: none)
  -u, --uuids           include UUIDS in search (default: No)

---------------------------------
Module Codetective
---------------------------------
determine the crypto/encoding algorithm used according to traces from its representation

Requirements

python v2.4 - 2.7

Discussion

This script is heavily based on regular expressions done with a mindset of rejecting the maximum possible alternatives for each possible hash/code submitted but never to reject valid choices. Also, this code requires proper testing, so you're welcome to contribute with more algorithms and bring me the feedback. Notice that this tool also infers on the confidence level relative to each guess and it's able to give you a preliminary analyze (-a).

Notice that results improve with filters (-f) that can be specified so if you know that the source of the file is related to the web, Codetective will have more confidence when trying to determine the source of code when considering web applications frameworks such as Joomla or Django. If you find this tool useful, know that there is other called [Hash Identifier] (http://code.google.com/p/hash-identifier/) that works in a different way that may also be helpful to you.

When using Codetective as plugin for Volatility remember to use uuids search option (-u) if you need it, since it is disabled by default. Most relevant options are -u (show UUIDs), -v (verbose mode), -t (filters), -p (search for Process ID) and -n (search for process name). If neither -p or -n is defined, if will search in all processes.

Finally, when passing strings from the command-line, always wrap your input string with '' (at least, if you're using bash) so that special characters such as '!' don't mess up with the input before it gets processed by Codetective.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].