All Projects β†’ ksharinarayanan β†’ SourceWolf

ksharinarayanan / SourceWolf

Licence: MIT license
Amazingly fast response crawler to find juicy stuff in the source code! 😎πŸ”₯

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to SourceWolf

aquatone
A Tool for Domain Flyovers
Stars: ✭ 43 (-67.42%)
Mutual labels:  osint, bugbounty, reconnaissance
Rengine
reNgine is an automated reconnaissance framework for web applications with a focus on highly configurable streamlined recon process via Engines, recon data correlation and organization, continuous monitoring, backed by a database, and simple yet intuitive User Interface. reNgine makes it easy for penetration testers to gather reconnaissance with…
Stars: ✭ 3,439 (+2505.3%)
Mutual labels:  osint, bugbounty, reconnaissance
quick-recon.py
Do some quick reconnaissance on a domain-based web-application
Stars: ✭ 13 (-90.15%)
Mutual labels:  osint, bugbounty, reconnaissance
Pdlist
A passive subdomain finder
Stars: ✭ 204 (+54.55%)
Mutual labels:  osint, bugbounty, reconnaissance
wordlist generator
Unique wordlist generator of unique wordlists.
Stars: ✭ 41 (-68.94%)
Mutual labels:  wordlist, bugbounty, reconnaissance
AttackSurfaceManagement
Discover the attack surface and prioritize risks with our continuous Attack Surface Management (ASM) platform - Sn1per Professional #pentest #redteam #bugbounty
Stars: ✭ 45 (-65.91%)
Mutual labels:  osint, bugbounty, reconnaissance
Osint tips
OSINT
Stars: ✭ 322 (+143.94%)
Mutual labels:  osint, bugbounty, reconnaissance
roboxtractor
Extract endpoints marked as disallow in robots files to generate wordlists.
Stars: ✭ 40 (-69.7%)
Mutual labels:  wordlist, fuzzing, bugbounty
Favfreak
Making Favicon.ico based Recon Great again !
Stars: ✭ 564 (+327.27%)
Mutual labels:  osint, bugbounty, reconnaissance
Bigbountyrecon
BigBountyRecon tool utilises 58 different techniques using various Google dorks and open source tools to expedite the process of initial reconnaissance on the target organisation.
Stars: ✭ 541 (+309.85%)
Mutual labels:  osint, bugbounty, reconnaissance
flydns
Related subdomains finder
Stars: ✭ 29 (-78.03%)
Mutual labels:  osint, bugbounty, reconnaissance
Asnip
ASN target organization IP range attack surface mapping for reconnaissance, fast and lightweight
Stars: ✭ 126 (-4.55%)
Mutual labels:  osint, bugbounty, reconnaissance
Onelistforall
Rockyou for web fuzzing
Stars: ✭ 213 (+61.36%)
Mutual labels:  wordlist, fuzzing, bugbounty
Reconky-Automated Bash Script
Reconky is an great Content Discovery bash script for bug bounty hunters which automate lot of task and organized in the well mannered form which help them to look forward.
Stars: ✭ 167 (+26.52%)
Mutual labels:  osint, bugbounty, reconnaissance
Dirsearch
Web path scanner
Stars: ✭ 7,246 (+5389.39%)
Mutual labels:  wordlist, fuzzing, bugbounty
Osmedeus
Fully automated offensive security framework for reconnaissance and vulnerability scanning
Stars: ✭ 3,391 (+2468.94%)
Mutual labels:  osint, bugbounty, reconnaissance
Hosthunter
HostHunter a recon tool for discovering hostnames using OSINT techniques.
Stars: ✭ 427 (+223.48%)
Mutual labels:  osint, bugbounty, reconnaissance
Paramspider
Mining parameters from dark corners of Web Archives
Stars: ✭ 781 (+491.67%)
Mutual labels:  osint, fuzzing, bugbounty
Raccoon
A high performance offensive security tool for reconnaissance and vulnerability scanning
Stars: ✭ 2,312 (+1651.52%)
Mutual labels:  osint, fuzzing, reconnaissance
Sudomy
Sudomy is a subdomain enumeration tool to collect subdomains and analyzing domains performing automated reconnaissance (recon) for bug hunting / pentesting
Stars: ✭ 1,572 (+1090.91%)
Mutual labels:  bugbounty, reconnaissance



SourceWolf

​

Tested environments: Windows, MAC, linux, and windows subsystem for linux (WSL)

Sponsors

Support this project by becoming a sponsor. Checkout these awesome sponsors:

Sections

What can SourceWolf do?

  • Crawl through responses to find hidden endpoints, either by sending requests, or from the local response files (if any).

  • Create a list of javascript variables found in the source

  • Extract all the social media links from the websites to identify potentially broken links

  • Brute forcing host using a wordlist.

  • Get the status codes for a list of URLs / Filtering out the live domains from a list of hosts.

All the features mentioned above execute with great speed.

  • SourceWolf uses the Session module from the requests library, which means, it reuses the TCP connection, making it really fast.

  • SourceWolf provides you with an option to crawl the responses files locally so that you aren't sending requests again to an endpoint, whose response you already have a copy of.

  • The final endpoints are in a complete form with a host like https://example.com/api/admin are not as /api/admin. This can come useful, when you are scanning a list of hosts.


Installation


Usage

> python3 sourcewolf.py -h

-l LIST, --list LIST  List of javascript URLs
-u URL, --url URL     Single URL
-t THREADS, --threads THREADS
                      Number of concurrent threads to use (default 5)
-o OUTPUT_DIR, --output directory-name OUTPUT_DIR
                      Store URL response text in a directory for further analysis
-s STATUS_CODE_FILE, --store-status-code STATUS_CODE_FILE
                      Store the status code in a file
-b BRUTE, --brute BRUTE
                      Brute force URL with FUZZ keyword (--wordlist must also be used along with this)
-w WORDLIST, --wordlist WORDLIST
                      Wordlist for brute forcing URL
-v, --verbose         Verbose mode (displays all the requests that are being sent)
-c CRAWL_OUTPUT, --crawl-output CRAWL_OUTPUT
                      Output directory to store the crawled output
-d DELAY, --delay DELAY
                      Delay in the requests (in seconds)
--timeout TIMEOUT     Maximum time to wait for connection timing out (in seconds)
--headers HEADERS     Add custom headers (Must be passed in as {'Token': 'YOUR-TOKEN-HERE'}) --> Dictionary format
--cookies COOKIES     Add cookies (Must be passed in as {'Cookie': 'YOUR-COOKIE-HERE'}) --> Dictionary format
--only-success        Only print 2XX responses
--local LOCAL         Directory with local response files to crawl for
--no-colors           Remove colors from the output
--update-info         Check for the latest version, and update if required
SourceWolf has 3 modes, which corresponds to it's 3 core features.
  • Crawl response mode:

Complete usage:

  python3 sourcewolf.py -l domains -o output/ -c crawl_output

domains is the list of URLs, which you want to crawl in the format:

https://example.com/
https://exisiting.example.com/
https://exisiting.example.com/dashboard
https://example.com/hitme

output/ is the directory where the response text files of the input file are stored.

They are stored in the format output/2XX, output/3XX, output/4XX, and output/5XX.
output/2XX stores 2XX status code responses, and so on!


crawl_output specified using the -c flag is used to store the output, inside a directory which SourceWolf produces by crawling the HTTP response files, stored inside the output/ directory (currently only endpoints)

The crawl_output/ directory contains:

endpoints - All the endpoints found
jsvars - All the javascript variables

The directory will have more files, as more modules, and features are integrated into SourceWolf.


(OR)

For a single URL,

  python3 sourcewolf.py -u example.com/api/endpoint -o output/ -c crawl_output

Only the flag -l is replaced by -u, everything else remains the same.


  • Brute force mode

python3 sourcewolf.py -b https://hackerone.com/FUZZ -w /path/to/wordlist -s status

-w flag is optional. If not specified, it will use a default wordlist with 6124 words

SourceWolf replace the FUZZ keyword from the -b value with the words from wordlist, and sends the requests. This enables you to brute force get parameter values as well.

-s will store the output in a file called status

  • Probing mode

Screenshot not included as the output looks similar to crawl response mode.

python3 sourcewolf -l domains -s live

The domains file can have anything like subdomains, endpoints, js files.

The -s flag write the response to the live file.

Both the brute force and probing mode prints all the status codes except 404 by default. You can customize this behavior to print only 2XX responses by using the flag --only-success

SourceWolf also makes use of multithreading.
The default number of threads for all modes is 5. You can increase the number of threads using the -t flag.

In addition to the above three modes, there is an option crawl locally, provided you have them locally, and follow sourcewolf compatible naming conventions.

Store all the responses in a directory, say responses/

python3 sourcewolf.py --local responses/

This will crawl the local directory, and give you the results.


How can this be integrated into your workflow?


Subdomain enumeration
|
|
SourceWolf
|
|
Filter out live subdomains
|
|
Store responses and find hidden endpoints / Directory brute forcing

At this point, you will have a lot of endpoints from the target, extracted real time from the web pages at the time of performing the scan.


SourceWolf core purpose is made with a broader vision to crawl through responses not just for discovering hidden endpoints, but also for automating all the tasks which are done by manually searching through the response files.

One such example would be manually searching for any leaked keys in the source.

This core purpose explains the modular way in which the files are written.

To do

  • Generate a custom wordlist for a target from the words obtained in the source.
  • Automate finding any leaked keys.

Updates

It is possible to update SourceWolf right from the terminal, without you having to clone the repository again.
SourceWolf checks for updates everytime it runs, and notifies the user if there are any updates available along with a summary of it.

Running

python3 sourcewolf.py --update-info

provides more details on the update

When there are updates available, you must move the update.py file outside of the SourceWolf directory, and run
Warning: This deletes all the files and folders inside your SourceWolf directory

python3 update.py /path/to/SourceWolf

This actually removes the directory, and clones back the repo.

Contributions

Currently, sourcewolf supports only finding hidden endpoints from the source, but you can expect other features to be integrated in the future.

Where can you contribute?
Contributions are mainly required for integrating more modules, with sourcewolf, though feel free to open a PR even if it's a typo.

Before sending a pull request, ensure that you are on the latest version.
> Open an issue first if you are going to add a new feature to confirm if it's required! You must not be wasting time trying to code a new feature which is not required.

Issues

Feel free to open any issues you face.
Ensure that you include your operating system, command which was run, and screenshots if possible while opening an issue, which makes it easier for me to reproduce the issue.
You can also request new features, or enhance existing features by opening an issue.

Naming conventions

To crawl the files locally, you must follow some naming conventions. These conventions are in place for SourceWolf to directly identify the host name, and thereby parse all the endpoints, including the relative ones.

Consider an URL https://example.com/api/

  • Remove the protocol and the trailing slash (if any) from the URL --> example.com/api
  • Replace '/' with '@' --> example.com@api
  • Save the response as a txt file with the file name obtained above.

So the file finally looks like [email protected]

Credits

Logo designed by Murugan artworks

License

SourceWolf uses the MIT license

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].