All Projects → saeeddhqan → evine

saeeddhqan / evine

Licence: GPL-3.0 License
Interactive CLI Web Crawler

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to evine

Twitter Get Old Tweets Scraper
A data scraper for retrieving old tweets in Twitter using Python3.
Stars: ✭ 27 (-80.71%)
Mutual labels:  data-mining, scraper
ant
A web crawler for Go
Stars: ✭ 264 (+88.57%)
Mutual labels:  scraper, web-crawler
SourceWolf
Amazingly fast response crawler to find juicy stuff in the source code! 😎🔥
Stars: ✭ 132 (-5.71%)
Mutual labels:  osint, fuzzing
Youtube Comment Suite
Download YouTube comments from numerous videos, playlists, and channels for archiving, general search, and showing activity.
Stars: ✭ 120 (-14.29%)
Mutual labels:  scraper, osint
LeetCode
At present contains scraped data from around 1500 problems present on the site. More to follow....
Stars: ✭ 45 (-67.86%)
Mutual labels:  data-mining, scraper
Lagoujob
Job data mining repo for lagou.com
Stars: ✭ 256 (+82.86%)
Mutual labels:  data-mining, web-crawler
gHarvester
Proof of concept for a security issue (in my opinion) that I found in accounts.google.com
Stars: ✭ 20 (-85.71%)
Mutual labels:  scraper, osint
Awesome Crawler
A collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+3323.57%)
Mutual labels:  scraper, web-crawler
dorkscout
DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets
Stars: ✭ 189 (+35%)
Mutual labels:  scraper, osint
website-to-json
Converts website to json using jQuery selectors
Stars: ✭ 37 (-73.57%)
Mutual labels:  data-mining, scraper
Instaloctrack
An Instagram OSINT tool to collect all the geotagged locations available on an Instagram profile in order to plot them on a map, and dump them in a JSON.
Stars: ✭ 85 (-39.29%)
Mutual labels:  scraper, osint
Mimo-Crawler
A web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.
Stars: ✭ 22 (-84.29%)
Mutual labels:  scraper, web-crawler
Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+368.57%)
Mutual labels:  scraper, web-crawler
Ferret
Declarative web scraping
Stars: ✭ 4,837 (+3355%)
Mutual labels:  data-mining, scraper
Operative Framework
operative framework is a OSINT investigation framework, you can interact with multiple targets, execute multiple modules, create links with target, export rapport to PDF file, add note to target or results, interact with RESTFul API, write your own modules.
Stars: ✭ 511 (+265%)
Mutual labels:  scraper, osint
KaliIntelligenceSuite
Kali Intelligence Suite (KIS) shall aid in the fast, autonomous, central, and comprehensive collection of intelligence by executing standard penetration testing tools. The collected data is internally stored in a structured manner to allow the fast identification and visualisation of the collected information.
Stars: ✭ 58 (-58.57%)
Mutual labels:  data-mining, osint
Osi.ig
Information Gathering Instagram.
Stars: ✭ 377 (+169.29%)
Mutual labels:  scraper, osint
Gosint
OSINT Swiss Army Knife
Stars: ✭ 401 (+186.43%)
Mutual labels:  scraper, osint
blinkist-m4a-downloader
Grabs all of the audio files from all of the Blinkist books
Stars: ✭ 100 (-28.57%)
Mutual labels:  data-mining, scraper
OLX Scraper
📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-89.29%)
Mutual labels:  scraper, web-crawler

Go Report Card License Build Status

Evine

Interactive CLI Web Crawler.

Evine is a simple, fast, and interactive web crawler and web scraper written in Golang. Evine is useful for a wide range of purposes such as metadata and data extraction, data mining, reconnaissance and testing.

asciicast

If you like the project, give it a star. It forces me to develop the project!

Install

From Binary

Pre-build binary releases are also available(Suggested).

From source

go get github.com/saeeddhqan/evine
"$GOPATH/bin/evine" -h

From GitHub

git clone https://github.com/saeeddhqan/evine.git
cd evine
go build .
mv evine /usr/local/bin
evine --help

Note: golang 1.13.x required.

Commands & Usage

Keybinding Description
Enter Run crawler (from URL view)
Enter Display response (from Keys and Regex views)
Tab Next view
Ctrl+Space Run crawler
Ctrl+S Save response
Ctrl+Z Quit
Ctrl+R Restore to default values (from Options and Headers views)
Ctrl+Q Close response save view (from Save view)
evine -h

It will display help for the tool:

flag Description Example
-url URL to crawl for evine -url toscrape.com
-url-exclude string Exclude URLs maching with this regex (default ".*") evine -url-exclude ?id=
-domain-exclude string Exclude in-scope domains to crawl. Separate with comma. default=root domain evine -domain-exclude host1.tld,host2.tld
-code-exclude string Exclude HTTP status code with these codes. Separate whit '|' (default ".*") evine -code-exclude 200,201
-delay int Sleep between each request(Millisecond) evine -delay 300
-depth Scraper depth search level (default 1) evine -depth 2
-thread int The number of concurrent goroutines for resolving (default 5) evine -thread 10
-header HTTP Header for each request(It should to separated fields by \n). evine -header KEY: VALUE\nKEY1: VALUE1
-proxy string Proxy by scheme://ip:port evine -proxy http://1.1.1.1:8080
-scheme string Set the scheme for the requests (default "https") evine -scheme http
-timeout int Seconds to wait before timing out (default 10) evine -timeout 15
-query string JQuery expression(It could be a file extension(pdf), a key query(url,script,css,..) or a jquery selector($("a[class='hdr']).attr('hdr')"))) evine -query url,pdf,txt
-regex string Search the Regular Expression on the page contents evine -regex 'User.+'
-logger string Log errors in a file evine -logger log.txt
-max-regex int Max result of regex search for regex field (default 1000) evine -max-regex -1
-robots Scrape robots.txt for URLs and using them as seeds evine -robots
-sitemap Scrape sitemap.xml for URLs and using them as seeds evine -sitemap
-wayback Scrape WayBackURLs(web.archive.org) for URLs and using them as seeds evine -sitemap

VIEWS

  • URL, In this view, you should enter the URL string.
  • Options, This view is for setting options.
  • Headers, This view is for setting the HTTP Headers.
  • Query, This view is used after the crawling web. It will be used to extract the data(docs, URLs, etc) from the web pages that have been crawled.
  • Regex, This view is useful to search the Regexes in web pages that have been crawled. Write your Regex in this view and press Enter.
  • Response, All of the results writes in this view.
  • Search, This view is used to search the Regexes in the Response view content.

Extract methods

From Keys

Keys are predefined keywords that can be used to specify data like in scope URLs, out scope URLs, emails, etc. List of all keys:

  • url, to extract IN SCOPE urls. the urls completly are sanitized.
  • email, to extract IN SCOPE and out scope emails.
  • query_urls, to extract IN SCOPE urls that contains the get query: ?foo=bar.
  • all_urls, to extract OUT SCOPE urls.
  • phone, to extract a[href]s that contains a phone number.
  • media, to extract files that are not web executable file. like .exe,.bat,.tar.xz,.zip, etc addresses.
  • css, to extract CSS files.
  • script, to extract JavaScript files.
  • cdn, to extract Content Delivery Networks(CDNs) addresses. like //api.foo.bar/jquery.min.js
  • comment, to extract html comments, <!-- .* !-->
  • dns, to extract subdomains that belongs to the website.
  • network, to extract social network IDs. like facebook, twitter, etc.
  • all, to extract all list of keys.(url,query_url,..) keys are case-sensitive. Also, it could be written to or three key with comma separation.

From Extensions

Maybe you wanna a file that is not defined in keys. What can you do? You can easily write the extension of the file on the Query view. like png,xml,txt,docx,xlsx,a,mp3, etc.

From JQuery selector

If you have basic JQuery skills, you can easily use this feature, but if not, it is not very difficult. To have a quick view about the selectors w3schools is a great source.
example(To find source[src]):

$("source").attr("src") // To find all of source[src] urls
$("h1").text() // To find h1 values

Template:

$("SELECTOR").METHOD_NAME("arg")

It does not support queries like below:

$('SELECTOR').METHOD("arg")
$('SELECTOR').METHOD('arg')
$("SELECTOR"  ).METHOD("arg" )

Methods are described below:

  • text(), to returns the content of the SELECTOR without html tag.
  • html(), to returns the content of the SELECTOR with html tag.
  • attr("ATTR"), to get the attribute of the SELECTOR. e.g $("a").attr("href")

Bugs or Suggestions

To report bugs or suggestions, create an issue.

Evine is heavily inspired by wuzz.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].