SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-44.55%)
WhispersIdentify hardcoded secrets and dangerous behaviours
Stars: ✭ 66 (-68.72%)
Freshonions TorscraperFresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion
Stars: ✭ 348 (+64.93%)
NightmareA high-level browser automation library.
Stars: ✭ 19,067 (+8936.49%)
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Stars: ✭ 365 (+72.99%)
ExiferA lightweight Exif meta-data decipher.
Stars: ✭ 290 (+37.44%)
Thepiratebay💀 The Pirate Bay node.js client
Stars: ✭ 191 (-9.48%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+116.11%)
Anime DlAnime-dl is a command-line program to download anime from CrunchyRoll and Funimation.
Stars: ✭ 190 (-9.95%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+2171.56%)
Google Play ScraperGoogle play scraper for Python inspired by <facundoolano/google-play-scraper>
Stars: ✭ 143 (-32.23%)
Youtube ProjectsThis repository contains all the code I use in my YouTube tutorials.
Stars: ✭ 144 (-31.75%)
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (-29.86%)
Goribot[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Stars: ✭ 190 (-9.95%)
NewcrawlerFree Web Scraping Tool with Java
Stars: ✭ 589 (+179.15%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+210.9%)
SurgeonDeclarative DOM extraction expression evaluator. 👨⚕️
Stars: ✭ 653 (+209.48%)
CheerioFast, flexible, and lean implementation of core jQuery designed specifically for the server.
Stars: ✭ 24,616 (+11566.35%)
AbotCross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.
Stars: ✭ 1,961 (+829.38%)
ScrapitScraping scripts for various websites.
Stars: ✭ 25 (-88.15%)
JktSimple helper to parse JSON based on independent schema
Stars: ✭ 22 (-89.57%)
Onion CrawlerTor website crawler (specific for Alphabay at the time)
Stars: ✭ 15 (-92.89%)
FuziA fast & lightweight XML & HTML parser in Swift with XPath & CSS support
Stars: ✭ 894 (+323.7%)
Parse XmlA fast, safe, compliant XML parser for Node.js and browsers.
Stars: ✭ 184 (-12.8%)
LogosCreate ridiculously fast Lexers
Stars: ✭ 1,001 (+374.41%)
Social ScraperTổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt
Stars: ✭ 47 (-77.73%)
Api StoreContains all the public APIs listed in Phantombuster's API store. Pull requests welcome!
Stars: ✭ 69 (-67.3%)
Php Svg LibSVG file parsing / rendering library
Stars: ✭ 1,146 (+443.13%)
GoscraperGolang pkg to quickly return a preview of a webpage (title/description/images)
Stars: ✭ 72 (-65.88%)
SerpscrapSEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
Stars: ✭ 153 (-27.49%)
RatsMovie Ratings Synchronization with Python
Stars: ✭ 156 (-26.07%)
ArpeggioParser interpreter based on PEG grammars written in Python http://textx.github.io/Arpeggio/
Stars: ✭ 204 (-3.32%)
Lodestone NodejsCharacter tracking and parser library for nodejs
Stars: ✭ 81 (-61.61%)
Mini YamlSingle header YAML 1.0 C++11 serializer/deserializer.
Stars: ✭ 79 (-62.56%)
Formula ParserParsing and evaluating mathematical formulas given as strings.
Stars: ✭ 62 (-70.62%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-52.61%)
Graphql Go ToolsTools to write high performance GraphQL applications using Go/Golang.
Stars: ✭ 96 (-54.5%)
WebmagicA scalable web crawler framework for Java.
Stars: ✭ 10,186 (+4727.49%)
ScrapoxyScrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (+526.54%)
Instagram Profilecrawl💻 Quickly crawl the information (e.g. followers, tags, etc...) of an instagram profile. No login required!
Stars: ✭ 110 (-47.87%)
Whois ParserGo(Golang) module for domain whois information parsing.
Stars: ✭ 123 (-41.71%)
Boj AutocommitWhen you solve the problem of Baekjoon Online Judge, it automatically commits and pushes to the remote repository.
Stars: ✭ 60 (-71.56%)
OnegramThis repository is no longer maintained.
Stars: ✭ 137 (-35.07%)
UdemycoursegrabberYour will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!
Stars: ✭ 137 (-35.07%)
ParjsJavaScript parser-combinator library
Stars: ✭ 145 (-31.28%)
NewspaperNews, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+5371.56%)
Instagram Scraperscrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot
Stars: ✭ 2,209 (+946.92%)
Secret AgentThe web browser that's built for scraping.
Stars: ✭ 151 (-28.44%)
Instagram CrawlerCrawl instagram photos, posts and videos for download.
Stars: ✭ 178 (-15.64%)
Dan Jurafsky Chris Manning NlpMy solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-41.23%)
Command Line ApiCommand line parsing, invocation, and rendering of terminal output.
Stars: ✭ 2,418 (+1045.97%)
JssoupJavaScript + BeautifulSoup = JSSoup
Stars: ✭ 203 (-3.79%)
JvppeteerHeadless Chrome For Java (Java 爬虫)
Stars: ✭ 193 (-8.53%)
RcrawlerAn R web crawler and scraper
Stars: ✭ 274 (+29.86%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+31.28%)
Awesome Python Primer自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向
Stars: ✭ 57 (-72.99%)