FbcrawlA Facebook crawler
Stars: ✭ 536 (-15.86%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-84.3%)
ScrapoxyScrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (+107.54%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (-30.93%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+23.86%)
NewspaperNews, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+1712.4%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+659.34%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-90.74%)
CollyElegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+2338.78%)
Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-73.16%)
Goribot[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Stars: ✭ 190 (-70.17%)
Ruiji.netcrawler framework, distributed crawler extractor
Stars: ✭ 220 (-65.46%)
double-agentA test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-80.69%)
diffbot-php-client[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (-91.68%)
aioScrapy基于asyncio与aiohttp的异步协程爬虫框架 欢迎Star
Stars: ✭ 34 (-94.66%)
Wechatsogou基于搜狗微信搜索的微信公众号爬虫接口
Stars: ✭ 5,220 (+719.47%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-91.99%)
OpenScraperAn open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (-87.44%)
scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-94.03%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-97.65%)
img-cliAn interactive Command-Line Interface Build in NodeJS for downloading a single or multiple images to disk from URL
Stars: ✭ 15 (-97.65%)
Scrapy SeleniumScrapy middleware to handle javascript pages using selenium
Stars: ✭ 550 (-13.66%)
crawlkitA crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers.
Stars: ✭ 23 (-96.39%)
flink-crawlerContinuous scalable web crawler built on top of Flink and crawler-commons
Stars: ✭ 48 (-92.46%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (-28.41%)
ScrapedinLinkedIn Scraper (currently working 2020)
Stars: ✭ 453 (-28.89%)
Haipproxy💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+683.83%)
scrapy-LBCAraignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-97.8%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-97.33%)
PoliteBe nice on the web
Stars: ✭ 253 (-60.28%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-97.65%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-91.84%)
Skrape.itA Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Stars: ✭ 231 (-63.74%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+652.43%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-96.55%)
Mimo-CrawlerA web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.
Stars: ✭ 22 (-96.55%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (-27.16%)
arachnodHigh performance crawler for Nodejs
Stars: ✭ 17 (-97.33%)
Ecommercecrawlers码云仓库链接:AJay13/ECommerceCrawlers
Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers
项目展示平台链接:http://wechat.doonsec.com
Stars: ✭ 3,073 (+382.42%)
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-89.32%)
Scrapy RedisRedis-based components for Scrapy.
Stars: ✭ 4,998 (+684.62%)
lightnovel epub🍭 epub generator for (light)novels (轻) 小说 epub 生成器,支持站点:轻之国度、轻小说文库
Stars: ✭ 89 (-86.03%)
SpidyThe simple, easy to use command line web crawler.
Stars: ✭ 257 (-59.65%)
Weibo terminator workflowUpdate Version of weibo_terminator, This is Workflow Version aim at Get Job Done!
Stars: ✭ 259 (-59.34%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-56.51%)
dijnet-botAz összes számlád még egy helyen :)
Stars: ✭ 17 (-97.33%)
RcrawlerAn R web crawler and scraper
Stars: ✭ 274 (-56.99%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+540.03%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (-51.49%)
Xcrawler快速、简洁且强大的PHP爬虫框架
Stars: ✭ 344 (-46%)
Freshonions TorscraperFresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion
Stars: ✭ 348 (-45.37%)
Hquery.phpAn extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.
Stars: ✭ 295 (-53.69%)
Vaultswiss army knife for hackers
Stars: ✭ 346 (-45.68%)