ProxyGrabAsynchronous Library made using Python and aiohttp to get proxies from multiple services!
Stars: ✭ 17 (-96.95%)
scrapy-wayback-machineA Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Stars: ✭ 92 (-83.48%)
Vaultswiss army knife for hackers
Stars: ✭ 346 (-37.88%)
Fp ServerFree proxy server, continuously crawling and providing proxies, based on Tornado and Scrapy. 免费代理服务器,基于Tornado和Scrapy,在本地搭建属于自己的代理池
Stars: ✭ 154 (-72.35%)
scrapyra simple & tiny scrapy clustering solution, considered a drop-in replacement for scrapyd
Stars: ✭ 50 (-91.02%)
Python3 SpiderPython爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
Stars: ✭ 2,129 (+282.23%)
dotnet-security-unit-testsA web application that contains several unit tests for the purpose of .NET security
Stars: ✭ 25 (-95.51%)
Scrapy RedisRedis-based components for Scrapy.
Stars: ✭ 4,998 (+797.31%)
Taobaoscrapy😩Tool For Taobao/Tmall| 儿时玩具已经过时
Stars: ✭ 146 (-73.79%)
scrapy.dartScrapy, a fast high-level web crawling & scraping framework for dart and Flutter
Stars: ✭ 50 (-91.02%)
Jobspidersscrapy框架爬取51job(scrapy.Spider),智联招聘(扒接口),拉勾网(CrawlSpider)
Stars: ✭ 144 (-74.15%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-96.05%)
Scrapy demoall kinds of scrapy demo
Stars: ✭ 128 (-77.02%)
cappy☕🗄CAching Proxy in Python – Simple file based python http proxy
Stars: ✭ 15 (-97.31%)
XidelCommand line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
Stars: ✭ 335 (-39.86%)
Crawlab LiteLite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Stars: ✭ 122 (-78.1%)
Qqmusicspider基于Scrapy的QQ音乐爬虫(QQ Music Spider),爬取歌曲信息、歌词、精彩评论等,并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料
Stars: ✭ 120 (-78.46%)
HeWeatherHomeAssistant HeWeather Plugin
Stars: ✭ 66 (-88.15%)
Copybook用爬虫爬取小说网站上所有小说,存储到数据库中,并用爬到的数据构建自己的小说网站
Stars: ✭ 117 (-78.99%)
Requests RespectfulMinimalist Requests wrapper to work within rate limits of any amount of services simultaneously. Parallel processing friendly.
Stars: ✭ 417 (-25.13%)
Maria QuiteriaBackend para coleta e disponibilização dos dados 📜
Stars: ✭ 115 (-79.35%)
163Music163music spider by scrapy.
Stars: ✭ 60 (-89.23%)
pyinrailA python wrapper for Indian Railways Enquiry API!
Stars: ✭ 40 (-92.82%)
Hivelots of spider (很多爬虫)
Stars: ✭ 110 (-80.25%)
Htmlqueryhtmlquery is golang XPath package for HTML query.
Stars: ✭ 338 (-39.32%)
Scrapyd Cluster On HerokuSet up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉
Stars: ✭ 106 (-80.97%)
htmx-talk-2021Code examples and slides from my 2021 talk Server-Side is Dead! Long Live Server-Side (+ HTMX), presented at DjangoCon and Code Code Code
Stars: ✭ 18 (-96.77%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-82.05%)
Proxy server crawleran awesome public proxy server crawler based on scrapy framework
Stars: ✭ 94 (-83.12%)
torchestratorSpin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (-94.25%)
PycookiecheatBorrow cookies from your browser's authenticated session for use in Python scripts.
Stars: ✭ 465 (-16.52%)
Email ExtractorThe main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (-85.46%)
wc18-cliAn easy command line interface for the 2018 World Cup
Stars: ✭ 15 (-97.31%)
Capturercapture pictures from website like sina, lofter, huaban and so on
Stars: ✭ 76 (-86.36%)
rigorHTTP-based DSL for for validating RESTful APIs
Stars: ✭ 65 (-88.33%)
Image DownloaderDownload images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
Stars: ✭ 1,173 (+110.59%)
animecenterThe source code for animecenter
Stars: ✭ 16 (-97.13%)
Node Request Retry💂 Wrap NodeJS request module to retry http requests in case of errors
Stars: ✭ 330 (-40.75%)
ArticleSpiderCrawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation roadmap, distributed-crawler and coping with anti-crawling strategies).
Stars: ✭ 34 (-93.9%)
image-crawlerAn image scraper that scraps images from unsplash.com
Stars: ✭ 12 (-97.85%)
JawbreakerA Python obfuscator using HTTP Requests and Hastebin.
Stars: ✭ 50 (-91.02%)
requestsRR interface to Python requests module
Stars: ✭ 12 (-97.85%)
Scrapy SeleniumScrapy middleware to handle javascript pages using selenium
Stars: ✭ 550 (-1.26%)
FbcrawlA Facebook crawler
Stars: ✭ 536 (-3.77%)
GuzzleGuzzle, an extensible PHP HTTP client
Stars: ✭ 21,384 (+3739.14%)