DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-92.44%)
Docs《数据采集从入门到放弃》源码。内容简介:爬虫介绍、就业情况、爬虫工程师面试题 ;HTTP协议介绍; Requests使用 ;解析器Xpath介绍; MongoDB与MySQL; 多线程爬虫; Scrapy介绍 ;Scrapy-redis介绍; 使用docker部署; 使用nomad管理docker集群; 使用EFK查询docker日志
Stars: ✭ 118 (-91.07%)
NewspaperNews, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+773.3%)
ProxyscrapePython library for retrieving free proxies (HTTP, HTTPS, SOCKS4, SOCKS5).
Stars: ✭ 134 (-89.86%)
Python3 SpiderPython爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
Stars: ✭ 2,129 (+61.04%)
NgmetaDynamic meta tags in your AngularJS single page application
Stars: ✭ 152 (-88.5%)
Youtube ProjectsThis repository contains all the code I use in my YouTube tutorials.
Stars: ✭ 144 (-89.11%)
Spoon🥄 A package for building specific Proxy Pool for different Sites.
Stars: ✭ 173 (-86.91%)
SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-91.15%)
Media ScraperScrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
Stars: ✭ 206 (-84.42%)
Tianyanchapip安装的天眼查爬虫API,指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.
Stars: ✭ 206 (-84.42%)
CollyElegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+1075.11%)
Social ScraperTổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt
Stars: ✭ 47 (-96.44%)
Skrape.itA Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Stars: ✭ 231 (-82.53%)
Ecommercecrawlers码云仓库链接:AJay13/ECommerceCrawlers
Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers
项目展示平台链接:http://wechat.doonsec.com
Stars: ✭ 3,073 (+132.45%)
Gobetween☁️ Modern & minimalistic load balancer for the Сloud era
Stars: ✭ 1,631 (+23.37%)
Querylist🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
Stars: ✭ 2,392 (+80.94%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-98.87%)
scrapy-LBCAraignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-98.94%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-98.34%)
SkipperAn HTTP router and reverse proxy for service composition, including use cases like Kubernetes Ingress
Stars: ✭ 2,606 (+97.13%)
dijnet-botAz összes számlád még egy helyen :)
Stars: ✭ 17 (-98.71%)
Fp ServerFree proxy server, continuously crawling and providing proxies, based on Tornado and Scrapy. 免费代理服务器,基于Tornado和Scrapy,在本地搭建属于自己的代理池
Stars: ✭ 154 (-88.35%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+208.4%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (-76.63%)
Vaultswiss army knife for hackers
Stars: ✭ 346 (-73.83%)
Hquery.phpAn extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.
Stars: ✭ 295 (-77.69%)
Warta ScrapIndonesia Index News Crawler, including 10 online media
Stars: ✭ 57 (-95.69%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+265.89%)
Haipproxy💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+277.69%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (-64.9%)
GoscraperGolang pkg to quickly return a preview of a webpage (title/description/images)
Stars: ✭ 72 (-94.55%)
RcrawlerAn R web crawler and scraper
Stars: ✭ 274 (-79.27%)
CrawlerA high performance web crawler in Elixir.
Stars: ✭ 781 (-40.92%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (-50.38%)
Voyages Sncf ApiA scrapy spider that scraps times and prices from Voyages Sncf. It uses scrapyrt to provide an API interface.
Stars: ✭ 7 (-99.47%)
ScrapitScraping scripts for various websites.
Stars: ✭ 25 (-98.11%)
CrawlabDistributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+534.8%)
IcrawlerA multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (-52.42%)
Weibo terminator workflowUpdate Version of weibo_terminator, This is Workflow Version aim at Get Job Done!
Stars: ✭ 259 (-80.41%)
PacbotPacBot (Policy as Code Bot)
Stars: ✭ 1,017 (-23.07%)
Hproxyhproxy - Asynchronous IP proxy pool, aims to make getting proxy as convenient as possible.(异步爬虫代理池)
Stars: ✭ 62 (-95.31%)
Email ExtractorThe main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (-93.87%)
GomplateA flexible commandline tool for template rendering. Supports lots of local and remote datasources.
Stars: ✭ 1,270 (-3.93%)
EnseadaA Cloud native multi-package registry
Stars: ✭ 80 (-93.95%)