Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+2544%)

Mutual labels: crawler, scraper

Pypergrabber

Fetches PubMed article IDs (PMIDs) from email inbox, then crawls PubMed, Google Scholar and Sci-Hub for respective PDF files.

Stars: ✭ 14 (-72%)

Mutual labels: crawler, scraper

Youtube Projects

This repository contains all the code I use in my YouTube tutorials.

Stars: ✭ 144 (+188%)

Mutual labels: crawler, scraper

Weibo wordcloud

根据关键词抓取微博数据，再生成词云

Stars: ✭ 154 (+208%)

Mutual labels: crawler, weibo

arachnod

High performance crawler for Nodejs

Stars: ✭ 17 (-66%)

Mutual labels: crawler, scraper

Linkedin Profile Scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.

Stars: ✭ 171 (+242%)

Mutual labels: crawler, scraper

Skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.

Stars: ✭ 231 (+362%)

Mutual labels: crawler, scraper

Annie

👾 Fast and simple video download library and CLI tool written in Go

Stars: ✭ 16,369 (+32638%)

Mutual labels: crawler, scraper

Polite

Be nice on the web

Stars: ✭ 253 (+406%)

Mutual labels: crawler, scraper

Scrapedin

LinkedIn Scraper (currently working 2020)

Stars: ✭ 453 (+806%)

Mutual labels: crawler, scraper

Bookcorpus

Crawl BookCorpus

Stars: ✭ 443 (+786%)

Mutual labels: crawler, scraper

Awesome Crawler

A collection of awesome web crawler,spider in different languages

Stars: ✭ 4,793 (+9486%)

Mutual labels: crawler, scraper

Crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

Stars: ✭ 440 (+780%)

Mutual labels: crawler, scraper

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+1212%)

Mutual labels: crawler, scraper

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (+1174%)

Mutual labels: crawler, scraper

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (+1478%)

Mutual labels: crawler, scraper

Weibo Analyst

Social media (Weibo) comments analyzing toolbox in Chinese 微博评论分析工具, 实现功能: 1.微博评论数据爬取; 2.分词与关键词提取; 3.词云与词频统计; 4.情感分析; 5.主题聚类

Stars: ✭ 430 (+760%)

Mutual labels: crawler, weibo

Social Scraper

Tổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt

Stars: ✭ 47 (-6%)

Mutual labels: crawler, scraper

Weibo Crawler

新浪微博爬虫，用python爬取新浪微博数据，并下载微博图片和微博视频

Stars: ✭ 1,019 (+1938%)

Mutual labels: crawler, weibo

Jd Autobuy

Python爬虫，京东自动登录，在线抢购商品

Stars: ✭ 1,174 (+2248%)

Mutual labels: crawler, scraper

Avbook

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+16166%)

Mutual labels: crawler, scraper

Weibo Album Crawler

新浪微博相册大图多线程爬虫。

Stars: ✭ 83 (+66%)

Mutual labels: crawler, weibo

Geziyor

Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.

Stars: ✭ 1,246 (+2392%)

Mutual labels: crawler, scraper

Google Play Scraper

Node.js scraper to get data from Google Play

Stars: ✭ 1,606 (+3112%)

Mutual labels: crawler, scraper

Gosint

OSINT Swiss Army Knife

Stars: ✭ 401 (+702%)

Mutual labels: crawler, scraper

Google Play Scraper

Google play scraper for Python inspired by <facundoolano/google-play-scraper>

Stars: ✭ 143 (+186%)

Mutual labels: crawler, scraper

Onegram

This repository is no longer maintained.

Stars: ✭ 137 (+174%)

Mutual labels: crawler, scraper

Instagram Scraper

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

Stars: ✭ 2,209 (+4318%)

Mutual labels: crawler, scraper

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+22990%)

Mutual labels: crawler, scraper

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (+280%)

Mutual labels: crawler, scraper

Instagram Crawler

Crawl instagram photos, posts and videos for download.

Stars: ✭ 178 (+256%)

Mutual labels: crawler, scraper

Querylist

🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

Stars: ✭ 2,392 (+4684%)

Mutual labels: crawler, scraper

Weibo Topic Spider

微博超级话题爬虫，微博词频统计+情感分析+简单分类，新增肺炎超话爬取数据

Stars: ✭ 128 (+156%)

Mutual labels: crawler, weibo

Ruiji.net

crawler framework, distributed crawler extractor

Stars: ✭ 220 (+340%)

Mutual labels: crawler, scraper

Goose Parser

Universal scrapping tool, which allows you to extract data using multiple environments

Stars: ✭ 211 (+322%)

Mutual labels: crawler, scraper

Weibopicdownloader

免登录下载微博图片爬虫 Download Weibo Images without Logging-in

Stars: ✭ 247 (+394%)

Mutual labels: crawler, weibo

Media Scraper

Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok

Stars: ✭ 206 (+312%)

Mutual labels: crawler, scraper

Xcrawler

快速、简洁且强大的PHP爬虫框架

Stars: ✭ 344 (+588%)

Mutual labels: crawler, scraper

Freshonions Torscraper

Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion

Stars: ✭ 348 (+596%)

Mutual labels: crawler, scraper

Sina Weibo Album Downloader

Multithreading download all HD photos / pictures from someone's Sina Weibo album.

Stars: ✭ 125 (+150%)

Mutual labels: crawler, weibo

Tianyancha

pip安装的天眼查爬虫API，指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.

Stars: ✭ 206 (+312%)

Mutual labels: crawler, scraper

zeekEye

A Fast and Powerful Scraping and Web Crawling Framework.

Stars: ✭ 36 (-28%)

Mutual labels: weibo, weibo-spider

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-70%)

Mutual labels: crawler, scraper

1-60 of 832 similar projects

›

next*5