Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

Stars: ✭ 335 (-2.62%)

Mutual labels: scraper

Federal-Parliament-Scraper

A scraper for obtaining information on the workings of the Belgian Federal Parliament.

Stars: ✭ 18 (-94.77%)

Mutual labels: scraper

feaplat

爬虫管理系统，支持集群，弹性伸缩。支持运行feapder、scrapy、selenium、playwright等各种框架及脚本

Stars: ✭ 42 (-87.79%)

Mutual labels: spider

fb-page-chat-download

Python script to download messages from a Facebook page to a CSV file

Stars: ✭ 51 (-85.17%)

Mutual labels: scraper

facebook-discussion-tk

A collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.

Stars: ✭ 33 (-90.41%)

Mutual labels: scraper

zhihu

搜索你的知乎收藏：可以直观地浏览你的所有收藏夹的内容，并进行全文搜索

Stars: ✭ 39 (-88.66%)

Mutual labels: spider

DeadPool

该项目是一个使用celery作为主体框架的爬虫应用，能够灵活的添加爬虫任务，并且同时运行多站点的爬虫工作，所有组件都能够原生支持规模并发和分布式，加上celery原生的分布式调用，实现大规模并发。

Stars: ✭ 38 (-88.95%)

Mutual labels: spider

dorkscout

DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets

Stars: ✭ 189 (-45.06%)

Mutual labels: scraper

scrapy-admin

A django admin site for scrapy

Stars: ✭ 44 (-87.21%)

Mutual labels: spider

LinkedIn-Scraper

A LinkedIn Scraper to scrape up to 10k LinkedIn profiles from company profile links and save their e-mail addresses if available!

Stars: ✭ 62 (-81.98%)

Mutual labels: scraper

Dotnetspider

DotnetSpider, a .NET standard web crawling library. It is lightweight, efficient and fast high-level web crawling & scraping framework

Stars: ✭ 3,233 (+839.83%)

Mutual labels: crawler

4scanner

Continuously search imageboards threads for images/webms and download them

Stars: ✭ 103 (-70.06%)

Mutual labels: scraper

crawlerdetect

Golang module to detect bots and crawlers via the user agent

Stars: ✭ 22 (-93.6%)

Mutual labels: spider

trainline-python

Non-official Python wrapper and CLI tool for Trainline

Stars: ✭ 41 (-88.08%)

Mutual labels: scraper

evine

Interactive CLI Web Crawler

Stars: ✭ 140 (-59.3%)

Mutual labels: scraper

sede

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

Stars: ✭ 83 (-75.87%)

Mutual labels: spider

Supercrawler

A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.

Stars: ✭ 306 (-11.05%)

Mutual labels: crawler

TwEater

A Python Bot for Scraping Conversations from Twitter

Stars: ✭ 16 (-95.35%)

Mutual labels: spider

covid-19

Current and historical coronavirus covid-19 confirmed, recovered, deaths and active case counts segmented by country and region. Includes csv, json and sqlite data along with an interactive website explorer.

Stars: ✭ 15 (-95.64%)

Mutual labels: scraper

python web scraping

Web scraping using python, requests and selenium

Stars: ✭ 40 (-88.37%)

Mutual labels: scraper

spider

python 爬虫(amazon, confluence ...)

Stars: ✭ 21 (-93.9%)

Mutual labels: spider

kick-off-web-scraping-python-selenium-beautifulsoup

A tutorial-based introduction to web scraping with Python.

Stars: ✭ 18 (-94.77%)

Mutual labels: scraper

avbot-charts

Aviation charts

Stars: ✭ 20 (-94.19%)

Mutual labels: scraper

PttImageSpider

PTT 圖片下載器 (抓取整個看板的圖片，並用文章標題作為資料夾的名稱 ) (使用Scrapy)

Stars: ✭ 16 (-95.35%)

Mutual labels: spider

documentDownloader

download document from book118 for free

Stars: ✭ 72 (-79.07%)

Mutual labels: spider

bet365-websocket-crawler

bet365 bot: bet365的比赛实时比分数据、实时赔率

Stars: ✭ 67 (-80.52%)

Mutual labels: spider

Alltheplaces

A set of spiders and scrapers to extract location information from places that post their location on the internet.

Stars: ✭ 277 (-19.48%)

Mutual labels: spider

Novel-crawler

这是一个用Python写的小说爬虫软件

Stars: ✭ 75 (-78.2%)

Mutual labels: spider

V2EX Spider

V2EX爬虫

Stars: ✭ 21 (-93.9%)

Mutual labels: spider

html-query

A fluent and functional approach to querying HTML

Stars: ✭ 48 (-86.05%)

Mutual labels: crawler

pyitau

Unofficial client to access your Itaú bank data

Stars: ✭ 28 (-91.86%)

Mutual labels: scraper

Scraper-Projects

🕸 List of mini projects that involve web scraping 🕸

Stars: ✭ 25 (-92.73%)

Mutual labels: scraper

spider-mzitu

妹子图

Stars: ✭ 13 (-96.22%)

Mutual labels: spider

wishlist

Read an Amazon wishlist programmatically with Python

Stars: ✭ 44 (-87.21%)

Mutual labels: scraper

pinterest-web-scraper

Scraping Visually Similar Images from Pinterest

Stars: ✭ 26 (-92.44%)

Mutual labels: scraper

metacritic api

PHP Metacritic API - Mirrored by my GitLab

Stars: ✭ 31 (-90.99%)

Mutual labels: scraper

SpiderCard

蜘蛛纸牌 for mac

Stars: ✭ 29 (-91.57%)

Mutual labels: spider

snapcrawl

Crawl a website and take screenshots

Stars: ✭ 37 (-89.24%)

Mutual labels: crawler

glyphhanger

Your web font utility belt. It can subset web fonts. It can find unicode-ranges for you automatically. It makes julienne fries.

Stars: ✭ 422 (+22.67%)

Mutual labels: spider

araneid

一个基于Glang语言开发的站群系统（蜘蛛池系统）

Stars: ✭ 25 (-92.73%)

Mutual labels: spider

trawler

scraper for facebook, gab, google and tiktok

Stars: ✭ 20 (-94.19%)

Mutual labels: scraper

Coinsta

A Python package for acquiring both historical and current data of cryptocurrencies

Stars: ✭ 47 (-86.34%)

Mutual labels: scraper

Cryptocmd

Cryptocurrency historical price data library in Python. Data from https://coinmarketcap.com.

Stars: ✭ 299 (-13.08%)

Mutual labels: scraper

Sitemap Generator

Easily create XML sitemaps for your website.

Stars: ✭ 273 (-20.64%)

Mutual labels: crawler

TumblTwo

TumblTwo, an Improved Fork of TumblOne, a Tumblr Downloader.

Stars: ✭ 57 (-83.43%)

Mutual labels: crawler

ammobin-client

client for https://ammobin.ca

Stars: ✭ 18 (-94.77%)

Mutual labels: scraper

imdb-scraper

🎬 An attempt at the most complete IMDb API

Stars: ✭ 24 (-93.02%)

Mutual labels: scraper

Z-Spider

一些爬虫开发的技巧和案例

Stars: ✭ 33 (-90.41%)

Mutual labels: spider

douyin-api

抖音接口、抖音API、抖音数据爬虫、抖音直播数据、抖音直播Api、抖音视频Api、抖音爬虫、抖音去水印、抖音视频下载、抖音视频解析、抖音直播监控、抖音数据采集

Stars: ✭ 41 (-88.08%)

Mutual labels: spider

naver news search scraper

검색어 기준으로 네이버뉴스와 댓글을 수집하는 파이썬 코드

Stars: ✭ 38 (-88.95%)

Mutual labels: scraper

AzurLaneWikiScrapers

A console application that can scrape the Azur Lane wiki and export the data to Json files

Stars: ✭ 12 (-96.51%)

Mutual labels: scraper

301-360 of 1004 similar projects