All Projects → robotstxt → Similar Projects or Alternatives

1249 Open source projects that are alternatives of or similar to robotstxt

Mailinglistscraper
A python web scraper for public email lists.
Stars: ✭ 19 (-70.77%)
Mutual labels:  scraper, spider, webscraping
Django Dynamic Scraper
Creating Scrapy scrapers via the Django admin interface
Stars: ✭ 1,024 (+1475.38%)
Mutual labels:  scraper, spider, webscraping
Polite
Be nice on the web
Stars: ✭ 253 (+289.23%)
Mutual labels:  scraper, r-package, webscraping
newspaperjs
News extraction and scraping. Article Parsing
Stars: ✭ 59 (-9.23%)
Mutual labels:  scraper, webscraping
getCRUCLdata
CRU CL v. 2.0 Climatology Client for R
Stars: ✭ 17 (-73.85%)
Mutual labels:  r-package, peer-reviewed
Spydan
A web spider for shodan.io without using the Developer API.
Stars: ✭ 30 (-53.85%)
Mutual labels:  scraper, spider
Autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+6172.31%)
Mutual labels:  scraper, webscraping
Java Spider
一个基于webmagic框架二次开发的java爬虫框架实战,已实现能爬取腾讯,搜狐,今日头条(单独集成功能)等资讯内容,配合elasticsearch框架用法,实现了自动爬虫,已投入线上生产使用。
Stars: ✭ 276 (+324.62%)
Mutual labels:  scraper, spider
Gosint
OSINT Swiss Army Knife
Stars: ✭ 401 (+516.92%)
Mutual labels:  scraper, spider
Awesome Crawler
A collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+7273.85%)
Mutual labels:  scraper, spider
Geziyor
Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
Stars: ✭ 1,246 (+1816.92%)
Mutual labels:  scraper, spider
riem
✈️ ☀️ R package for accessing ASOS data via the Iowa Environment Mesonet ☁️ ✈️
Stars: ✭ 38 (-41.54%)
Mutual labels:  r-package, peer-reviewed
cyphr
Humane encryption
Stars: ✭ 91 (+40%)
Mutual labels:  r-package, peer-reviewed
aliexscrape
Get Aliexpress product details in JSON
Stars: ✭ 80 (+23.08%)
Mutual labels:  scraper, spider
roadoi
Use Unpaywall with R
Stars: ✭ 60 (-7.69%)
Mutual labels:  r-package, peer-reviewed
arachnod
High performance crawler for Nodejs
Stars: ✭ 17 (-73.85%)
Mutual labels:  scraper, spider
Instagram-Scraper-2021
Scrape Instagram content and stories anonymously, using a new technique based on the har file (No Token + No public API).
Stars: ✭ 57 (-12.31%)
Mutual labels:  scraper, webscraping
Freshonions Torscraper
Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion
Stars: ✭ 348 (+435.38%)
Mutual labels:  scraper, spider
newsemble
API for fetching data from news websites.
Stars: ✭ 42 (-35.38%)
Mutual labels:  scraper, webscraping
Huginn
Create agents that monitor and act on your behalf. Your agents are standing by!
Stars: ✭ 33,694 (+51736.92%)
Mutual labels:  scraper, webscraping
Avbook
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Stars: ✭ 8,133 (+12412.31%)
Mutual labels:  scraper, spider
Not Your Average Web Crawler
A web crawler (for bug hunting) that gathers more than you can imagine.
Stars: ✭ 107 (+64.62%)
Mutual labels:  scraper, spider
Goribot
[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Stars: ✭ 190 (+192.31%)
Mutual labels:  scraper, spider
Querylist
🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
Stars: ✭ 2,392 (+3580%)
Mutual labels:  scraper, spider
Colly
Elegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+23800%)
Mutual labels:  scraper, spider
rdflib
📦 High level wrapper around the redland package for common rdf applications
Stars: ✭ 47 (-27.69%)
Mutual labels:  r-package, peer-reviewed
geoparser
⛔ ARCHIVED ⛔ R package for the Geoparser.io API
Stars: ✭ 38 (-41.54%)
Mutual labels:  r-package, peer-reviewed
bittrex
A R Client for the Bittrex Crypto-Currency Exchange
Stars: ✭ 26 (-60%)
Mutual labels:  r-package, peer-reviewed
NLMR
📦 R package to simulate neutral landscape models 🏔
Stars: ✭ 57 (-12.31%)
Mutual labels:  r-package, peer-reviewed
rrlite
R interface to rlite https://github.com/seppo0010/rlite
Stars: ✭ 16 (-75.38%)
Mutual labels:  r-package, peer-reviewed
OpenScraper
An open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (+23.08%)
Mutual labels:  scraper, spider
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-20%)
Mutual labels:  scraper, spider
Mimo-Crawler
A web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.
Stars: ✭ 22 (-66.15%)
Mutual labels:  scraper, webscraping
ropenaq
⛔ ARCHIVED ⛔ Accesses Air Quality Data from the Open Data Platform OpenAQ
Stars: ✭ 69 (+6.15%)
Mutual labels:  r-package, peer-reviewed
bing-ip2hosts
bingip2hosts is a Bing.com web scraper that discovers websites by IP address
Stars: ✭ 99 (+52.31%)
Mutual labels:  scraper, webscraping
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-66.15%)
Mutual labels:  scraper, spider
Rcrawler
An R web crawler and scraper
Stars: ✭ 274 (+321.54%)
Mutual labels:  scraper, webscraping
metacritic api
PHP Metacritic API - Mirrored by my GitLab
Stars: ✭ 31 (-52.31%)
Mutual labels:  scraper, webscraping
Xcrawler
快速、简洁且强大的PHP爬虫框架
Stars: ✭ 344 (+429.23%)
Mutual labels:  scraper, spider
Xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
Stars: ✭ 335 (+415.38%)
Mutual labels:  scraper, webscraping
Crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+576.92%)
Mutual labels:  scraper, spider
opencage
🌐 R package for the OpenCage API -- both forward and reverse geocoding 🌐
Stars: ✭ 82 (+26.15%)
Mutual labels:  r-package, peer-reviewed
Scrapit
Scraping scripts for various websites.
Stars: ✭ 25 (-61.54%)
Mutual labels:  scraper, spider
BookingScraper
🌎 🏨 Scrape Booking.com 🏨 🌎
Stars: ✭ 68 (+4.62%)
Mutual labels:  scraper, webscraping
ant
A web crawler for Go
Stars: ✭ 264 (+306.15%)
Mutual labels:  scraper, spider
Crawler
A high performance web crawler in Elixir.
Stars: ✭ 781 (+1101.54%)
Mutual labels:  scraper, spider
Linkedin Profile Scraper
🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (+163.08%)
Mutual labels:  scraper, spider
Youtube Projects
This repository contains all the code I use in my YouTube tutorials.
Stars: ✭ 144 (+121.54%)
Mutual labels:  scraper, webscraping
Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+909.23%)
Mutual labels:  scraper, spider
scraper
图片爬取下载工具,极速爬取下载 站酷https://www.zcool.com.cn/, CNU 视觉 http://www.cnu.cc/ 设计师/用户 上传的 图片/照片/插画。
Stars: ✭ 64 (-1.54%)
Mutual labels:  scraper, spider
TikTokDownloader PyWebIO
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音|TikTok数据爬取工具,支持API调用,在线批量解析及下载。
Stars: ✭ 919 (+1313.85%)
Mutual labels:  scraper, spider
nlrx
nlrx NetLogo R
Stars: ✭ 66 (+1.54%)
Mutual labels:  r-package, peer-reviewed
rdefra
rdefra: Interact with the UK AIR Pollution Database from DEFRA
Stars: ✭ 14 (-78.46%)
Mutual labels:  r-package, peer-reviewed
suppdata
Grabbing SUPPlementary DATA in R
Stars: ✭ 31 (-52.31%)
Mutual labels:  r-package, peer-reviewed
PostcodesioR
API wrapper around postcodes.io - free UK postcode lookup and geocoder
Stars: ✭ 36 (-44.62%)
Mutual labels:  r-package, peer-reviewed
weathercan
R package for downloading weather data from Environment and Climate Change Canada
Stars: ✭ 83 (+27.69%)
Mutual labels:  r-package, peer-reviewed
Fbcrawl
A Facebook crawler
Stars: ✭ 536 (+724.62%)
Mutual labels:  scraper, spider
crawler-chrome-extensions
爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer
Stars: ✭ 53 (-18.46%)
Mutual labels:  scraper, spider
blinkist-m4a-downloader
Grabs all of the audio files from all of the Blinkist books
Stars: ✭ 100 (+53.85%)
Mutual labels:  scraper, spider
medrxivr
Access and search medRxiv and bioRxiv preprint data
Stars: ✭ 34 (-47.69%)
Mutual labels:  r-package, peer-reviewed
1-60 of 1249 similar projects