All Projects → scrapy-wayback-machine → Similar Projects or Alternatives

355 Open source projects that are alternatives of or similar to scrapy-wayback-machine

Scrapy Fake Useragent
Random User-Agent middleware based on fake-useragent
Stars: ✭ 520 (+465.22%)
Mutual labels:  web-scraping, scrapy
City Scrapers
Scrape, standardize and share public meetings from local government websites
Stars: ✭ 220 (+139.13%)
Mutual labels:  web-scraping, scrapy
Scrapyd Cluster On Heroku
Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉
Stars: ✭ 106 (+15.22%)
Mutual labels:  web-scraping, scrapy
Scrapy Training
Scrapy Training companion code
Stars: ✭ 157 (+70.65%)
Mutual labels:  web-scraping, scrapy
scrapy-fieldstats
A Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-81.52%)
Mutual labels:  scrapy, scrapy-extension
scraping-ebay
Scraping Ebay's products using Scrapy Web Crawling Framework
Stars: ✭ 79 (-14.13%)
Mutual labels:  web-scraping, scrapy
Scrapple
A framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+404.35%)
Mutual labels:  web-scraping, scrapy
IMDB-Scraper
Scrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.
Stars: ✭ 37 (-59.78%)
Mutual labels:  web-scraping, scrapy
Juno crawler
Scrapy crawler to collect data on the back catalog of songs listed for sale.
Stars: ✭ 150 (+63.04%)
Mutual labels:  web-scraping, scrapy
OLX Scraper
📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-83.7%)
Mutual labels:  web-scraping, scrapy
Netflix Clone
Netflix like full-stack application with SPA client and backend implemented in service oriented architecture
Stars: ✭ 156 (+69.57%)
Mutual labels:  web-scraping, scrapy
Faster Than Requests
Faster requests on Python 3
Stars: ✭ 639 (+594.57%)
Mutual labels:  web-scraping, scrapy
restaurant-finder-featureReviews
Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-77.17%)
Mutual labels:  web-scraping, scrapy
scrapy plus
scrapy 常用爬网必备工具包
Stars: ✭ 18 (-80.43%)
Mutual labels:  scrapy, scrapy-extension
Scrapy Craigslist
Web Scraping Craigslist's Engineering Jobs in NY with Scrapy
Stars: ✭ 54 (-41.3%)
Mutual labels:  web-scraping, scrapy
wayback
⏪ Tools to Work with the Various Internet Archive Wayback Machine APIs
Stars: ✭ 52 (-43.48%)
Mutual labels:  web-scraping, wayback-machine
Quora Api
An unofficial API for Quora.
Stars: ✭ 250 (+171.74%)
Mutual labels:  web-scraping
scrapy helper
Dynamic configurable crawl (动态可配置化爬虫)
Stars: ✭ 84 (-8.7%)
Mutual labels:  scrapy
Wayback Machine Scraper
A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Stars: ✭ 230 (+150%)
Mutual labels:  web-scraping
Docbao
Công cụ quét và phân tích từ khoá các trang báo mạng Việt Nam
Stars: ✭ 230 (+150%)
Mutual labels:  web-scraping
vietnam-ecommerce-crawler
Crawling the data from lazada, websosanh, compare.vn, cdiscount and cungmua with flexible configs
Stars: ✭ 28 (-69.57%)
Mutual labels:  scrapy
lopez
Crawling and scraping the Web for fun and profit
Stars: ✭ 20 (-78.26%)
Mutual labels:  web-scraping
Short Jokes Dataset
Python scripts for building 'Short Jokes' dataset, featured on Kaggle
Stars: ✭ 215 (+133.7%)
Mutual labels:  web-scraping
Trump Lies
Tutorial: Web scraping in Python with Beautiful Soup
Stars: ✭ 201 (+118.48%)
Mutual labels:  web-scraping
PythonScrapyBasicSetup
Basic setup with random user agents and IP addresses for Python Scrapy Framework.
Stars: ✭ 57 (-38.04%)
Mutual labels:  web-scraping
Twitter Intelligence
Twitter Intelligence OSINT project performs tracking and analysis of the Twitter
Stars: ✭ 179 (+94.57%)
Mutual labels:  web-scraping
UofT-Timetable-Generator
A web application that generates timetables for university students at the University of Toronto
Stars: ✭ 34 (-63.04%)
Mutual labels:  web-scraping
crawlzone
Crawlzone is a fast asynchronous internet crawling framework for PHP.
Stars: ✭ 70 (-23.91%)
Mutual labels:  web-scraping
Scrape Linkedin Selenium
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+159.78%)
Mutual labels:  web-scraping
scrapy-LBC
Araignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-84.78%)
Mutual labels:  scrapy
2017-summer-workshop
Exercises, data, and more for our 2017 summer workshop (funded by the Estes Fund and in partnership with Project Jupyter and Berkeley's D-Lab)
Stars: ✭ 33 (-64.13%)
Mutual labels:  web-scraping
Web Database Analytics
Web scrapping and related analytics using Python tools
Stars: ✭ 175 (+90.22%)
Mutual labels:  web-scraping
Selenium Python Helium
Selenium-python but lighter: Helium is the best Python library for web automation.
Stars: ✭ 2,732 (+2869.57%)
Mutual labels:  web-scraping
cinedantan
🎥 🍿 Streaming Public domain movies
Stars: ✭ 52 (-43.48%)
Mutual labels:  archive-dot-org
R Web Scraping Cheat Sheet
Guide, reference and cheatsheet on web scraping using rvest, httr and Rselenium.
Stars: ✭ 207 (+125%)
Mutual labels:  web-scraping
scrapy-rotated-proxy
A scrapy middleware to use rotated proxy ip list.
Stars: ✭ 22 (-76.09%)
Mutual labels:  scrapy
Bet On Sibyl
Machine Learning Model for Sport Predictions (Football, Basketball, Baseball, Hockey, Soccer & Tennis)
Stars: ✭ 190 (+106.52%)
Mutual labels:  web-scraping
crawler
python爬虫项目集合
Stars: ✭ 29 (-68.48%)
Mutual labels:  scrapy
Grab
Web Scraping Framework
Stars: ✭ 2,147 (+2233.7%)
Mutual labels:  web-scraping
Hi
A Programming language for Web Scraping
Stars: ✭ 14 (-84.78%)
Mutual labels:  web-scraping
ArticleSpider
Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation roadmap, distributed-crawler and coping with anti-crawling strategies).
Stars: ✭ 34 (-63.04%)
Mutual labels:  scrapy
Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Stars: ✭ 18 (-80.43%)
Mutual labels:  web-scraping
codepen-puppeteer
Use Puppeteer to download pens from Codepen.io as single html pages
Stars: ✭ 22 (-76.09%)
Mutual labels:  web-scraping
vandal
Navigator for Web Archive
Stars: ✭ 146 (+58.7%)
Mutual labels:  wayback-machine
Learnpythonforresearch
This repository provides everything you need to get started with Python for (social science) research.
Stars: ✭ 163 (+77.17%)
Mutual labels:  web-scraping
Scrapy-tripadvisor-reviews
Using scrapy to scrape tripadvisor in order to get users' reviews.
Stars: ✭ 24 (-73.91%)
Mutual labels:  scrapy
Web Scraping
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, SHFE and news data crawlers on BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
Stars: ✭ 153 (+66.3%)
Mutual labels:  web-scraping
asyncpy
使用asyncio和aiohttp开发的轻量级异步协程web爬虫框架
Stars: ✭ 86 (-6.52%)
Mutual labels:  scrapy
arche
Analyze scraped data
Stars: ✭ 49 (-46.74%)
Mutual labels:  scrapy
Helena
A Chrome extension for writing custom web scraping programs and web automation programs. Just demonstrate how to collect the first row of data, then let the extension write the program for collecting all rows.
Stars: ✭ 151 (+64.13%)
Mutual labels:  web-scraping
Phpscraper
PHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+60.87%)
Mutual labels:  web-scraping
lgcrawl
python+scrapy+splash 爬取拉勾全站职位信息
Stars: ✭ 22 (-76.09%)
Mutual labels:  scrapy
Sqrape
Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (+56.52%)
Mutual labels:  web-scraping
Zillow
Zillow Scraper for Python using Selenium
Stars: ✭ 141 (+53.26%)
Mutual labels:  web-scraping
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (+33.7%)
Mutual labels:  scrapy
web-poet
Web scraping Page Objects core library
Stars: ✭ 67 (-27.17%)
Mutual labels:  web-scraping
pagser
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler
Stars: ✭ 82 (-10.87%)
Mutual labels:  scrapy
Html Metadata
MetaData html scraper and parser for Node.js (supports Promises and callback style)
Stars: ✭ 129 (+40.22%)
Mutual labels:  web-scraping
Actor Page Analyzer
Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
Stars: ✭ 124 (+34.78%)
Mutual labels:  web-scraping
concurrent-web-scraping
Building a Concurrent Web Scraper with Python and Selenium
Stars: ✭ 28 (-69.57%)
Mutual labels:  web-scraping
1-60 of 355 similar projects