All Projects → Scrapple → Similar Projects or Alternatives

1958 Open source projects that are alternatives of or similar to Scrapple

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (+25.65%)

Mutual labels: crawler, scrapy, scraping, beautifulsoup

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-96.77%)

Mutual labels: web-scraper, web-scraping, scrapy

Detect Cms

PHP Library for detecting CMS

Stars: ✭ 78 (-83.19%)

Mutual labels: scraping, web-scraping, web-scraper

Scrape Linkedin Selenium

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

Stars: ✭ 239 (-48.49%)

Mutual labels: scraping, web-scraping, web-scraper

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-96.77%)

Mutual labels: crawler, scraping, web-scraping

Faster Than Requests

Faster requests on Python 3

Stars: ✭ 639 (+37.72%)

Mutual labels: scrapy, web-scraping, web-scraper

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (-78.45%)

Mutual labels: crawler, scrapy, scraping

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+41.38%)

Mutual labels: crawler, web-scraping, web-scraper

Daftlistings

A library that enables programmatic interaction with daft.ie. Daft.ie has nationwide coverage and contains about 80% of the total available properties in Ireland.

Stars: ✭ 86 (-81.47%)

Mutual labels: web-scraping, web-scraper, beautifulsoup

Scrapy Craigslist

Web Scraping Craigslist's Engineering Jobs in NY with Scrapy

Stars: ✭ 54 (-88.36%)

Mutual labels: scrapy, web-scraping, web-scraper

Sqrape

Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)

Stars: ✭ 144 (-68.97%)

Mutual labels: scraping, web-scraping, css-selector

Scrapy Crawlera

Crawlera middleware for Scrapy

Stars: ✭ 281 (-39.44%)

Mutual labels: crawler, scrapy, scraping

top-github-scraper

Scape top GitHub repositories and users based on keywords

Stars: ✭ 40 (-91.38%)

Mutual labels: scraping, web-scraper, web-scraping

Cascadia

Go cascadia package command line CSS selector

Stars: ✭ 67 (-85.56%)

Mutual labels: web-scraping, web-scraper, css-selector

Autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Stars: ✭ 4,077 (+778.66%)

Mutual labels: crawler, scraping, web-scraping

Phpscraper

PHP Scraper - an highly opinionated web-interface for PHP

Stars: ✭ 148 (-68.1%)

Mutual labels: scraping, web-scraping, web-scraper

Arachnid

Powerful web scraping framework for Crystal

Stars: ✭ 68 (-85.34%)

Mutual labels: crawler, web-scraping, web-scraper

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (-40.3%)

Mutual labels: crawler, scraping, web-scraping

chopper

Chopper is a tool to extract elements from HTML by preserving ancestors and CSS rules

Stars: ✭ 22 (-95.26%)

Mutual labels: scraping, beautifulsoup

linkedin-scraper

Tool to scrape linkedin

Stars: ✭ 74 (-84.05%)

Mutual labels: scraping, beautifulsoup

trafilatura

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Stars: ✭ 711 (+53.23%)

Mutual labels: scraping, web-scraping

scrapy-fieldstats

A Scrapy extension to log items coverage when the spider shuts down

Stars: ✭ 17 (-96.34%)

Mutual labels: scraping, scrapy

BookingScraper

🌎 🏨 Scrape Booking.com 🏨 🌎

Stars: ✭ 68 (-85.34%)

Mutual labels: web-scraping, beautifulsoup

ioweb

Web Scraping Framework

Stars: ✭ 31 (-93.32%)

Mutual labels: scraping, web-scraping

html-table-extractor

extract data from html table

Stars: ✭ 74 (-84.05%)

Mutual labels: scraping, beautifulsoup

browser-pool

A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.

Stars: ✭ 71 (-84.7%)

Mutual labels: scraping, web-scraping

Euro2016 TerminalApp

⚽ Instantly find 🏆EURO 2016 live-streams & highlights, now a Web App!

Stars: ✭ 54 (-88.36%)

Mutual labels: scraping, beautifulsoup

PythonScrapyBasicSetup

Basic setup with random user agents and IP addresses for Python Scrapy Framework.

Stars: ✭ 57 (-87.72%)

Mutual labels: scraping, web-scraping

double-agent

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

Stars: ✭ 123 (-73.49%)

Mutual labels: scraping, scrapy

Ecommercecrawlers

码云仓库链接:AJay13/ECommerceCrawlers Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers 项目展示平台链接:http://wechat.doonsec.com

Stars: ✭ 3,073 (+562.28%)

Mutual labels: crawler, scrapy

RARBG-scraper

With Selenium headless browsing and CAPTCHA solving

Stars: ✭ 38 (-91.81%)

Mutual labels: scraping, scrapy

scrapy-wayback-machine

A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

Stars: ✭ 92 (-80.17%)

Mutual labels: web-scraping, scrapy

selectorlib

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Stars: ✭ 53 (-88.58%)

Mutual labels: scraping, web-scraping

Filesensor

Dynamic file detection tool based on crawler 基于爬虫的动态敏感文件探测工具

Stars: ✭ 227 (-51.08%)

Mutual labels: crawler, scrapy

InstaBot

Simple and friendly Bot for Instagram, using Selenium and Scrapy with Python.

Stars: ✭ 32 (-93.1%)

Mutual labels: scraping, scrapy

Linkedin-Client

Web scraper for grabing data from Linkedin profiles or company pages (personal project)

Stars: ✭ 42 (-90.95%)

Mutual labels: web-scraper, web-scraping

torchestrator

Spin up Tor containers and then proxy HTTP requests via these Tor instances

Stars: ✭ 32 (-93.1%)

Mutual labels: scraping, scrapy

IMDB-Scraper

Scrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.

Stars: ✭ 37 (-92.03%)

Mutual labels: web-scraping, scrapy

grailer

web scraping tool for grailed.com

Stars: ✭ 30 (-93.53%)

Mutual labels: web-scraping, beautifulsoup

scraping-ebay

Scraping Ebay's products using Scrapy Web Crawling Framework

Stars: ✭ 79 (-82.97%)

Mutual labels: web-scraping, scrapy

proxi

Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.

Stars: ✭ 32 (-93.1%)

Mutual labels: scraping, scrapy

Scraper-Projects

🕸 List of mini projects that involve web scraping 🕸

Stars: ✭ 25 (-94.61%)

Mutual labels: scraping, beautifulsoup

Arachnid

Crawl all unique internal links found on a given website, and extract SEO related information - supports javascript based sites

Stars: ✭ 224 (-51.72%)

Mutual labels: crawler, scraping

Data-Wrangling-with-Python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Stars: ✭ 90 (-80.6%)

Mutual labels: web-scraping, beautifulsoup

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (-91.81%)

Mutual labels: scraping, scrapy

restaurant-finder-featureReviews

Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).

Stars: ✭ 21 (-95.47%)

Mutual labels: web-scraping, scrapy

Crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

Stars: ✭ 440 (-5.17%)

Mutual labels: crawler, scraping

MediumScraper

Scraping articles of medium and providing audio versions 📑 to 🔊 using django

Stars: ✭ 12 (-97.41%)

Mutual labels: web-scraper, beautifulsoup

TorScrapper

A Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)

Stars: ✭ 24 (-94.83%)

Mutual labels: scraping, beautifulsoup

lostark-wait-notifier

🐤️ Lost Ark wait notifier

Stars: ✭ 38 (-91.81%)

Mutual labels: crawler, beautifulsoup

pythonSpider

🕷️some python spiders with BeautifulSoup or scarpy

Stars: ✭ 28 (-93.97%)

Mutual labels: scrapy, beautifulsoup

ptt-web-crawler

PTT 網路版爬蟲

Stars: ✭ 20 (-95.69%)

Mutual labels: crawler, scrapy

memes-api

API for scrapping common meme sites

Stars: ✭ 17 (-96.34%)

Mutual labels: scraping, scrapy

bots-zoo

No description or website provided.

Stars: ✭ 59 (-87.28%)

Mutual labels: crawler, scraping

policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

Stars: ✭ 22 (-95.26%)

Mutual labels: scraping, scrapy

ARGUS

ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9

Stars: ✭ 68 (-85.34%)

Mutual labels: scraping, scrapy

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+579.74%)

Mutual labels: scraping, web-scraping

scrapy-zyte-smartproxy

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

Stars: ✭ 317 (-31.68%)

Mutual labels: scraping, scrapy

raspagem-de-dados-fatec

📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí

Stars: ✭ 22 (-95.26%)

Mutual labels: scraping, web-scraping

Php Curl Class

PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs

Stars: ✭ 2,903 (+525.65%)

Mutual labels: web-scraping, web-scraper

1-60 of 1958 similar projects

›

next*5