All Projects → ioweb → Similar Projects or Alternatives

455 Open source projects that are alternatives of or similar to ioweb

Apify Js
Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+10074.19%)
Mutual labels:  scraping, web-scraping, web-crawling
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (+119.35%)
Mutual labels:  scraping, webscraping, webcrawling
zcrawl
An open source web crawling platform
Stars: ✭ 21 (-32.26%)
Mutual labels:  scraping, web-crawling, webcrawling
Autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+13051.61%)
Mutual labels:  scraping, web-scraping, webscraping
chesf
CHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-41.94%)
Mutual labels:  scraping, webscraping
gotor
This program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API.
Stars: ✭ 97 (+212.9%)
Mutual labels:  webscraping, webcrawling
BookingScraper
🌎 🏨 Scrape Booking.com 🏨 🌎
Stars: ✭ 68 (+119.35%)
Mutual labels:  web-scraping, webscraping
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+2193.55%)
Mutual labels:  scraping, web-scraping
Humanoid
Node.js package to bypass CloudFlare's anti-bot JavaScript challenges
Stars: ✭ 88 (+183.87%)
Mutual labels:  scraping, web-scraping
extractnet
A Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+67.74%)
Mutual labels:  web-scraping, webscraping
Stock-Fundamental-data-scraping-and-analysis
Project on building a web crawler to collect the fundamentals of the stock and review their performance in one go
Stars: ✭ 40 (+29.03%)
Mutual labels:  web-scraping, webcrawling
raspagem-de-dados-fatec
📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (-29.03%)
Mutual labels:  scraping, web-scraping
Gazpacho
🥫 The simple, fast, and modern web scraping library
Stars: ✭ 525 (+1593.55%)
Mutual labels:  scraping, webscraping
browser-automation-api
Browser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other things like capture a screenshot, generate pdf, extract content or execute custom Puppeteer, Playwright functions.
Stars: ✭ 24 (-22.58%)
Mutual labels:  scraping, webscraping
OLX Scraper
📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-51.61%)
Mutual labels:  web-scraping, web-crawling
R Web Scraping Cheat Sheet
Guide, reference and cheatsheet on web scraping using rvest, httr and Rselenium.
Stars: ✭ 207 (+567.74%)
Mutual labels:  web-scraping, webscraping
Configs
Public, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores
Stars: ✭ 37 (+19.35%)
Mutual labels:  scraping, webscraping
newspaperjs
News extraction and scraping. Article Parsing
Stars: ✭ 59 (+90.32%)
Mutual labels:  webscraping, webcrawling
selectorlib
A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
Stars: ✭ 53 (+70.97%)
Mutual labels:  scraping, web-scraping
browser-pool
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+129.03%)
Mutual labels:  scraping, web-scraping
anime-scraper
[partially working] Scrape and add anime episode stream URLs to uGet (Linux) or IDM (Windows) ~ Python3
Stars: ✭ 21 (-32.26%)
Mutual labels:  scraping, webscraping
top-github-scraper
Scape top GitHub repositories and users based on keywords
Stars: ✭ 40 (+29.03%)
Mutual labels:  scraping, web-scraping
PythonScrapyBasicSetup
Basic setup with random user agents and IP addresses for Python Scrapy Framework.
Stars: ✭ 57 (+83.87%)
Mutual labels:  scraping, web-scraping
papercut
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-51.61%)
Mutual labels:  scraping, web-scraping
schedule-tweet
Schedules tweets using TweetDeck
Stars: ✭ 14 (-54.84%)
Mutual labels:  scraping, webscraping
Gopa
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+793.55%)
Mutual labels:  scraping, web-scraping
Django Dynamic Scraper
Creating Scrapy scrapers via the Django admin interface
Stars: ✭ 1,024 (+3203.23%)
Mutual labels:  scraping, webscraping
Detect Cms
PHP Library for detecting CMS
Stars: ✭ 78 (+151.61%)
Mutual labels:  scraping, web-scraping
Dotnetcrawler
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (+222.58%)
Mutual labels:  scraping, webscraping
Scrapple
A framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+1396.77%)
Mutual labels:  scraping, web-scraping
Instago
Download/access photos, videos, stories, story highlights, postlives, following and followers of Instagram
Stars: ✭ 59 (+90.32%)
Mutual labels:  web-scraping, webscraping
Phpscraper
PHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+377.42%)
Mutual labels:  scraping, web-scraping
Sqrape
Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (+364.52%)
Mutual labels:  scraping, web-scraping
Scrape Linkedin Selenium
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+670.97%)
Mutual labels:  scraping, web-scraping
codepen-puppeteer
Use Puppeteer to download pens from Codepen.io as single html pages
Stars: ✭ 22 (-29.03%)
Mutual labels:  web-scraping
medium-scrapper
Scrap Medium Articles using tags.
Stars: ✭ 34 (+9.68%)
Mutual labels:  webscraping
web-poet
Web scraping Page Objects core library
Stars: ✭ 67 (+116.13%)
Mutual labels:  web-scraping
crawler-chrome-extensions
爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer
Stars: ✭ 53 (+70.97%)
Mutual labels:  scraping
Crypto-Webminer
Use Crypto Webminer JavaScript miner on various Cryptonight | CN-Lite | CN-Fast | CN-Fast2 | CN-Pico | CN-RWZ | CN-UPX2 | CN-Half | CN-Heavy | CN-Saber (BitTube) | Argon2id - Chukwa Stratum Pools
Stars: ✭ 166 (+435.48%)
Mutual labels:  webmining
scrapy-wayback-machine
A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Stars: ✭ 92 (+196.77%)
Mutual labels:  web-scraping
core
The complete web scraping toolkit for PHP.
Stars: ✭ 1,110 (+3480.65%)
Mutual labels:  web-scraping
scrape-github-trending
Tutorial for web scraping / crawling with Node.js.
Stars: ✭ 42 (+35.48%)
Mutual labels:  scraping
Architeuthis
MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.
Stars: ✭ 35 (+12.9%)
Mutual labels:  scraping
google scraper live view
Application for extracting large amounts of data from the Google search results page
Stars: ✭ 17 (-45.16%)
Mutual labels:  webscraping
diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (+70.97%)
Mutual labels:  scraping
4cat
The 4CAT Capture and Analysis Toolkit provides modular data capture & analysis for a variety of social media platforms.
Stars: ✭ 144 (+364.52%)
Mutual labels:  scraping
info-bot
🤖 A Versatile Telegram Bot
Stars: ✭ 37 (+19.35%)
Mutual labels:  scraping
super-anime-downloader
A program which takes an Anime name or URL and downloads the specified range of episodes.
Stars: ✭ 26 (-16.13%)
Mutual labels:  webscraping
crawlzone
Crawlzone is a fast asynchronous internet crawling framework for PHP.
Stars: ✭ 70 (+125.81%)
Mutual labels:  web-scraping
socials
👨‍👩‍👦 Social account detection and extraction in Python, e.g. for crawling/scraping.
Stars: ✭ 37 (+19.35%)
Mutual labels:  scraping
Goirate
Pillaging the seven seas for torrents, pieces of eight and other bounty.
Stars: ✭ 20 (-35.48%)
Mutual labels:  scraping
fBrowser
Helpful Selenium functions to make web-scraping easier and faster
Stars: ✭ 16 (-48.39%)
Mutual labels:  webscraping
2017-summer-workshop
Exercises, data, and more for our 2017 summer workshop (funded by the Estes Fund and in partnership with Project Jupyter and Berkeley's D-Lab)
Stars: ✭ 33 (+6.45%)
Mutual labels:  web-scraping
linkedin-scraper
Tool to scrape linkedin
Stars: ✭ 74 (+138.71%)
Mutual labels:  scraping
chopper
Chopper is a tool to extract elements from HTML by preserving ancestors and CSS rules
Stars: ✭ 22 (-29.03%)
Mutual labels:  scraping
scrapers
scrapers for building your own image databases
Stars: ✭ 46 (+48.39%)
Mutual labels:  scraping
google-search-results-nodejs
SerpApi client library for Node.js. Previously: Google Search Results Node.js.
Stars: ✭ 46 (+48.39%)
Mutual labels:  webscraping
readability-cli
A CLI for Mozilla Readability. Get clean, uncluttered, ready-to-read HTML from any webpage!
Stars: ✭ 41 (+32.26%)
Mutual labels:  scraping
shorter.recipes
A website dedicated to making recipes from any website easy to read.
Stars: ✭ 27 (-12.9%)
Mutual labels:  scraping
Raspagem-de-dados-para-iniciantes
Raspagem de dados para iniciante usando Scrapy e outras libs básicas
Stars: ✭ 113 (+264.52%)
Mutual labels:  webcrawling
1-60 of 455 similar projects