All Projects β†’ Memorious β†’ Similar Projects or Alternatives

275 Open source projects that are alternatives of or similar to Memorious

zcrawl
An open source web crawling platform
Stars: ✭ 21 (-91.53%)
Mutual labels:  scraping, crawling
Linkedin Profile Scraper
πŸ•΅οΈβ€β™‚οΈ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-31.05%)
Mutual labels:  scraping, crawling
scrape-github-trending
Tutorial for web scraping / crawling with Node.js.
Stars: ✭ 42 (-83.06%)
Mutual labels:  scraping, crawling
Antch
Antch, a fast, powerful and extensible web crawling & scraping framework for Go
Stars: ✭ 198 (-20.16%)
Mutual labels:  scraping, crawling
Apify Js
Apify SDK β€” The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+1171.77%)
Mutual labels:  scraping, crawling
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-50.4%)
Mutual labels:  scraping, crawling
Colly
Elegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+6164.11%)
Mutual labels:  scraping, crawling
Crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+77.42%)
Mutual labels:  scraping, crawling
go-scrapy
Web crawling and scraping framework for Golang
Stars: ✭ 17 (-93.15%)
Mutual labels:  scraping, crawling
bots-zoo
No description or website provided.
Stars: ✭ 59 (-76.21%)
Mutual labels:  scraping, crawling
diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (-78.63%)
Mutual labels:  scraping, crawling
Grawler
Grawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them in a file.
Stars: ✭ 98 (-60.48%)
Mutual labels:  scraping, crawling
Scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+16973.79%)
Mutual labels:  scraping, crawling
Spidermon
Scrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (+24.6%)
Mutual labels:  scraping, crawling
Dotnetcrawler
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-59.68%)
Mutual labels:  scraping, crawling
feedsearch-crawler
Crawl sites for RSS, Atom, and JSON feeds.
Stars: ✭ 23 (-90.73%)
Mutual labels:  scraping, crawling
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-72.58%)
Mutual labels:  scraping, crawling
Sasila
δΈ€δΈͺ灡活、友ε₯½ηš„ηˆ¬θ™«ζ‘†ζžΆ
Stars: ✭ 286 (+15.32%)
Mutual labels:  scraping, crawling
scrapy-fieldstats
A Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-93.15%)
Mutual labels:  scraping, crawling
Awesome Puppeteer
A curated list of awesome puppeteer resources.
Stars: ✭ 1,728 (+596.77%)
Mutual labels:  scraping, crawling
proxycrawl-python
ProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-79.44%)
Mutual labels:  scraping, crawling
pomp
Screen scraping and web crawling framework
Stars: ✭ 61 (-75.4%)
Mutual labels:  scraping, crawling
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-84.68%)
Mutual labels:  scraping, crawling
Dataflowkit
Extract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+83.87%)
Mutual labels:  scraping, crawling
Ferret
Declarative web scraping
Stars: ✭ 4,837 (+1850.4%)
Mutual labels:  scraping, crawling
crawling-framework
Easily crawl news portals or blog sites using Storm Crawler.
Stars: ✭ 22 (-91.13%)
Mutual labels:  scraping, crawling
Easy Scraping Tutorial
Simple but useful Python web scraping tutorial code.
Stars: ✭ 583 (+135.08%)
Mutual labels:  scraping, crawling
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-79.03%)
Mutual labels:  scraping, crawling
socials
πŸ‘¨β€πŸ‘©β€πŸ‘¦ Social account detection and extraction in Python, e.g. for crawling/scraping.
Stars: ✭ 37 (-85.08%)
Mutual labels:  scraping, crawling
Gopa
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+11.69%)
Mutual labels:  scraping, crawling
Headless Chrome Crawler
Distributed crawler powered by Headless Chrome
Stars: ✭ 5,129 (+1968.15%)
Mutual labels:  scraping, crawling
Lulu
[Unmaintained] A simple and clean video/music/image downloader πŸ‘Ύ
Stars: ✭ 789 (+218.15%)
Mutual labels:  scraping, crawling
Crawler
Go process used to crawl websites
Stars: ✭ 147 (-40.73%)
Mutual labels:  crawling
Idt
Image Dataset Tool (idt) is a cli tool designed to make the otherwise repetitive and slow task of creating image datasets into a fast and intuitive process.
Stars: ✭ 202 (-18.55%)
Mutual labels:  scraping
Fantasy Basketball
Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with genetic algorithm. Capstone Project for Machine Learning Engineer Nanodegree by Udacity.
Stars: ✭ 146 (-41.13%)
Mutual labels:  scraping
Search Engine Parser
Lightweight package to query popular search engines and scrape for result titles, links and descriptions
Stars: ✭ 216 (-12.9%)
Mutual labels:  scraping
Sqrape
Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (-41.94%)
Mutual labels:  scraping
Embed
Get info from any web service or page
Stars: ✭ 1,808 (+629.03%)
Mutual labels:  scraping
Massivedl
Download a large list of files concurrently
Stars: ✭ 141 (-43.15%)
Mutual labels:  crawling
Jsonframe Cheerio
simple multi-level scraper json input/output for Cheerio
Stars: ✭ 196 (-20.97%)
Mutual labels:  scraping
Instagram Bot
An Instagram bot developed using the Selenium Framework
Stars: ✭ 138 (-44.35%)
Mutual labels:  crawling
Educative.io Downloader
πŸ“– This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.
Stars: ✭ 139 (-43.95%)
Mutual labels:  scraping
Scrape Linkedin Selenium
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (-3.63%)
Mutual labels:  scraping
Goose Parser
Universal scrapping tool, which allows you to extract data using multiple environments
Stars: ✭ 211 (-14.92%)
Mutual labels:  scraping
Juriscraper
An API to scrape American court websites for metadata.
Stars: ✭ 194 (-21.77%)
Mutual labels:  scraping
Search Engine Google
πŸ•· Google client for SERPS
Stars: ✭ 138 (-44.35%)
Mutual labels:  scraping
Udemycoursegrabber
Your will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!
Stars: ✭ 137 (-44.76%)
Mutual labels:  scraping
Nutch
Apache Nutch is an extensible and scalable web crawler
Stars: ✭ 2,277 (+818.15%)
Mutual labels:  crawling
Newspaper
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+4555.24%)
Mutual labels:  crawling
Torchbear
πŸ”₯🐻 The Speakeasy Scripting Engine Which Combines Speed, Safety, and Simplicity
Stars: ✭ 128 (-48.39%)
Mutual labels:  scraping
Anime Dl
Anime-dl is a command-line program to download anime from CrunchyRoll and Funimation.
Stars: ✭ 190 (-23.39%)
Mutual labels:  scraping
Bhban rpa
6κ°œμ›” 치 업무λ₯Ό ν•˜λ£¨ λ§Œμ— λλ‚΄λŠ” 업무 μžλ™ν™”(생λŠ₯μΆœνŒμ‚¬, 2020)의 예제 μ½”λ“œμž…λ‹ˆλ‹€. νŒŒμ΄μ¬μ„ ν•œ λ²ˆλ„ λ°°μ›Œλ³Έ 적 μ—†λŠ” 뢄듀을 μœ„ν•œ 예제이며, μ—‘μ…€λΆ€ν„° λ””μžμΈ, 맀크둜, ν¬λ‘€λ§κΉŒμ§€ 업무 μžλ™ν™”μ™€ κ΄€λ ¨λœ λ‹€μ–‘ν•œ λΆ„μ•Ό μ˜ˆμ œκ°€ μ œκ³΅λ©λ‹ˆλ‹€.
Stars: ✭ 124 (-50%)
Mutual labels:  crawling
Scan For Webcams
scan for webcams on the internet
Stars: ✭ 128 (-48.39%)
Mutual labels:  scraping
N2h4
넀이버 λ‰΄μŠ€ μˆ˜μ§‘μ„ μœ„ν•œ 도ꡬ
Stars: ✭ 177 (-28.63%)
Mutual labels:  crawling
Corpuscrawler
Crawler for linguistic corpora
Stars: ✭ 127 (-48.79%)
Mutual labels:  crawling
Jsoup Annotations
Jsoup Annotations POJO
Stars: ✭ 242 (-2.42%)
Mutual labels:  scraping
Cdp4j
cdp4j - Chrome DevTools Protocol for Java
Stars: ✭ 232 (-6.45%)
Mutual labels:  crawling
Thal
Getting started with Puppeteer and Chrome Headless for Web Scraping
Stars: ✭ 2,345 (+845.56%)
Mutual labels:  scraping
Squidwarc
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Stars: ✭ 125 (-49.6%)
Mutual labels:  crawling
Htmlsql
htmlSQL is a experimental PHP library which allows you to access HTML values by an SQL like syntax.
Stars: ✭ 120 (-51.61%)
Mutual labels:  scraping
1-60 of 275 similar projects