AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+2045.79%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (+62.63%)
Captcha-ToolsAll-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (-87.89%)
KatanaA Python Tool For google Hacking
Stars: ✭ 355 (+86.84%)
Instagram-to-discordMonitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (-40.53%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-72.63%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-68.95%)
facebook-discussion-tkA collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.
Stars: ✭ 33 (-82.63%)
PypatentSearch for and retrieve US Patent and Trademark Office Patent Data
Stars: ✭ 31 (-83.68%)
Api StoreContains all the public APIs listed in Phantombuster's API store. Pull requests welcome!
Stars: ✭ 69 (-63.68%)
MalScraperScrape everything you can from MyAnimeList.net
Stars: ✭ 132 (-30.53%)
scrapmanRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (-88.95%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-73.16%)
TorScrapperA Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)
Stars: ✭ 24 (-87.37%)
ZeiverA Scraper, Downloader, & Recorder for static open directories.
Stars: ✭ 14 (-92.63%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+1560%)
Scraper-Projects🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (-86.84%)
NickjsWeb scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)
Stars: ✭ 494 (+160%)
OjTools for various online judges. Downloading sample cases, generating additional test cases, testing your code, and submitting it.
Stars: ✭ 517 (+172.11%)
AnitopAnitop is an unofficial simple API from https://anitrendz.net/ site
Stars: ✭ 30 (-84.21%)
Imagescraper✂️ High performance, multi-threaded image scraper
Stars: ✭ 630 (+231.58%)
RodA Devtools driver for web automation and scraping
Stars: ✭ 1,392 (+632.63%)
SillyniumAutomate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elements
Stars: ✭ 100 (-47.37%)
Awesome PuppeteerA curated list of awesome puppeteer resources.
Stars: ✭ 1,728 (+809.47%)
diffbot-php-client[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (-72.11%)
scrapersscrapers for building your own image databases
Stars: ✭ 46 (-75.79%)
ha-multiscrapeHome Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (-45.79%)
gochanges**[ARCHIVED]** website changes tracker 🔍
Stars: ✭ 12 (-93.68%)
anime-scraper[partially working] Scrape and add anime episode stream URLs to uGet (Linux) or IDM (Windows) ~ Python3
Stars: ✭ 21 (-88.95%)
copycatA PHP Scraping Class
Stars: ✭ 70 (-63.16%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-92.63%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-92.11%)
scraperNodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
Stars: ✭ 37 (-80.53%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-88.42%)
Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-10%)
kaa.si-cliStream anime from kaa.si and sync with anilist
Stars: ✭ 12 (-93.68%)
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (-22.11%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+2445.79%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+140%)
JikanUnofficial MyAnimeList PHP+REST API which provides functions other than the official API
Stars: ✭ 531 (+179.47%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+131.58%)
HuginnCreate agents that monitor and act on your behalf. Your agents are standing by!
Stars: ✭ 33,694 (+17633.68%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+315.26%)
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Stars: ✭ 365 (+92.11%)
GrawlerGrawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them in a file.
Stars: ✭ 98 (-48.42%)
GeziyorGeziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
Stars: ✭ 1,246 (+555.79%)
SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-38.42%)
Email ExtractorThe main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (-57.37%)
Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+25.79%)
google-scraperThis class can retrieve search results from Google.
Stars: ✭ 33 (-82.63%)
Comic DlComic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.
Stars: ✭ 365 (+92.11%)
Spam Bot 3000Social media research and promotion, semi-autonomous CLI bot
Stars: ✭ 79 (-58.42%)
UdemycoursegrabberYour will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!
Stars: ✭ 137 (-27.89%)
SerpscrapSEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
Stars: ✭ 153 (-19.47%)