Search Engine ParserLightweight package to query popular search engines and scrape for result titles, links and descriptions
Stars: ✭ 216 (+764%)
Api StoreContains all the public APIs listed in Phantombuster's API store. Pull requests welcome!
Stars: ✭ 69 (+176%)
SerpscrapSEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
Stars: ✭ 153 (+512%)
MechamlOCaml functional web scraping library
Stars: ✭ 60 (+140%)
MusoqUse SQL on various data sources
Stars: ✭ 252 (+908%)
MtntCode for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (+92%)
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+492%)
CollyElegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+62040%)
ConfigsPublic, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores
Stars: ✭ 37 (+48%)
SqrapeSimple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (+476%)
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (+3584%)
google-scraperThis class can retrieve search results from Google.
Stars: ✭ 33 (+32%)
WebhereHTML scraping for Objective-C.
Stars: ✭ 16 (-36%)
Educative.io Downloader📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.
Stars: ✭ 139 (+456%)
Imagescraper✂️ High performance, multi-threaded image scraper
Stars: ✭ 630 (+2420%)
TransistorTransistor, a Python web scraping framework for intelligent use cases.
Stars: ✭ 205 (+720%)
NewcrawlerFree Web Scraping Tool with Java
Stars: ✭ 589 (+2256%)
UdemycoursegrabberYour will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!
Stars: ✭ 137 (+448%)
TabulaTabula is a tool for liberating data tables trapped inside PDF files
Stars: ✭ 5,420 (+21580%)
Loconotion📄 Python tool to turn Notion.so pages into lightweight, customizable static websites
Stars: ✭ 237 (+848%)
Gazpacho🥫 The simple, fast, and modern web scraping library
Stars: ✭ 525 (+2000%)
Facebook data analyzerAnalyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more
Stars: ✭ 515 (+1960%)
GooglescraperA Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
Stars: ✭ 2,363 (+9352%)
NickjsWeb scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)
Stars: ✭ 494 (+1876%)
Od DatabaseDistributed crawler, database and web frontend for public directories indexing
Stars: ✭ 121 (+384%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+19248%)
tvseriesTV Series is a tool that scrapes Episode Synopsis' of popular TV Series' from websites like Wikipedia / IMDb and show in one place with a user-friendly navigation UI.
Stars: ✭ 37 (+48%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+1724%)
SouqscraperSimple scriptes for Level UP your scraping Skills, and source code for Level UP playlist on Youtube
Stars: ✭ 118 (+372%)
JekyllJekyll-based static site for The Programming Historian
Stars: ✭ 387 (+1448%)
IdtImage Dataset Tool (idt) is a cli tool designed to make the otherwise repetitive and slow task of creating image datasets into a fast and intuitive process.
Stars: ✭ 202 (+708%)
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Stars: ✭ 365 (+1360%)
ScrapyScrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+169272%)
CoronadatascraperCOVID-19 Coronavirus data scraped from government and curated data sources.
Stars: ✭ 372 (+1388%)
ReaperSocial media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Stars: ✭ 240 (+860%)
Comic DlComic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.
Stars: ✭ 365 (+1360%)
Laravel Bank StatementsLaravel package to collect your bank statements history. Currently support for parsing statements history from BCA, Mandiri, BNI, and MUAMALAT e-banking websites.
Stars: ✭ 105 (+320%)
SocialreaperSocial media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Stars: ✭ 338 (+1252%)
Jsonframe Cheeriosimple multi-level scraper json input/output for Cheerio
Stars: ✭ 196 (+684%)
Tinking🧶 Extract data from any website without code, just clicks.
Stars: ✭ 331 (+1224%)
Languagepod101 ScraperPython scraper for Language Pods such as Japanesepod101.com 👹 🗾 🍣 Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
Stars: ✭ 104 (+316%)
SpidermonScrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (+1136%)
ChampA Telegram bot combined with python to serve some basic functions like weather, music charts, cricket score and much more.
Stars: ✭ 22 (-12%)
Elixir ScrapeScrape any website, article or RSS/Atom Feed with ease!
Stars: ✭ 306 (+1124%)
GrawlerGrawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them in a file.
Stars: ✭ 98 (+292%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (+1044%)
Anime DlAnime-dl is a command-line program to download anime from CrunchyRoll and Funimation.
Stars: ✭ 190 (+660%)
HumanoidNode.js package to bypass CloudFlare's anti-bot JavaScript challenges
Stars: ✭ 88 (+252%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+1008%)
Scrapysharpreborn of https://bitbucket.org/rflechner/scrapysharp
Stars: ✭ 226 (+804%)
github-languagesTiny little ruby on rails website that crawls though your public github repos to find out what your favourite languages are.
Stars: ✭ 23 (-8%)
Whatsapp-NetGenerate a network graph of connections from your WhatsApp groups data
Stars: ✭ 75 (+200%)
List Of User AgentsList of major web + mobile browser user agent strings. +1 Bonus script to scrape :)
Stars: ✭ 247 (+888%)
ArachnidCrawl all unique internal links found on a given website, and extract SEO related information - supports javascript based sites
Stars: ✭ 224 (+796%)
Requests HtmlPythonic HTML Parsing for Humans™
Stars: ✭ 12,268 (+48972%)