Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-96.46%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (-90.9%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-98.95%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-98.78%)
ScrapyScrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+775.4%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (-83.69%)
CollyElegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+221.17%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (-15.71%)
Skrape.itA Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Stars: ✭ 231 (-95.22%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-98.92%)
BatA cat(1) clone with wings.
Stars: ✭ 30,833 (+537.44%)
diffbot-php-client[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (-98.9%)
Phpinsights🔰 Instant PHP quality checks from your console
Stars: ✭ 4,442 (-8.17%)
NewspaperNews, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+138.68%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-97.93%)
D4n155OWASP D4N155 - Intelligent and dynamic wordlist using OSINT
Stars: ✭ 105 (-97.83%)
AntchAntch, a fast, powerful and extensible web crawling & scraping framework for Go
Stars: ✭ 198 (-95.91%)
Goose ParserUniversal scrapping tool, which allows you to extract data using multiple environments
Stars: ✭ 211 (-95.64%)
Lambdacda library to define a continuous delivery pipeline in code
Stars: ✭ 655 (-86.46%)
HeadlesschromeA Go package for working with headless Chrome. Run interactive JavaScript commands on web pages with Go and Chrome.
Stars: ✭ 112 (-97.68%)
FselectFind files with SQL-like queries
Stars: ✭ 3,103 (-35.85%)
Search Engine ParserLightweight package to query popular search engines and scrape for result titles, links and descriptions
Stars: ✭ 216 (-95.53%)
scrapmanRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (-99.57%)
MusoqUse SQL on various data sources
Stars: ✭ 252 (-94.79%)
LeetCodeAt present contains scraped data from around 1500 problems present on the site. More to follow....
Stars: ✭ 45 (-99.07%)
NpkillList any node_modules directories in your system, as well as the space they take up. You can then select which ones you want to erase to free up space.
Stars: ✭ 5,325 (+10.09%)
Emutomanipulate JSON files
Stars: ✭ 180 (-96.28%)
Mod PbxprojA python module to manipulate XCode projects
Stars: ✭ 959 (-80.17%)
gochanges**[ARCHIVED]** website changes tracker 🔍
Stars: ✭ 12 (-99.75%)
Instascrape🚀 A fast and lightweight utility and Python library for downloading posts, stories, and highlights from Instagram.
Stars: ✭ 76 (-98.43%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-99.71%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-99.69%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-94.27%)
GeziyorGeziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
Stars: ✭ 1,246 (-74.24%)
SquidwarcSquidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Stars: ✭ 125 (-97.42%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (-86.83%)
JvppeteerHeadless Chrome For Java (Java 爬虫)
Stars: ✭ 193 (-96.01%)
SpidermonScrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (-93.61%)
CosmosHacktoberfest 2021 | World's largest Contributor driven code dataset | Algorithms that run our universe | Your personal library of every algorithm and data structure code that you will ever encounter |
Stars: ✭ 12,936 (+167.44%)
Instagram-to-discordMonitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (-97.66%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (-94.09%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (-90.57%)
TeachcodeA tool to develop and improve a student’s programming skills by introducing the earliest lessons of coding.
Stars: ✭ 325 (-93.28%)
KiimagepagerThe KIImagePager is inspired by foursquare's ImageSlideshow, the user may scroll through images loaded from the Web
Stars: ✭ 324 (-93.3%)
LaunchpadAn open-source game launcher for your games
Stars: ✭ 322 (-93.34%)
GraphbackGraphback - Out of the box GraphQL server and client
Stars: ✭ 323 (-93.32%)
FdA simple, fast and user-friendly alternative to 'find'
Stars: ✭ 19,851 (+310.4%)
Ack3ack is a grep-like search tool optimized for source code.
Stars: ✭ 330 (-93.18%)
Super ProductivityTo-do list & time tracker for programmers and other digital workers with Jira, Github, and Gitlab integration
Stars: ✭ 4,505 (-6.86%)
Xcrawler快速、简洁且强大的PHP爬虫框架
Stars: ✭ 344 (-92.89%)
AskqlAskQL is a query language that can express any data request
Stars: ✭ 352 (-92.72%)
HorusecHorusec is an open source tool that improves identification of vulnerabilities in your project with just one command.
Stars: ✭ 311 (-93.57%)
XidelCommand line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
Stars: ✭ 335 (-93.07%)
Freshonions TorscraperFresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion
Stars: ✭ 348 (-92.81%)
KatanaA Python Tool For google Hacking
Stars: ✭ 355 (-92.66%)
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Stars: ✭ 365 (-92.45%)
JqlA JSON Query Language CLI tool
Stars: ✭ 368 (-92.39%)