IdtImage Dataset Tool (idt) is a cli tool designed to make the otherwise repetitive and slow task of creating image datasets into a fast and intuitive process.
Stars: ✭ 202 (-18.55%)
Email ExtractorThe main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (-67.34%)
SqrapeSimple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)
Stars: ✭ 144 (-41.94%)
Search Engine ParserLightweight package to query popular search engines and scrape for result titles, links and descriptions
Stars: ✭ 216 (-12.9%)
NewcrawlerFree Web Scraping Tool with Java
Stars: ✭ 589 (+137.5%)
Api StoreContains all the public APIs listed in Phantombuster's API store. Pull requests welcome!
Stars: ✭ 69 (-72.18%)
MassivedlDownload a large list of files concurrently
Stars: ✭ 141 (-43.15%)
TorrengoTorrengo is a CLI (command line) program written in Go which concurrently searches torrents from various sources.
Stars: ✭ 67 (-72.98%)
Jsonframe Cheeriosimple multi-level scraper json input/output for Cheerio
Stars: ✭ 196 (-20.97%)
Educative.io Downloader📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.
Stars: ✭ 139 (-43.95%)
Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (-3.63%)
Artooartoo.js - the client-side scraping companion.
Stars: ✭ 1,029 (+314.92%)
UdemycoursegrabberYour will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [insert big number here] websites, compares the last updated date, and gives you the download link of the latest one back, but you also have the choice to see the other ones as well!
Stars: ✭ 137 (-44.76%)
NutchApache Nutch is an extensible and scalable web crawler
Stars: ✭ 2,277 (+818.15%)
PypatentSearch for and retrieve US Patent and Trademark Office Patent Data
Stars: ✭ 31 (-87.5%)
Torchbear🔥🐻 The Speakeasy Scripting Engine Which Combines Speed, Safety, and Simplicity
Stars: ✭ 128 (-48.39%)
Instagram ScraperScrape the Instagram frontend. Inspired from twitter-scraper by @kennethreitz.
Stars: ✭ 903 (+264.11%)
Cdp4jcdp4j - Chrome DevTools Protocol for Java
Stars: ✭ 232 (-6.45%)
ThalGetting started with Puppeteer and Chrome Headless for Web Scraping
Stars: ✭ 2,345 (+845.56%)
HtmlsqlhtmlSQL is a experimental PHP library which allows you to access HTML values by an SQL like syntax.
Stars: ✭ 120 (-51.61%)
TabulaTabula is a tool for liberating data tables trapped inside PDF files
Stars: ✭ 5,420 (+2085.48%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (+156.85%)
N2h4네이버 뉴스 수집을 위한 도구
Stars: ✭ 177 (-28.63%)
ParselParsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Stars: ✭ 628 (+153.23%)
SquidwarcSquidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Stars: ✭ 125 (-49.6%)
Od DatabaseDistributed crawler, database and web frontend for public directories indexing
Stars: ✭ 121 (-51.21%)
Scrapy SeleniumScrapy middleware to handle javascript pages using selenium
Stars: ✭ 550 (+121.77%)
Gazpacho🥫 The simple, fast, and modern web scraping library
Stars: ✭ 525 (+111.69%)
OjTools for various online judges. Downloading sample cases, generating additional test cases, testing your code, and submitting it.
Stars: ✭ 517 (+108.47%)
Facebook data analyzerAnalyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more
Stars: ✭ 515 (+107.66%)
SouqscraperSimple scriptes for Level UP your scraping Skills, and source code for Level UP playlist on Youtube
Stars: ✭ 118 (-52.42%)
Facebook ScraperScrape Facebook public pages without an API key
Stars: ✭ 499 (+101.21%)
NickjsWeb scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)
Stars: ✭ 494 (+99.19%)
TransistorTransistor, a Python web scraping framework for intelligent use cases.
Stars: ✭ 205 (-17.34%)
Requests HtmlPythonic HTML Parsing for Humans™
Stars: ✭ 12,268 (+4846.77%)
SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-52.82%)
Geeksforgeeks.pdfTopic wise PDFs of Geeks for Geeks articles. (Last updated in October 2018)
Stars: ✭ 489 (+97.18%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+87.1%)
Holiday Cn📅🇨🇳 中国法定节假日数据 自动每日抓取国务院公告
Stars: ✭ 157 (-36.69%)
WebmagicA scalable web crawler framework for Java.
Stars: ✭ 10,186 (+4007.26%)
MechanizeMechanize is a ruby library that makes automated web interaction easy.
Stars: ✭ 4,158 (+1576.61%)
Skycaiji蓝天采集器是一款免费的数据采集发布爬虫软件,采用php+mysql开发,可部署在云服务器,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统
Stars: ✭ 1,514 (+510.48%)
Isp Data PollutionISP Data Pollution to Protect Private Browsing History with Obfuscation
Stars: ✭ 425 (+71.37%)
JekyllJekyll-based static site for The Programming Historian
Stars: ✭ 387 (+56.05%)
Scrapysharpreborn of https://bitbucket.org/rflechner/scrapysharp
Stars: ✭ 226 (-8.87%)
PantherA browser testing and web crawling library for PHP and Symfony
Stars: ✭ 2,480 (+900%)
Secret AgentThe web browser that's built for scraping.
Stars: ✭ 151 (-39.11%)
Laravel Bank StatementsLaravel package to collect your bank statements history. Currently support for parsing statements history from BCA, Mandiri, BNI, and MUAMALAT e-banking websites.
Stars: ✭ 105 (-57.66%)
LookylooLookyloo is a web interface that allows users to capture a website page and then display a tree of domains that call each other.
Stars: ✭ 381 (+53.63%)
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Stars: ✭ 365 (+47.18%)
D4n155OWASP D4N155 - Intelligent and dynamic wordlist using OSINT
Stars: ✭ 105 (-57.66%)
Data ScienceCollection of useful data science topics along with code and articles
Stars: ✭ 315 (+27.02%)
CoronadatascraperCOVID-19 Coronavirus data scraped from government and curated data sources.
Stars: ✭ 372 (+50%)
XqueryExtract data or evaluate value from HTML/XML documents using XPath
Stars: ✭ 155 (-37.5%)