scrapmanRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (+50%)
anime-scraper[partially working] Scrape and add anime episode stream URLs to uGet (Linux) or IDM (Windows) ~ Python3
Stars: ✭ 21 (+50%)
browser-automation-apiBrowser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other things like capture a screenshot, generate pdf, extract content or execute custom Puppeteer, Playwright functions.
Stars: ✭ 24 (+71.43%)
video-subtitle-extractor视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
Stars: ✭ 1,763 (+12492.86%)
torchestratorSpin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (+128.57%)
yttrexyoutube & tiktok analysis + youchoose recommendation custmizer. backend, extensions, and tooling
Stars: ✭ 31 (+121.43%)
chesfCHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (+28.57%)
covid19br-pubProjeto de monitoramento de publicações oficiais relacionadas a COVID-19 no Brasil.
Stars: ✭ 12 (-14.29%)
scrapScrapping Facebook with JavaScript.
Stars: ✭ 25 (+78.57%)
htmltabCommand-line utility to convert HTML tables into CSV files
Stars: ✭ 13 (-7.14%)
youtube toolTool for extracting comments or subtitles from youtube video's
Stars: ✭ 89 (+535.71%)
sg-food-mlThis script is used to scrap images from the Internet to classify 5 common noodle "mee" dishes in Singapore. Wanton Mee, Bak Chor Mee, Lor Mee, Prawn Mee and Mee Siam.
Stars: ✭ 18 (+28.57%)
libgosubsgolang library to read and write various subtitle formats
Stars: ✭ 20 (+42.86%)
gunaydinYour good mornings ☀️
Stars: ✭ 16 (+14.29%)
suboxSubox是一个基于 Electron 开发的以媒体资源文件为基础的字幕搜索桌面应用。可根据设定的搜索目录和忽略路径索引所有可播放的资源文件,并且以文件名为基础索引字幕文件或者辅助搜索字幕文件并下载。
Stars: ✭ 17 (+21.43%)
scavengerScrape and take screenshots of dynamic and static webpages
Stars: ✭ 14 (+0%)
selectorlibA library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
Stars: ✭ 53 (+278.57%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (+128.57%)
subedSubtitle editor for Emacs
Stars: ✭ 143 (+921.43%)
PersianSubtitleFixerFix Arabic and Persian subtitles by converting them into UTF-8
Stars: ✭ 25 (+78.57%)
oversmashOverwatch API library for player details and career stats
Stars: ✭ 42 (+200%)
Elgindy-VTT-to-SRT-Subtitle-ConverterA tool for converting Web Video Text Tracks Format (WebVTT) subtitle to srt one. As most of video players support srt subtitles and can't open vtt subtitles, We should convert vtt to srt or sub subtitles but it's not easy to do that.
Stars: ✭ 68 (+385.71%)
iowebWeb Scraping Framework
Stars: ✭ 31 (+121.43%)
srtmergersubtitle merger is a tool for merging two or more subtitles for videos.
Stars: ✭ 35 (+150%)
TV4DialogNo description or website provided.
Stars: ✭ 33 (+135.71%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (+21.43%)
ogpParserOpen Graph Protocol Parser for Node.js
Stars: ✭ 43 (+207.14%)
ha-multiscrapeHome Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (+635.71%)
scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (+171.43%)
zcrawlAn open source web crawling platform
Stars: ✭ 21 (+50%)
ksoupKotlin Wrapper for Jsoup
Stars: ✭ 59 (+321.43%)
ferendaTransform unstructured document collections to structured Linked Data
Stars: ✭ 22 (+57.14%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+407.14%)
puppeteer-botcheck🕵♂ Bot detection tests for Puppeteer. Hide and seek!
Stars: ✭ 42 (+200%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (+0%)
ScrappingMastering the art of scrapping 🎓
Stars: ✭ 24 (+71.43%)
crawling-frameworkEasily crawl news portals or blog sites using Storm Crawler.
Stars: ✭ 22 (+57.14%)
Captcha-ToolsAll-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (+64.29%)
copycatA PHP Scraping Class
Stars: ✭ 70 (+400%)
docker-selenium-lambdaThe simplest demo of chrome automation by python and selenium in AWS Lambda
Stars: ✭ 172 (+1128.57%)
go-scrapyWeb crawling and scraping framework for Golang
Stars: ✭ 17 (+21.43%)
node-red-contrib-nbrowserProvides a virtual web browser (a.k.a. "headless browser") appearing as a node.
Stars: ✭ 31 (+121.43%)
InstaBotSimple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (+128.57%)
internet-affordability🌍 Dataset that shows the Internet affordability by country (a shocking reality!)
Stars: ✭ 13 (-7.14%)
asyncio-hnPython (asyncio) wrapper for hackernews api
Stars: ✭ 27 (+92.86%)
shorter.recipesA website dedicated to making recipes from any website easy to read.
Stars: ✭ 27 (+92.86%)
rubiumRubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby
Stars: ✭ 65 (+364.29%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+264.29%)
koustMoviePlayerkoustMoviePlayer is similar netflix player. Almost , available all features on koustMoviePlayer
Stars: ✭ 17 (+21.43%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+271.43%)
Instagram-to-discordMonitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (+707.14%)