aws-pdf-textract-pipeline🔍 Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript
Stars: ✭ 141 (+642.11%)
anime-scraper[partially working] Scrape and add anime episode stream URLs to uGet (Linux) or IDM (Windows) ~ Python3
Stars: ✭ 21 (+10.53%)
browser-automation-apiBrowser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other things like capture a screenshot, generate pdf, extract content or execute custom Puppeteer, Playwright functions.
Stars: ✭ 24 (+26.32%)
SoupWeb Scraper in Go, similar to BeautifulSoup
Stars: ✭ 1,685 (+8768.42%)
Mimo-CrawlerA web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.
Stars: ✭ 22 (+15.79%)
CoWin-Vaccine-NotifierAutomated Python Script to retrieve vaccine slots availability and get notified when a slot is available.
Stars: ✭ 102 (+436.84%)
super-anime-downloaderA program which takes an Anime name or URL and downloads the specified range of episodes.
Stars: ✭ 26 (+36.84%)
fBrowserHelpful Selenium functions to make web-scraping easier and faster
Stars: ✭ 16 (-15.79%)
gotorThis program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API.
Stars: ✭ 97 (+410.53%)
VideoRecognition-realtime-autotrainer-alertsState of the art object detection in real-time using YOLOV3 algorithm. Augmented with a process that allows easy training of the classifier as a plug & play solution . Provides alert if an item in an alert list is detected.
Stars: ✭ 36 (+89.47%)
supervised-machine-learningThis repo contains regression and classification projects. Examples: development of predictive models for comments on social media websites; building classifiers to predict outcomes in sports competitions; churn analysis; prediction of clicks on online ads; analysis of the opioids crisis and an analysis of retail store expansion strategies using…
Stars: ✭ 34 (+78.95%)
opensea-scraperScrapes nft floor prices and additional information from opensea. Used for https://nftfloorprice.info
Stars: ✭ 129 (+578.95%)
gcf-packsLibrary packs for google cloud functions
Stars: ✭ 48 (+152.63%)
Cezanne🎣 Serverless Image Generation Function
Stars: ✭ 29 (+52.63%)
newspaperjsNews extraction and scraping. Article Parsing
Stars: ✭ 59 (+210.53%)
phantom-lordHandy API for Headless Chromium
Stars: ✭ 24 (+26.32%)
konadlMultithreaded Konachan / Yandere (moebooru based site) Image Bulk Downloader | 多线程K站Y站下载器
Stars: ✭ 64 (+236.84%)
puppeteer-reportConvert HTML to PDF by Puppeteer with support of adding a custom header, footer, and page number
Stars: ✭ 90 (+373.68%)
puppeteer-instaquoteUse Puppeteer to create snazzy Instagram-like quote images and memes
Stars: ✭ 20 (+5.26%)
PikachuYummy Recipe Crawler and Search
Stars: ✭ 50 (+163.16%)
apolloA Unix-style personal search engine and web crawler for your digital footprint.
Stars: ✭ 1,270 (+6584.21%)
Android-Web-ScraperAndroid Web Scraper is a simple library for android web automation. You can perform web task in background to fetch website data programmatically.
Stars: ✭ 38 (+100%)
webring“วงแหวนเว็บ” แห่งนี้สร้างขึ้นเพื่อส่งเสริมให้ศิลปิน นักออกแบบ และนักพัฒนาชาวไทย สร้างเว็บไซต์ของตัวเองและแบ่งปันการเข้าชมซึ่งกันและกัน
Stars: ✭ 125 (+557.89%)
nest-puppeteerPuppeteer (Headless Chrome) provider for Nest.js
Stars: ✭ 68 (+257.89%)
pappetA command-line tool to crawl websites using puppeteer.
Stars: ✭ 95 (+400%)
Puppeteer-IEHeadless Internet Explorer NodeJS API inspired by Puppeteer
Stars: ✭ 72 (+278.95%)
xstate-marionettistModel based testing with Jest, XState and Puppeteer or Playwright made easy
Stars: ✭ 23 (+21.05%)
site-audit-seoWeb service and CLI tool for SEO site audit: crawl site, lighthouse all pages, view public reports in browser. Also output to console, json, csv, xlsx, Google Drive.
Stars: ✭ 91 (+378.95%)
repository.colossusColossus Repository for Kodi Addons - Kodi is a registered trademark of the XBMC Foundation. We are not connected to or in any other way affiliated with Kodi - DMCA:
[email protected] Stars: ✭ 13 (-31.58%)
RecorderA browser extension that generates Cypress, Playwright and Puppeteer test scripts from your interactions 🖱 ⌨
Stars: ✭ 277 (+1357.89%)
scrapisma work-in-progress guide to web scraping as an artistic and critical practice
Stars: ✭ 43 (+126.32%)
servicesHolder of multiple npm packages
Stars: ✭ 31 (+63.16%)
php-puppeteerPHP Wrapper of Google Chrome Puppeteer for PDF Generation
Stars: ✭ 24 (+26.32%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+273.68%)
spiderA web spider framework
Stars: ✭ 25 (+31.58%)
naos📉 Uptime and error monitoring CLI
Stars: ✭ 30 (+57.89%)
screenie-serverA Node server with a pool of Puppeteer (Chrome headless) instances for scalable screenshot generation.
Stars: ✭ 19 (+0%)
extractnetA Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+173.68%)
image-crawlerAn image scraper that scraps images from unsplash.com
Stars: ✭ 12 (-36.84%)
teesUniversal test framework for front-end with WebDriver, Puppeteer and Enzyme
Stars: ✭ 23 (+21.05%)
gafanhotoBot para monitoramento de promoções no fórum do Hardmob http://www.hardmob.com.br/promocoes/
Stars: ✭ 48 (+152.63%)
puppeteer-assetsMeasuring and monitor assets metrics using Puppeteer and Prometheus
Stars: ✭ 29 (+52.63%)
lab-assistantA tool to measure performance deltas between two versions of a site
Stars: ✭ 20 (+5.26%)
irProjeto de calculo de Imposto de Renda em operacoes na bovespa automaticamente. Tags:canal eletronico do investidor, CEI, selenium, bovespa, IRPF, IR, imposto de renda, finance, yahoo finance, acao, fii, etf, python, crawler, webscraping, calculadora ir
Stars: ✭ 120 (+531.58%)
requestsRR interface to Python requests module
Stars: ✭ 12 (-36.84%)
chesfCHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-5.26%)
Email-Crawler-Lead-GeneratorThis email crawler will visit all pages of a provided website and parse and save emails found to a csv file.
Stars: ✭ 47 (+147.37%)
PacPawPawn package manager for SA-MP
Stars: ✭ 14 (-26.32%)
newsembleAPI for fetching data from news websites.
Stars: ✭ 42 (+121.05%)