document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-33.33%)
my blog在 Issues 中建立的个人博客
Stars: ✭ 28 (+33.33%)
kuwalaKuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+2157.14%)
internet-affordability🌍 Dataset that shows the Internet affordability by country (a shocking reality!)
Stars: ✭ 13 (-38.1%)
ogpParserOpen Graph Protocol Parser for Node.js
Stars: ✭ 43 (+104.76%)
image-collectorDownload images from Google Image Search
Stars: ✭ 38 (+80.95%)
theano-recurrenceRecurrent Neural Networks (RNN, GRU, LSTM) and their Bidirectional versions (BiRNN, BiGRU, BiLSTM) for word & character level language modelling in Theano
Stars: ✭ 40 (+90.48%)
flying-appleJust to keep track of nice content and new announcements related to Apple products and Swift
Stars: ✭ 45 (+114.29%)
arcreactoropen-source intelligence gathering for SIEMs <3
Stars: ✭ 36 (+71.43%)
Hi-Blogs嗨博客 ASP.NET Core2.0 + CentOS7.3 + MySql5.6.37 + Redis + nginx1.12.1
Stars: ✭ 86 (+309.52%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+238.1%)
shupA POSIX shell script to parse HTML
Stars: ✭ 28 (+33.33%)
scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (+80.95%)
scrapy-zyte-smartproxyZyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy
Stars: ✭ 317 (+1409.52%)
classifai🔥 One of the most comprehensive open-source data annotation platform.
Stars: ✭ 99 (+371.43%)
naos📉 Uptime and error monitoring CLI
Stars: ✭ 30 (+42.86%)
go-scrapyWeb crawling and scraping framework for Golang
Stars: ✭ 17 (-19.05%)
GitBlogs基于 GitHub 的个人博客
Stars: ✭ 20 (-4.76%)
rubiumRubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby
Stars: ✭ 65 (+209.52%)
top-github-scraperScape top GitHub repositories and users based on keywords
Stars: ✭ 40 (+90.48%)
codeprepA toolkit for pre-processing large source code corpora
Stars: ✭ 39 (+85.71%)
LNEx📍 🏢 🏦 🏣 🏪 🏬 LNEx: Location Name Extractor
Stars: ✭ 21 (+0%)
Captcha-ToolsAll-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (+9.52%)
mozolmMozoLM: A language model (LM) serving library
Stars: ✭ 32 (+52.38%)
scavengerScrape and take screenshots of dynamic and static webpages
Stars: ✭ 14 (-33.33%)
dmi-instascraperA GUI for Instaloader to scrape users and hashtags with on Instagram
Stars: ✭ 21 (+0%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (+52.38%)
humanparserParse a human name string into salutation, first name, middle name, last name, suffix.
Stars: ✭ 78 (+271.43%)
xforms-specThe XForms-derived specification used in the ODK ecosystem. If you are interested in building a tool that is compliant with the forms rendered by ODK tools, this is the place to start. ✨⚒✨
Stars: ✭ 27 (+28.57%)
chirpsTwitter bot powering @arichduvet
Stars: ✭ 35 (+66.67%)
gunaydinYour good mornings ☀️
Stars: ✭ 16 (-23.81%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-28.57%)
chesfCHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-14.29%)
Scraper-Projects🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (+19.05%)
dustArchive web pages with all relevant assets or save as a single file HTML
Stars: ✭ 19 (-9.52%)
spring-asyncAsynchronous REST call with DeferredResult
Stars: ✭ 50 (+138.1%)
web-clipperEasily download the main content of a web page in html, markdown, and/or epub format from command line.
Stars: ✭ 15 (-28.57%)
rnn darts fastaiImplement Differentiable Architecture Search (DARTS) for RNN with fastai
Stars: ✭ 21 (+0%)
TorScrapperA Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)
Stars: ✭ 24 (+14.29%)
lingua-go👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+3157.14%)
AngleParseHTML parsing and processing tool for PowerShell.
Stars: ✭ 35 (+66.67%)
pompScreen scraping and web crawling framework
Stars: ✭ 61 (+190.48%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+147.62%)
subscene scraperLibrary to download subtitles from subscene.com
Stars: ✭ 14 (-33.33%)
deepblastNeural Networks for Protein Sequence Alignment
Stars: ✭ 29 (+38.1%)
blog3.0博客V3.0 目前使用的技术(Nuxtjs + Nestjs + Vue + Element ui + vuetify),存储(MongoDB + Redis + COS)
Stars: ✭ 37 (+76.19%)
sg-food-mlThis script is used to scrap images from the Internet to classify 5 common noodle "mee" dishes in Singapore. Wanton Mee, Bak Chor Mee, Lor Mee, Prawn Mee and Mee Siam.
Stars: ✭ 18 (-14.29%)
torchestratorSpin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (+52.38%)
agoutiA platform for collective blogs and social media platform, forum, question and answer service. Catalog of sites (programs), site navigation and directories - facets. A community based on the PHP HLEB micro-framework.
Stars: ✭ 36 (+71.43%)
ZeiverA Scraper, Downloader, & Recorder for static open directories.
Stars: ✭ 14 (-33.33%)
android-amap-track-collect这阵子由于项目需要,需要从手机上采集用户的运动轨迹数据,这样的功能大家都见到的很多了,比如咕咚、悦动圈,对跑步运动轨迹数据进行采集,再如,微信运动、钉钉运动,对于每一天你走步进行计数,如果要记录轨迹就离不开的手机定位,如果要记录步数那就离不开陀螺仪(角速度传感器),花了一天多的时间实现了一个定位数据实时采集的功能。
Stars: ✭ 50 (+138.1%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (+4.76%)
ferendaTransform unstructured document collections to structured Linked Data
Stars: ✭ 22 (+4.76%)