All Projects → Babler → Similar Projects or Alternatives

340 Open source projects that are alternatives of or similar to Babler

document-dl
Command line program to download documents from web portals.
Stars: ✭ 14 (-33.33%)
Mutual labels:  scraping
my blog
在 Issues 中建立的个人博客
Stars: ✭ 28 (+33.33%)
Mutual labels:  blogs
kuwala
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+2157.14%)
Mutual labels:  scraping
internet-affordability
🌍 Dataset that shows the Internet affordability by country (a shocking reality!)
Stars: ✭ 13 (-38.1%)
Mutual labels:  scraping
ogpParser
Open Graph Protocol Parser for Node.js
Stars: ✭ 43 (+104.76%)
Mutual labels:  scraping
image-collector
Download images from Google Image Search
Stars: ✭ 38 (+80.95%)
Mutual labels:  scraping
theano-recurrence
Recurrent Neural Networks (RNN, GRU, LSTM) and their Bidirectional versions (BiRNN, BiGRU, BiLSTM) for word & character level language modelling in Theano
Stars: ✭ 40 (+90.48%)
Mutual labels:  language-modeling
data-collection-ios
Mobile data collection app using the iOS Runtime SDK.
Stars: ✭ 24 (+14.29%)
Mutual labels:  data-collection
flying-apple
Just to keep track of nice content and new announcements related to Apple products and Swift
Stars: ✭ 45 (+114.29%)
Mutual labels:  blogs
arcreactor
open-source intelligence gathering for SIEMs <3
Stars: ✭ 36 (+71.43%)
Mutual labels:  data-collection
Hi-Blogs
嗨博客 ASP.NET Core2.0 + CentOS7.3 + MySql5.6.37 + Redis + nginx1.12.1
Stars: ✭ 86 (+309.52%)
Mutual labels:  blogs
browser-pool
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+238.1%)
Mutual labels:  scraping
shup
A POSIX shell script to parse HTML
Stars: ✭ 28 (+33.33%)
Mutual labels:  scraping
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (+80.95%)
Mutual labels:  scraping
scrapy-zyte-smartproxy
Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy
Stars: ✭ 317 (+1409.52%)
Mutual labels:  scraping
classifai
🔥 One of the most comprehensive open-source data annotation platform.
Stars: ✭ 99 (+371.43%)
Mutual labels:  data-collection
naos
📉 Uptime and error monitoring CLI
Stars: ✭ 30 (+42.86%)
Mutual labels:  scraping
go-scrapy
Web crawling and scraping framework for Golang
Stars: ✭ 17 (-19.05%)
Mutual labels:  scraping
GitBlogs
基于 GitHub 的个人博客
Stars: ✭ 20 (-4.76%)
Mutual labels:  blogs
rubium
Rubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby
Stars: ✭ 65 (+209.52%)
Mutual labels:  scraping
top-github-scraper
Scape top GitHub repositories and users based on keywords
Stars: ✭ 40 (+90.48%)
Mutual labels:  scraping
codeprep
A toolkit for pre-processing large source code corpora
Stars: ✭ 39 (+85.71%)
Mutual labels:  language-modeling
LNEx
📍 🏢 🏦 🏣 🏪 🏬 LNEx: Location Name Extractor
Stars: ✭ 21 (+0%)
Mutual labels:  language-modeling
angel.co-companies-list-scraping
No description or website provided.
Stars: ✭ 54 (+157.14%)
Mutual labels:  scraping
Captcha-Tools
All-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (+9.52%)
Mutual labels:  scraping
mozolm
MozoLM: A language model (LM) serving library
Stars: ✭ 32 (+52.38%)
Mutual labels:  language-modeling
scavenger
Scrape and take screenshots of dynamic and static webpages
Stars: ✭ 14 (-33.33%)
Mutual labels:  scraping
dmi-instascraper
A GUI for Instaloader to scrape users and hashtags with on Instagram
Stars: ✭ 21 (+0%)
Mutual labels:  scraping
proxi
Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (+52.38%)
Mutual labels:  scraping
humanparser
Parse a human name string into salutation, first name, middle name, last name, suffix.
Stars: ✭ 78 (+271.43%)
Mutual labels:  scraping
xforms-spec
The XForms-derived specification used in the ODK ecosystem. If you are interested in building a tool that is compliant with the forms rendered by ODK tools, this is the place to start. ✨⚒✨
Stars: ✭ 27 (+28.57%)
Mutual labels:  data-collection
chirps
Twitter bot powering @arichduvet
Stars: ✭ 35 (+66.67%)
Mutual labels:  scraping
gunaydin
Your good mornings ☀️
Stars: ✭ 16 (-23.81%)
Mutual labels:  scraping
papercut
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-28.57%)
Mutual labels:  scraping
chesf
CHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-14.29%)
Mutual labels:  scraping
Scraper-Projects
🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (+19.05%)
Mutual labels:  scraping
Pentest-Bookmarkz
A collection of useful links for Pentesters
Stars: ✭ 118 (+461.9%)
Mutual labels:  forums
dust
Archive web pages with all relevant assets or save as a single file HTML
Stars: ✭ 19 (-9.52%)
Mutual labels:  scraping
spring-async
Asynchronous REST call with DeferredResult
Stars: ✭ 50 (+138.1%)
Mutual labels:  blogs
web-clipper
Easily download the main content of a web page in html, markdown, and/or epub format from command line.
Stars: ✭ 15 (-28.57%)
Mutual labels:  scraping
rnn darts fastai
Implement Differentiable Architecture Search (DARTS) for RNN with fastai
Stars: ✭ 21 (+0%)
Mutual labels:  language-modeling
TorScrapper
A Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)
Stars: ✭ 24 (+14.29%)
Mutual labels:  scraping
lingua-go
👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+3157.14%)
Mutual labels:  language-modeling
AngleParse
HTML parsing and processing tool for PowerShell.
Stars: ✭ 35 (+66.67%)
Mutual labels:  scraping
akvo-flow-mobile
Akvo Flow app
Stars: ✭ 18 (-14.29%)
Mutual labels:  data-collection
pomp
Screen scraping and web crawling framework
Stars: ✭ 61 (+190.48%)
Mutual labels:  scraping
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+147.62%)
Mutual labels:  scraping
subscene scraper
Library to download subtitles from subscene.com
Stars: ✭ 14 (-33.33%)
Mutual labels:  scraping
deepblast
Neural Networks for Protein Sequence Alignment
Stars: ✭ 29 (+38.1%)
Mutual labels:  language-modeling
blog3.0
博客V3.0 目前使用的技术(Nuxtjs + Nestjs + Vue + Element ui + vuetify),存储(MongoDB + Redis + COS)
Stars: ✭ 37 (+76.19%)
Mutual labels:  blogs
sg-food-ml
This script is used to scrap images from the Internet to classify 5 common noodle "mee" dishes in Singapore. Wanton Mee, Bak Chor Mee, Lor Mee, Prawn Mee and Mee Siam.
Stars: ✭ 18 (-14.29%)
Mutual labels:  scraping
feedsearch-crawler
Crawl sites for RSS, Atom, and JSON feeds.
Stars: ✭ 23 (+9.52%)
Mutual labels:  scraping
torchestrator
Spin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (+52.38%)
Mutual labels:  scraping
Data-Science-and-Machine-Learning-Resources
List of Data Science and Machine Learning Resource that I frequently use
Stars: ✭ 19 (-9.52%)
Mutual labels:  blogs
agouti
A platform for collective blogs and social media platform, forum, question and answer service. Catalog of sites (programs), site navigation and directories - facets. A community based on the PHP HLEB micro-framework.
Stars: ✭ 36 (+71.43%)
Mutual labels:  blogs
Zeiver
A Scraper, Downloader, & Recorder for static open directories.
Stars: ✭ 14 (-33.33%)
Mutual labels:  scraping
android-amap-track-collect
这阵子由于项目需要,需要从手机上采集用户的运动轨迹数据,这样的功能大家都见到的很多了,比如咕咚、悦动圈,对跑步运动轨迹数据进行采集,再如,微信运动、钉钉运动,对于每一天你走步进行计数,如果要记录轨迹就离不开的手机定位,如果要记录步数那就离不开陀螺仪(角速度传感器),花了一天多的时间实现了一个定位数据实时采集的功能。
Stars: ✭ 50 (+138.1%)
Mutual labels:  data-collection
whatsapp-tracking
Scraping the status of WhatsApp contacts
Stars: ✭ 49 (+133.33%)
Mutual labels:  scraping
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (+4.76%)
Mutual labels:  scraping
ferenda
Transform unstructured document collections to structured Linked Data
Stars: ✭ 22 (+4.76%)
Mutual labels:  scraping
1-60 of 340 similar projects