All Projects → Babler → Similar Projects or Alternatives

340 Open source projects that are alternatives of or similar to Babler

document-dl

Command line program to download documents from web portals.

Stars: ✭ 14 (-33.33%)

Mutual labels: scraping

my blog

在 Issues 中建立的个人博客

Stars: ✭ 28 (+33.33%)

Mutual labels: blogs

kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…

Stars: ✭ 474 (+2157.14%)

Mutual labels: scraping

internet-affordability

🌍 Dataset that shows the Internet affordability by country (a shocking reality!)

Stars: ✭ 13 (-38.1%)

Mutual labels: scraping

ogpParser

Open Graph Protocol Parser for Node.js

Stars: ✭ 43 (+104.76%)

Mutual labels: scraping

image-collector

Download images from Google Image Search

Stars: ✭ 38 (+80.95%)

Mutual labels: scraping

theano-recurrence

Recurrent Neural Networks (RNN, GRU, LSTM) and their Bidirectional versions (BiRNN, BiGRU, BiLSTM) for word & character level language modelling in Theano

Stars: ✭ 40 (+90.48%)

Mutual labels: language-modeling

data-collection-ios

Mobile data collection app using the iOS Runtime SDK.

Stars: ✭ 24 (+14.29%)

Mutual labels: data-collection

flying-apple

Just to keep track of nice content and new announcements related to Apple products and Swift

Stars: ✭ 45 (+114.29%)

Mutual labels: blogs

arcreactor

open-source intelligence gathering for SIEMs <3

Stars: ✭ 36 (+71.43%)

Mutual labels: data-collection

Hi-Blogs

嗨博客 ASP.NET Core2.0 + CentOS7.3 + MySql5.6.37 + Redis + nginx1.12.1

Stars: ✭ 86 (+309.52%)

Mutual labels: blogs

browser-pool

A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.

Stars: ✭ 71 (+238.1%)

Mutual labels: scraping

shup

A POSIX shell script to parse HTML

Stars: ✭ 28 (+33.33%)

Mutual labels: scraping

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (+80.95%)

Mutual labels: scraping

scrapy-zyte-smartproxy

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

Stars: ✭ 317 (+1409.52%)

Mutual labels: scraping

classifai

🔥 One of the most comprehensive open-source data annotation platform.

Stars: ✭ 99 (+371.43%)

Mutual labels: data-collection

naos

📉 Uptime and error monitoring CLI

Stars: ✭ 30 (+42.86%)

Mutual labels: scraping

go-scrapy

Web crawling and scraping framework for Golang

Stars: ✭ 17 (-19.05%)

Mutual labels: scraping

GitBlogs

基于 GitHub 的个人博客

Stars: ✭ 20 (-4.76%)

Mutual labels: blogs

rubium

Rubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby

Stars: ✭ 65 (+209.52%)

Mutual labels: scraping

top-github-scraper

Scape top GitHub repositories and users based on keywords

Stars: ✭ 40 (+90.48%)

Mutual labels: scraping

codeprep

A toolkit for pre-processing large source code corpora

Stars: ✭ 39 (+85.71%)

Mutual labels: language-modeling

LNEx

📍 🏢 🏦 🏣 🏪 🏬 LNEx: Location Name Extractor

Stars: ✭ 21 (+0%)

Mutual labels: language-modeling

angel.co-companies-list-scraping

No description or website provided.

Stars: ✭ 54 (+157.14%)

Mutual labels: scraping

Captcha-Tools

All-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!

Stars: ✭ 23 (+9.52%)

Mutual labels: scraping

mozolm

MozoLM: A language model (LM) serving library

Stars: ✭ 32 (+52.38%)

Mutual labels: language-modeling

scavenger

Scrape and take screenshots of dynamic and static webpages

Stars: ✭ 14 (-33.33%)

Mutual labels: scraping

dmi-instascraper

A GUI for Instaloader to scrape users and hashtags with on Instagram

Stars: ✭ 21 (+0%)

Mutual labels: scraping

proxi

Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.

Stars: ✭ 32 (+52.38%)

Mutual labels: scraping

humanparser

Parse a human name string into salutation, first name, middle name, last name, suffix.

Stars: ✭ 78 (+271.43%)

Mutual labels: scraping

xforms-spec

The XForms-derived specification used in the ODK ecosystem. If you are interested in building a tool that is compliant with the forms rendered by ODK tools, this is the place to start. ✨⚒✨

Stars: ✭ 27 (+28.57%)

Mutual labels: data-collection

chirps

Twitter bot powering @arichduvet

Stars: ✭ 35 (+66.67%)

Mutual labels: scraping

gunaydin

Your good mornings ☀️

Stars: ✭ 16 (-23.81%)

Mutual labels: scraping

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-28.57%)

Mutual labels: scraping

chesf

CHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages

Stars: ✭ 18 (-14.29%)

Mutual labels: scraping

Scraper-Projects

🕸 List of mini projects that involve web scraping 🕸

Stars: ✭ 25 (+19.05%)

Mutual labels: scraping

Pentest-Bookmarkz

A collection of useful links for Pentesters

Stars: ✭ 118 (+461.9%)

Mutual labels: forums

dust

Archive web pages with all relevant assets or save as a single file HTML

Stars: ✭ 19 (-9.52%)

Mutual labels: scraping

spring-async

Asynchronous REST call with DeferredResult

Stars: ✭ 50 (+138.1%)

Mutual labels: blogs

web-clipper

Easily download the main content of a web page in html, markdown, and/or epub format from command line.

Stars: ✭ 15 (-28.57%)

Mutual labels: scraping

rnn darts fastai

Implement Differentiable Architecture Search (DARTS) for RNN with fastai

Stars: ✭ 21 (+0%)

Mutual labels: language-modeling

TorScrapper

A Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)

Stars: ✭ 24 (+14.29%)

Mutual labels: scraping

lingua-go

👄 The most accurate natural language detection library for Go, suitable for long and short text alike

Stars: ✭ 684 (+3157.14%)

Mutual labels: language-modeling

AngleParse

HTML parsing and processing tool for PowerShell.

Stars: ✭ 35 (+66.67%)

Mutual labels: scraping

akvo-flow-mobile

Akvo Flow app

Stars: ✭ 18 (-14.29%)

Mutual labels: data-collection

pomp

Screen scraping and web crawling framework

Stars: ✭ 61 (+190.48%)

Mutual labels: scraping

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (+147.62%)

Mutual labels: scraping

subscene scraper

Library to download subtitles from subscene.com

Stars: ✭ 14 (-33.33%)

Mutual labels: scraping

deepblast

Neural Networks for Protein Sequence Alignment

Stars: ✭ 29 (+38.1%)

Mutual labels: language-modeling

blog3.0

博客V3.0 目前使用的技术(Nuxtjs + Nestjs + Vue + Element ui + vuetify)，存储(MongoDB + Redis + COS)

Stars: ✭ 37 (+76.19%)

Mutual labels: blogs

sg-food-ml

This script is used to scrap images from the Internet to classify 5 common noodle "mee" dishes in Singapore. Wanton Mee, Bak Chor Mee, Lor Mee, Prawn Mee and Mee Siam.

Stars: ✭ 18 (-14.29%)

Mutual labels: scraping

feedsearch-crawler

Crawl sites for RSS, Atom, and JSON feeds.

Stars: ✭ 23 (+9.52%)

Mutual labels: scraping

torchestrator

Spin up Tor containers and then proxy HTTP requests via these Tor instances

Stars: ✭ 32 (+52.38%)

Mutual labels: scraping

Data-Science-and-Machine-Learning-Resources

List of Data Science and Machine Learning Resource that I frequently use

Stars: ✭ 19 (-9.52%)

Mutual labels: blogs

agouti

A platform for collective blogs and social media platform, forum, question and answer service. Catalog of sites (programs), site navigation and directories - facets. A community based on the PHP HLEB micro-framework.

Stars: ✭ 36 (+71.43%)

Mutual labels: blogs

Zeiver

A Scraper, Downloader, & Recorder for static open directories.

Stars: ✭ 14 (-33.33%)

Mutual labels: scraping

android-amap-track-collect

这阵子由于项目需要，需要从手机上采集用户的运动轨迹数据，这样的功能大家都见到的很多了，比如咕咚、悦动圈，对跑步运动轨迹数据进行采集，再如，微信运动、钉钉运动，对于每一天你走步进行计数，如果要记录轨迹就离不开的手机定位，如果要记录步数那就离不开陀螺仪（角速度传感器），花了一天多的时间实现了一个定位数据实时采集的功能。

Stars: ✭ 50 (+138.1%)

Mutual labels: data-collection

whatsapp-tracking

Scraping the status of WhatsApp contacts

Stars: ✭ 49 (+133.33%)

Mutual labels: scraping

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

Stars: ✭ 22 (+4.76%)

Mutual labels: scraping

ferenda

Transform unstructured document collections to structured Linked Data

Stars: ✭ 22 (+4.76%)

Mutual labels: scraping

1-60 of 340 similar projects

›

next*5