All Projects → skein → Similar Projects or Alternatives

989 Open source projects that are alternatives of or similar to skein

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-84.37%)

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-89.84%)

Mutual labels: hadoop, hdfs

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (+17.19%)

Mutual labels: hadoop, hdfs

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (+132.81%)

Mutual labels: hadoop, cluster

Akkeeper

An easy way to deploy your Akka services to a distributed environment.

Stars: ✭ 30 (-76.56%)

Mutual labels: hadoop, deployment

Kafka Connect Hdfs

Kafka Connect HDFS connector

Stars: ✭ 400 (+212.5%)

Mutual labels: hadoop, hdfs

fsbrowser

Fast desktop client for Hadoop Distributed File System

Stars: ✭ 27 (-78.91%)

Mutual labels: hadoop, hdfs

ros hadoop

Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.

Stars: ✭ 92 (-28.12%)

Mutual labels: hadoop, hdfs

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+8486.72%)

Mutual labels: hadoop, hdfs

Jsr203 Hadoop

A Java NIO file system provider for HDFS

Stars: ✭ 35 (-72.66%)

Mutual labels: hadoop, hdfs

Data-pipeline-project

Data pipeline project

Stars: ✭ 18 (-85.94%)

Mutual labels: hadoop, deployment

teraslice

Scalable data processing pipelines in JavaScript

Stars: ✭ 48 (-62.5%)

Mutual labels: hadoop, hdfs

Bigdata

💎🔥大数据学习笔记

Stars: ✭ 488 (+281.25%)

Mutual labels: hadoop, hdfs

Ibis

A pandas-like deferred expression system, with first-class SQL support

Stars: ✭ 1,630 (+1173.44%)

Mutual labels: hadoop, hdfs

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+217.19%)

Mutual labels: hadoop, hdfs

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-85.16%)

Mutual labels: hadoop, hdfs

datasqueeze

Hadoop utility to compact small files

Stars: ✭ 18 (-85.94%)

Mutual labels: hadoop, hdfs

fluent-plugin-webhdfs

Hadoop WebHDFS output plugin for Fluentd

Stars: ✭ 57 (-55.47%)

Mutual labels: hadoop, hdfs

aaocp

一个对用户行为日志进行分析的大数据项目

Stars: ✭ 53 (-58.59%)

Mutual labels: hadoop, hdfs

Hadoop For Geoevent

ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.

Stars: ✭ 5 (-96.09%)

Mutual labels: hadoop, hdfs

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+569.53%)

Mutual labels: hadoop, hdfs

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (-28.12%)

Mutual labels: hadoop, hdfs

Learning Spark

零基础学习spark，大数据学习

Stars: ✭ 37 (-71.09%)

Mutual labels: hadoop, hdfs

kafka-connect-fs

Kafka Connect FileSystem Connector

Stars: ✭ 107 (-16.41%)

Mutual labels: hadoop, hdfs

docker-hadoop

Docker image for main Apache Hadoop components (Yarn/Hdfs)

Stars: ✭ 59 (-53.91%)

Mutual labels: hadoop, hdfs

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-71.09%)

Mutual labels: hadoop, hdfs

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-89.06%)

Mutual labels: hadoop, hdfs

py-hdfs-mount

Mount HDFS with fuse, works with kerberos!

Stars: ✭ 13 (-89.84%)

Mutual labels: hadoop, hdfs

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+4593.75%)

Mutual labels: hadoop, hdfs

hive to es

同步Hive数据仓库数据到Elasticsearch的小工具

Stars: ✭ 21 (-83.59%)

Mutual labels: hadoop, hdfs

Camus

Mirror of Linkedin's Camus

Stars: ✭ 81 (-36.72%)

Mutual labels: hadoop, hdfs

Wifi

基于wifi抓取信息的大数据查询分析系统

Stars: ✭ 93 (-27.34%)

Mutual labels: hadoop, hdfs

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (+25.78%)

Mutual labels: hadoop, hdfs

Dynamometer

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.

Stars: ✭ 122 (-4.69%)

Mutual labels: hadoop, hdfs

Swarmlet

A self-hosted, open-source Platform as a Service that enables easy swarm deployments, load balancing, automatic SSL, metrics, analytics and more.

Stars: ✭ 373 (+191.41%)

Mutual labels: deployment, cluster

init ec2

init EC2 cluster, for free-password-login(ubuntu and root). for hostname, for hosts file.

Stars: ✭ 11 (-91.41%)

Mutual labels: deployment, cluster

Cloudbreak

A tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.

Stars: ✭ 301 (+135.16%)

Mutual labels: hadoop, deployment

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-8.59%)

Mutual labels: hadoop, hdfs

HDFS-Netdisc

基于Hadoop的分布式云存储系统 🌴

Stars: ✭ 56 (-56.25%)

Mutual labels: hadoop, hdfs

Temps

λ A selfhostable serverless function runtime. Inspired by zeit now.

Stars: ✭ 15 (-88.28%)

Mutual labels: deployment, cluster

pmml4s

PMML scoring library for Scala

Stars: ✭ 49 (-61.72%)

Mutual labels: deployment

corc

An ORC File Scheme for the Cascading data processing platform.

Stars: ✭ 14 (-89.06%)

Mutual labels: hadoop

big-data-exploration

[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product

Stars: ✭ 43 (-66.41%)

Mutual labels: hadoop

NodeMCU-BlackBox

ESP8266 based CAN-Bus Diagnostic Tool

Stars: ✭ 28 (-78.12%)

Mutual labels: cluster

hbase-meta-repair

Repair hbase metadata table from hdfs.

Stars: ✭ 36 (-71.87%)

Mutual labels: hdfs

BigInsights-on-Apache-Hadoop

Example projects for 'BigInsights for Apache Hadoop' on IBM Bluemix

Stars: ✭ 21 (-83.59%)

Mutual labels: hadoop

push-package-action

| Public | GitHub Action to Push a Package to Octopus Deploy

Stars: ✭ 23 (-82.03%)

Mutual labels: deployment

Real Time Social Media Mining

DevOps pipeline for Real Time Social/Web Mining

Stars: ✭ 22 (-82.81%)

Mutual labels: hdfs

easy qsub

Easily submitting multiple PBS jobs or running local jobs in parallel. Multiple input files supported.

Stars: ✭ 26 (-79.69%)

Mutual labels: cluster

LogAnalyzeHelper

论坛日志分析系统清洗程序(包含IP规则库，UDF开发，MapReduce程序，日志数据)

Stars: ✭ 33 (-74.22%)

Mutual labels: hadoop

actions-publish-gh-pages

🍣 A GitHub Action to publish static website using GitHub Pages

Stars: ✭ 12 (-90.62%)

Mutual labels: deployment

create-release-action

| Public | GitHub Action to Create a Release in Octopus Deploy

Stars: ✭ 68 (-46.87%)

Mutual labels: deployment

deploy shard mongodb

This repository has a set of scripts and resources required for deploying MongoDB replicated sharded cluster.

Stars: ✭ 17 (-86.72%)

Mutual labels: cluster

Librarian

Easily host your iOS and Android builds locally!

Stars: ✭ 35 (-72.66%)

Mutual labels: deployment

manager

The API endpoint that manages nebula orchestrator clusters

Stars: ✭ 28 (-78.12%)

Mutual labels: cluster

testnet deploy

Deployment scripts and monitoring configuration for a Cosmos Validator setup

Stars: ✭ 19 (-85.16%)

Mutual labels: deployment

ML-CaPsule

ML-capsule is a Project for beginners and experienced data science Enthusiasts who don't have a mentor or guidance and wish to learn Machine learning. Using our repo they can learn ML, DL, and many related technologies with different real-world projects and become Interview ready.

Stars: ✭ 177 (+38.28%)

Mutual labels: deployment

laniakea

Laniakea is a utility for managing instances at various cloud providers and aids in setting up a fuzzing cluster.

Stars: ✭ 28 (-78.12%)

Mutual labels: cluster

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (-69.53%)

Mutual labels: hadoop

Meteor-Mailer

📮 Bulletproof email queue on top of NodeMailer with support of multiple clusters and servers setup

Stars: ✭ 21 (-83.59%)

Mutual labels: cluster

1-60 of 989 similar projects

›

next*5