All Projects → bigdata-doc → Similar Projects or Alternatives

571 Open source projects that are alternatives of or similar to bigdata-doc

Flink Boot
懒松鼠Flink-Boot 脚手架让Flink全面拥抱Spring生态体系,使得开发者可以以Java WEB开发模式开发出分布式运行的流处理程序,懒松鼠让跨界变得更加简单。懒松鼠旨在让开发者以更底上手成本(不需要理解分布式计算的理论知识和Flink框架的细节)便可以快速编写业务代码实现。为了进一步提升开发者使用懒松鼠脚手架开发大型项目的敏捷的度,该脚手架默认集成Spring框架进行Bean管理,同时将微服务以及WEB开发领域中经常用到的框架集成进来,进一步提升开发速度。比如集成Mybatis ORM框架,Hibernate Validator校验框架,Spring Retry重试框架等,具体见下面的脚手架特性。
Stars: ✭ 209 (+464.86%)
Mutual labels:  bigdata, flink
HDFS-Netdisc
基于Hadoop的分布式云存储系统 🌴
Stars: ✭ 56 (+51.35%)
Mutual labels:  hadoop, hdfs
Flinkx
Based on Apache Flink. support data synchronization/integration and streaming SQL computation.
Stars: ✭ 2,651 (+7064.86%)
Mutual labels:  bigdata, flink
learning-spark
Tidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (-24.32%)
Mutual labels:  hadoop, bigdata
Ecommercerecommendsystem
商品大数据实时推荐系统。前端:Vue + TypeScript + ElementUI,后端 Spring + Spark
Stars: ✭ 139 (+275.68%)
Mutual labels:  bigdata, flink
ros hadoop
Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.
Stars: ✭ 92 (+148.65%)
Mutual labels:  hadoop, hdfs
flokkr
Documentation placeholder and utilities for all the other containers.
Stars: ✭ 30 (-18.92%)
Mutual labels:  hadoop, bigdata
Hdfs Shell
HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (+216.22%)
Mutual labels:  hadoop, hdfs
Dynamometer
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Stars: ✭ 122 (+229.73%)
Mutual labels:  hadoop, hdfs
Big Whale
Spark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (+340.54%)
Mutual labels:  hadoop, flink
Hive
Apache Hive
Stars: ✭ 4,031 (+10794.59%)
Mutual labels:  hive, hadoop
Awesome Learning
实践源码库:https://github.com/jast90/bigdata 。 微信搜索Jast关注公众号,获取最新技术分享😯。
Stars: ✭ 197 (+432.43%)
Mutual labels:  hadoop, bigdata
Shifu
An end-to-end machine learning and data mining framework on Hadoop
Stars: ✭ 207 (+459.46%)
Mutual labels:  hadoop, bigdata
Hive Jdbc Uber Jar
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Stars: ✭ 188 (+408.11%)
Mutual labels:  hive, hadoop
Hops Examples
Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (+127.03%)
Mutual labels:  hive, flink
kafka-connect-fs
Kafka Connect FileSystem Connector
Stars: ✭ 107 (+189.19%)
Mutual labels:  hadoop, hdfs
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+12281.08%)
Mutual labels:  hive, hadoop
Hive Funnel Udf
Hive UDFs for funnel analysis
Stars: ✭ 72 (+94.59%)
Mutual labels:  hive, hadoop
Big data architect skills
一个大数据架构师应该掌握的技能
Stars: ✭ 400 (+981.08%)
Mutual labels:  hadoop, bigdata
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+59489.19%)
Mutual labels:  hadoop, mapreduce
Behemoth
Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Stars: ✭ 286 (+672.97%)
Mutual labels:  hadoop, mapreduce
Jsr203 Hadoop
A Java NIO file system provider for HDFS
Stars: ✭ 35 (-5.41%)
Mutual labels:  hadoop, hdfs
Data Algorithms Book
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+2464.86%)
Mutual labels:  hadoop, mapreduce
logparser
Easy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+275.68%)
Mutual labels:  hive, flink
Drill
Apache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (+4275.68%)
Mutual labels:  hive, hadoop
BigData-News
基于Spark2.2新闻网大数据实时系统项目
Stars: ✭ 36 (-2.7%)
Mutual labels:  hive, hadoop
Flink Notes
flink学习笔记
Stars: ✭ 106 (+186.49%)
Mutual labels:  bigdata, flink
Waterdrop
Production Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+4916.22%)
Mutual labels:  hadoop, flink
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+481.08%)
Mutual labels:  hadoop, bigdata
hive-bigquery-storage-handler
Hive Storage Handler for interoperability between BigQuery and Apache Hive
Stars: ✭ 16 (-56.76%)
Mutual labels:  hive, hadoop
Addax
Addax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.
Stars: ✭ 615 (+1562.16%)
Mutual labels:  hive, hadoop
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+305.41%)
Mutual labels:  hadoop, hdfs
Ibis
A pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+4305.41%)
Mutual labels:  hadoop, hdfs
Facebook Hive Udfs
Facebook's Hive UDFs
Stars: ✭ 213 (+475.68%)
Mutual labels:  hive, hadoop
Javaorbigdata Interview
Java开发者或者大数据开发者面试知识点整理
Stars: ✭ 203 (+448.65%)
Mutual labels:  hadoop, bigdata
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+905.41%)
Mutual labels:  hive, hadoop
Asakusafw
Asakusa Framework
Stars: ✭ 114 (+208.11%)
Mutual labels:  hadoop, mapreduce
Datafaker
Datafaker is a large-scale test data and flow test data generation tool. Datafaker fakes data and inserts to varied data sources. 测试数据生成工具
Stars: ✭ 327 (+783.78%)
Mutual labels:  hive, bigdata
smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (+113.51%)
Mutual labels:  hive, hadoop
Hadoop Attack Library
A collection of pentest tools and resources targeting Hadoop environments
Stars: ✭ 228 (+516.22%)
Mutual labels:  hadoop, bigdata
hive-jdbc-driver
An alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (-16.22%)
Mutual labels:  hive, hadoop
Bigdata practice
大数据分析可视化实践
Stars: ✭ 166 (+348.65%)
Mutual labels:  hive, bigdata
Presto
The official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+34918.92%)
Mutual labels:  hive, hadoop
swordfish
Open-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-5.41%)
Mutual labels:  hive, hadoop
Movie recommend
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Stars: ✭ 2,092 (+5554.05%)
Mutual labels:  hive, hadoop
liquibase-impala
Liquibase extension to add Impala Database support
Stars: ✭ 23 (-37.84%)
Mutual labels:  hive, hadoop
hadoop-etl-udfs
The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-54.05%)
Mutual labels:  hive, hadoop
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-35.14%)
Mutual labels:  hive, hadoop
TIL
Today I Learned
Stars: ✭ 43 (+16.22%)
Mutual labels:  hive, hadoop
cobra-policytool
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-56.76%)
Mutual labels:  hive, hadoop
litemall-dw
基于开源Litemall电商项目的大数据项目,包含前端埋点(openresty+lua)、后端埋点;数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化),同时也包含了Azkaban的workflow。
Stars: ✭ 36 (-2.7%)
Mutual labels:  hive, flink
xxhadoop
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (+0%)
Mutual labels:  hive, hadoop
EngineeringTeam
와이빅타 엔지니어링팀의 자료를 정리해두는 곳입니다.
Stars: ✭ 41 (+10.81%)
Mutual labels:  hive, hadoop
cloud
云计算之hadoop、hive、hue、oozie、sqoop、hbase、zookeeper环境搭建及配置文件
Stars: ✭ 48 (+29.73%)
Mutual labels:  hive, hadoop
TitanDataOperationSystem
最好的大数据项目。《Titan数据运营系统》,本项目是一个全栈闭环系统,我们有用作数据可视化的web系统,然后用flume-kafaka-flume进行日志的读取,在hive设计数仓,编写spark代码进行数仓表之间的转化以及ads层表到mysql的迁移,使用azkaban进行定时任务的调度,使用技术:Java/Scala语言,Hadoop、Spark、Hive、Kafka、Flume、Azkaban、SpringBoot,Bootstrap, Echart等;
Stars: ✭ 62 (+67.57%)
Mutual labels:  hive, hadoop
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (+132.43%)
Mutual labels:  bigdata, hdfs
BigInsights-on-Apache-Hadoop
Example projects for 'BigInsights for Apache Hadoop' on IBM Bluemix
Stars: ✭ 21 (-43.24%)
Mutual labels:  hive, hadoop
ETL-Starter-Kit
📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.
Stars: ✭ 21 (-43.24%)
Mutual labels:  hive, bigdata
Eel Sdk
Big Data Toolkit for the JVM
Stars: ✭ 140 (+278.38%)
Mutual labels:  hive, hadoop
Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+4821.62%)
Mutual labels:  hive, flink
61-120 of 571 similar projects