Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (+81.4%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (+225.58%)
Tf YarnTrain TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (+76.74%)
Docker HadoopApache Hadoop docker image
Stars: ✭ 1,190 (+2667.44%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (+55.81%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (+39.53%)
phoenixApache Phoenix / Hbase Spring Boot Microservices
Stars: ✭ 23 (-46.51%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (+197.67%)
Hadoop SolrCode to index HDFS to Solr using MapReduce
Stars: ✭ 51 (+18.6%)
Awesome Learning实践源码库:https://github.com/jast90/bigdata 。 微信搜索Jast关注公众号,获取最新技术分享😯。
Stars: ✭ 197 (+358.14%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+2283.72%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (+197.67%)
Devops Bash Tools550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...
Stars: ✭ 226 (+425.58%)
Jsr203 HadoopA Java NIO file system provider for HDFS
Stars: ✭ 35 (-18.6%)
Parquet4sRead and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Stars: ✭ 125 (+190.7%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+2106.98%)
Hive Jdbc Uber JarHive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Stars: ✭ 188 (+337.21%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (+172.09%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+1893.02%)
ambari-hdp-dockerDockerfiles and Docker Compose for HDP 2.6 with Blueprints
Stars: ✭ 23 (-46.51%)
Hadoop PotA scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.
Stars: ✭ 8 (-81.4%)
DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (+169.77%)
KyloKylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Stars: ✭ 916 (+2030.23%)
Deeplearning4jSuite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…
Stars: ✭ 12,277 (+28451.16%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+1800%)
LuigiLuigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Stars: ✭ 15,226 (+35309.3%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (+162.79%)
Javapdf🍣100本 Java电子书 技术书籍PDF(以下载阅读为荣,以点赞收藏为耻)
Stars: ✭ 609 (+1316.28%)
Dist KerasDistributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Stars: ✭ 613 (+1325.58%)
Hadoop study定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Stars: ✭ 567 (+1218.6%)
LR-GCCFRevisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach, AAAI2020
Stars: ✭ 99 (+130.23%)
Gis Tools For HadoopThe GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.
Stars: ✭ 485 (+1027.91%)
Haproxy Configs80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Stars: ✭ 106 (+146.51%)
Pdf编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+27827.91%)
Hadoop CommonMirror of Apache Hadoop common
Stars: ✭ 155 (+260.47%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+13872.09%)
MarmarayGeneric Data Ingestion & Dispersal Library for Hadoop
Stars: ✭ 414 (+862.79%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+844.19%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+400%)
Movie recommend基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Stars: ✭ 2,092 (+4765.12%)
AntsdbAntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (+130.23%)
Hadoop HdfsMirror of Apache Hadoop HDFS
Stars: ✭ 152 (+253.49%)
Wifi基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (+116.28%)