550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...

Stars: ✭ 226 (+425.58%)

Mutual labels: hadoop

Jsr203 Hadoop

A Java NIO file system provider for HDFS

Stars: ✭ 35 (-18.6%)

Mutual labels: hadoop

Parquet4s

Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.

Stars: ✭ 125 (+190.7%)

Mutual labels: hadoop

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (+2106.98%)

Mutual labels: hadoop

Hive Jdbc Uber Jar

Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version

Stars: ✭ 188 (+337.21%)

Mutual labels: hadoop

Interview Questions Collection

按知识领域整理面试题，包括C++、Java、Hadoop、机器学习等

Stars: ✭ 21 (-51.16%)

Mutual labels: hadoop

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (+172.09%)

Mutual labels: hadoop

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+1893.02%)

Mutual labels: hadoop

ambari-hdp-docker

Dockerfiles and Docker Compose for HDP 2.6 with Blueprints

Stars: ✭ 23 (-46.51%)

Mutual labels: hadoop

Hadoop Pot

A scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.

Stars: ✭ 8 (-81.4%)

Mutual labels: hadoop

Datax

DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server

Stars: ✭ 116 (+169.77%)

Mutual labels: hadoop

Kylo

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.

Stars: ✭ 916 (+2030.23%)

Mutual labels: hadoop

Deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (+28451.16%)

Mutual labels: hadoop

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (+1820.93%)

Mutual labels: hadoop

Tensorflowonyarn

Support TensorFlow on YARN

Stars: ✭ 114 (+165.12%)

Mutual labels: hadoop

Bigdataguide

大数据学习，从零开始学习大数据，包含大数据学习各阶段学习视频、面试资料

Stars: ✭ 817 (+1800%)

Mutual labels: hadoop

Luigi

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Stars: ✭ 15,226 (+35309.3%)

Mutual labels: hadoop

Useractionanalyzeplatform

电商用户行为分析大数据平台

Stars: ✭ 645 (+1400%)

Mutual labels: hadoop

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (+162.79%)

Mutual labels: hadoop

Javapdf

🍣100本 Java电子书技术书籍PDF(以下载阅读为荣，以点赞收藏为耻)

Stars: ✭ 609 (+1316.28%)

Mutual labels: hadoop

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (+274.42%)

Mutual labels: hadoop

Dist Keras

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

Stars: ✭ 613 (+1325.58%)

Mutual labels: hadoop

Introtohadoopandmr udacity course

🐘 Source code for assignments of Udacity course "Introduction to Hadoop and MapReduce"

Stars: ✭ 110 (+155.81%)

Mutual labels: hadoop

Hadoop study

定期更新Hadoop生态圈中常用大数据组件文档重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图印象笔记 Scala版本简单demo 常用工具类去敏后的train code 持续更新!!!)

Stars: ✭ 567 (+1218.6%)

Mutual labels: hadoop

LR-GCCF

Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach, AAAI2020

Stars: ✭ 99 (+130.23%)

Mutual labels: collaborative-filtering

Gis Tools For Hadoop

The GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.

Stars: ✭ 485 (+1027.91%)

Mutual labels: hadoop

Haproxy Configs

80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.

Stars: ✭ 106 (+146.51%)

Mutual labels: hadoop

Pdf

编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+27827.91%)

Mutual labels: hadoop

Hadoop Common

Mirror of Apache Hadoop common

Stars: ✭ 155 (+260.47%)

Mutual labels: hadoop

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+25460.47%)

Mutual labels: hadoop

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+13872.09%)

Mutual labels: hadoop

Marmaray

Generic Data Ingestion & Dispersal Library for Hadoop

Stars: ✭ 414 (+862.79%)

Mutual labels: hadoop

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+844.19%)

Mutual labels: hadoop

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (+400%)

Mutual labels: hadoop

Movie recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

Stars: ✭ 2,092 (+4765.12%)

Mutual labels: hadoop

Antsdb

AntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase