God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+28509.52%)
the-apache-ignite-bookAll code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (+209.52%)
DGFraud-TF2A Deep Graph-based Toolbox for Fraud Detection in TensorFlow 2.X
Stars: ✭ 84 (+300%)
common-datax基于DataX的通用数据同步微服务,一个Restful接口搞定所有通用数据同步
Stars: ✭ 51 (+142.86%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+814.29%)
cloud云计算之hadoop、hive、hue、oozie、sqoop、hbase、zookeeper环境搭建及配置文件
Stars: ✭ 48 (+128.57%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+3790.48%)
qweryA SQL-like language for performing ETL transformations.
Stars: ✭ 28 (+33.33%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (+252.38%)
logparserEasy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+561.9%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+14.29%)
DatafakerDatafaker is a large-scale test data and flow test data generation tool. Datafaker fakes data and inserts to varied data sources. 测试数据生成工具
Stars: ✭ 327 (+1457.14%)
litemall-dw基于开源Litemall电商项目的大数据项目,包含前端埋点(openresty+lua)、后端埋点;数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化),同时也包含了Azkaban的workflow。
Stars: ✭ 36 (+71.43%)
Pyetlpython ETL framework
Stars: ✭ 33 (+57.14%)
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (+38.1%)
TitanDataOperationSystem最好的大数据项目。《Titan数据运营系统》,本项目是一个全栈闭环系统,我们有用作数据可视化的web系统,然后用flume-kafaka-flume进行日志的读取,在hive设计数仓,编写spark代码进行数仓表之间的转化以及ads层表到mysql的迁移,使用azkaban进行定时任务的调度,使用技术:Java/Scala语言,Hadoop、Spark、Hive、Kafka、Flume、Azkaban、SpringBoot,Bootstrap, Echart等;
Stars: ✭ 62 (+195.24%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (+500%)
gan deeplearning4jAutomatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-9.52%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (+166.67%)
SparkTwitterAnalysisAn Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project.
Stars: ✭ 29 (+38.1%)
UnROOT.jlNative Julia I/O package to work with CERN ROOT files
Stars: ✭ 52 (+147.62%)
Anomaly Detectionanomaly detection with anomalize and Google Trends data
Stars: ✭ 38 (+80.95%)
waggle-danceHive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
Stars: ✭ 194 (+823.81%)
hivemindHive API server (offloads most API calls from hived) implemented using Python+SQL
Stars: ✭ 46 (+119.05%)
cubetlCubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (+0%)
etlflowEtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (+80.95%)
bqvThe simplest tool to manage views of BigQuery.
Stars: ✭ 22 (+4.76%)
vite-primevue-starterVUE 3 Starter project for using primevue 3 with Vite 2 - Pages, Layouts, Validation
Stars: ✭ 37 (+76.19%)
cdsData syncing in golang for ClickHouse.
Stars: ✭ 839 (+3895.24%)
awesome-open-mlopsThe Fuzzy Labs guide to the universe of open source MLOps
Stars: ✭ 304 (+1347.62%)
ts-detox-exampleExample TypeScript + React-Native + Jest project that integrates Detox for writing end-to-end tests
Stars: ✭ 54 (+157.14%)
ga-fetcherFetch Google Analytics data with Google APIs in Node.js 🚠
Stars: ✭ 14 (-33.33%)
awesome-bigdataA curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 11,093 (+52723.81%)
apiaryApiary provides modules which can be combined to create a federated cloud data lake
Stars: ✭ 30 (+42.86%)
nl4dvA python toolkit to create Visualizations (Vis) using natural language (NL) or add an NL interface to existing Vis.
Stars: ✭ 63 (+200%)
aaocp一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (+152.38%)
meetups-archivosPpts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
Stars: ✭ 60 (+185.71%)
cobra-policytoolManage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-23.81%)
genero-nomesClassifica nomes por gênero de acordo com API do IBGE
Stars: ✭ 33 (+57.14%)
d20datascienceData science investigations into the mechanics of the world's greatest role playing game
Stars: ✭ 50 (+138.1%)
enlite-starterEnlite Starter - React Dashboard Starter Template with Firebase Auth
Stars: ✭ 28 (+33.33%)
real-estate-neighborhood-predictionCode to repeat the experiments of "The economic value of neighborhoods: Predicting real estate prices from the urban environment"
Stars: ✭ 53 (+152.38%)
hive-jdbc-driverAn alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (+47.62%)
vulknLove your Data. Love the Environment. Love VULKИ.
Stars: ✭ 43 (+104.76%)
symfony-lts-docker-starter🐳 Dockerized your Symfony project using a complete stack (Makefile, Docker-Compose, CI, bunch of quality insurance tools, tests ...) with a base according to up-to-date components and best practices.
Stars: ✭ 39 (+85.71%)