HorovodDistributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Stars: ✭ 11,943 (+8024.49%)
Spark FfmFFM (Field-Awared Factorization Machine) on Spark
Stars: ✭ 101 (-31.29%)
Scala SamplesThere are pieces of scala code that explain Scala syntax and related things - like what you can do with all this
Stars: ✭ 125 (-14.97%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-5.44%)
Pyspark StubsApache (Py)Spark type annotations (stub files).
Stars: ✭ 98 (-33.33%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-17.01%)
Repo 2019BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics
Stars: ✭ 133 (-9.52%)
FlintWebex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)
Stars: ✭ 85 (-42.18%)
Cube.js📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+8051.7%)
Spark StatesCustom state store providers for Apache Spark
Stars: ✭ 83 (-43.54%)
ZparkioBoiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (-17.69%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+8155.78%)
Bitcoin Value Predictor[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-38.1%)
AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (-11.56%)
Ammonite SparkRun spark calculations from Ammonite
Stars: ✭ 88 (-40.14%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (-18.37%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-41.5%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+1008.84%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-42.86%)
Spylon KernelJupyter kernel for scala and spark
Stars: ✭ 129 (-12.24%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-44.22%)
Spark LucenerddSpark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-22.45%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (+0%)
MleapMLeap: Deploy ML Pipelines to Production
Stars: ✭ 1,232 (+738.1%)
Nd4jFast, Scientific and Numerical Computing for the JVM (NDArrays)
Stars: ✭ 1,742 (+1085.03%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-5.44%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1070.75%)
Spring Shiro SparkSpring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (-22.45%)
LeharVisualize data using relative ordering
Stars: ✭ 81 (-44.9%)
Spark GbtlrHybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
Stars: ✭ 81 (-44.9%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-46.26%)
Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (-46.94%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+1017.01%)
Xlearning Xdmlextremely distributed machine learning
Stars: ✭ 113 (-23.13%)
HomeApacheCN 开源组织:公告、介绍、成员、活动、交流方式
Stars: ✭ 1,199 (+715.65%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-23.81%)
Cleanframestype-class based data cleansing library for Apache Spark SQL
Stars: ✭ 75 (-48.98%)
Ds CheatsheetsList of Data Science Cheatsheets to rule the world
Stars: ✭ 9,452 (+6329.93%)
Isolation ForestA Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
Stars: ✭ 139 (-5.44%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-12.93%)
ArchivesparkAn Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Stars: ✭ 111 (-24.49%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+712.93%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-49.66%)
ElephasDistributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+934.69%)
Lpa DetectorOptimize and improve the Label propagation algorithm
Stars: ✭ 75 (-48.98%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+1652.38%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-24.49%)
LabsResearch on distributed system
Stars: ✭ 73 (-50.34%)
Kamu CliNext generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (-53.06%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1162.59%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (-51.02%)