AlluxioAlluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+13692.31%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+1933.33%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-71.79%)
SparklyrR interface for Apache Spark
Stars: ✭ 775 (+1887.18%)
GridA Lightning Component grid implementation that expects a server-side data store.
Stars: ✭ 35 (-10.26%)
AngelA Flexible and Powerful Parameter Server for large-scale machine learning
Stars: ✭ 6,458 (+16458.97%)
SparkApache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+80971.79%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+2282.05%)
SparklearningLearning Apache spark,including code and data .Most part can run local.
Stars: ✭ 558 (+1330.77%)
Coding Now学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Stars: ✭ 750 (+1823.08%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+2333.33%)
SparkctrCTR prediction model based on spark(LR, GBDT, DNN)
Stars: ✭ 740 (+1797.44%)
HailScalable genomic data analysis.
Stars: ✭ 706 (+1710.26%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+1684.62%)
Sfdc Debug LogsBrowser extension for Salesforce logs management
Stars: ✭ 28 (-28.21%)
Cogstack PipelineDistributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning
Stars: ✭ 26 (-33.33%)
FreestyleA cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (+1507.69%)
Sendgrid ApexSendGrid (http://sendgrid.com) Apex helper library.
Stars: ✭ 33 (-15.38%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+14402.56%)
Stormtweetssentimentd3vizComputes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (-35.9%)
HeraclesHigh performance HBase / Spark SQL engine
Stars: ✭ 27 (-30.77%)
Spark SwaggerSpark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-35.9%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+2428.21%)
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+1317.95%)
ChroniclerScala toolchain for InfluxDB
Stars: ✭ 24 (-38.46%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+1279.49%)
Fflib Apex CommonCommon Apex Library supporting Apex Enterprise Patterns and much more!
Stars: ✭ 536 (+1274.36%)
Awesome Flink😎 A curated list of amazingly awesome Flink and Flink ecosystem resources
Stars: ✭ 530 (+1258.97%)
DigitrecognizerJava Convolutional Neural Network example for Hand Writing Digit Recognition
Stars: ✭ 23 (-41.03%)
LopqTraining of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (+1258.97%)
SpartaReal Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+1215.38%)
Df17 Ant To SfdxMetadata repository demonstrating move from Ant Migration Tools to the Salesforce CLI
Stars: ✭ 20 (-48.72%)
Salesforce Apex TemplatesLooking for a possibility to use Email Templates within APEX code? Here's the answer!
Stars: ✭ 22 (-43.59%)
CdapAn open source framework for building data analytic applications.
Stars: ✭ 509 (+1205.13%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (+1200%)
KyloKylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Stars: ✭ 916 (+2248.72%)
Gimp Plugin BimpBIMP. Batch Image Manipulation Plugin for GIMP.
Stars: ✭ 500 (+1182.05%)
Apex MdapiApex Wrapper for the Salesforce Metadata API
Stars: ✭ 493 (+1164.1%)
FlintA Time Series Library for Apache Spark
Stars: ✭ 878 (+2151.28%)
Up Node8The way this project is packaging the Node 8 app isn't the best. Try the official example of Apex Up that uses the Node binary!
Stars: ✭ 22 (-43.59%)
Easy BatchThe simple, stupid batch framework for Java
Stars: ✭ 493 (+1164.1%)
NpspThe current version of the Salesforce.org Nonprofit Success Pack
Stars: ✭ 487 (+1148.72%)
Pytorch Auto DriveSegmentation models (ERFNet, ENet, DeepLab, FCN...) and Lane detection models (SCNN, SAD, PRNet, RESA, LSTR...) based on PyTorch 1.6 with mixed precision training
Stars: ✭ 32 (-17.95%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-64.1%)