Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-34.52%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+2827.38%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+128.57%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (+65.48%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+540.48%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1073.81%)
WedatasphereWeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+342.86%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-83.33%)
Flink Learningflink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+13445.24%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-23.81%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (-14.29%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+2665.48%)
Spark AuthorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (+67.86%)
XsqlUnified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (+109.52%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+786.9%)
Kamu CliNext generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (-17.86%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-11.9%)
DareblopyData Reading Blocks for Python
Stars: ✭ 82 (-2.38%)
GraphlogAPI for accessing the GraphLog dataset
Stars: ✭ 82 (-2.38%)
EconmlALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
Stars: ✭ 1,238 (+1373.81%)
ErgoA Python library for integrating model-based and judgmental forecasting
Stars: ✭ 82 (-2.38%)
Mlcourse生命情報の機械学習入門(新学術領域「先進ゲノム支援」中級講習会資料)
Stars: ✭ 83 (-1.19%)
Jupyter to mediumPython package for publishing Jupyter Notebooks as Medium blogposts
Stars: ✭ 82 (-2.38%)
MlMachine learning projects, often on audio datasets
Stars: ✭ 83 (-1.19%)
Amazon Sagemaker Script ModeAmazon SageMaker examples for prebuilt framework mode containers, a.k.a. Script Mode, and more (BYO containers and models etc.)
Stars: ✭ 82 (-2.38%)
Openml RR package to interface with OpenML
Stars: ✭ 81 (-3.57%)
Juliaopt NotebooksA collection of IJulia notebooks related to optimization
Stars: ✭ 81 (-3.57%)
RsnLearning to Exploit Long-term Relational Dependencies in Knowledge Graphs, ICML 2019
Stars: ✭ 83 (-1.19%)
MapidocPublic repo for Materials API documentation
Stars: ✭ 81 (-3.57%)
Dviz CourseData visualization course material
Stars: ✭ 81 (-3.57%)
PylbmNumerical simulations using flexible Lattice Boltzmann solvers
Stars: ✭ 83 (-1.19%)
Expo MfExposure Matrix Factorization: modeling user exposure in recommendation
Stars: ✭ 81 (-3.57%)
Continuous analysisComputational reproducibility using Continuous Integration to produce verifiable end-to-end runs of scientific analysis.
Stars: ✭ 81 (-3.57%)
Video2gif codeVideo2GIF neural network model from our paper at CVPR2016
Stars: ✭ 80 (-4.76%)
Chainer HandsonCAUTION: This is not maintained anymore. Visit https://github.com/chainer-community/chainer-colab-notebook/
Stars: ✭ 84 (+0%)
Airflow projectscaffold of Apache Airflow executing Docker containers
Stars: ✭ 84 (+0%)
CoronabrSérie histórica dos dados sobre COVID-19, a partir de informações do Ministério da Saúde
Stars: ✭ 83 (-1.19%)
MleapMLeap: Deploy ML Pipelines to Production
Stars: ✭ 1,232 (+1366.67%)
MadCode for "Online and Linear Time Attention by Enforcing Monotonic Alignments"
Stars: ✭ 81 (-3.57%)