Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …

Stars: ✭ 60 (+42.86%)

Mutual labels: bigdata

jupyterlab-sparkmonitor

JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook

Stars: ✭ 78 (+85.71%)

Mutual labels: pyspark

check-engine

Data validation library for PySpark 3.0.0

Stars: ✭ 29 (-30.95%)

Mutual labels: pyspark

v6.dooring.public

可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.

Stars: ✭ 323 (+669.05%)

Mutual labels: bigdata

spark-utils

Basic framework utilities to quickly start writing production ready Apache Spark applications

Stars: ✭ 25 (-40.48%)

Mutual labels: spark-streaming

learning notes

学习笔记

Stars: ✭ 18 (-57.14%)

Mutual labels: bigdata

BigDataTools

tools for bigData

Stars: ✭ 36 (-14.29%)

Mutual labels: bigdata

kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…

Stars: ✭ 474 (+1028.57%)

Mutual labels: pyspark

cassandra.realtime

Different ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink

Stars: ✭ 25 (-40.48%)

Mutual labels: spark-streaming

ai-deployment

关注AI模型上线、模型部署

Stars: ✭ 149 (+254.76%)

Mutual labels: pyspark

SparkTwitterAnalysis

An Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project.

Stars: ✭ 29 (-30.95%)

Mutual labels: bigdata

Springboard-Data-Science-Immersive

No description or website provided.

Stars: ✭ 52 (+23.81%)

Mutual labels: pyspark

awesome-bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

Stars: ✭ 11,093 (+26311.9%)

Mutual labels: bigdata

pyspark-cheatsheet

PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster

Stars: ✭ 115 (+173.81%)

Mutual labels: pyspark

architect big data solutions with spark

code, labs and lectures for the course

Stars: ✭ 40 (-4.76%)

Mutual labels: spark-streaming

sparklanes

A lightweight data processing framework for Apache Spark

Stars: ✭ 17 (-59.52%)

Mutual labels: pyspark

vor

The new IoT Office Experience.

Stars: ✭ 44 (+4.76%)

Mutual labels: iot-sensors

Azure-Databricks-NYC-Taxi-Workshop

An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset

Stars: ✭ 71 (+69.05%)

Mutual labels: pyspark

phrase-at-scale

Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English

Stars: ✭ 115 (+173.81%)

Mutual labels: pyspark

room-renting

用Python爬取安居客房源信息，并用高德地图进行可视化

Stars: ✭ 16 (-61.9%)

Mutual labels: bigdata

pyspark-k8s-boilerplate

Boilerplate for PySpark on Cloud Kubernetes

Stars: ✭ 24 (-42.86%)

Mutual labels: pyspark

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-54.76%)

Mutual labels: spark-streaming

ETL-Starter-Kit

📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.

Stars: ✭ 21 (-50%)

Mutual labels: bigdata

flink-learn

Learning Flink : Flink CEP,Flink Core,Flink SQL

Stars: ✭ 70 (+66.67%)

Mutual labels: bigdata

databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Stars: ✭ 19 (-54.76%)

Mutual labels: pyspark

dlsa

Distributed least squares approximation (dlsa) implemented with Apache Spark