All Projects → oliveiraJessica → workshop-spark

oliveiraJessica / workshop-spark

Licence: other
Código para workshops Spark com ambiente de desenvolvimento em docker

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to workshop-spark

Pyspark Cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (+300%)
Mutual labels:  pyspark
Handyspark
HandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+485.19%)
Mutual labels:  pyspark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+10637.04%)
Mutual labels:  pyspark
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+366.67%)
Mutual labels:  pyspark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+455.56%)
Mutual labels:  pyspark
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (+511.11%)
Mutual labels:  pyspark
Pyspark Stubs
Apache (Py)Spark type annotations (stub files).
Stars: ✭ 98 (+262.96%)
Mutual labels:  pyspark
Morphl Community Edition
MorphL Community Edition uses big data and machine learning to predict user behaviors in digital products and services with the end goal of increasing KPIs (click-through rates, conversion rates, etc.) through personalization
Stars: ✭ 253 (+837.04%)
Mutual labels:  pyspark
Learningapachespark
LearningApacheSpark
Stars: ✭ 155 (+474.07%)
Mutual labels:  pyspark
Spark Practice
Apache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+640.74%)
Mutual labels:  pyspark
Repo 2019
BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics
Stars: ✭ 133 (+392.59%)
Mutual labels:  pyspark
Cc Pyspark
Process Common Crawl data with Python and Spark
Stars: ✭ 147 (+444.44%)
Mutual labels:  pyspark
Spark Iforest
Isolation Forest on Spark
Stars: ✭ 166 (+514.81%)
Mutual labels:  pyspark
Eat pyspark in 10 days
pyspark🍒🥭 is delicious,just eat it!😋😋
Stars: ✭ 116 (+329.63%)
Mutual labels:  pyspark
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+700%)
Mutual labels:  pyspark
Hnswlib
Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (+300%)
Mutual labels:  pyspark
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+8503.7%)
Mutual labels:  pyspark
spark-dgraph-connector
A connector for Apache Spark and PySpark to Dgraph databases.
Stars: ✭ 36 (+33.33%)
Mutual labels:  pyspark
Quinn
pyspark methods to enhance developer productivity 📣 👯 🎉
Stars: ✭ 217 (+703.7%)
Mutual labels:  pyspark
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+9225.93%)
Mutual labels:  pyspark

Workshop Spark

Esse repositório contém alguns notebooks utilizados em workshops sobre pyspark, com explicação teórica e exploração da arquitetura spark utilizando Spark UI.

O ambiente de desenvolvimento utiliza um container com pyspakr e jupyter-notebook.

Pré-requisito

É necessário ter docker-compose instalado para utilizar o ambiente.

Utilização

1 - Para subir o container: docker-compose up -d

2 - Acesse localhost:8888

3 - Para parar o container: docker-compose down

Notebooks

O repositório contém 3 notebooks com a mesma análise mas com níveis diferentes de detalhamento.

└──notebooks
    ├──semana_dados
    ├──guilda_dados
    └──TDC

Referências

Algumas referências para ajudar nos estudos do pyspark:

Livros:

Blogs:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].