LearningJournal / SparkProgrammingInScala

Licence: MIT license

Apache Spark Course Material

Programming Languages

5932 projects

Projects that are alternatives of or similar to SparkProgrammingInScala

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (-31.58%)

Mutual labels: big-data, apache-spark, datalake, spark-sql

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (-40.35%)

Mutual labels: big-data, bigdata, spark-sql

Movies-Analytics-in-Spark-and-Scala

Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.

Stars: ✭ 47 (-17.54%)

Mutual labels: big-data, spark-sql, spark-scala

gan deeplearning4j

Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.

Stars: ✭ 19 (-66.67%)

Mutual labels: big-data, apache-spark, bigdata

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+2919.3%)

Mutual labels: apache-spark, bigdata, spark-sql

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-77.19%)

Mutual labels: big-data, apache-spark, bigdata

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (+277.19%)

Mutual labels: big-data, apache-spark, bigdata

spark-twitter-sentiment-analysis

Sentiment Analysis of a Twitter Topic with Spark Structured Streaming

Stars: ✭ 55 (-3.51%)

Mutual labels: apache-spark, spark-sql

awesome-coder-resources

编程路上加油站！------【持续更新中...欢迎star,欢迎常回来看看......】【内容：编程/学习/阅读资源，开源项目,面试题,网站,书,博客,教程等等】

Stars: ✭ 54 (-5.26%)

Mutual labels: big-data, bigdata

geospark

bring sf to spark in production

Stars: ✭ 53 (-7.02%)

Mutual labels: apache-spark, spark-sql

Detecting-Malicious-URL-Machine-Learning

No description or website provided.

Stars: ✭ 47 (-17.54%)

Mutual labels: big-data, apache-spark

dt-sql-parser

SQL Parsers for BigData, built with antlr4.

Stars: ✭ 135 (+136.84%)

Mutual labels: bigdata, spark-sql

spark-records

Bulletproof Apache Spark jobs with fast root cause analysis of failures.

Stars: ✭ 67 (+17.54%)

Mutual labels: big-data, apache-spark

mmtf-spark

Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.

Stars: ✭ 20 (-64.91%)

Mutual labels: big-data, apache-spark

Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.

Stars: ✭ 126 (+121.05%)

Mutual labels: big-data, bigdata

awesome-tools

curated list of awesome tools and libraries for specific domains

Stars: ✭ 31 (-45.61%)

Mutual labels: big-data, apache-spark

twitter-archive-reader

Full featured TypeScript Twitter archive reader and browser

Stars: ✭ 43 (-24.56%)

Mutual labels: big-data, bigdata

meetups-archivos

Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …

Stars: ✭ 60 (+5.26%)

Mutual labels: big-data, bigdata

Real-time-Data-Warehouse

Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi

Stars: ✭ 52 (-8.77%)

Mutual labels: datalake, spark-sql

SynapseML

Simple and Distributed Machine Learning

Stars: ✭ 3,355 (+5785.96%)

Mutual labels: big-data, apache-spark

View All Similar Projects ➔

Apache Spark 3 - Spark Programming in Scala for Beginners

This is the central repository for all the materials related to Apache Spark 3 - Spark Programming in Scala for Beginners
Course by Prashant Pandey.
You can get the full course at Apache Spark Course @ Udemy.

Description

I am creating Apache Spark 3 - Spark Programming in Scala for Beginners course to help you understand the Spark programming and apply that knowledge to build data engineering solutions. This course is example-driven and follows a working session like approach. We will be taking a live coding approach and explain all the needed concepts along the way.

Who should take this Course?

I designed this course for software engineers willing to develop a Data Engineering pipeline and application using the Apache Spark. I am also creating this course for data architects and data engineers who are responsible for designing and building the organization’s data-centric infrastructure. Another group of people is the managers and architects who do not directly work with Spark implementation. Still, they work with the people who implement Apache Spark at the ground level.

Kafka and source code version

This Course is using the Apache Spark 3.x. I have tested all the source code and examples used in this Course on Apache Spark 3.0.0 open-source distribution.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

LearningJournal / SparkProgrammingInScala

Programming Languages

Labels

Projects that are alternatives of or similar to SparkProgrammingInScala

Apache Spark 3 - Spark Programming in Scala for Beginners

Description

Who should take this Course?

Kafka and source code version