All Projects → isarn-sketches-spark → Similar Projects or Alternatives

518 Open source projects that are alternatives of or similar to isarn-sketches-spark

aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+296.43%)
Mutual labels:  apache-spark, pyspark, dataframe
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+435.71%)
Mutual labels:  apache-spark, pyspark, dataframe
dlsa
Distributed least squares approximation (dlsa) implemented with Apache Spark
Stars: ✭ 25 (-10.71%)
Mutual labels:  pyspark, spark-ml
dh-core
Functional data science
Stars: ✭ 123 (+339.29%)
Mutual labels:  datasets, dataframes
Awesome Cybersecurity Datasets
A curated list of amazingly awesome Cybersecurity datasets
Stars: ✭ 380 (+1257.14%)
Mutual labels:  datasets, dataframe
woodwork
Woodwork is a Python library that provides robust methods for managing and communicating data typing information.
Stars: ✭ 97 (+246.43%)
Mutual labels:  dataframe, dataframes
ai-deployment
关注AI模型上线、模型部署
Stars: ✭ 149 (+432.14%)
Mutual labels:  pyspark, spark-ml
Quinn
pyspark methods to enhance developer productivity 📣 👯 🎉
Stars: ✭ 217 (+675%)
Mutual labels:  apache-spark, pyspark
pyspark-asyncactions
Asynchronous actions for PySpark
Stars: ✭ 30 (+7.14%)
Mutual labels:  apache-spark, pyspark
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+39.29%)
Mutual labels:  apache-spark, pyspark
Spark-for-data-engineers
Apache Spark for data engineers
Stars: ✭ 22 (-21.43%)
Mutual labels:  apache-spark, pyspark
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (+489.29%)
Mutual labels:  apache-spark, pyspark
machine-learning-course
Machine Learning Course @ Santa Clara University
Stars: ✭ 17 (-39.29%)
Mutual labels:  pyspark, spark-ml
Pyspark Boilerplate
A boilerplate for writing PySpark Jobs
Stars: ✭ 318 (+1035.71%)
Mutual labels:  apache-spark, pyspark
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (+21.43%)
Mutual labels:  pyspark, spark-ml
Pyspark Stubs
Apache (Py)Spark type annotations (stub files).
Stars: ✭ 98 (+250%)
Mutual labels:  apache-spark, pyspark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+10253.57%)
Mutual labels:  apache-spark, pyspark
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (+78.57%)
Mutual labels:  apache-spark, pyspark
polars
Fast multi-threaded DataFrame library in Rust | Python | Node.js
Stars: ✭ 6,368 (+22642.86%)
Mutual labels:  dataframe, dataframes
SynapseML
Simple and Distributed Machine Learning
Stars: ✭ 3,355 (+11882.14%)
Mutual labels:  apache-spark, pyspark
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+8892.86%)
Mutual labels:  pyspark, spark-ml
learn-by-examples
Real-world Spark pipelines examples
Stars: ✭ 84 (+200%)
Mutual labels:  apache-spark, pyspark
Awesome Spark
A curated list of awesome Apache Spark packages and resources.
Stars: ✭ 1,061 (+3689.29%)
Mutual labels:  apache-spark, pyspark
Sparkora
Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Stars: ✭ 51 (+82.14%)
Mutual labels:  apache-spark, pyspark
Sparkflow
Easy to use library to bring Tensorflow on Apache Spark
Stars: ✭ 282 (+907.14%)
Mutual labels:  apache-spark, dataframe
Spark Gotchas
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
Stars: ✭ 308 (+1000%)
Mutual labels:  apache-spark, pyspark
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+3217.86%)
Mutual labels:  apache-spark, dataframe
heidi
heidi : tidy data in Haskell
Stars: ✭ 24 (-14.29%)
Mutual labels:  dataframe, dataframes
jupyterlab-sparkmonitor
JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
Stars: ✭ 78 (+178.57%)
Mutual labels:  apache-spark, pyspark
spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
Stars: ✭ 55 (+96.43%)
Mutual labels:  apache-spark, pyspark
pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+157.14%)
Mutual labels:  pyspark, dataframe
pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (+310.71%)
Mutual labels:  apache-spark, pyspark
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-50%)
Mutual labels:  apache-spark, pyspark
spark3D
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (-17.86%)
Mutual labels:  apache-spark, pyspark
Albedo
A recommender system for discovering GitHub repos, built with Apache Spark
Stars: ✭ 149 (+432.14%)
Mutual labels:  apache-spark
Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+782.14%)
Mutual labels:  apache-spark
Parquetviewer
Simple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+417.86%)
Mutual labels:  apache-spark
Oryx
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Stars: ✭ 1,785 (+6275%)
Mutual labels:  apache-spark
workshop-spark
Código para workshops Spark com ambiente de desenvolvimento em docker
Stars: ✭ 27 (-3.57%)
Mutual labels:  pyspark
Mastering Spark Sql Book
The Internals of Spark SQL
Stars: ✭ 234 (+735.71%)
Mutual labels:  apache-spark
Hydrograph
A visual ETL development and debugging tool for big data
Stars: ✭ 144 (+414.29%)
Mutual labels:  apache-spark
Scalable Data Science
Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
Stars: ✭ 142 (+407.14%)
Mutual labels:  apache-spark
Pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Stars: ✭ 231 (+725%)
Mutual labels:  apache-spark
Azure Event Hubs Spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (+400%)
Mutual labels:  apache-spark
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (+389.29%)
Mutual labels:  apache-spark
laravel-quasar
⏰📊✨Laravel Time Series - Provides an API to create and maintain data projections (statistics, aggregates, etc.) from your Eloquent models, and convert them to time series.
Stars: ✭ 78 (+178.57%)
Mutual labels:  aggregator
data.world-r
R library for data.world
Stars: ✭ 59 (+110.71%)
Mutual labels:  datasets
Awesome Ai Infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Stars: ✭ 223 (+696.43%)
Mutual labels:  apache-spark
Spark Tpc Ds Performance Test
Use the TPC-DS benchmark to test Spark SQL performance
Stars: ✭ 133 (+375%)
Mutual labels:  apache-spark
Spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+6046.43%)
Mutual labels:  apache-spark
Spark Workshop
Apache Spark™ and Scala Workshops
Stars: ✭ 224 (+700%)
Mutual labels:  apache-spark
Griffon Vm
Griffon Data Science Virtual Machine
Stars: ✭ 128 (+357.14%)
Mutual labels:  apache-spark
Detecting-Malicious-URL-Machine-Learning
No description or website provided.
Stars: ✭ 47 (+67.86%)
Mutual labels:  apache-spark
Scala Spark Tutorial
Project for James' Apache Spark with Scala course
Stars: ✭ 121 (+332.14%)
Mutual labels:  apache-spark
Spark On K8s Operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+6257.14%)
Mutual labels:  apache-spark
Splash
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (+275%)
Mutual labels:  apache-spark
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+667.86%)
Mutual labels:  apache-spark
Docker Spark
Apache Spark docker image
Stars: ✭ 1,396 (+4885.71%)
Mutual labels:  apache-spark
pyspark-cassandra
pyspark-cassandra is a Python port of the awesome @datastax Spark Cassandra connector. Compatible w/ Spark 2.0, 2.1, 2.2, 2.3 and 2.4
Stars: ✭ 70 (+150%)
Mutual labels:  pyspark
Scene-Text-Recognition-Recommendations
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining
Stars: ✭ 215 (+667.86%)
Mutual labels:  datasets
1-60 of 518 similar projects