All Projects → Sparkora → Similar Projects or Alternatives

996 Open source projects that are alternatives of or similar to Sparkora

Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+194.12%)
Mutual labels:  apache-spark, apache, pyspark
olliePy
OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.
Stars: ✭ 46 (-9.8%)
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-23.53%)
Mutual labels:  apache-spark, pyspark
Scattertext
Beautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+3276.47%)
Mutual labels:  exploratory-data-analysis, eda
typed-prelude
Reliable, standards-oriented software for browsers & Node.
Stars: ✭ 48 (-5.88%)
Mutual labels:  toolkit, easy-to-use
Ditching Excel For Python
Functionalities in Excel translated to Python
Stars: ✭ 172 (+237.25%)
Mutual labels:  exploratory-data-analysis, eda
Springboard-Data-Science-Immersive
No description or website provided.
Stars: ✭ 52 (+1.96%)
Mutual labels:  eda, pyspark
Great expectations
Always know what to expect from your data.
Stars: ✭ 5,808 (+11288.24%)
Mutual labels:  exploratory-data-analysis, eda
Dataprep
DataPrep — The easiest way to prepare data in Python
Stars: ✭ 639 (+1152.94%)
Mutual labels:  exploratory-data-analysis, eda
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+117.65%)
Mutual labels:  apache-spark, pyspark
Pyspark Boilerplate
A boilerplate for writing PySpark Jobs
Stars: ✭ 318 (+523.53%)
Mutual labels:  apache-spark, pyspark
Awesome Spark
A curated list of awesome Apache Spark packages and resources.
Stars: ✭ 1,061 (+1980.39%)
Mutual labels:  apache-spark, pyspark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+5584.31%)
Mutual labels:  apache-spark, pyspark
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-1.96%)
Mutual labels:  apache-spark, pyspark
Pyspark Stubs
Apache (Py)Spark type annotations (stub files).
Stars: ✭ 98 (+92.16%)
Mutual labels:  apache-spark, pyspark
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+83496.08%)
Mutual labels:  apache, data-analytics
data-analysis-using-python
Data Analysis Using Python: A Beginner’s Guide Featuring NYC Open Data
Stars: ✭ 81 (+58.82%)
Data Describe
data⎰describe: Pythonic EDA Accelerator for Data Science
Stars: ✭ 269 (+427.45%)
Mutual labels:  exploratory-data-analysis, eda
Sweetviz
Visualize and compare datasets, target values and associations, with one line of code.
Stars: ✭ 1,851 (+3529.41%)
Mutual labels:  exploratory-data-analysis, eda
Complete Life Cycle Of A Data Science Project
Complete-Life-Cycle-of-a-Data-Science-Project
Stars: ✭ 140 (+174.51%)
Mutual labels:  exploratory-data-analysis, eda
SynapseML
Simple and Distributed Machine Learning
Stars: ✭ 3,355 (+6478.43%)
Mutual labels:  apache-spark, pyspark
Spark-for-data-engineers
Apache Spark for data engineers
Stars: ✭ 22 (-56.86%)
Mutual labels:  apache-spark, pyspark
Spark Gotchas
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
Stars: ✭ 308 (+503.92%)
Mutual labels:  apache-spark, pyspark
pyspark-asyncactions
Asynchronous actions for PySpark
Stars: ✭ 30 (-41.18%)
Mutual labels:  apache-spark, pyspark
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (+223.53%)
Mutual labels:  apache-spark, pyspark
Quinn
pyspark methods to enhance developer productivity 📣 👯 🎉
Stars: ✭ 217 (+325.49%)
Mutual labels:  apache-spark, pyspark
isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (-45.1%)
Mutual labels:  apache-spark, pyspark
100 Days Of Ml Code
A day to day plan for this challenge. Covers both theoritical and practical aspects
Stars: ✭ 172 (+237.25%)
Mutual labels:  exploratory-data-analysis, eda
pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (+125.49%)
Mutual labels:  apache-spark, pyspark
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-72.55%)
Mutual labels:  apache-spark, pyspark
Inspectdf
🛠️ 📊 Tools for Exploring and Comparing Data Frames
Stars: ✭ 195 (+282.35%)
Mutual labels:  exploratory-data-analysis, eda
spark3D
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (-54.9%)
Mutual labels:  apache-spark, pyspark
Azure Event Hubs Spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (+174.51%)
Mutual labels:  apache-spark, apache
jupyterlab-sparkmonitor
JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
Stars: ✭ 78 (+52.94%)
Mutual labels:  apache-spark, pyspark
streamsx.kafka
Repository for integration with Apache Kafka
Stars: ✭ 13 (-74.51%)
Mutual labels:  apache-spark, toolkit
Exploratory Data Analysis Visualization Python
Data analysis and visualization with PyData ecosystem: Pandas, Matplotlib Numpy, and Seaborn
Stars: ✭ 78 (+52.94%)
Mutual labels:  exploratory-data-analysis, eda
Autoeda Resources
A list of software and papers related to automatic and fast Exploratory Data Analysis
Stars: ✭ 268 (+425.49%)
Mutual labels:  exploratory-data-analysis, eda
skimpy
skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.
Stars: ✭ 236 (+362.75%)
Mutual labels:  exploratory-data-analysis, eda
Hn so analysis
Is there a relationship between popularity of a given technology on Stack Overflow (SO) and Hacker News (HN)? And a few words about causality
Stars: ✭ 94 (+84.31%)
Mutual labels:  exploratory-data-analysis, eda
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+16231.37%)
Mutual labels:  exploratory-data-analysis, eda
leila
Librería para la evaluación de calidad de datos, e interacción con el portal de datos.gov.co
Stars: ✭ 56 (+9.8%)
Mutual labels:  exploratory-data-analysis, eda
Handyspark
HandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+209.8%)
Spark States
Custom state store providers for Apache Spark
Stars: ✭ 83 (+62.75%)
Mutual labels:  apache-spark, apache
learn-by-examples
Real-world Spark pipelines examples
Stars: ✭ 84 (+64.71%)
Mutual labels:  apache-spark, pyspark
spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
Stars: ✭ 55 (+7.84%)
Mutual labels:  apache-spark, pyspark
dqlab-career-track
A collection of scripts written to complete DQLab Data Analyst Career Track 📊
Stars: ✭ 53 (+3.92%)
Mutual labels:  exploratory-data-analysis
datart
Datart is a next generation Data Visualization Open Platform
Stars: ✭ 1,042 (+1943.14%)
Mutual labels:  data-analytics
alc-site
The web site of the ALC Beijing (Apache Local Community Beijing)
Stars: ✭ 75 (+47.06%)
Mutual labels:  apache
OSCI
Open Source Contributor Index
Stars: ✭ 107 (+109.8%)
Mutual labels:  pyspark
oshinko-s2i
This is a place to put s2i images and utilities for spark application builders for openshift
Stars: ✭ 16 (-68.63%)
Mutual labels:  pyspark
Easy-HotSpot
Easy HotSpot is a super easy WiFi hotspot user management utility for Mikrotik RouterOS based Router devices. Voucher printing in 6 ready made templates are available. Can be installed in any PHP/MySql enabled servers locally or in Internet web servers. Uses the PHP PEAR2 API Client by boenrobot.
Stars: ✭ 45 (-11.76%)
Mutual labels:  easy-to-use
Raunaksingh-hacktober-2020
Welcome to Hackertober fest 2020
Stars: ✭ 7 (-86.27%)
Mutual labels:  easy-to-use
hack-cs-tools
client side (C-S) penetration toolkit
Stars: ✭ 111 (+117.65%)
Mutual labels:  toolkit
Data-Analyst-Nanodegree
This repo consists of the projects that I completed as a part of the Udacity's Data Analyst Nanodegree's curriculum.
Stars: ✭ 13 (-74.51%)
Mutual labels:  exploratory-data-analysis
Breast-cancer-risk-prediction
Classification of Breast Cancer diagnosis Using Support Vector Machines
Stars: ✭ 143 (+180.39%)
Mutual labels:  exploratory-data-analysis
spydrnet
A flexible framework for analyzing and transforming FPGA netlists. Official repository.
Stars: ✭ 49 (-3.92%)
Mutual labels:  eda
Effortless-SPIFFS
A class designed to make reading and storing data on the ESP8266 and ESP32 effortless
Stars: ✭ 27 (-47.06%)
Mutual labels:  easy-to-use
UnityCommon
A collection of common frameworks and tools for Unity-based projects
Stars: ✭ 61 (+19.61%)
Mutual labels:  toolkit
eyy-indexer
An image and video friendly directory indexer for web directories.
Stars: ✭ 53 (+3.92%)
Mutual labels:  apache
greycat
GreyCat - Data Analytics, Temporal data, What-if, Live machine learning
Stars: ✭ 104 (+103.92%)
Mutual labels:  data-analytics
1-60 of 996 similar projects