All Projects → syedhassaanahmed → databricks-notebooks

syedhassaanahmed / databricks-notebooks

Licence: MIT license
Collection of Databricks and Jupyter Notebooks

Programming Languages

Jupyter Notebook
11667 projects
HTML
75241 projects
python
139335 projects - #7 most used programming language
scala
5932 projects

Projects that are alternatives of or similar to databricks-notebooks

Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+2036.84%)
Mutual labels:  pyspark, parquet
Sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+4921.05%)
Mutual labels:  pandas-dataframe, pyspark
Petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Stars: ✭ 1,108 (+5731.58%)
Mutual labels:  pyspark, parquet
Skale
High performance distributed data processing engine
Stars: ✭ 390 (+1952.63%)
Mutual labels:  parquet, azure-storage
pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+278.95%)
Mutual labels:  pyspark, graphframes
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+78.95%)
Mutual labels:  pyspark, spark-sql
data-analysis-using-python
Data Analysis Using Python: A Beginner’s Guide Featuring NYC Open Data
Stars: ✭ 81 (+326.32%)
Mutual labels:  pandas-dataframe, matplotlib
MCW-Big-data-analytics-and-visualization
MCW Big data analytics and visualization
Stars: ✭ 172 (+805.26%)
Mutual labels:  power-bi, spark-sql
spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
Stars: ✭ 55 (+189.47%)
Mutual labels:  pyspark, spark-sql
CosmicClone
Cosmic Clone is a utility that can backup\clone\restore a azure Cosmos database Collection. It can also anonymize cosmos documents and helps hide personally identifiable data.
Stars: ✭ 113 (+494.74%)
Mutual labels:  azure-storage, cosmos-db
Spark
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
Stars: ✭ 55 (+189.47%)
Mutual labels:  parquet, spark-sql
Azure-Certification-DP-200
Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution
Stars: ✭ 54 (+184.21%)
Mutual labels:  azure-storage, azure-databricks
Goofys
a high-performance, POSIX-ish Amazon S3 file system written in Go
Stars: ✭ 3,932 (+20594.74%)
Mutual labels:  azure-storage, azure-data-lake
Azure-Databricks-NYC-Taxi-Workshop
An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset
Stars: ✭ 71 (+273.68%)
Mutual labels:  pyspark, azure-databricks
AzureStor
R interface to Azure storage accounts
Stars: ✭ 51 (+168.42%)
Mutual labels:  azure-storage, azure-data-lake
Py
Repository to store sample python programs for python learning
Stars: ✭ 4,154 (+21763.16%)
Mutual labels:  pandas-dataframe, jupyter-notebooks
albis
Albis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (+5.26%)
Mutual labels:  parquet, spark-sql
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+105.26%)
Mutual labels:  pyspark, spark-sql
SimpleSQLite
SimpleSQLite is a Python library to simplify SQLite database operations: table creation, data insertion and get data as other data formats. Simple ORM functionality for SQLite.
Stars: ✭ 116 (+510.53%)
Mutual labels:  pandas-dataframe
spark-vcf
Spark VCF data source implementation for Dataframes
Stars: ✭ 15 (-21.05%)
Mutual labels:  spark-sql

databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].