syedhassaanahmed / databricks-notebooks

Licence: MIT license

Collection of Databricks and Jupyter Notebooks

Programming Languages

Jupyter Notebook

11667 projects

HTML

75241 projects

python

139335 projects - #7 most used programming language

scala

5932 projects

Projects that are alternatives of or similar to databricks-notebooks

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+2036.84%)

Mutual labels: pyspark, parquet

Sparkmagic

Jupyter magics and kernels for working with remote Spark clusters

Stars: ✭ 954 (+4921.05%)

Mutual labels: pandas-dataframe, pyspark

Petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

Stars: ✭ 1,108 (+5731.58%)

Mutual labels: pyspark, parquet

Skale

High performance distributed data processing engine

Stars: ✭ 390 (+1952.63%)

Mutual labels: parquet, azure-storage

pyspark-algorithms

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

Stars: ✭ 72 (+278.95%)

Mutual labels: pyspark, graphframes

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (+78.95%)

Mutual labels: pyspark, spark-sql

data-analysis-using-python

Data Analysis Using Python: A Beginner’s Guide Featuring NYC Open Data

Stars: ✭ 81 (+326.32%)

Mutual labels: pandas-dataframe, matplotlib

MCW-Big-data-analytics-and-visualization

MCW Big data analytics and visualization

Stars: ✭ 172 (+805.26%)

Mutual labels: power-bi, spark-sql

spark-twitter-sentiment-analysis

Sentiment Analysis of a Twitter Topic with Spark Structured Streaming

Stars: ✭ 55 (+189.47%)

Mutual labels: pyspark, spark-sql

CosmicClone

Cosmic Clone is a utility that can backup\clone\restore a azure Cosmos database Collection. It can also anonymize cosmos documents and helps hide personally identifiable data.

Stars: ✭ 113 (+494.74%)

Mutual labels: azure-storage, cosmos-db

Spark

Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .

Stars: ✭ 55 (+189.47%)

Mutual labels: parquet, spark-sql

Azure-Certification-DP-200

Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution

Stars: ✭ 54 (+184.21%)

Mutual labels: azure-storage, azure-databricks

Goofys

a high-performance, POSIX-ish Amazon S3 file system written in Go

Stars: ✭ 3,932 (+20594.74%)

Mutual labels: azure-storage, azure-data-lake

Azure-Databricks-NYC-Taxi-Workshop

An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset

Stars: ✭ 71 (+273.68%)

Mutual labels: pyspark, azure-databricks

AzureStor

R interface to Azure storage accounts

Stars: ✭ 51 (+168.42%)

Mutual labels: azure-storage, azure-data-lake

Repository to store sample python programs for python learning

Stars: ✭ 4,154 (+21763.16%)

Mutual labels: pandas-dataframe, jupyter-notebooks

albis

Albis: High-Performance File Format for Big Data Systems

Stars: ✭ 20 (+5.26%)

Mutual labels: parquet, spark-sql

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (+105.26%)

Mutual labels: pyspark, spark-sql

SimpleSQLite

SimpleSQLite is a Python library to simplify SQLite database operations: table creation, data insertion and get data as other data formats. Simple ORM functionality for SQLite.

Stars: ✭ 116 (+510.53%)

Mutual labels: pandas-dataframe

spark-vcf

Spark VCF data source implementation for Dataframes

Stars: ✭ 15 (-21.05%)

Mutual labels: spark-sql

View All Similar Projects ➔

databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

syedhassaanahmed / databricks-notebooks

Programming Languages

Labels

Projects that are alternatives of or similar to databricks-notebooks

databricks-notebooks