spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-3.85%)
Ds CheatsheetsList of Data Science Cheatsheets to rule the world
Stars: ✭ 9,452 (+36253.85%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (+142.31%)
Ai Series📚 [.md & .ipynb] Series of Artificial Intelligence & Deep Learning, including Mathematics Fundamentals, Python Practices, NLP Application, etc. 💫 人工智能与深度学习实战,数理统计篇 | 机器学习篇 | 深度学习篇 | 自然语言处理篇 | 工具实践 Scikit & Tensoflow & PyTorch 篇 | 行业应用 & 课程笔记
Stars: ✭ 702 (+2600%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+9357.69%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-53.85%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (+434.62%)
HandysparkHandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+507.69%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+730.77%)
awesome-conformal-predictionA professionally curated list of awesome Conformal Prediction videos, tutorials, books, papers, PhD and MSc theses, articles and open-source libraries.
Stars: ✭ 998 (+3738.46%)
Deeplearning4jSuite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…
Stars: ✭ 12,277 (+47119.23%)
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (+23.08%)
My Awesome Ai BookmarksCurated list of my reads, implementations and core concepts of Artificial Intelligence, Deep Learning, Machine Learning by best folk in the world.
Stars: ✭ 223 (+757.69%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+1461.54%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+2334.62%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+3692.31%)
OpenubaA robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (+388.46%)
Data-Scientist-In-PythonThis repository contains notes and projects of Data scientist track from dataquest course work.
Stars: ✭ 23 (-11.54%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+11050%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+9584.62%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+669.23%)
LudwigData-centric declarative deep learning framework
Stars: ✭ 8,018 (+30738.46%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+5046.15%)
DatascienceCurated list of Python resources for data science.
Stars: ✭ 3,051 (+11634.62%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+326.92%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+8834.62%)
ScalnetA Scala wrapper for Deeplearning4j, inspired by Keras. Scala + DL + Spark + GPUs
Stars: ✭ 342 (+1215.38%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+2576.92%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-3.85%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+3569.23%)
Live log analyzer sparkSpark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-46.15%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (+146.15%)
Cc PysparkProcess Common Crawl data with Python and Spark
Stars: ✭ 147 (+465.38%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (+315.38%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (+315.38%)
HorovodDistributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Stars: ✭ 11,943 (+45834.62%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+476.92%)
gan deeplearning4jAutomatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-26.92%)
dstyet another custom data science template via cookiecutter
Stars: ✭ 59 (+126.92%)
QmapCompressionOfficial implementation of "Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform", ICCV 2021
Stars: ✭ 27 (+3.85%)
ChatbotA Deep-Learning multi-purpose chatbot made using Python3
Stars: ✭ 36 (+38.46%)
SentimentAnalysisSentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (+23.08%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-23.08%)
pyspark-cheatsheetPySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (+342.31%)
deeplearning딥러닝 학습 자료를 모아두고 여러 사람이 참고하여 학습할 수 있도록 돕습니다.
Stars: ✭ 44 (+69.23%)
genstarGeneration of Synthetic Populations Library
Stars: ✭ 17 (-34.62%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+30.77%)