Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+366.67%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+25242.53%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+129.89%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+148.28%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-71.26%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+627.59%)
HandysparkHandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (+81.61%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+2570.11%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+2726.44%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+27.59%)
Spark Jupyter AwsA guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+197.7%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+996.55%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+1437.93%)
Cc PysparkProcess Common Crawl data with Python and Spark
Stars: ✭ 147 (+68.97%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+72.41%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+2794.25%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+3232.18%)
ODSC India 2018My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-70.11%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-71.26%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+1549.43%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-86.21%)
Live log analyzer sparkSpark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-83.91%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1033.33%)
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-63.22%)
Sagemaker SparkA Spark library for Amazon SageMaker.
Stars: ✭ 219 (+151.72%)
Artificial Intelligence Deep Learning Machine Learning TutorialsA comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.
Stars: ✭ 2,966 (+3309.2%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+700%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (+24.14%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-26.44%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (+24.14%)
Dev SetupmacOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Stars: ✭ 5,590 (+6325.29%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-27.59%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (-17.24%)
PerunA command-line validation tool for AWS Cloud Formation that allows to conquer the cloud faster!
Stars: ✭ 82 (-5.75%)
Terraform Aws ElbTerraform module which creates ELB resources on AWS
Stars: ✭ 85 (-2.3%)
Facial Expression RecognitionClassify each facial image into one of the seven facial emotion categories considered using CNN based on https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge
Stars: ✭ 82 (-5.75%)
Aws Lambda Go Proxy⚡️ ☁️ Pass Lambda events to the application running on your machine | Debug real traffic locally | Forget about redeployments
Stars: ✭ 85 (-2.3%)
MleapMLeap: Deploy ML Pipelines to Production
Stars: ✭ 1,232 (+1316.09%)
LeharVisualize data using relative ordering
Stars: ✭ 81 (-6.9%)
Ecs Pipeline☁️ 🐳 ⚡️ 🚀 Create environment and deployment pipelines to ECS Fargate with CodePipeline, CodeBuild and Github using Terraform
Stars: ✭ 85 (-2.3%)
Spark GbtlrHybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
Stars: ✭ 81 (-6.9%)
Aws AutomationAWS automation scripts and lambda functions
Stars: ✭ 81 (-6.9%)
This Or ThatThis or that - Real-time atomic voting app built with AWS Amplify
Stars: ✭ 87 (+0%)
Kaggle CompetitionsThere are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-1.15%)
Write With MeReal-time Collaborative Markdown Editor
Stars: ✭ 81 (-6.9%)
MetasearchSearch aggregator for Slack, Google Docs, GitHub, and more 🔍
Stars: ✭ 81 (-6.9%)