basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-82.99%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+406.8%)
Bitcoin Value Predictor[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-38.1%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-26.53%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1162.59%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-26.53%)
Spark StatesCustom state store providers for Apache Spark
Stars: ✭ 83 (-43.54%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+1572.79%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+1612.93%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+1480.27%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1070.75%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-4.76%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+176.19%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (-18.37%)
Repo 2019BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics
Stars: ✭ 133 (-9.52%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-5.44%)
Cs231nhomework for CS231n 2017
Stars: ✭ 144 (-2.04%)
Python camppython code for pratice
Stars: ✭ 144 (-2.04%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+8155.78%)
Multihead Siamese NetsImplementation of Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task.
Stars: ✭ 144 (-2.04%)
UnetU-Net Biomedical Image Segmentation
Stars: ✭ 144 (-2.04%)
AlphatradingAn workflow in factor-based equity trading, including factor analysis and factor modeling. For well-established factor models, I implement APT model, BARRA's risk model and dynamic multi-factor model in this project.
Stars: ✭ 144 (-2.04%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (+0%)
Deepschool.ioDeep Learning tutorials in jupyter notebooks.
Stars: ✭ 1,780 (+1110.88%)
Face RecognitionFace recognition and its application as attendance system
Stars: ✭ 143 (-2.72%)
Elmo TutorialA short tutorial on Elmo training (Pre trained, Training on new data, Incremental training)
Stars: ✭ 145 (-1.36%)
Digital video introductionA hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding).
Stars: ✭ 12,184 (+8188.44%)
Chess Alpha ZeroChess reinforcement learning by AlphaGo Zero methods.
Stars: ✭ 1,868 (+1170.75%)
EssentiaC++ library for audio and music analysis, description and synthesis, including Python bindings
Stars: ✭ 1,985 (+1250.34%)
AltaThe Art of Literary Text Analysis
Stars: ✭ 145 (-1.36%)
Pytorch tutorialA set of jupyter notebooks on pytorch functions with examples
Stars: ✭ 142 (-3.4%)
Python4dsJupyter Notebooks used on my DataScience projects
Stars: ✭ 147 (+0%)
DpcaAn implementation of demixed Principal Component Analysis (a supervised linear dimensionality reduction technique)
Stars: ✭ 146 (-0.68%)
Deep DeepAdaptive crawler which uses Reinforcement Learning methods
Stars: ✭ 145 (-1.36%)
Diy AlexaCommand recognition research
Stars: ✭ 143 (-2.72%)
SqlcellSQLCell is a magic function for the Jupyter Notebook that executes raw, parallel, parameterized SQL queries with the ability to accept Python values as parameters and assign output data to Python variables while concurrently running Python code. And *much* more.
Stars: ✭ 145 (-1.36%)