Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (-69.13%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (-83.86%)
KoalasKoalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+127.5%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-83.93%)
Seaborn TutorialThis repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (-91.48%)
Pandas VideosJupyter notebook and datasets from the pandas Q&A video series
Stars: ✭ 1,716 (+28.25%)
Show astAn IPython notebook plugin for visualizing ASTs.
Stars: ✭ 76 (-94.32%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-90.43%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+116.67%)
HandysparkHandySpark - bringing pandas-like capabilities to Spark dataframes
Stars: ✭ 158 (-88.19%)
Data Science Resources👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-87.22%)
Data Science Your WayWays of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (-60.39%)
Amazing Feature EngineeringFeature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-83.71%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (-85.05%)
Ipyexperimentsjupyter/ipython experiment containers for GPU and general RAM re-use
Stars: ✭ 128 (-90.43%)
PachydermReproducible Data Science at Scale!
Stars: ✭ 5,305 (+296.49%)
IpywebrtcWebRTC for Jupyter notebook/lab
Stars: ✭ 171 (-87.22%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-95.29%)
Signals And Systems LectureContinuous- and Discrete-Time Signals and Systems - Theory and Computational Examples
Stars: ✭ 166 (-87.59%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-91.7%)
CortxCORTX Community Object Storage is 100% open source object storage uniquely optimized for mass capacity storage devices.
Stars: ✭ 426 (-68.16%)
TutorialsCatBoost tutorials repository
Stars: ✭ 563 (-57.92%)
NotebooksA collection of Jupyter/IPython notebooks
Stars: ✭ 78 (-94.17%)
Datacamp🍧 A repository that contains courses I have taken on DataCamp
Stars: ✭ 69 (-94.84%)
ZatZeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (-77.35%)
Nbstripoutstrip output from Jupyter and IPython notebooks
Stars: ✭ 738 (-44.84%)
VscodejupyterJupyter for Visual Studio Code
Stars: ✭ 337 (-74.81%)
Spark NotebookInteractive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+130.27%)
Hyperlearn50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster
Stars: ✭ 1,204 (-10.01%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+1547.83%)
Jupyter pivottablejsDrag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
Stars: ✭ 428 (-68.01%)
Jupyterlab LspCoding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol
Stars: ✭ 796 (-40.51%)
JupytemplateTemplates for jupyter notebooks
Stars: ✭ 85 (-93.65%)
SkdataPython tools for data analysis
Stars: ✭ 16 (-98.8%)
Pyspark Setup DemoDemo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-98.21%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (-36.17%)
Ipython DashboardA stand alone, light-weight web server for building, sharing graphs created in ipython. Build for data science, data analysis guys. Aiming at building an interactive visualization, collaborated dashboard, and real-time streaming graph.
Stars: ✭ 664 (-50.37%)
Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+522.5%)
Ansible JupyterhubAnsible role to setup jupyterhub server (deprecated)
Stars: ✭ 14 (-98.95%)
LambdaschooldatascienceCompleted assignments and coding challenges from the Lambda School Data Science program.
Stars: ✭ 22 (-98.36%)
ResourcesPyMC3 educational resources
Stars: ✭ 930 (-30.49%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (-52.69%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (-35.43%)
Bitcoin Value Predictor[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-93.2%)
Drugs Recommendation Using ReviewsAnalyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Stars: ✭ 35 (-97.38%)
Vagrant ProjectsVagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR
Stars: ✭ 34 (-97.46%)