Dockerfiles50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+1440%)
Nlp📝 This repository recorded my NLP journey.
Stars: ✭ 820 (+1390.91%)
Data Science PortfolioPortfolio of data science projects completed by me for academic, self learning, and hobby purposes.
Stars: ✭ 559 (+916.36%)
BubblyA python package for plotting animated and interactive bubble charts using Plotly
Stars: ✭ 37 (-32.73%)
Computervision RecipesBest Practices, code samples, and documentation for Computer Vision.
Stars: ✭ 8,214 (+14834.55%)
Streamsx.messagingThis toolkit is focused on interacting with popular messaging systems such as Kafka, JMS, XMS, and MQTT. After release v5.4.2 the complete toolkit will be deprecated. See the README.md file for hints to alternative toolkits.
Stars: ✭ 31 (-43.64%)
Machine Learning With PythonSmall scale machine learning projects to understand the core concepts . Give a Star 🌟If it helps you. BONUS: Interview Bank coming up..!
Stars: ✭ 821 (+1392.73%)
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+905.45%)
SocratA Dynamic Web Toolbox for Interactive Data Processing, Analysis, and Visualization
Stars: ✭ 26 (-52.73%)
TiledbThe Universal Storage Engine
Stars: ✭ 1,072 (+1849.09%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+878.18%)
OpenscoringREST web service for the true real-time scoring (<1 ms) of Scikit-Learn, R and Apache Spark models
Stars: ✭ 536 (+874.55%)
Mlj.jlA Julia machine learning framework
Stars: ✭ 982 (+1685.45%)
Feature SelectionFeatures selector based on the self selected-algorithm, loss function and validation method
Stars: ✭ 534 (+870.91%)
Data Science Your WayWays of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (+863.64%)
Interpretable machine learning with pythonExamples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
Stars: ✭ 530 (+863.64%)
DatacleanerA Python tool that automatically cleans data sets and readies them for analysis.
Stars: ✭ 933 (+1596.36%)
LopqTraining of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (+863.64%)
DataconfsA list of conferences connected with data worldwide.
Stars: ✭ 36 (-34.55%)
Moderndive bookStatistical Inference via Data Science: A ModernDive into R and the Tidyverse
Stars: ✭ 527 (+858.18%)
Spark SwaggerSpark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-54.55%)
DapyEasy-to-use data analysis / manipulation framework for humans
Stars: ✭ 523 (+850.91%)
GlueLinked Data Visualizations Across Multiple Files
Stars: ✭ 518 (+841.82%)
SaberWindow-Based Hybrid CPU/GPU Stream Processing Engine
Stars: ✭ 35 (-36.36%)
Lean Batch LauncherUnofficial alternative launcher for QuantConnect's LEAN allowing for parallel execution and looping/batching with customizable parameters and ranges.
Stars: ✭ 30 (-45.45%)
HeamyA set of useful tools for competitive data science.
Stars: ✭ 511 (+829.09%)
ChroniclerScala toolchain for InfluxDB
Stars: ✭ 24 (-56.36%)
Spacy Stanza💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
Stars: ✭ 508 (+823.64%)
DiffgramData Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning
Stars: ✭ 43 (-21.82%)
PanderaA light-weight, flexible, and expressive pandas data validation library
Stars: ✭ 506 (+820%)
EdwardA probabilistic programming language in TensorFlow. Deep generative models, variational inference.
Stars: ✭ 4,674 (+8398.18%)
Mldmпотоковый курс "Машинное обучение и анализ данных (Machine Learning and Data Mining)" на факультете ВМК МГУ имени М.В. Ломоносова
Stars: ✭ 35 (-36.36%)
Gimp Plugin BimpBIMP. Batch Image Manipulation Plugin for GIMP.
Stars: ✭ 500 (+809.09%)
Awesome RA curated list of awesome R packages, frameworks and software.
Stars: ✭ 4,858 (+8732.73%)
25daysinmachinelearningI will update this repository to learn Machine learning with python with statistics content and materials
Stars: ✭ 53 (-3.64%)
Dataframe GoDataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Stars: ✭ 487 (+785.45%)
DigitrecognizerJava Convolutional Neural Network example for Hand Writing Digit Recognition
Stars: ✭ 23 (-58.18%)
Dvc🦉Data Version Control | Git for Data & Models | ML Experiments Management
Stars: ✭ 9,004 (+16270.91%)
Machine Learning RoadmapA roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Stars: ✭ 5,277 (+9494.55%)
BoltzmanncleanFill missing values in Pandas DataFrames using Restricted Boltzmann Machines
Stars: ✭ 23 (-58.18%)
PointblankData validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+772.73%)
TidyverseEasily install and load packages from the tidyverse
Stars: ✭ 1,015 (+1745.45%)
Threatpursuit VmThreat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
Stars: ✭ 814 (+1380%)
Ml Template AzureTemplate for getting started with automated ML Ops on Azure Machine Learning
Stars: ✭ 52 (-5.45%)
SkootA package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn friendly interface in an effort to expedite the modeling process.
Stars: ✭ 50 (-9.09%)
SusiSuSi: Python package for unsupervised, supervised and semi-supervised self-organizing maps (SOM)
Stars: ✭ 42 (-23.64%)
Tensorflow object counting api🚀 The TensorFlow Object Counting API is an open source framework built on top of TensorFlow and Keras that makes it easy to develop object counting systems!
Stars: ✭ 956 (+1638.18%)
Datastream.ioAn open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+1380%)
Osint collectionMaintained collection of OSINT related resources. (All Free & Actionable)
Stars: ✭ 809 (+1370.91%)
Page clusteringA simple algorithm for clustering web pages, suitable for crawlers
Stars: ✭ 30 (-45.45%)