All Projects → Pulsar Spark → Similar Projects or Alternatives

1595 Open source projects that are alternatives of or similar to Pulsar Spark

Dockerfiles
50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+1440%)
Mutual labels:  spark
Data Analysis And Machine Learning Projects
Repository of teaching materials, code, and data for my data analysis and machine learning projects.
Stars: ✭ 5,166 (+9292.73%)
Mutual labels:  data-science
Nlp
📝 This repository recorded my NLP journey.
Stars: ✭ 820 (+1390.91%)
Mutual labels:  data-science
Data Science Portfolio
Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.
Stars: ✭ 559 (+916.36%)
Mutual labels:  data-science
Awesome Fraud Detection Papers
A curated list of data mining papers about fraud detection.
Stars: ✭ 843 (+1432.73%)
Mutual labels:  data-science
Bubbly
A python package for plotting animated and interactive bubble charts using Plotly
Stars: ✭ 37 (-32.73%)
Mutual labels:  data-science
Computervision Recipes
Best Practices, code samples, and documentation for Computer Vision.
Stars: ✭ 8,214 (+14834.55%)
Mutual labels:  data-science
Streamsx.messaging
This toolkit is focused on interacting with popular messaging systems such as Kafka, JMS, XMS, and MQTT. After release v5.4.2 the complete toolkit will be deprecated. See the README.md file for hints to alternative toolkits.
Stars: ✭ 31 (-43.64%)
Mutual labels:  stream-processing
Machine Learning With Python
Small scale machine learning projects to understand the core concepts . Give a Star 🌟If it helps you. BONUS: Interview Bank coming up..!
Stars: ✭ 821 (+1392.73%)
Mutual labels:  data-science
Spark Daria
Essential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+905.45%)
Mutual labels:  spark
Socrat
A Dynamic Web Toolbox for Interactive Data Processing, Analysis, and Visualization
Stars: ✭ 26 (-52.73%)
Mutual labels:  data-science
Probabilistic Programming And Bayesian Methods For Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
Stars: ✭ 23,912 (+43376.36%)
Mutual labels:  data-science
Tiledb
The Universal Storage Engine
Stars: ✭ 1,072 (+1849.09%)
Mutual labels:  data-science
Justenoughscalaforspark
A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+878.18%)
Mutual labels:  spark
Rmarkdown Website Tutorial
Tutorial for creating websites w/ R Markdown
Stars: ✭ 26 (-52.73%)
Mutual labels:  data-science
Openscoring
REST web service for the true real-time scoring (<1 ms) of Scikit-Learn, R and Apache Spark models
Stars: ✭ 536 (+874.55%)
Mutual labels:  apache-spark
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+1685.45%)
Mutual labels:  data-science
Feature Selection
Features selector based on the self selected-algorithm, loss function and validation method
Stars: ✭ 534 (+870.91%)
Mutual labels:  data-science
Data Science Your Way
Ways of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (+863.64%)
Mutual labels:  data-science
10 Simple Hacks To Speed Up Your Data Analysis In Python
Some useful Tips and Tricks to speed up the data analysis process in Python.
Stars: ✭ 45 (-18.18%)
Mutual labels:  data-science
Interpretable machine learning with python
Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
Stars: ✭ 530 (+863.64%)
Mutual labels:  data-science
Datacleaner
A Python tool that automatically cleans data sets and readies them for analysis.
Stars: ✭ 933 (+1596.36%)
Mutual labels:  data-science
Lopq
Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (+863.64%)
Mutual labels:  spark
Dataconfs
A list of conferences connected with data worldwide.
Stars: ✭ 36 (-34.55%)
Mutual labels:  data-science
Moderndive book
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
Stars: ✭ 527 (+858.18%)
Mutual labels:  data-science
Spark Swagger
Spark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-54.55%)
Mutual labels:  spark
Dapy
Easy-to-use data analysis / manipulation framework for humans
Stars: ✭ 523 (+850.91%)
Mutual labels:  data-science
Glue
Linked Data Visualizations Across Multiple Files
Stars: ✭ 518 (+841.82%)
Mutual labels:  data-science
Kubeflow Data Science On Steroids
The blog post about Kubeflow, including all materials
Stars: ✭ 25 (-54.55%)
Mutual labels:  data-science
Saber
Window-Based Hybrid CPU/GPU Stream Processing Engine
Stars: ✭ 35 (-36.36%)
Mutual labels:  stream-processing
Lean Batch Launcher
Unofficial alternative launcher for QuantConnect's LEAN allowing for parallel execution and looping/batching with customizable parameters and ranges.
Stars: ✭ 30 (-45.45%)
Mutual labels:  batch-processing
Awesome Python Data Science
Probably the best curated list of data science software in Python.
Stars: ✭ 812 (+1376.36%)
Mutual labels:  data-science
Heamy
A set of useful tools for competitive data science.
Stars: ✭ 511 (+829.09%)
Mutual labels:  data-science
Chronicler
Scala toolchain for InfluxDB
Stars: ✭ 24 (-56.36%)
Mutual labels:  spark
Spacy Stanza
💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
Stars: ✭ 508 (+823.64%)
Mutual labels:  data-science
Diffgram
Data Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning
Stars: ✭ 43 (-21.82%)
Mutual labels:  data-science
Pandera
A light-weight, flexible, and expressive pandas data validation library
Stars: ✭ 506 (+820%)
Mutual labels:  data-processing
Redis Stream Demo
Demo for Redis Streams
Stars: ✭ 24 (-56.36%)
Mutual labels:  stream-processing
Edward
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
Stars: ✭ 4,674 (+8398.18%)
Mutual labels:  data-science
Mldm
потоковый курс "Машинное обучение и анализ данных (Machine Learning and Data Mining)" на факультете ВМК МГУ имени М.В. Ломоносова
Stars: ✭ 35 (-36.36%)
Mutual labels:  data-science
Gimp Plugin Bimp
BIMP. Batch Image Manipulation Plugin for GIMP.
Stars: ✭ 500 (+809.09%)
Mutual labels:  batch-processing
Spark Tdd Example
A simple Spark TDD example
Stars: ✭ 23 (-58.18%)
Mutual labels:  spark
Awesome R
A curated list of awesome R packages, frameworks and software.
Stars: ✭ 4,858 (+8732.73%)
Mutual labels:  data-science
25daysinmachinelearning
I will update this repository to learn Machine learning with python with statistics content and materials
Stars: ✭ 53 (-3.64%)
Mutual labels:  data-science
Dataframe Go
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Stars: ✭ 487 (+785.45%)
Mutual labels:  data-science
Digitrecognizer
Java Convolutional Neural Network example for Hand Writing Digit Recognition
Stars: ✭ 23 (-58.18%)
Mutual labels:  spark
Apache Flink Docs Zh Translation
Apache Flink官方文档中文翻译计划
Stars: ✭ 485 (+781.82%)
Mutual labels:  flink
Dvc
🦉Data Version Control | Git for Data & Models | ML Experiments Management
Stars: ✭ 9,004 (+16270.91%)
Mutual labels:  data-science
Machine Learning Roadmap
A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Stars: ✭ 5,277 (+9494.55%)
Mutual labels:  data-science
Boltzmannclean
Fill missing values in Pandas DataFrames using Restricted Boltzmann Machines
Stars: ✭ 23 (-58.18%)
Mutual labels:  data-science
Pointblank
Data validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+772.73%)
Mutual labels:  spark
Tidyverse
Easily install and load packages from the tidyverse
Stars: ✭ 1,015 (+1745.45%)
Mutual labels:  data-science
Threatpursuit Vm
Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
Stars: ✭ 814 (+1380%)
Mutual labels:  data-science
Ml Template Azure
Template for getting started with automated ML Ops on Azure Machine Learning
Stars: ✭ 52 (-5.45%)
Mutual labels:  data-science
Skoot
A package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn friendly interface in an effort to expedite the modeling process.
Stars: ✭ 50 (-9.09%)
Mutual labels:  data-science
Susi
SuSi: Python package for unsupervised, supervised and semi-supervised self-organizing maps (SOM)
Stars: ✭ 42 (-23.64%)
Mutual labels:  data-science
Tensorflow object counting api
🚀 The TensorFlow Object Counting API is an open source framework built on top of TensorFlow and Keras that makes it easy to develop object counting systems!
Stars: ✭ 956 (+1638.18%)
Mutual labels:  data-science
Datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+1380%)
Mutual labels:  data-science
Osint collection
Maintained collection of OSINT related resources. (All Free & Actionable)
Stars: ✭ 809 (+1370.91%)
Mutual labels:  data-science
Page clustering
A simple algorithm for clustering web pages, suitable for crawlers
Stars: ✭ 30 (-45.45%)
Mutual labels:  data-science
301-360 of 1595 similar projects