All Projects → anovos → Similar Projects or Alternatives

464 Open source projects that are alternatives of or similar to anovos

data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (-35.06%)
Mutual labels:  bigdata, pyspark
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+1637.66%)
Mutual labels:  bigdata, pyspark
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-55.84%)
Mutual labels:  bigdata, pyspark
Spark-and-Kafka IoT-Data-Processing-and-Analytics
Final Project for IoT: Big Data Processing and Analytics class. Analyzing U.S nationwide temperature from IoT sensors in real-time
Stars: ✭ 42 (-45.45%)
Mutual labels:  bigdata, pyspark
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1180.52%)
Mutual labels:  bigdata, pyspark
optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+1654.55%)
Mutual labels:  bigdata, pyspark
AutoTS
Automated Time Series Forecasting
Stars: ✭ 665 (+763.64%)
Mutual labels:  feature-engineering
featuretoolsOnSpark
A simplified version of featuretools for Spark
Stars: ✭ 24 (-68.83%)
Mutual labels:  feature-engineering
chatnoir-resiliparse
A robust web archive analytics toolkit
Stars: ✭ 26 (-66.23%)
Mutual labels:  bigdata
PersonNotes
个人笔记集中营,快糙猛的形式记录技术性Notes .. 📚☕️⌨️🎧
Stars: ✭ 61 (-20.78%)
Mutual labels:  bigdata
Temps
λ A selfhostable serverless function runtime. Inspired by zeit now.
Stars: ✭ 15 (-80.52%)
Mutual labels:  scale
EvolutionaryForest
An open source python library for automated feature engineering based on Genetic Programming
Stars: ✭ 56 (-27.27%)
Mutual labels:  feature-engineering
twitter-archive-reader
Full featured TypeScript Twitter archive reader and browser
Stars: ✭ 43 (-44.16%)
Mutual labels:  bigdata
spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
Stars: ✭ 55 (-28.57%)
Mutual labels:  pyspark
msda
Library for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector
Stars: ✭ 80 (+3.9%)
Mutual labels:  feature-engineering
kaggle-berlin
Material of the Kaggle Berlin meetup group!
Stars: ✭ 36 (-53.25%)
Mutual labels:  feature-engineering
zdh web
大数据采集,抽取平台
Stars: ✭ 292 (+279.22%)
Mutual labels:  bigdata
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (-24.68%)
Mutual labels:  pyspark
traefik-ondemand-service
Traefik ondemand service for the traefik ondemand plugin
Stars: ✭ 35 (-54.55%)
Mutual labels:  scale
intersect
一道面试题的思考 - 6000万数据包和300万数据包在50M内存使用环境中求交集
Stars: ✭ 54 (-29.87%)
Mutual labels:  bigdata
StreamBench
Measuring the performance of popular streaming engines with Yahoo's Streaming Benchmark
Stars: ✭ 52 (-32.47%)
Mutual labels:  bigdata
clink
Clink is a library that provides APIs and infrastructure to facilitate the development of parallelizable feature engineering operators that can be used in both C++ and Java runtime.
Stars: ✭ 24 (-68.83%)
Mutual labels:  feature-engineering
pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (-6.49%)
Mutual labels:  pyspark
feng
feng - feature engineering for machine-learning champions
Stars: ✭ 27 (-64.94%)
Mutual labels:  feature-engineering
hayabusa
Hayabusa: Simple and Fast Full-Text Search Engine for Massive System Log Data
Stars: ✭ 43 (-44.16%)
Mutual labels:  bigdata
jhdf
A pure Java HDF5 library
Stars: ✭ 83 (+7.79%)
Mutual labels:  bigdata
dot
distributed data sync with operational transformation/transforms
Stars: ✭ 73 (-5.19%)
Mutual labels:  transformation
jgit-spark-connector
jgit-spark-connector is a library for running scalable data retrieval pipelines that process any number of Git repositories for source code analysis.
Stars: ✭ 71 (-7.79%)
Mutual labels:  pyspark
awesome-coder-resources
编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (-29.87%)
Mutual labels:  bigdata
dt-sql-parser
SQL Parsers for BigData, built with antlr4.
Stars: ✭ 135 (+75.32%)
Mutual labels:  bigdata
einet
Uncertainty and causal emergence in complex networks
Stars: ✭ 77 (+0%)
Mutual labels:  scale
gintonic
A declarative transformation language for GraphQL 🍸
Stars: ✭ 27 (-64.94%)
Mutual labels:  transformation
stargan2
StarGAN2 for practice
Stars: ✭ 89 (+15.58%)
Mutual labels:  transformation
163-bigdate-note
bigdata note
Stars: ✭ 38 (-50.65%)
Mutual labels:  bigdata
learn-by-examples
Real-world Spark pipelines examples
Stars: ✭ 84 (+9.09%)
Mutual labels:  pyspark
flask-spark-docker
Just a boilerplate for PySpark and Flask
Stars: ✭ 32 (-58.44%)
Mutual labels:  pyspark
hedgedhttp
Hedged HTTP client which helps to reduce tail latency at scale.
Stars: ✭ 103 (+33.77%)
Mutual labels:  scale
greycat
GreyCat - Data Analytics, Temporal data, What-if, Live machine learning
Stars: ✭ 104 (+35.06%)
Mutual labels:  bigdata
Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (+63.64%)
Mutual labels:  bigdata
ReinforcementLearning Sutton-Barto Solutions
Solutions and figures for problems from Reinforcement Learning: An Introduction Sutton&Barto
Stars: ✭ 20 (-74.03%)
Mutual labels:  feature-engineering
bigquery-data-lineage
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Stars: ✭ 112 (+45.45%)
Mutual labels:  bigdata
2019 egu workshop jupyter notebooks
Short course on interactive analysis of Big Earth Data with Jupyter Notebooks
Stars: ✭ 29 (-62.34%)
Mutual labels:  bigdata
young-examples
java学习和项目中一些典型的应用场景样例代码
Stars: ✭ 21 (-72.73%)
Mutual labels:  bigdata
PubMed-Best-Match
Machine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches
Stars: ✭ 36 (-53.25%)
Mutual labels:  feature-engineering
ASV
[CVPR16] Accumulated Stability Voting: A Robust Descriptor from Descriptors of Multiple Scales
Stars: ✭ 26 (-66.23%)
Mutual labels:  scale
go-hx711
Golang HX711 interface using periph.io driver
Stars: ✭ 15 (-80.52%)
Mutual labels:  scale
GEAN
This toolkit deals with GEnomic sequence and genome structure ANnotation files between inbreeding lines and species.
Stars: ✭ 36 (-53.25%)
Mutual labels:  transformation
exemplary-ml-pipeline
Exemplary, annotated machine learning pipeline for any tabular data problem.
Stars: ✭ 23 (-70.13%)
Mutual labels:  feature-engineering
scale
📦 Toolkit for mapping abstract data into visual representation.
Stars: ✭ 53 (-31.17%)
Mutual labels:  scale
kafka-twitter-spark-streaming
Counting Tweets Per User in Real-Time
Stars: ✭ 38 (-50.65%)
Mutual labels:  pyspark
FIFA-2019-Analysis
This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations
Stars: ✭ 28 (-63.64%)
Mutual labels:  feature-engineering
the-apache-ignite-book
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (-15.58%)
Mutual labels:  bigdata
nitroml
NitroML is a modular, portable, and scalable model-quality benchmarking framework for Machine Learning and Automated Machine Learning (AutoML) pipelines.
Stars: ✭ 40 (-48.05%)
Mutual labels:  scale
skrobot
skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.
Stars: ✭ 22 (-71.43%)
Mutual labels:  feature-engineering
isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (-63.64%)
Mutual labels:  pyspark
Spark-MLlib-Tutorial
大数据框架 Spark MLlib 机器学习库基础算法全面讲解,附带齐全的测试文件
Stars: ✭ 32 (-58.44%)
Mutual labels:  bigdata
amas
Amas is recursive acronym for “Amas, monitor alert system”.
Stars: ✭ 77 (+0%)
Mutual labels:  bigdata
lectures-hse-spark
Масштабируемое машинное обучение и анализ больших данных с Apache Spark
Stars: ✭ 20 (-74.03%)
Mutual labels:  bigdata
BetterDummy
Unlock your displays on your Mac! Smooth scaling, HiDPI unlock, XDR/HDR extra brightness upscale, DDC, brightness and dimming, dummy displays, PIP and lots more!
Stars: ✭ 9,601 (+12368.83%)
Mutual labels:  scale
pyspark-cassandra
pyspark-cassandra is a Python port of the awesome @datastax Spark Cassandra connector. Compatible w/ Spark 2.0, 2.1, 2.2, 2.3 and 2.4
Stars: ✭ 70 (-9.09%)
Mutual labels:  pyspark
1-60 of 464 similar projects