All Projects → Sparkrdma → Similar Projects or Alternatives

1035 Open source projects that are alternatives of or similar to Sparkrdma

Fili
Easily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.
Stars: ✭ 151 (-29.77%)
Mutual labels:  big-data
Hazelcast Cpp Client
Hazelcast IMDG C++ Client
Stars: ✭ 67 (-68.84%)
Mutual labels:  big-data
Mobydq
🐳 Tool to automate data quality checks on data pipelines
Stars: ✭ 123 (-42.79%)
Mutual labels:  big-data
Flink Shaded
Apache Flink shaded artifacts repository
Stars: ✭ 67 (-68.84%)
Mutual labels:  big-data
Dvid
Distributed, Versioned, Image-oriented Dataservice
Stars: ✭ 174 (-19.07%)
Mutual labels:  big-data
Spark Infotheoretic Feature Selection
This package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.
Stars: ✭ 123 (-42.79%)
Mutual labels:  spark
Spark Bigquery
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Stars: ✭ 65 (-69.77%)
Mutual labels:  spark
Spark Tsne
Distributed t-SNE via Apache Spark
Stars: ✭ 151 (-29.77%)
Mutual labels:  spark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-70.23%)
Mutual labels:  spark
Dynamometer
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Stars: ✭ 122 (-43.26%)
Mutual labels:  hadoop
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-70.7%)
Mutual labels:  spark
Facebook Hive Udfs
Facebook's Hive UDFs
Stars: ✭ 213 (-0.93%)
Mutual labels:  hadoop
Roffildlibrary
Library for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS
Stars: ✭ 63 (-70.7%)
Mutual labels:  spark
Benchm Ml
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
Stars: ✭ 1,835 (+753.49%)
Mutual labels:  spark
Warp
Convert and analyze large data sets at light speed, on Mac and iOS.
Stars: ✭ 62 (-71.16%)
Mutual labels:  big-data
Deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Stars: ✭ 2,020 (+839.53%)
Mutual labels:  spark
Nabhash
An extremely fast Non-crypto-safe AES Based Hash algorithm for Big Data
Stars: ✭ 62 (-71.16%)
Mutual labels:  big-data
Zparkio
Boiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (-43.72%)
Mutual labels:  spark
Silex
something to help you spark
Stars: ✭ 61 (-71.63%)
Mutual labels:  spark
Data Science Cookbook
🎓 Jupyter notebooks from UFC data science course
Stars: ✭ 60 (-72.09%)
Mutual labels:  spark
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+1071.16%)
Mutual labels:  spark
Hudi
Upserts, Deletes And Incremental Processing on Big Data.
Stars: ✭ 2,586 (+1102.79%)
Mutual labels:  bigdata
Eat pyspark in 10 days
pyspark🍒🥭 is delicious,just eat it!😋😋
Stars: ✭ 116 (-46.05%)
Mutual labels:  spark
Zemberek Nlp Server
Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
Stars: ✭ 60 (-72.09%)
Mutual labels:  spark
Verticapy
VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.
Stars: ✭ 59 (-72.56%)
Mutual labels:  big-data
Example Spark Kafka
Apache Spark and Apache Kafka integration example
Stars: ✭ 120 (-44.19%)
Mutual labels:  spark
Likelike
An implementation of locality sensitive hashing with Hadoop
Stars: ✭ 58 (-73.02%)
Mutual labels:  hadoop
Attic Lens
Mirror of Apache Lens
Stars: ✭ 58 (-73.02%)
Mutual labels:  big-data
Aztk
AZTK powered by Azure Batch: On-demand, Dockerized, Spark Jobs on Azure
Stars: ✭ 152 (-29.3%)
Mutual labels:  spark
Sigmf
The Signal Metadata Format Specification
Stars: ✭ 120 (-44.19%)
Mutual labels:  big-data
Ymcache
YMCache is a lightweight object caching solution for iOS and Mac OS X that is designed for highly parallel access scenarios.
Stars: ✭ 58 (-73.02%)
Mutual labels:  big-data
Pyspark Examples
Code examples on Apache Spark using python
Stars: ✭ 58 (-73.02%)
Mutual labels:  spark
Teddy
Spark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (-44.19%)
Mutual labels:  spark
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-73.02%)
Mutual labels:  spark
Data Science Live Book
An open source book to learn data science, data analysis and machine learning, suitable for all ages!
Stars: ✭ 193 (-10.23%)
Mutual labels:  big-data
Keyvi
Keyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
Stars: ✭ 171 (-20.47%)
Mutual labels:  big-data
Kinesis Sql
Kinesis Connector for Structured Streaming
Stars: ✭ 120 (-44.19%)
Mutual labels:  spark
Model Serving Tutorial
Code and presentation for Strata Model Serving tutorial
Stars: ✭ 57 (-73.49%)
Mutual labels:  spark
Athenacli
AthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.
Stars: ✭ 151 (-29.77%)
Mutual labels:  bigdata
Elassandra
Elassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+648.84%)
Mutual labels:  spark
Net.jgp.labs.spark
Apache Spark examples exclusively in Java
Stars: ✭ 55 (-74.42%)
Mutual labels:  spark
Sparkit Learn
PySpark + Scikit-learn = Sparkit-learn
Stars: ✭ 1,073 (+399.07%)
Mutual labels:  apache-spark
Kibble 1
Apache Kibble - a tool to collect, aggregate and visualize data about any software project
Stars: ✭ 54 (-74.88%)
Mutual labels:  big-data
Attic Predictionio
PredictionIO, a machine learning server for developers and ML engineers.
Stars: ✭ 12,522 (+5724.19%)
Mutual labels:  big-data
Albedo
A recommender system for discovering GitHub repos, built with Apache Spark
Stars: ✭ 149 (-30.7%)
Mutual labels:  apache-spark
Lifion Kinesis
A native Node.js producer and consumer library for Amazon Kinesis Data Streams
Stars: ✭ 54 (-74.88%)
Mutual labels:  big-data
Utils4s
scala、spark使用过程中,各种测试用例以及相关资料整理
Stars: ✭ 1,070 (+397.67%)
Mutual labels:  spark
Spark Submit Ui
This is a based on playframwork for submit spark app
Stars: ✭ 53 (-75.35%)
Mutual labels:  spark
Cube.js
📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+5473.49%)
Mutual labels:  spark
Macro ml
Course Website on Macroeconomic Analysis with Machine Learning and Big Data
Stars: ✭ 53 (-75.35%)
Mutual labels:  big-data
Oodt
Mirror of Apache OODT
Stars: ✭ 52 (-75.81%)
Mutual labels:  big-data
Cc Pyspark
Process Common Crawl data with Python and Spark
Stars: ✭ 147 (-31.63%)
Mutual labels:  spark
Cmak
CMAK is a tool for managing Apache Kafka clusters
Stars: ✭ 10,544 (+4804.19%)
Mutual labels:  big-data
Awesome Spark
A curated list of awesome Apache Spark packages and resources.
Stars: ✭ 1,061 (+393.49%)
Mutual labels:  apache-spark
Datumbox Framework
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Stars: ✭ 1,063 (+394.42%)
Mutual labels:  big-data
Datax
DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (-46.05%)
Mutual labels:  hadoop
Play Spark Scala
Stars: ✭ 51 (-76.28%)
Mutual labels:  spark
Attic Predictionio Sdk Ruby
PredictionIO Ruby SDK
Stars: ✭ 192 (-10.7%)
Mutual labels:  big-data
Avro
Apache Avro is a data serialization system.
Stars: ✭ 2,005 (+832.56%)
Mutual labels:  bigdata
Amazon S3 Find And Forget
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Stars: ✭ 115 (-46.51%)
Mutual labels:  big-data
301-360 of 1035 similar projects