Fast MrmrAn improved implementation of the classical feature selection method: minimum Redundancy and Maximum Relevance (mRMR).
Stars: ✭ 67 (-96.31%)
RoffildlibraryLibrary for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS
Stars: ✭ 63 (-96.53%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (-20.98%)
Silexsomething to help you spark
Stars: ✭ 61 (-96.64%)
Scala SamplesThere are pieces of scala code that explain Scala syntax and related things - like what you can do with all this
Stars: ✭ 125 (-93.12%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (-1.98%)
Zemberek Nlp ServerZemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
Stars: ✭ 60 (-96.7%)
Pyspark ExamplesCode examples on Apache Spark using python
Stars: ✭ 58 (-96.81%)
SplashSplash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (-94.22%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-93.28%)
Awesome PulsarA curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-96.86%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-96.97%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-92.29%)
Nodejs DriverDataStax Node.js Driver for Apache Cassandra
Stars: ✭ 1,074 (-40.86%)
Spark Submit UiThis is a based on playframwork for submit spark app
Stars: ✭ 53 (-97.08%)
ZparkioBoiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (-93.34%)
AlmondA Scala kernel for Jupyter
Stars: ✭ 1,354 (-25.44%)
KontextfreiWriting application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (-96.31%)
Awesome Recommendation EngineThe purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.
Stars: ✭ 47 (-97.41%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-94.66%)
Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-97.52%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-26.32%)
GatkOfficial code repository for GATK versions 4 and up
Stars: ✭ 1,002 (-44.82%)
Spark AuthorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (-92.24%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (-44.93%)
CqlkitCLI tool to export Cassandra query as CSV and JSON format.
Stars: ✭ 94 (-94.82%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (-93.39%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (-45.7%)
OpaqueAn encrypted data analytics platform
Stars: ✭ 129 (-92.9%)
Haproxy Configs80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Stars: ✭ 106 (-94.16%)
ThingsboardOpen-source IoT Platform - Device management, data collection, processing and visualization.
Stars: ✭ 10,526 (+479.63%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (-10.24%)
Spark FlamegraphEasy CPU Profiling for Apache Spark applications
Stars: ✭ 30 (-98.35%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-96.42%)
PucketBucketing and partitioning system for Parquet
Stars: ✭ 29 (-98.4%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-92.35%)
HeraclesHigh performance HBase / Spark SQL engine
Stars: ✭ 27 (-98.51%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-95.15%)
SparkApache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+1641.08%)
Dcos Cassandra ServiceDEPRECATED—Open source Apache Cassandra running on DC/OS is now replaced by mesosphere/dcos-commons/frameworks/cassandra. This repository will be deleted at the end of 2017.
Stars: ✭ 116 (-93.61%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-93.06%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-94.05%)
Spark BigqueryGoogle BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Stars: ✭ 65 (-96.42%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-96.48%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-94.05%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-96.53%)