Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-97.52%)
ElasticlusterCreate clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (-83.59%)
Android NosqlLightweight, simple structured NoSQL database for Android
Stars: ✭ 284 (-84.36%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-26.32%)
KorioKorio: Kotlin cORoutines I/O : Virtual File System + Async/Sync Streams + Async TCP Client/Server + WebSockets for Multiplatform Kotlin 1.3
Stars: ✭ 282 (-84.47%)
GatkOfficial code repository for GATK versions 4 and up
Stars: ✭ 1,002 (-44.82%)
Hbase RddSpark RDD to read, write and delete from HBase
Stars: ✭ 277 (-84.75%)
Spark AuthorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (-92.24%)
HelkThe Hunting ELK
Stars: ✭ 3,097 (+70.54%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (-44.93%)
CqlkitCLI tool to export Cassandra query as CSV and JSON format.
Stars: ✭ 94 (-94.82%)
Spark Jupyter AwsA guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (-85.74%)
Big Data Rosetta CodeCode snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Stars: ✭ 254 (-86.01%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (-93.39%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (-45.7%)
Book本项目收藏这些年来看过或者听过的一些不错的书籍,在整理文件时看见这些,发现删掉有点可惜,放着又太浪费空间,本着分享的原则,就把它们共享出来,一方面给需要的读者提供这些书籍,另一方面也是一种像知识库的积累吧
Stars: ✭ 47 (-97.41%)
spark-http-streamspark structured streaming via HTTP communication
Stars: ✭ 17 (-99.06%)
OpaqueAn encrypted data analytics platform
Stars: ✭ 129 (-92.9%)
Udacity Data Engineering ProjectsFew projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (-74.78%)
marinaHigh-Performance Erlang Cassandra CQL Client
Stars: ✭ 50 (-97.25%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-94.05%)
Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (-74.89%)
spark-data-sourcesDeveloping Spark External Data Sources using the V2 API
Stars: ✭ 36 (-98.02%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-99.12%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (-10.24%)
Covid19TrackerA Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (-96.42%)
Spark FlamegraphEasy CPU Profiling for Apache Spark applications
Stars: ✭ 30 (-98.35%)
dieselNo description or website provided.
Stars: ✭ 30 (-98.35%)
blogblog entries
Stars: ✭ 39 (-97.85%)
PucketBucketing and partitioning system for Parquet
Stars: ✭ 29 (-98.4%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-92.35%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-98.62%)
HeraclesHigh performance HBase / Spark SQL engine
Stars: ✭ 27 (-98.51%)
CasperA compiler for automatically re-targeting sequential Java code to Apache Spark.
Stars: ✭ 45 (-97.52%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-95.15%)
smolderHL7 Apache Spark Datasource
Stars: ✭ 33 (-98.18%)
SparkApache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+1641.08%)
spark-demosCollection of different demo applications using Apache Spark
Stars: ✭ 15 (-99.17%)
Dcos Cassandra ServiceDEPRECATED—Open source Apache Cassandra running on DC/OS is now replaced by mesosphere/dcos-commons/frameworks/cassandra. This repository will be deleted at the end of 2017.
Stars: ✭ 116 (-93.61%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+1114.1%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-96.53%)
Bigdataie大数据博客、笔试题、教程、项目、面经的整理
Stars: ✭ 445 (-75.5%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+230.84%)
FlintA Time Series Library for Apache Spark
Stars: ✭ 878 (-51.65%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+568.28%)
Nd4jFast, Scientific and Numerical Computing for the JVM (NDArrays)
Stars: ✭ 1,742 (-4.07%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-92.35%)
HorovodDistributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Stars: ✭ 11,943 (+557.65%)