Coding Now学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Stars: ✭ 750 (+435.71%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-96.43%)
Azure OpenshiftRedHat Openshift Origin cluster on Azure
Stars: ✭ 17 (-87.86%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+432.14%)
AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (-7.14%)
Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-67.86%)
GokaGoka is a compact yet powerful distributed stream processing library for Apache Kafka written in Go.
Stars: ✭ 1,862 (+1230%)
Dockerfiles50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+505%)
Spark FlamegraphEasy CPU Profiling for Apache Spark applications
Stars: ✭ 30 (-78.57%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-60.71%)
Reddit sse streamA Server Side Event stream to deliver Reddit comments and submissions in near real-time to a client.
Stars: ✭ 39 (-72.14%)
Delta ArchitectureStreaming data changes to a Data Lake with Debezium and Delta Lake pipeline
Stars: ✭ 43 (-69.29%)
SnappydataProject SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
Stars: ✭ 995 (+610.71%)
ExamplesDemo applications and code examples for Confluent Platform and Apache Kafka
Stars: ✭ 571 (+307.86%)
Awesome PulsarA curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-59.29%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-58.57%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+604.29%)
Utils4sscala、spark使用过程中,各种测试用例以及相关资料整理
Stars: ✭ 1,070 (+664.29%)
Spark NkpNatural Korean Processor for Apache Spark
Stars: ✭ 50 (-64.29%)
Private Aks ClusterThis sample shows how to create a private AKS cluster in a virtual network along with a jumpbox virtual machine.
Stars: ✭ 63 (-55%)
Cypher StreamNeo4j Cypher queries as Node.js object streams
Stars: ✭ 58 (-58.57%)
Kafka Connect JdbcKafka Connect connector for JDBC-compatible databases
Stars: ✭ 698 (+398.57%)
Awesome Recommendation EngineThe purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.
Stars: ✭ 47 (-66.43%)
OblectoOblecto is a media server, which streams media you already own, and is designed to be at the heart of your entertainment experience. It runs on your home server to index and analyze your media such as Movies and TV Shows and presents them in an interface tailored for your media consupmtion needs.
Stars: ✭ 67 (-52.14%)
ThingsboardOpen-source IoT Platform - Device management, data collection, processing and visualization.
Stars: ✭ 10,526 (+7418.57%)
Webrtc CliWebRTC command-line peer.
Stars: ✭ 135 (-3.57%)
AthenaxSQL-based streaming analytics platform at scale
Stars: ✭ 1,178 (+741.43%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-47.14%)
Sec Apisec.gov EDGAR API | search & filter SEC filings | over 150 form types supported | 10-Q, 10-K, 8, 4, 13, S-11, ... | insider trading
Stars: ✭ 71 (-49.29%)
Cleanframestype-class based data cleansing library for Apache Spark SQL
Stars: ✭ 75 (-46.43%)
Fs2 KafkaKafka client for functional streams for scala (fs2)
Stars: ✭ 75 (-46.43%)
AzureAzure-related repository
Stars: ✭ 78 (-44.29%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-42.14%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-38.57%)
MnemonicApache Mnemonic - A non-volatile hybrid memory storage oriented library
Stars: ✭ 91 (-35%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+855.71%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-30.71%)
Kafka Connect Mongodb**Unofficial / Community** Kafka Connect MongoDB Sink Connector - Find the official MongoDB Kafka Connector here: https://www.mongodb.com/kafka-connector
Stars: ✭ 137 (-2.14%)
StormkafkamonDumps state of Storm Kafka consumers
Stars: ✭ 99 (-29.29%)
Hls.jsHLS.js is a JavaScript library that plays HLS in browsers with support for MSE.
Stars: ✭ 10,791 (+7607.86%)
BurrowuiThis is a NodeJS/Angular 2 frontend UI for Kafka cluster monitoring with Burrow
Stars: ✭ 69 (-50.71%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-34.29%)
RecommendersBest Practices on Recommendation Systems
Stars: ✭ 11,818 (+8341.43%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-10%)