flink-connector-kudu基于Apache-bahir-kudu-connector的flink-connector-kudu,支持Flink1.11.x DynamicTableSource/Sink,支持Range分区等
Stars: ✭ 40 (-34.43%)
gr-eventstreamgr-eventstream is a set of GNU Radio blocks for creating precisely timed events and either inserting them into, or extracting them from normal data-streams precisely. It allows for the definition of high speed time-synchronous c++ burst event handlers, as well as bridging to standard GNU Radio Async PDU messages with precise timing easily.
Stars: ✭ 38 (-37.7%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-8.2%)
tf-idf-pythonTerm frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (+60.66%)
Recommender-SystemsImplementing Content based and Collaborative filtering(with KNN, Matrix Factorization and Neural Networks) in Python
Stars: ✭ 46 (-24.59%)
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (-52.46%)
extract-colors-pyExtract colors from an image. Colors are grouped based on visual similarities using the CIE76 formula.
Stars: ✭ 48 (-21.31%)
pygramsExtracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
Stars: ✭ 52 (-14.75%)
TextAudit一个短视频app文本审核模块的实现思路及demo
Stars: ✭ 63 (+3.28%)
logparserEasy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+127.87%)
Insider-TradingThis program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.
Stars: ✭ 43 (-29.51%)
dpkb大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse
Stars: ✭ 123 (+101.64%)
SentimentAnalysis(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset
Stars: ✭ 40 (-34.43%)
clusterixVisual exploration of clustered data.
Stars: ✭ 44 (-27.87%)
ResumeRiseAn NLP tool which classifies and summarizes resumes
Stars: ✭ 29 (-52.46%)
bns-short-text-similarity📖 Use Bi-normal Separation to find document vectors which is used to compute similarity for shorter sentences.
Stars: ✭ 24 (-60.66%)
Websockets-Vertx-Flink-KafkaA simple request response cycle using Websockets, Eclipse Vert-x server, Apache Kafka, Apache Flink.
Stars: ✭ 14 (-77.05%)
html2dataLibrary and cli for extracting data from HTML via CSS selectors
Stars: ✭ 62 (+1.64%)
Lidea大型分布式系统实时监控平台
Stars: ✭ 28 (-54.1%)
devsearchA web search engine built with Python which uses TF-IDF and PageRank to sort search results.
Stars: ✭ 52 (-14.75%)
emmaA quotation-based Scala DSL for scalable data analysis.
Stars: ✭ 61 (+0%)
dlinkDinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Streaming & Batch and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.
Stars: ✭ 1,535 (+2416.39%)
coolplayflinkFlink: Stateful Computations over Data Streams
Stars: ✭ 14 (-77.05%)
SANSA-StackBig Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena http://sansa-stack.github.io/SANSA-Stack/
Stars: ✭ 130 (+113.11%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+214.75%)
review-notes团队分享学习、复盘笔记资料共享。Java、Scala、Flink...
Stars: ✭ 27 (-55.74%)
flink-deployerA tool that help automate deployment to an Apache Flink cluster
Stars: ✭ 143 (+134.43%)
Content-based-Recommender-SystemIt is a content based recommender system that uses tf-idf and cosine similarity for N Most SImilar Items from a dataset
Stars: ✭ 64 (+4.92%)
flink-clientJava library for managing Apache Flink via the Monitoring REST API
Stars: ✭ 48 (-21.31%)
pigletA compiler for Pig Latin to Spark and Flink.
Stars: ✭ 23 (-62.3%)
fdp-modelserverAn umbrella project for multiple implementations of model serving
Stars: ✭ 47 (-22.95%)
Keyword-ExtracterProblem Statement: Given a particular PDF/Text document ,How to extract keywords and arrange in order of their weightage using Python?
Stars: ✭ 17 (-72.13%)
KeywordExtractionImplementation of algorithm in keyword extraction,including TextRank,TF-IDF and the combination of both
Stars: ✭ 95 (+55.74%)
flink-learnLearning Flink : Flink CEP,Flink Core,Flink SQL
Stars: ✭ 70 (+14.75%)
ArkSavegameToolkitNetLibrary for reading ARK Survival Evolved savegame files using C#.
Stars: ✭ 19 (-68.85%)
FlinkTutorialFlinkTutorial 专注大数据Flink流试处理技术。从基础入门、概念、原理、实战、性能调优、源码解析等内容,使用Java开发,同时含有Scala部分核心代码。欢迎关注我的博客及github。
Stars: ✭ 46 (-24.59%)
soanSocial Analysis based on Whatsapp data
Stars: ✭ 106 (+73.77%)
cassandra.realtimeDifferent ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink
Stars: ✭ 25 (-59.02%)
fb-post-screenshotFirefox Web Extension to save Facebook posts as images
Stars: ✭ 18 (-70.49%)
Nepali-News-ClassifierText Classification of Nepali Language Document. This Mini Project was done for the partial fulfillment of NLP Course : COMP 473.
Stars: ✭ 13 (-78.69%)