懒松鼠Flink-Boot 脚手架让Flink全面拥抱Spring生态体系，使得开发者可以以Java WEB开发模式开发出分布式运行的流处理程序，懒松鼠让跨界变得更加简单。懒松鼠旨在让开发者以更底上手成本（不需要理解分布式计算的理论知识和Flink框架的细节）便可以快速编写业务代码实现。为了进一步提升开发者使用懒松鼠脚手架开发大型项目的敏捷的度，该脚手架默认集成Spring框架进行Bean管理，同时将微服务以及WEB开发领域中经常用到的框架集成进来，进一步提升开发速度。比如集成Mybatis ORM框架，Hibernate Validator校验框架,Spring Retry重试框架等，具体见下面的脚手架特性。

Stars: ✭ 209 (+386.05%)

Mutual labels: bigdata

Node Hbase

Asynchronous HBase client for NodeJs using REST

Stars: ✭ 226 (+425.58%)

Mutual labels: bigdata

Every Single Day I Tldr

A daily digest of the articles or videos I've found interesting, that I want to share with you.

Stars: ✭ 249 (+479.07%)

Mutual labels: bigdata

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (+372.09%)

Mutual labels: bigdata

optimus

🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Stars: ✭ 1,351 (+3041.86%)

Mutual labels: bigdata

Kotlin Spark Api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

Stars: ✭ 183 (+325.58%)

Mutual labels: bigdata

Dpark

Python clone of Spark, a MapReduce alike framework in Python

Stars: ✭ 2,668 (+6104.65%)

Mutual labels: bigdata

Syslog

An Arduino library for logging to Syslog server in IETF format (RFC 5424) and BSD format (RFC 3164)

Stars: ✭ 105 (+144.19%)

Mutual labels: syslog

Spark-MLlib-Tutorial

大数据框架 Spark MLlib 机器学习库基础算法全面讲解,附带齐全的测试文件

Stars: ✭ 32 (-25.58%)

Mutual labels: bigdata

bigdatatutorial

Stars: ✭ 34 (-20.93%)

Mutual labels: bigdata

View All Similar Projects ➔

Hayabusa

Hayabusa: A Simple and Fast Full-Text Search Engine for Massive System Log Data

Concept

Pure python implement
Parallel SQLite processing engine
SQLite3 FTS(Full Text Search)
Core-scale architecture

Architecture

Design of the directory structure
- By specifying a search range of time in ”the directory path + yyyy + mm + dd + hh + min.db”, the search program can select the search time systematically.
```
/targetdir/yyyy/mm/dd/hh/min.db
```

StoreEngine

sample code

import os.path import sqlite3
db_file = ’test.db’ log_file = ’1m.log’

if not os.path.exists(db_file):
    conn = sqlite3.connect(db_file) conn.execute("CREATE VIRTUAL TABLE SYSLOG USING FTS3(LOGS)");
    conn.close()
conn = sqlite3.connect(db_file)

with open(log_file) as fh:
    lines = [[line] for line in fh] 
    conn.executemany(’INSERT INTO SYSLOG VALUES ( ? )’, lines) 
    conn.commit()

SearchEngine

sample command

$ python search_engine.py -h
usage: search_engine.py [-h] [--time TIME] [--match MATCH] [-c] [-s] [-v]

optional arguments:
  -h, --help     show this help message and exit
  --time TIME    time explain regexp(YYYY/MM/DD/HH/MIN). eg: 2017/04/27/10/*
  --match MATCH  matching keyword. eg: noc or 'noc Login'
  -e             exact match
  -c             count
  -s             sum
  -v             verbose
 
 $ python search_engine.py --time 2017/05/11/13/* --match 'keyword' -c

Architecture image

Search condition

case-insensitive
- no distinguish uppercase or lowercase
Exact match
```
-e --match '192.168.0.1'
```
AND
```
--match 'Hello World'
```
OR
```
--match 'Hello OR World'
```
NOT
```
--match 'Hello World -Wide'
```

PHRASE

--match '"Hello World"'
--match '\"192.168.0.1\"' <- IP address case(same as -e flag)
--match '\"192.168.0.1\" src sent' <- PHRASE + AND search

asterisk(*)
```
--match 'H* World'
```
HAT
```
--match '^Hello World'
```

Development environment

CentOS 7.3
Python 3.5.1(use anaconda packages)
SQLite3(version 3.9.2)

Dependency softwares

Python 3
SQLite3
GNU Parallel

Benchmark

Compare with Apache Spark

Hayabusa and Spark time comparison
Comarison of distributes Spark environment and the stand-alone Hayabusa

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

hirolovesbeer / hayabusa

Programming Languages

Labels

Projects that are alternatives of or similar to hayabusa

Hayabusa

Concept

Architecture

Search condition

Development environment

Dependency softwares

Benchmark

Compare with Apache Spark