All Projects → livelace → gosquito

livelace / gosquito

Licence: MIT license
gosquito ("go" + "mosquito") is a pluggable tool for data gathering, data processing and data transmitting to various destinations.

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to gosquito

feed2email
RSS/Atom feed updates in your email
Stars: ✭ 37 (+48%)
Mutual labels:  rss, news, smtp
Dino Rss Electron
A simple, efficient, open source RSS reader service
Stars: ✭ 89 (+256%)
Mutual labels:  rss, news
Pygooglenews
If Google News had a Python library
Stars: ✭ 900 (+3500%)
Mutual labels:  rss, news
backup-repository
Backup storage for E2E GPG-encrypted files, with multi-user, quotas, versioning, using a object storage (S3/Min.io/GCS etc.) and deployed on Kubernetes or standalone.
Stars: ✭ 21 (-16%)
Mutual labels:  s3, minio
Data-mining-python-script
It contain various script on web crawling/ data mining of social web(RSS,facebook,twitter,Linkedin)
Stars: ✭ 24 (-4%)
Mutual labels:  rss, data-mining
News
📰 RSS/Atom feed reader
Stars: ✭ 524 (+1996%)
Mutual labels:  rss, news
Newscatchr
FOSS Android News Reader App
Stars: ✭ 216 (+764%)
Mutual labels:  rss, news
HungryHippo
🦛 scrapes websites and generates rss feeds
Stars: ✭ 33 (+32%)
Mutual labels:  rss, news
newsdash
A news dashboard inspired by iGoogle and Netvibes
Stars: ✭ 44 (+76%)
Mutual labels:  rss, news
skbn
Copy files and directories between Kubernetes and cloud storage
Stars: ✭ 68 (+172%)
Mutual labels:  s3, minio
Mapnews
Today's News on a Map
Stars: ✭ 20 (-20%)
Mutual labels:  rss, news
mlflow-docker
Ready to run docker-compose configuration for ML Flow with Mysql and Minio S3
Stars: ✭ 146 (+484%)
Mutual labels:  s3, minio
crawley
Crawley the Telegram Beholder
Stars: ✭ 24 (-4%)
Mutual labels:  rss, news
Cypht
Cypht: Lightweight Open Source webmail written in PHP and JavaScript
Stars: ✭ 628 (+2412%)
Mutual labels:  rss, smtp
overflow-news
📚 Don't waste time searching for good dev blog posts. Get the latest news here.
Stars: ✭ 32 (+28%)
Mutual labels:  rss, news
Awesome Blogs
한국에 있는 좋은 개발자들의 블로그들을 편리하게 구독할 수 있도록 하나의 주소로 묶어서 RSS 피드로 제공해줍니다.
Stars: ✭ 178 (+612%)
Mutual labels:  rss, news
Xioc
Extract indicators of compromise from text, including "escaped" ones.
Stars: ✭ 148 (+492%)
Mutual labels:  data-mining, regexp
Emuto
manipulate JSON files
Stars: ✭ 180 (+620%)
Mutual labels:  data-mining, jq
mindav
A self-hosted file backup server which bridges WebDAV protocol with @minio written in @totoval. Webdav ❤️ Minio
Stars: ✭ 64 (+156%)
Mutual labels:  s3, minio
serializer
A linearizing social tech news reader
Stars: ✭ 89 (+256%)
Mutual labels:  rss, news

gosquito

gosquito ("go" + "mosquito") is a pluggable tool for data gathering, data processing and data transmitting to various destinations. Main goal is to replace various in-house automated tasks with a single tool and move those tasks at the edge. See docs and examples for additional info.

graph TD;
    I1(kafka)-->P1(process plugins);
    I2(resty)-->P1;
    I3(rss)-->P1;
    I4(telegram)-->P1;
    I5(twitter)-->P1;
    P1-->O1(kafka);
    P1-->O2(mattermost);
    P1-->O3(resty);
    P1-->O4(slack);
    P1-->O5(smtp);
    P1-->O6(telegram);

Main features:

  • Pluggable architecture. Data processing organized as chains of plugins.
  • Flow approach. Flow consists of: input plugin (receive), process plugins (filter/transform), output plugin (send).
  • Declarative YAML configurations with templates support.
  • Consider data as new by configurable signature (data timestamp by default). Force fetching is supported.
  • Dependencies between process plugins. Plugin "B" will process data only if plugin "A" derived some data.
  • Parallel executions limits.
  • Export metrics to Prometheus.

Input plugins:

Plugin Description
io Use text and files as data source.
kafka Kafka topic as data source.
resty REST endpoint as data source.
rss RSS/Atom feed as data source.
telegram Telegram chat as data source.
twitter Twitter channel as data source.

Process plugins:

Plugin Description
dedup Deduplicate datums by UUID.
echo Echoing processing data.
expandurl Expand short URLs.
fetch Fetch remote data.
io Read/write text and files.
jq Extract JSON elements.
minio Get/put data from/to S3 bucket.
regexpfind Find patters in data.
regexpmatch Match data by patterns.
regexpreplace Replace patterns in data.
resty Perform REST queries.
same Match data similarity.
split Split single datum into multiple.
unique Get unique values from data.
webchela Process web pages with full browser support (Chrome, Firefox).
xpath Extract HTML/XML nodes.

Output plugins:

Plugin Description
kafka Send data to Kafka topic.
mattermost Send data to Mattermost channel/user.
resty Send data to REST endpoint.
slack Send data to Slack channel/user.
smtp Send data as email.
telegram Send data to Telegram chat.

ToDo:

  1. prometheus: last message received timestamp.
  2. prometheus: extend metrics with invalid flows.
  3. core: cron mode.
  4. regexp/ml: storage of interests.
  5. input: match_ignore_case option for match_signature.
  6. docs: add complex examples, docker compose environments.
  7. telegram: add careful (api limits) support for download/sending unread files/messages.
  8. plugins: new (echo plugin more suitable?) file plugin for saving text to files.
  9. core/ml: auto learning interval.
  10. process: ocr plugin.
  11. process: lang detect plugin.
  12. core: file deduplication cache.
  13. process: exec plugin.
  14. core: top content ratio.
  15. core: flow schedule.
  16. kafka: warn about connection problems.
  17. core: log level for flow.
  18. core: flow enable/disable regexp support.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].