All Projects → Keep-Current → Data Engineering

Keep-Current / Data Engineering

Licence: mit
Wraps the DB by opening a REST API for storing and retrieving documents info & recommendations

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Data Engineering

Gravity
A Data Replication Center
Stars: ✭ 635 (+6955.56%)
Mutual labels:  storage
Ipfs Deploy
Zero-Config CLI to Deploy Static Websites to IPFS
Stars: ✭ 740 (+8122.22%)
Mutual labels:  storage
Spec
Container Storage Interface (CSI) Specification.
Stars: ✭ 799 (+8777.78%)
Mutual labels:  storage
Ffdl
Fabric for Deep Learning (FfDL, pronounced fiddle) is a Deep Learning Platform offering TensorFlow, Caffe, PyTorch etc. as a Service on Kubernetes
Stars: ✭ 640 (+7011.11%)
Mutual labels:  storage
Conf
Simple config handling for your app or module
Stars: ✭ 707 (+7755.56%)
Mutual labels:  storage
Libaums
Open source library to access USB Mass Storage devices on Android without rooting your device
Stars: ✭ 769 (+8444.44%)
Mutual labels:  storage
Nodejs Storage
Node.js client for Google Cloud Storage: unified object storage for developers and enterprises, from live data serving to data analytics/ML to data archiving.
Stars: ✭ 605 (+6622.22%)
Mutual labels:  storage
Peergos
A p2p, secure file storage, social network and application protocol
Stars: ✭ 895 (+9844.44%)
Mutual labels:  storage
Defaults
Swifty and modern UserDefaults
Stars: ✭ 734 (+8055.56%)
Mutual labels:  storage
Dexie.js
A Minimalistic Wrapper for IndexedDB
Stars: ✭ 7,337 (+81422.22%)
Mutual labels:  storage
Redux Storage
Persistence layer for redux with flexible backends
Stars: ✭ 681 (+7466.67%)
Mutual labels:  storage
Minio
High Performance, Kubernetes Native Object Storage
Stars: ✭ 30,698 (+340988.89%)
Mutual labels:  storage
Api
SODA API is an open source implementation of SODA API Standards for Data and Storage Management.
Stars: ✭ 795 (+8733.33%)
Mutual labels:  storage
Sirix
SirixDB is a temporal, evolutionary database system, which uses an accumulate only approach. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach called sliding snapshot.
Stars: ✭ 638 (+6988.89%)
Mutual labels:  storage
Gluster Kubernetes
GlusterFS Native Storage Service for Kubernetes
Stars: ✭ 822 (+9033.33%)
Mutual labels:  storage
Weibo Picture Store
🖼 新浪微博图床 Chrome/Firefox 扩展,支持同步到微相册
Stars: ✭ 624 (+6833.33%)
Mutual labels:  storage
Jsftp
Light and complete FTP client implementation for Node.js
Stars: ✭ 766 (+8411.11%)
Mutual labels:  storage
Openebs
Leading Open Source Container Attached Storage, built using Cloud Native Architecture, simplifies running Stateful Applications on Kubernetes.
Stars: ✭ 7,277 (+80755.56%)
Mutual labels:  storage
Sheepdog
Distributed Storage System for QEMU
Stars: ✭ 896 (+9855.56%)
Mutual labels:  storage
Kingbus
A distributed MySQL binlog storage system built on Raft
Stars: ✭ 798 (+8766.67%)
Mutual labels:  storage

Keep-Current-Storage - Data Engineering

This module handles the DB and storage of documents info, users, relations between the two and the recommendations

Codacy Badge Build StatusBCH compliance

After studying a topic, keeping current with the news, published papers, advanced technologies and such proved to be a hard work. One must attend conventions, subscribe to different websites and newsletters, go over different emails, alerts and such while filtering the relevant data out of these sources.

In this project, we aspire to create a platform for students, researchers, professionals and enthusiasts to discover news on relevant topics. The users are encouraged to constantly give a feedback on the suggestions, in order to adapt and personalize future results.

The goal is to create an automated system that scans the web, through a list of trusted sources, classify and categorize the documents it finds, and match them to the different users, according to their interest. It then presents it as a timely summarized digest to the user, whether by email or within a site.

Who are we?

This project intends to be a shared work of Vienna Data Science Cafe Meet-Up members, with the purpose, beside the obvious result, to also be used as a learning platform, while advancing the Natural Language Processing / Machine Learning field by exploring, comparing and hacking different models.

Please feel free to contribute.

Project board is on Trello and we use Slack as our communication channel. If you're new, you can join using this link.

I want to help

We welcome anyone who would like to join and contribute. We meet regularly every month in Vienna through the Data Science Cafe meetup of the VDSG, show our progress and discuss the next steps.

Data Engineering

This component exposes API for the other components, to save and retrieve the data they need in a secured way.

The repository

This repository is for Data engineering. If you wish to assist in different aspects (Data Engineering / Web development / DevOps), we have divided the project to several additional repositories focusing on these topics:

  • The machine-learning engine can be found in our Main repository
  • Web Development & UI/UX experiments can be found in our App repository
  • Website crawling and spider tasks are concentrated in our Web Crawler repository
  • Devops tasks are all across the project. We are trying to develop this project in a serverless architecture, and currently looking into Docker and Kubernetes as well as different hosting providers and plans. Feel free to join the discussion and provide your input!
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].