All Projects → scality → backbeat

scality / backbeat

Licence: Apache-2.0 license
Zenko Backbeat is the core engine for asynchronous replication, optimized for queuing metadata updates and dispatching work to long-running tasks in the background.

Programming Languages

javascript
184084 projects - #8 most used programming language
python
139335 projects - #7 most used programming language
go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to backbeat

radio
Redundant Array of Distributed Independent Objectstores in short RADIO performs synchronous mirroring, erasure coding across multiple object stores
Stars: ✭ 25 (-50.98%)
Mutual labels:  replication, s3, disaster-recovery
awesome-storage
A curated list of storage open source tools. Backups, redundancy, sharing, distribution, encryption, etc.
Stars: ✭ 324 (+535.29%)
Mutual labels:  replication, s3
data-transfer-hub
Seamless User Interface for replicating data into AWS.
Stars: ✭ 102 (+100%)
Mutual labels:  replication, s3
Seaweedfs
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
Stars: ✭ 13,380 (+26135.29%)
Mutual labels:  replication, s3
Wal E
Continuous Archiving for Postgres
Stars: ✭ 3,313 (+6396.08%)
Mutual labels:  replication, s3
Aws Data Replication Hub
Seamless User Interface for replicating data into AWS.
Stars: ✭ 40 (-21.57%)
Mutual labels:  replication, s3
Litestream
Streaming replication for SQLite.
Stars: ✭ 3,795 (+7341.18%)
Mutual labels:  replication, s3
ndstore
code for storing neurodata images and image annotations
Stars: ✭ 39 (-23.53%)
Mutual labels:  s3
backup-repository
Backup storage for E2E GPG-encrypted files, with multi-user, quotas, versioning, using a object storage (S3/Min.io/GCS etc.) and deployed on Kubernetes or standalone.
Stars: ✭ 21 (-58.82%)
Mutual labels:  s3
krawler
A minimalist (geospatial) ETL
Stars: ✭ 51 (+0%)
Mutual labels:  s3
cottoncandy
sugar for s3
Stars: ✭ 33 (-35.29%)
Mutual labels:  s3
radiusd
Distributed Radius-server to do authentication+accounting.
Stars: ✭ 50 (-1.96%)
Mutual labels:  replication
mindav
A self-hosted file backup server which bridges WebDAV protocol with @minio written in @totoval. Webdav ❤️ Minio
Stars: ✭ 64 (+25.49%)
Mutual labels:  s3
vrrm
rough code for running consensus
Stars: ✭ 18 (-64.71%)
Mutual labels:  replication
terraform-aws-s3
Terraform module to create default S3 bucket with logging and encryption type specific features.
Stars: ✭ 22 (-56.86%)
Mutual labels:  s3
backblaze
Backblaze.Agent is a high-performance .NET Core implementation of the Backblaze B2 Cloud Storage API.
Stars: ✭ 32 (-37.25%)
Mutual labels:  s3
skbn
Copy files and directories between Kubernetes and cloud storage
Stars: ✭ 68 (+33.33%)
Mutual labels:  s3
ansible-role-backup
Ansible Role - Backup for simple servers
Stars: ✭ 122 (+139.22%)
Mutual labels:  disaster-recovery
qscamel
qscamel is a command line tool to migrate data between different endpoint efficiently.
Stars: ✭ 34 (-33.33%)
Mutual labels:  s3
sls-photos-upload-service
Example web app and serverless API for uploading photos and saving to S3 and DynamoDB
Stars: ✭ 50 (-1.96%)
Mutual labels:  s3

Zenko Backbeat

backbeat logo

OVERVIEW

Backbeat is an engine with a messaging system at its heart. It's part of Zenko, Scality’s Open Source Multi-Cloud Data Controller. Learn more about Zenko at Zenko.io

Backbeat is optimized for queuing metadata updates and dispatching work to long-running tasks in the background. The core engine can be extended for many use cases, which are called extensions, as listed below.

EXTENSIONS

Asynchronous Replication

This feature replicates objects from one S3 bucket to another S3 bucket in a different geographical region. The extension uses the local Metadata journal as the source of truth and replicates object updates in a FIFO order.

DESIGN

QUICKSTART

This guide assumes the following:

  • Using MacOS
  • brew is installed (get it here)
  • node is installed (version 6.9.5)
  • yarn is installed (version 3.10.10)
  • aws is installed (version 1.11.1)

Kafka and Zookeeper

Install and start kafka and zookeeper servers

mkdir ~/kafka && \
cd ~/kafka && \
curl https://archive.apache.org/dist/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz | \
tar xvz && \
sed 's/zookeeper.connect=.*/zookeeper.connect=localhost:2181\/backbeat/' \
kafka_2.11-0.11.0.0/config/server.properties > \
kafka_2.11-0.11.0.0/config/server.properties.backbeat

Start the zookeeper server:

zookeeper-server-start kafka_2.11-0.11.0.0/config/zookeeper.properties

In a different Shell start the kafka server:

kafka-server-start kafka_2.11-0.11.0.0/config/server.properties.backbeat

Install and run Redis on port 6379 (default port)

The "Failed Entry Processor" (started by Queue Populator) stores the "failed replication entries" to the Redis sorted set.

Create log offset Zookeeper node

zkCli -server localhost:2181/backbeat

create /queue-populator

Create the kafka topics

backbeat-replication
kafka-topics --create \
--zookeeper localhost:2181/backbeat \
--replication-factor 1 \
--partitions 1 \
--topic backbeat-replication
backbeat-replication-status
kafka-topics --create \
--zookeeper localhost:2181/backbeat \
--replication-factor 1 \
--partitions 1 \
--topic backbeat-replication-status
backbeat-replication-failed
kafka-topics --create \
--zookeeper localhost:2181/backbeat \
--replication-factor 1 \
--partitions 1 \
--topic backbeat-replication-failed

Vault

git clone https://github.com/scality/Vault ~/replication/vault && \
cd ~/replication/vault && \
yarn i

Run vault with "in-memory backend"

chmod 400 ./tests/utils/keyfile

VAULT_DB_BACKEND="MEMORY" yarn start

⚠️ with "in-memory backend", all data is lost after you stop the process.

or with a "mongodb backend"

chmod 400 ./tests/utils/keyfile

VAULT_DB_BACKEND="MONGODB" yarn start

mongodb can be installed follwing these steps

CloudServer

git clone https://github.com/scality/cloudserver ~/replication/cloudserver && \
cd ~/replication/cloudserver && \
yarn i

Generate AWS access/secret key with full IAM and S3 access.

Add your keys aws configure --profile aws-account

Create AWS destination versioning-enabled bucket.

aws s3api create-bucket --bucket <DESTINATION_BUCKET_NAME> --profile aws-account

aws s3api put-bucket-versioning \
--bucket <DESTINATION_BUCKET_NAME> \
--versioning-configuration Status=Enabled \
--profile aws-account

Replace existing ./locationConfig.json with:

{
    "us-east-1": {
      "details": {
        "supportsVersioning": true
      },
      "isTransient": false,
      "legacyAwsBehavior": false,
      "objectId": "0b1d9226-a694-11eb-bc21-baec55d199cd",
      "type": "file"
    },
    "aws-location": {
        "type": "aws_s3",
        "legacyAwsBehavior": true,
        "details": {
            "awsEndpoint": "s3.amazonaws.com",
            "bucketName": "<DESTINATION_BUCKET_NAME>",
            "bucketMatch": false,
            "credentialsProfile": "aws-account",
            "serverSideEncryption": true
        }
    }
}

Update ./config.json with

"replicationEndpoints": [{ "site": "aws-location", "type": "aws_s3" }],

In ./config.json, make sure recordLog.enabled is set to true

"recordLog": {
        "enabled": true,
        ...
}

Run Cloudserver

S3DATA=multiple S3METADATA=mongodb REMOTE_MANAGEMENT_DISABLE=true \
S3VAULT=scality yarn start

Vaultclient

Create a "Zenko" account and generate keys

bin/vaultclient create-account --name bart --email dev@backbeat --port 8600
bin/vaultclient generate-account-access-key --name bart --port 8600
aws configure --profile bart

Backbeat

git clone https://github.com/scality/backbeat ~/replication/backbeat && \
cd ~/replication/backbeat && \
yarn i

Update conf/authdata.json with bart informations and keys.

{
    "accounts": [{
        "name": "bart",
        "arn": "aws:iam::331457510670:/bart",
        "canonicalID": "2083781e15384e30f48c651a948ec2dc1e1801c4af24c2750a166823e28ca570",
        "displayName": "bart",
        "keys": {
            "access": "20TNCD06HOCSLQSABFZP",
            "secret": "1P43SL0ekJjXnQvliV0KgMibZ=N2lKZO4dpnWzbF"
        }
    }
    ]
}

Update conf/config.json section extensions.replication.source.auth

"auth": {
    "type": "account",
    "account": "bart",
    "vault": {
        "host": "127.0.0.1",
        "port": 8500,
        "adminPort": 8600
    }
}

Make sure conf/config.json section extensions.replication.destination.bootstrapList includes:

{ "site": "aws-location", "type": "aws_s3" }

Queue populator

S3_REPLICATION_METRICS_PROBE=true REMOTE_MANAGEMENT_DISABLE=true \
yarn run queue_populator

Queue processor

S3_REPLICATION_METRICS_PROBE=true REMOTE_MANAGEMENT_DISABLE=true \
yarn run queue_processor aws-location

Replication status processor

S3_REPLICATION_METRICS_PROBE=true REMOTE_MANAGEMENT_DISABLE=true \
yarn run replication_status_processor

AWS S3 CLI

Create a source bucket with versioning enabled:

aws s3api create-bucket \
--bucket sourcebucket \
--endpoint-url http://127.0.0.1:8000 \
--profile bart
aws s3api put-bucket-versioning \
--bucket sourcebucket \
--versioning-configuration Status=Enabled \
--endpoint-url=http://127.0.0.1:8000 \
--profile bart

Set up replication

Create replication.json

{
    "Role": "arn:aws:iam::root:role/s3-replication-role",
    "Rules": [
        {
            "Status": "Enabled",
            "Prefix": "",
            "Destination": {
                "Bucket": "arn:aws:s3:::sourcebucket",
                "StorageClass": "aws-location"
            }
        }
    ]
}
aws s3api put-bucket-replication \
--bucket sourcebucket \
--replication-configuration file://replication.json \
--endpoint-url=http://127.0.0.1:8000 \
--profile bart

Put object to be replicated

aws s3api put-object \
--key key0 \
--body file \
--bucket sourcebucket \
--endpoint-url=http://127.0.0.1:8000 \
--profile bart

Check that object has been replicated

aws s3api head-object \
--key bucketsource/key0 \
--bucket <DESTINATION_BUCKET_NAME> \
--profile aws-account

Structure

In our $HOME directory, we now have the following directories:

$HOME
├── kafka
│   └── kafka_2.11-0.11.0.0
├── replication
    ├── backbeat
    ├── cloudserver
    └── vault
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].