All Projects → scopely → Kinesis Vcr

scopely / Kinesis Vcr

Licence: apache-2.0
Record, rewind and replay live Kinesis streams

Programming Languages

java
68154 projects - #9 most used programming language

Labels

Projects that are alternatives of or similar to Kinesis Vcr

Govno
Backup your govno to S3! VNO protocol implementation in Go
Stars: ✭ 21 (-61.11%)
Mutual labels:  backup
Docker Duplicity
Docker image for running duplicity in a cron
Stars: ✭ 35 (-35.19%)
Mutual labels:  backup
Barman
Barman - Backup and Recovery Manager for PostgreSQL
Stars: ✭ 1,044 (+1833.33%)
Mutual labels:  backup
Ansible Restic
Deploy restic backup program
Stars: ✭ 29 (-46.3%)
Mutual labels:  backup
Cassy
A simple and integrated backup tool for Apache Cassandra
Stars: ✭ 33 (-38.89%)
Mutual labels:  backup
Scm Backup
Makes offline backups of your cloud hosted source code repositories
Stars: ✭ 38 (-29.63%)
Mutual labels:  backup
Bash Toolkit
Este proyecto esá destinado a ayudar a los sysadmin
Stars: ✭ 13 (-75.93%)
Mutual labels:  backup
Backintime
Back In Time - A simple backup tool for Linux
Stars: ✭ 1,066 (+1874.07%)
Mutual labels:  backup
Hactar
📃 An incremential daily backup script using rsync
Stars: ✭ 34 (-37.04%)
Mutual labels:  backup
Pghoard
PostgreSQL backup and restore service
Stars: ✭ 1,027 (+1801.85%)
Mutual labels:  backup
Netflix Ratings Extractor
Greasemonkey script for Chrome, Firefox, Safari: export your rated Netflix movies.
Stars: ✭ 30 (-44.44%)
Mutual labels:  backup
Vuplicity
A cross-platform GUI for Duplicity backups, powered by Atom Electron.
Stars: ✭ 31 (-42.59%)
Mutual labels:  backup
Terraform
Share Terraform best practices and custom modules with the community
Stars: ✭ 39 (-27.78%)
Mutual labels:  backup
Quip Export
Export all folders and documents from Quip
Stars: ✭ 28 (-48.15%)
Mutual labels:  backup
Unburden Home Dir
Automatically unburden $HOME from caches, etc. Useful for $HOME on SSDs, small disks or slow NFS homes. Can be triggered via an hook in /etc/X11/Xsession.d/.
Stars: ✭ 48 (-11.11%)
Mutual labels:  backup
Flickrsync
A command line tool to synchronise, upload, download, pictures between the local file system and Flickr. Image hash signature of the picture is used to uniquely identify the image.
Stars: ✭ 14 (-74.07%)
Mutual labels:  backup
Stash
🛅 Backup your Kubernetes Stateful Applications
Stars: ✭ 989 (+1731.48%)
Mutual labels:  backup
Gphotos Sync
Google Photos and Albums backup with Google Photos Library API
Stars: ✭ 1,066 (+1874.07%)
Mutual labels:  backup
Resticprofile
Configuration profiles for restic backup
Stars: ✭ 48 (-11.11%)
Mutual labels:  backup
Hypervbackup
Utility for backing up HyperV virtual machines
Stars: ✭ 43 (-20.37%)
Mutual labels:  backup

kinesis-vcr

Rewind and (later) replay live Kinesis streams

The VCR uses the Amazon Kinesis Connectors and Amazon Kinesis Client Library internally, meaning it automatically uses a DynamoDB table for lease management.

Running locally?

Build

./gradlew installDist

Run

Record a stream:

VCR_SOURCE_STREAM_NAME=my-important-data 
VCR_BUCKET_NAME=dev-kinesis-backups 
./build/install/kinesis-vcr/bin/kinesis-vcr record

This will write to files like s3://dev-kinesis-backups/my-important-data/2015-05-12/49545259625339426540861503851695364890964474172851355682-49545259625339426540861503967844121936259603873071628322, where the date is the date that the chunk is written to S3 and the long numbers are the start and end sequence numbers of this chunk of stream data.

Replay a stream:

VCR_SOURCE_STREAM_NAME=my-important-data
VCR_TARGET_STREAM_NAME=recovery-stream
VCR_BUCKET_NAME=dev-kinesis-backups 
./build/install/kinesis-vcr/bin/kinesis-vcr play 2014-05-01 2014-05-10

This will take a recording of the stream my-important-data recorded in the bucket dev-kinesis-backups and replay records for the date range 2014-05-01 to 2014-05-10 (exclusive of the end) onto the stream recovery-stream. If only one day of data should be play replayed use:

./build/install/kinesis-vcr/bin/kinesis-vcr play 2014-05-01

Replay can also specify times, for a narrow playback range:

./build/install/kinesis-vcr/bin/kinesis-vcr play 2014-05-01T12:00:00 2014-05-03:13:45:00

Specified times are always interpreted as UTC.

Prior to replaying a stream, the VCR can provide an estimated playing time, which is simply the size of the input dataset divided by the write throughput possible for the target stream. kinesis-vcr estimate takes the same parameters as kinesis-vcr play:

$ kinesis-vcr estimate 2014-05-01T12:00:00 2014-05-03:13:45:00
[main] INFO com.scopely.infrastructure.kinesis.KinesisVcr - Target stream (kinesis-playback-test) has 2 shards
[main] INFO com.scopely.infrastructure.kinesis.KinesisVcr - It would take around 50 mins to replay the data in the provided range, which has 341 files and a total size of 6038 MB

Running somewhere else?

JAR distribution

Build

Make a fat JAR:

./gradlew shadowJar

Your JAR will be in build/libs/kinesis-vcr-1.0.4-SNAPSHOT-all.jar.

Run

java -jar kinesis-vcr-1.0.4-SNAPSHOT-all.jar record

Debian

Build

Make a Debian package:

./gradlew buildDeb

Your package will be in build/distributions/kinesis-vcr_1.0.0_all.deb.

Run

Install the package. The script kinesis-vcr will be installed to /opt/scopely/kinesis-vcr/bin/kinesis-vcr.

Invoke it like you would locally, e.g.:

VCR_SOURCE_STREAM_NAME=my-important-data 
VCR_BUCKET_NAME=dev-kinesis-backups 
/opt/scopely/kinesis-vcr/bin/kinesis-vcr record

Configuration and options

The VCR assumes that it can find AWS credentials in the default locations, according to the rules of DefaultAWSCredentialsProviderChain:

  • Environment Variables - AWS_ACCESS_KEY_ID and AWS_SECRET_KEY
  • Java System Properties - aws.accessKeyId and aws.secretKey
  • Credential profiles file at the default location (~/.aws/credentials) shared by all AWS SDKs and the AWS CLI
  • Instance profile credentials delivered through the Amazon EC2 metadata service

In addition to the required parameters VCR_SOURCE_STREAM_NAME and VCR_BUCKET_NAME, the VCR also respects VCR_BUFFER_SIZE_BYTES, controlling how much data to buffer before writing to S3, and VCR_BUFFER_TIME_MILLIS, controlling how long to buffer data before writing to S3.

When playing from S3, VCR_TARGET_STREAM_NAME must also be specified.

More bells and whistles are sure to come.

Format

The VCR writes records to S3 as newline-delimited Base64 -- each record on the input stream is written to a line in the output file. On playback, it just Base64-decodes each line and emits it on the target stream. As such, this tool is completely agnostic to the format of records on the wire.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].