All Projects → larrabee → S3sync

larrabee / S3sync

Licence: gpl-3.0
Really fast sync tool for S3

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to S3sync

Aws
A collection of bash shell scripts for automating various tasks with Amazon Web Services using the AWS CLI and jq.
Stars: ✭ 493 (+120.09%)
Mutual labels:  s3, amazon
Akubra
Simple solution to keep a independent S3 storages in sync
Stars: ✭ 79 (-64.73%)
Mutual labels:  s3, sync
Rclone
"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Yandex Files
Stars: ✭ 30,541 (+13534.38%)
Mutual labels:  s3, sync
benji
📁 This library is a Scala reactive DSL for object storage (e.g. S3/Amazon, S3/CEPH, Google Cloud Storage).
Stars: ✭ 18 (-91.96%)
Mutual labels:  amazon, s3
Cash
HTTP response caching for Koa. Supports Redis, in-memory store, and more!
Stars: ✭ 122 (-45.54%)
Mutual labels:  s3, amazon
Aws.s3
Amazon Simple Storage Service (S3) API Client
Stars: ✭ 302 (+34.82%)
Mutual labels:  s3, amazon
Objstore
A Multi-Master Distributed Caching Layer for Amazon S3.
Stars: ✭ 69 (-69.2%)
Mutual labels:  s3, amazon
punic
Punic is a remote cache CLI built for Carthage and Apple .xcframework
Stars: ✭ 25 (-88.84%)
Mutual labels:  amazon, s3
S3fs
Amazon S3 filesystem for PyFilesystem2
Stars: ✭ 111 (-50.45%)
Mutual labels:  s3, amazon
Aws Workflows On Github
Workflows for automation of AWS services setup from Github CI/CD
Stars: ✭ 95 (-57.59%)
Mutual labels:  s3, amazon
docker base images
Vlad's Base Images for Docker
Stars: ✭ 61 (-72.77%)
Mutual labels:  sync, s3
Docker S3 Volume
Docker container with a data volume from s3.
Stars: ✭ 166 (-25.89%)
Mutual labels:  s3, sync
docker-aws-s3-sync
Docker container to sync a folder to Amazon S3
Stars: ✭ 21 (-90.62%)
Mutual labels:  sync, s3
S3rver
A fake S3 server written in NodeJs
Stars: ✭ 410 (+83.04%)
Mutual labels:  s3, amazon
S4
🔄 Fast and cheap synchronisation of files using Amazon S3
Stars: ✭ 69 (-69.2%)
Mutual labels:  sync, s3
Aws Toolkit Vscode
AWS Toolkit for Visual Studio Code, an extension for working with AWS services including AWS Lambda.
Stars: ✭ 823 (+267.41%)
Mutual labels:  s3, amazon
CloudHunter
Find unreferenced AWS S3 buckets which have CloudFront CNAME records pointing to them
Stars: ✭ 31 (-86.16%)
Mutual labels:  amazon, s3
serverless-s3bucket-sync
Serverless Plugin to sync local folders with an S3 bucket
Stars: ✭ 24 (-89.29%)
Mutual labels:  sync, s3
S3scanner
Scan for open AWS S3 buckets and dump the contents
Stars: ✭ 1,319 (+488.84%)
Mutual labels:  s3, amazon
Aws Sdk Perl
A community AWS SDK for Perl Programmers
Stars: ✭ 153 (-31.7%)
Mutual labels:  s3, amazon

S3Sync

Really fast sync tool for S3

Go Report Card GoDoc Build Status

Features

  • Multi-threaded file downloading/uploading
  • Can sync to multiple ways:
    • S3 to local FS
    • Local FS to S3
    • S3 to S3
  • Retrying on errors
  • Live statistics
  • Rate limiting by objects
  • Rate limiting by bandwidth
  • Flexible filters by extension, Content-Type, ETag and object mtime

Key feature: very high speed.
Avg listing speed around 5k objects/sec for S3.
With 128 workers we get avg sync speed around 2k obj/sec (small objects 1-20 kb) (limited by 1Gb uplink).

Limitations

  • Each object is loaded into RAM. So you need <avg object size> * <workers count> RAM.
    If you don't have enough RAM, you can use swap. A large (32-64 Gb) swap on SSD does not affect the tool performance.
    This happened because the tool was designed to synchronize billions of small files and optimized for this workload.

Usage

>> s3sync --help
Really fast sync tool for S3
VersionId: dev, commit: none, built at: unknown
Usage: s3sync [--sk SK] [--ss SS] [--st ST] [--sr SR] [--se SE] [--tk TK] [--ts TS] [--tt TT] [--tr TR] [--te TE] [--s3-retry S3-RETRY] [--s3-retry-sleep S3-RETRY-SLEEP] [--s3-acl S3-ACL] [--s3-storage-class S3-STORAGE-CLASS] [--s3-keys-per-req S3-KEYS-PER-REQ] [--fs-file-perm FS-FILE-PERM] [--fs-dir-perm FS-DIR-PERM] [--fs-disable-xattr] [--filter-ext FILTER-EXT] [--filter-not-ext FILTER-NOT-EXT] [--filter-ct FILTER-CT] [--filter-not-ct FILTER-NOT-CT] [--filter-after-mtime FILTER-AFTER-MTIME] [--filter-before-mtime FILTER-BEFORE-MTIME] [--filter-modified] [--workers WORKERS] [--debug] [--sync-log] [--sync-progress] [--on-fail ON-FAIL] [--error-handling ERROR-HANDLING] [--disable-http2] [--list-buffer LIST-BUFFER] [--ratelimit-objects RATELIMIT-OBJECTS] [--ratelimit-bandwidth RATELIMIT-BANDWIDTH] SOURCE TARGET

Positional arguments:
  SOURCE
  TARGET

Options:
  --sk SK                Source AWS key
  --ss SS                Source AWS session secret
  --st ST                Source AWS token
  --sr SR                Source AWS Region
  --se SE                Source AWS Endpoint
  --tk TK                Target AWS key
  --ts TS                Target AWS secret
  --tt TT                Target AWS session token
  --tr TR                Target AWS Region
  --te TE                Target AWS Endpoint
  --s3-retry S3-RETRY    Max numbers of retries to sync file
  --s3-retry-sleep S3-RETRY-SLEEP
                         Sleep interval (sec) between sync retries on error
  --s3-acl S3-ACL        S3 ACL for uploaded files. Possible values: private, public-read, public-read-write, aws-exec-read, authenticated-read, bucket-owner-read, bucket-owner-full-control
  --s3-storage-class S3-STORAGE-CLASS
                         S3 Storage Class for uploaded files.
  --s3-keys-per-req S3-KEYS-PER-REQ
                         Max numbers of keys retrieved via List request [default: 1000]
  --fs-file-perm FS-FILE-PERM
                         File permissions [default: 0644]
  --fs-dir-perm FS-DIR-PERM
                         Dir permissions [default: 0755]
  --fs-disable-xattr     Disable FS xattr for storing metadata
  --filter-ext FILTER-EXT
                         Sync only files with given extensions
  --filter-not-ext FILTER-NOT-EXT
                         Skip files with given extensions
  --filter-ct FILTER-CT
                         Sync only files with given Content-Type
  --filter-not-ct FILTER-NOT-CT
                         Skip files with given Content-Type
  --filter-after-mtime FILTER-AFTER-MTIME
                         Sync only files modified after given unix timestamp
  --filter-before-mtime FILTER-BEFORE-MTIME
                         Sync only files modified before given unix timestamp
  --filter-modified      Sync only modified files
  --workers WORKERS, -w WORKERS
                         Workers count [default: 16]
  --debug, -d            Show debug logging
  --sync-log             Show sync log
  --sync-progress, -p    Show sync progress
  --on-fail ON-FAIL, -f ON-FAIL
                         Action on failed. Possible values: fatal, skip, skipmissing (DEPRECATED, use --error-handling instead) [default: fatal]
  --error-handling ERROR-HANDLING
                         Controls error handling. Sum of the values: 1 for ignoring NotFound errors, 2 for ignoring PermissionDenied errors OR 255 to ignore all errors
  --disable-http2        Disable HTTP2 for http client
  --list-buffer LIST-BUFFER
                         Size of list buffer [default: 1000]
  --ratelimit-objects RATELIMIT-OBJECTS
                         Rate limit objects per second
  --ratelimit-bandwidth RATELIMIT-BANDWIDTH
                         Set bandwidth rate limit, byte/s, Allow suffixes: K, M, G
  --help, -h             display this help and exit
  --version              display version and exit

Examples:

  • Sync Amazon S3 bucket to FS:
    s3sync --sk KEY --ss SECRET -w 128 s3://shared fs:///opt/backups/s3/
  • Sync S3 bucket with custom endpoint to FS:
    s3sync --sk KEY --ss SECRET --se "http://127.0.0.1:7484" -w 128 s3://shared fs:///opt/backups/s3/
  • Sync directory (/test) from Amazon S3 bucket to FS:
    s3sync --sk KEY --ss SECRET -w 128 s3://shared/test fs:///opt/backups/s3/test/
  • Sync directory from local FS to Amazon S3:
    s3sync --tk KEY --ts SECRET -w 128 fs:///opt/backups/s3/ s3://shared
  • Sync directory from local FS to Amazon S3 bucket directory:
    s3sync --tk KEY --ts SECRET -w 128 fs:///opt/backups/s3/test/ s3://shared/test_new/
  • Sync one Amazon bucket to another Amazon bucket:
    s3sync --tk KEY2 --ts SECRET2 --sk KEY1 --ss SECRET1 -w 128 s3://shared s3://shared_new
  • Sync S3 bucket with custom endpoint to another bucket with custom endpoint:
    s3sync --tk KEY2 --ts SECRET2 --sk KEY1 --ss SECRET1 --se "http://127.0.0.1:7484" --te "http://127.0.0.1:7484" -w 128 s3://shared s3://shared_new
  • Sync one Amazon bucket directory to another Amazon bucket:
    s3sync --tk KEY2 --ts SECRET2 --sk KEY1 --ss SECRET1 -w 128 s3://shared/test/ s3://shared_new

SOURCE and TARGET should be a directory. Syncing of single file are not supported (This will not work s3sync --sk KEY --ss SECRET s3://shared/megafile.zip fs:///opt/backups/s3/)

You can use filters.

  • Timestamp filter (--filter-after-mtime arg) syncing only files, that has been changed after specified timestamp. Its useful for diff backups.
  • File extension filter (--filter-ext arg) syncing only files, that have specified extension. Can be specified multiple times (Like this --filter-ext .jpg --filter-ext .png --filter-ext .bmp).
  • Content-type filter (--filter-ct arg) syncing only files, that have specified content-type. Can be specified multiple times.
  • Etag filter (--filter-modified) sync only modified files. It have few restrictions. If you are using FS storage, the files must be created using s3sync. FS storage should also support xattr.
  • There are also inverted filters (--filter-not-ext, --filter-not-ct and --filter-before-mtime).

Install

Download binary from Release page.
Or use docker image larrabee/s3sync like this:

docker run --rm -ti larrabee/s3sync --tk KEY2 --ts SECRET2 --sk KEY1 --ss SECRET1 -w 128 s3://shared/test/ s3://shared_new

Building

Minimum go version: 1.13
Build it with:

go mod vendor
go build -o s3sync ./cli 

Using module

You can easy use s3sync in your application. See example in cli/ folder.

License

GPLv3

Notes

s3sync is a non-destructive one-way sync: it does not delete files in the destination or source paths that are out of sync.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].