All Projects → pdlfs → Deltafs

pdlfs / Deltafs

Licence: other
Transient file system service featuring highly paralleled indexing on both file data and file system metadata

Projects that are alternatives of or similar to Deltafs

Shrine
File Attachment toolkit for Ruby applications
Stars: ✭ 2,903 (+4047.14%)
Mutual labels:  metadata, filesystem
diskover-community
Diskover Community Edition - Open source file indexer, file search engine and data management and analytics powered by Elasticsearch
Stars: ✭ 1,257 (+1695.71%)
Mutual labels:  metadata, filesystem
Diskover Web
Web file manager, disk space usage, storage search engine and file system analytics for diskover
Stars: ✭ 121 (+72.86%)
Mutual labels:  metadata, filesystem
Diskover
File system crawler, disk space usage, file search engine and file system analytics powered by Elasticsearch
Stars: ✭ 977 (+1295.71%)
Mutual labels:  metadata, filesystem
Notify
🔭 Cross-platform filesystem notification library for Rust.
Stars: ✭ 1,123 (+1504.29%)
Mutual labels:  filesystem
Linux Apfs Oot
APFS module for linux (out-of-tree repository)
Stars: ✭ 58 (-17.14%)
Mutual labels:  filesystem
Udftools
Linux tools for UDF filesystems and DVD/CD-R(W) drives
Stars: ✭ 57 (-18.57%)
Mutual labels:  filesystem
Hlsinjector
ID3 metadata injector for MPEG TS (HLS) written in PHP
Stars: ✭ 56 (-20%)
Mutual labels:  metadata
Fast Glob
🚀 It's a very fast and efficient glob library for Node.js
Stars: ✭ 1,150 (+1542.86%)
Mutual labels:  filesystem
Raspi Overlayroot
Protect your SD card against wear and tear
Stars: ✭ 66 (-5.71%)
Mutual labels:  filesystem
Metaforge
An OSINT Metadata analyzing tool that filters through tags and creates reports
Stars: ✭ 63 (-10%)
Mutual labels:  metadata
Ipdata
🌐 An IP lookup system utilizing open datasets
Stars: ✭ 58 (-17.14%)
Mutual labels:  metadata
Netkan
Metadata files used by the NetKAN/CKAN indexer
Stars: ✭ 64 (-8.57%)
Mutual labels:  metadata
Cf Xarray
a lightweight accessor for xarray objects that interprets CF attributes
Stars: ✭ 58 (-17.14%)
Mutual labels:  metadata
Sickbeard mp4 automator
Automatically convert video files to a standardized format with metadata tagging to create a beautiful and uniform media library
Stars: ✭ 1,142 (+1531.43%)
Mutual labels:  metadata
30daysofnode
30 days of node is a code base tutorial series for node.js which deals with providing a practical project based learning experience instead of talking about its conceptual details.
Stars: ✭ 57 (-18.57%)
Mutual labels:  filesystem
Fsq
A tool for querying the file system with a SQL-like language.
Stars: ✭ 60 (-14.29%)
Mutual labels:  filesystem
Scrape
Distributed Scraper
Stars: ✭ 65 (-7.14%)
Mutual labels:  metadata
Browser Shell
A Linux command-line shell in the browser
Stars: ✭ 60 (-14.29%)
Mutual labels:  filesystem
Win10 Explorer
A new windows 10 file explorer
Stars: ✭ 58 (-17.14%)
Mutual labels:  filesystem

Transient file system service featuring highly paralleled indexing on both file data and file system metadata.

Build Status GitHub (pre-)release License

DeltaFS

XXXXXXXXX
XX      XX                 XX                  XXXXXXXXXXX
XX       XX                XX                  XX
XX        XX               XX                  XX
XX         XX              XX   XX             XX
XX          XX             XX   XX             XXXXXXXXX
XX           XX  XXXXXXX   XX XXXXXXXXXXXXXXX  XX         XX
XX          XX  XX     XX  XX   XX       XX XX XX      XX
XX         XX  XX       XX XX   XX      XX  XX XX    XX
XX        XX   XXXXXXXXXX  XX   XX     XX   XX XX    XXXXXXXX
XX       XX    XX          XX   XX    XX    XX XX           XX
XX      XX      XX      XX XX   XX X    XX  XX XX         XX
XXXXXXXXX        XXXXXXX   XX    XX        XX  XX      XX

This software was developed, in part, under U.S. Government contract 89233218CNA000001 for Los Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S. Department of Energy/National Nuclear Security Administration. Please see the accompanying LICENSE.txt for further information. DeltaFS is still under development.

Features

  • Serverless design featuring zero dedicated metadata servers and no global file system namespace.
  • Application-owned metadata service harnessing compute nodes to handle metadata and achieve highly agile scalability.
  • Freedom from unjustified synchronization among HPC applications that do not need to use the file system to communicate.
  • Write-optimized LSM-based metadata representation with file system namespace snapshots as the basis of inter-job data sharing and workflow execution.
  • A special directory type with an embedded striped-down streaming Map-Reduce pipeline.
  • A file system as no more than a thin service composed by each application at runtime to provide a temporary view of a private namespace backed by a stack of immutable snapshots and a collection of shared data objects.
  • Simplified data center storage consisting of multiple independent underlying object stores, providing flat namespaces of data objects, and oblivious of file system semantics.

Platform

DeltaFS is able to run on Linux, Mac OS, as well as most UNIX platforms for both development and local testing purposes. To run DeltaFS in production, it must be a Linux box. DeltaFS is mostly written in C++. C++11 is not required to compile the DeltaFS code, but will be used if the compiler supports it. C++14 or later is currently not used.

Documentation

Our paper deltafs_pdsw15 provides an overview of the file system, and our other two papers, deltafs_pdsw17 and deltafs_sc18, provide an overview of the Indexed Massive Directory in DeltaFS.

Software requirements

Compiling DeltaFS requires a recent C/C++ compiler, cmake, make, mpi, snappy, glog, and gflags. Compiling some of DeltaFS' dependencies requires a recent autoconf, automake, and libtool.

On Ubuntu 14.04 LTS or later, you may use the following to prepare the system environment for DeltaFS.

sudo apt-get install gcc g++ make  # Alternatively, this can also be clang
sudo apt-get install autoconf automake libtool pkg-config
sudo apt-get install cmake cmake-curses-gui
sudo apt-get install libsnappy-dev libgflags-dev libgoogle-glog-dev
sudo apt-get install libmpich-dev  # Alternatively, this can also be libopenmpi-dev
sudo apt-get install mpich

For Mac OS, see README.mac.md for instuctions.

Object store

DeltaFS assumes an underlying object storage service to store file system metadata and file data. This underlying object store may just be a shared parallel file system such as Lustre, GPFS, PanFS, and HDFS. However, a scalable object storage service is suggested to ensure high performance and currently DeltaFS supports Ceph RADOS.

RADOS

On Ubuntu 14.04 LTS or later, RADOS can be installed via apt-get.

sudo apt-get install librados-dev

RPC

Distributed DeltaFS instances require an RPC library to communicate with each other. Currently, we use Mercury and Mercury itself supports multiple network backends, such as MPI, bmi on tcp, and cci on a variety of underlying network abstractions including verbs, tcp, sock, and raw eth.

Mercury

Please follow online Merury documentation to install Mercury and one or more of its backends. To start, we suggest using bmi as the network backend. Compiling Mercury may also require the installation of openpa, depending on the presence of <stdatomic.h>.

# BMI
git clone http://git.mcs.anl.gov/bmi.git && cd bmi
./prepare && ./configure --enable-shared --enable-bmi-only
make && sudo make install

# OpenPA -- when in the absence of <stdatomic.h> 
git clone https://github.com/pmodels/openpa.git && cd openpa
./autogen.sh && ./configure --enable-shared
make && sudo make install

# Mercury
git clone --recurse-submodules https://github.com/mercury-hpc/mercury.git && cd mercury
mkdir build && cd build
cmake -DBUILD_SHARED_LIBS=ON \
-DMERCURY_USE_CHECKSUMS=OFF -DNA_USE_BMI=ON ..
make && sudo make install

Building

After all software dependencies are installed, we can proceed to build DeltaFS. DeltaFS uses cmake and suggests you to do an out-of-source build. To do that, create a dedicated build directory and run 'ccmake' command from it:

cd deltafs
mkdir build
cd build
ccmake -DDELTAFS_COMMON_INTREE=ON ..

Type 'c' multiple times and choose suitable options. Recommended options are:

 BUILD_SHARED_LIBS                ON
 BUILD_TESTS                      ON  ## <-- turn this off to skip building tests
 CMAKE_BUILD_TYPE                 RelWithDebInfo
 CMAKE_INSTALL_PREFIX             /usr/local
 CMAKE_PREFIX_PATH                ## <-- this can be empty
 DELTAFS_BENCHMARKS               OFF
 DELTAFS_COMMON_INTREE            ON  ## <-- this must be ON
 DELTAFS_MPI                      ON  ## <-- this must be ON
 PDLFS_GFLAGS                     ON
 PDLFS_GLOG                       ON
 PDLFS_MARGO_RPC                  OFF
 PDLFS_MERCURY_RPC                ON
 PDLFS_RADOS                      ON  ## <-- only if deltafs needs to run on rados
 PDLFS_SNAPPY                     ON
 PDLFS_VERBOSE                    1

Once you exit the CMake configuration screen and are ready to build the targets, do:

make

Local testing

To test DeltaFS on a local machine using the local file system to store file system metadata and file data, we can run two DeltaFS server instances and then use a DeltaFS shell to access the namespace. See README.rados.md for testing DeltaFS agaist Ceph RADOS.

mpirun -n 2 ./build/src/server/deltafs-srvr -v=1 -logtostderr

This will start two DeltaFS server instances that store file system metadata in /tmp/deltafs_outputs and file data in /tmp/deltafs_data. Please remove these two folders if they exist before running DeltaFS. The two DeltaFS server instances will begin listening on tcp port 10101 and 10102.

env "DELTAFS_MetadataSrvAddrs=127.0.0.1:10101&127.0.0.1:10102" "DELTAFS_NumOfMetadataSrvs=2" \
    ./build/src/cmds/deltafs-shell -v=1 -logtostderr

This will start a DeltaFS shell and instruct it to connect to DeltaFS servers we previously started. Currently, this is just a simple shell that allows us to create directories, copy files from the local file system to DeltaFS, and cat files in DeltaFS.

DeltaFS app

Currently, applications have to explicitly link to DeltaFS user library (include/deltafs_api.h) in order to call DeltaFS. Alternatively, DeltaFS may be implicitly invoked by preloading fs calls made by an application and redirecting them to DeltaFS. We have developed one such library and it is available here, https://github.com/pdlfs/pdlfs-preload.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].