All Projects → data61 → cuda-fixnum

data61 / cuda-fixnum

Licence: other
Extended-precision modular arithmetic library that targets CUDA.

Programming Languages

Cuda
1817 projects
python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to cuda-fixnum

kodex
A privacy and security engineering toolkit: Discover, understand, pseudonymize, anonymize, encrypt and securely share sensitive and personal data: Privacy and security as code.
Stars: ✭ 70 (+79.49%)
Mutual labels:  privacy-enhancing-technologies
mobiletrackers
A repository of telemetry domains and URLs used by mobile location tracking, user profiling, targeted marketing and aggressive ads libraries.
Stars: ✭ 118 (+202.56%)
Mutual labels:  privacy-enhancing-technologies
awesome-secure-computation
Awesome list for cryptographic secure computation paper. This repo includes *Lattice*, *DifferentialPrivacy*, *MPC* and also a comprehensive summary for top conferences.
Stars: ✭ 125 (+220.51%)
Mutual labels:  privacy-enhancing-technologies
he-toolkit
The Intel Homomorphic Encryption (HE) toolkit is the primordial vehicle for the continuous distribution of the Intel HE technological innovation to users. The toolkit has been designed with usability in mind and to make it easier for users to evaluate and deploy homomorphic encryption technology on the Intel platforms.
Stars: ✭ 40 (+2.56%)
Mutual labels:  privacy-enhancing-technologies
hyperdome
the safest place to reach out
Stars: ✭ 26 (-33.33%)
Mutual labels:  privacy-enhancing-technologies
gpc-optmeowt
Browser extension for opting out from the sale and sharing of personal information per the California Consumer Privacy Act and other privacy laws
Stars: ✭ 75 (+92.31%)
Mutual labels:  privacy-enhancing-technologies
privacy-preserving-primitives
primitives and protocols for implementing privacy preserving networks
Stars: ✭ 14 (-64.1%)
Mutual labels:  privacy-enhancing-technologies
gretel-python-client
The Gretel Python Client allows you to interact with the Gretel REST API.
Stars: ✭ 28 (-28.21%)
Mutual labels:  privacy-enhancing-technologies
voco
Privacy friendly voice control for the Candle Controller / WebThings Gateway
Stars: ✭ 18 (-53.85%)
Mutual labels:  privacy-enhancing-technologies
PrivacyQA EMNLP
PrivacyQA, a resource to support question-answering over privacy policies.
Stars: ✭ 24 (-38.46%)
Mutual labels:  privacy-enhancing-technologies
Fawkes
Fawkes, privacy preserving tool against facial recognition systems. More info at https://sandlab.cs.uchicago.edu/fawkes
Stars: ✭ 4,362 (+11084.62%)
Mutual labels:  privacy-enhancing-technologies
conclave
Query compiler for secure multi-party computation.
Stars: ✭ 86 (+120.51%)
Mutual labels:  privacy-enhancing-technologies
HElib
HElib is an open-source software library that implements homomorphic encryption. It supports the BGV scheme with bootstrapping and the Approximate Number CKKS scheme. HElib also includes optimizations for efficient homomorphic evaluation, focusing on effective use of ciphertext packing techniques and on the Gentry-Halevi-Smart optimizations.
Stars: ✭ 2,913 (+7369.23%)
Mutual labels:  privacy-enhancing-technologies
PrivacyRaven
Privacy Testing for Deep Learning
Stars: ✭ 111 (+184.62%)
Mutual labels:  privacy-enhancing-technologies
fluxfonts
A unique tool that blurs your device’s unique font fingerprint by continuously installing and uninstalling new fonts that are generated on the fly.
Stars: ✭ 59 (+51.28%)
Mutual labels:  privacy-enhancing-technologies
MixEth
MixEth: efficient, trustless coin mixing service for Ethereum
Stars: ✭ 49 (+25.64%)
Mutual labels:  privacy-enhancing-technologies
privapi
Detect Sensitive REST API communication using Deep Neural Networks
Stars: ✭ 42 (+7.69%)
Mutual labels:  privacy-enhancing-technologies
MOTION
An efficient, user-friendly, modular, and extensible framework for mixed-protocol secure multi-party computation with two or more parties
Stars: ✭ 59 (+51.28%)
Mutual labels:  privacy-enhancing-technologies
Hemmelig.app
Keep your sensitive information out of chat logs, emails, and more with encrypted secrets.
Stars: ✭ 183 (+369.23%)
Mutual labels:  privacy-enhancing-technologies
shifting
A privacy-focused list of alternatives to mainstream services to help the competition.
Stars: ✭ 31 (-20.51%)
Mutual labels:  privacy-enhancing-technologies

cuda-fixnum

NB: This repository is no longer maintained. All development on cuda-fixnum now takes place at unzvfu/cuda-fixnum.

cuda-fixnum is a fixed-precision SIMD library that targets CUDA. It provides the apparatus necessary to easily create efficient functions that operate on vectors of n-bit integers, where n can be much larger than the size of a usual machine or device register. Currently supported values of n are 32, 64, 128, 256, 512, 1024, and 2048 (larger values will be possible in a forthcoming release).

The primary use case for fast arithmetic of numbers in the range covered by cuda-fixnum is in cryptography and computational number theory; in particular it can form an integral part in accelerating homomorphic encryption primitives as used in privacy-preserving machine learning. As such, special attention is given to support modular arithmetic; this is used in an example implementation of the Paillier additively homomorphic encryption scheme and of elliptic curve scalar multiplication. Future releases will provide additional support for operations useful to implementing Ring-LWE-based somewhat homomorphic encryption schemes.

Finally, the library is designed to be fast. Through exploitation of warp-synchronous programming, vote functions, and deferred carry handling, the primitives of the library are currently competitive with the state-of-the-art in the literature for modular multiplication and modular exponentiation on GPUs. The design of the library allows transparent substitution of the underlying arithmetic, allowing the user to select whichever performs best on the available hardware. Moreover, several algorithms, both novel and from the literature, will be incorporated shortly that will improve performance by a further 25-50%.

The library is currently at the alpha stage of development. It has many rough edges, but most features are present and it is performant enough to be competitive. Comments, questions and contributions are welcome!

Example

To get a feel for what it's like to use the library, let's consider a simple example. Here is an implementation of encryption in the Paillier cryptosystem:

#include "functions/quorem_preinv.cu"
#include "functions/modexp.cu"

using namespace cuFIXNUM;

template< typename fixnum >
class paillier_encrypt {
    fixnum n;                     // public key
    fixnum n_sqr;                 // ciphertext modulus n^2
    modexp<fixnum> pow;           // function x |--> x^n (mod n^2)
    quorem_preinv<fixnum> mod_n2; // function x |--> x (mod n^2)

    // This is a utility whose purpose is to allow calculating 
    // and using n^2 in the constructor initialisation list.
    __device__ fixnum square(fixnum n) {
        fixnum n2;
        fixnum::sqr_lo(n2, n);
        return n2;
    }
    
public:
    /*
     * Construct an encryption functor with public key n_.
     * n_ must be less fixnum::BITS/2 bits in order for n^2
     * to fit in a fixnum.
     */
    __device__ paillier_encrypt(fixnum n_)
        : n(n_)
        , n_sqr(square(n_))
        , pow(n_sqr, n_)
        , mod_n2(n_sqr)
        { }

    /*
     * Encrypt the message m using the public key n and randomness r.
     * Precisely, return
     *
     *   ctxt  <-  (1 + m*n) * r^n  (mod n^2)
     *
     * m and r must be at most fixnum::BITS/2 bits (m is interpreted 
     * modulo n anyhow).
     */
    __device__ void operator()(fixnum &ctxt, fixnum m, fixnum r) const {
        fixnum::mul_lo(m, m, n);             // m  <-  m * n   (lo half mult)
        fixnum::incr_cy(m);                  // m  <-  1
        pow(r, r);                           // r  <-  r^n (mod n^2)
        fixnum c_hi, c_lo;                   // hi and lo halves of wide multiplication
        fixnum::mul_wide(c_hi, c_lo, m, r);  // (c_hi, c_lo)  <-  m * r (wide mult) 
        mod_n2(ctxt, c_hi, c_lo);            // ctxt  <-  (c_hi, c_lo) (mod n^2)
    }
};

A few features will be common to most user-defined functions such as the one above: They will be template function objects that rely on a fixnum, which will be instantiated with one of the fixnum arithmetic implemententations provided, usually the warp_fixnum. Functions in the fixnum class are static and (usually) return their results in the first one or two parameters. Complicated functions that might perform precomputation, such as modular exponentiation (modexp) and quotient & remainder with precomputed inverse (quorem_preinv) are instance variables in the object that are initialised in the constructor.

Although it is not (yet) the focus of this project to help optimise host-device communication, the fixnum_array facility is provided to make it easy to apply user-defined functions to data originating in the host. Using fixnum_array will often look something like this:

using namespace cuFIXNUM;

// In this case we need to wrap paillier_encrypt above to read the 
// public key from memory and pass it to the constructor.
__device__ uint8_t public_key[] = ...; // initialised earlier

template< typename fixnum >
class my_paillier_encrypt {
    paillier_encrypt<fixnum> encrypt;
    
    __device__ fixnum load_pkey() {
        fixnum pkey;
        fixnum::from_bytes(pkey, public_key, public_key_len);
        return pkey;
    }
    
public:
    __device__ my_paillier_encrypt()
        : encrypt(load_pkey())
        { }
    
    __device__ void operator()(fixnum &ctxt, fixnum m, fixnum r) const {
        fixnum c;  // Always read into a register, then set the result.
        encrypt(c, m, r);
        ctxt = c;
    }
};

void host_function() {
    ...
    // fixnum represents 256-byte numbers, using a 64-bit "basic fixnum".
    typedef warp_fixnum<256, u64_fixnum> fixnum;
    typedef fixnum_array<fixnum> fixnum_array;

    fixnum_array *ctxts, *ptxts, *rnds, *pkeys;

    int nelts = ...;  // can be as much as 1e6 to 1e8
    // Usually this could be as much as fixnum::BYTES == 256, however
    // in this application it must be at most fixnum::BYTES/2 = 128.
    int message_bytes = ...;

    // Input plaintexts
    ptxts = fixnum_array::create(input_array, message_bytes, nelts);
    // Randomness
    rands = fixnum_array::create(random_data, fixnum::BYTES/2, nelts);
    // Ciphertexts will be put here
    ctxts = fixnum_array::create(ptxts->length());

    // Map  ctxts  <-  [paillier_encrypt(p, r) for p, r in zip(ptxts, rands)]
    fixnum_array::template map<my_paillier_encrypt>(ctxts, rands, ptxts);

    // Access results.
    ctxts->retrieve_all(byte_buffer, buflen);
   ...
}

Building

The build system for cuda-fixnum is currently, shall we say, primitive. Basically you can run make bench to build the benchmarking program, or make check to build and run the test suite. The test suite requires the Google Test framework to be installed. The Makefile will read in the variables CXX and GENCODES from the environment as a convenient way to specify the C++ compiler to use and the Cuda compute capability codes that you want to compile with. The defaults are CXX = g++ and GENCODES = 50.

Benchmarks

Here is the output from a recent run of the benchmark with a GTX Titan X (Maxwell, 1GHz clock, 3072 cores):

$ bench/bench 5000000
Function: mul_lo, #elts: 5000e3
fixnum digit  total data   time       Kops/s
 bits  bits     (MiB)    (seconds)
   32    32      19.1     0.000    24630541.9
   64    32      38.1     0.000    11547344.1
  128    32      76.3     0.001     5091649.7
  256    32     152.6     0.003     1775568.2
  512    32     305.2     0.008      619578.7
 1024    32     610.4     0.030      166855.8

   64    64      38.1     0.000    14619883.0
  128    64      76.3     0.001     7824726.1
  256    64     152.6     0.002     2908667.8
  512    64     305.2     0.006      829875.5
 1024    64     610.4     0.023      221749.2
 2048    64    1220.7     0.087       57611.0


Function: mul_wide, #elts: 5000e3
fixnum digit  total data   time       Kops/s
 bits  bits     (MiB)    (seconds)
   32    32      19.1     0.000    25906735.8
   64    32      38.1     0.000    10775862.1
  128    32      76.3     0.001     3861003.9
  256    32     152.6     0.005      985998.8
  512    32     305.2     0.018      271164.4
 1024    32     610.4     0.060       83847.6

   64    64      38.1     0.000    14662756.6
  128    64      76.3     0.001     6765899.9
  256    64     152.6     0.003     1904036.6
  512    64     305.2     0.009      530278.9
 1024    64     610.4     0.036      140024.6
 2048    64    1220.7     0.129       38680.5


Function: modexp, #elts: 50e3
fixnum digit  total data   time       Kops/s
 bits  bits     (MiB)    (seconds)
   32    32       0.2     0.000      292397.7
   64    32       0.4     0.001       68306.0
  128    32       0.8     0.003       16388.1
  256    32       1.5     0.017        3015.5
  512    32       3.1     0.108         463.6
 1024    32       6.1     0.748          66.9

   64    64       0.4     0.000      113378.7
  128    64       0.8     0.002       20798.7
  256    64       1.5     0.015        3403.2
  512    64       3.1     0.105         476.8
 1024    64       6.1     0.658          76.0
 2048    64      12.2     4.959          10.1

It is interesting to note that performance is consistently better with 64-bit integer arithmetic, even though 64-bit registers are simulated on nVidia devices. Looks like my 32-bit integer arithmetic code could use some work!

Sources

The main sources for the algorithms in this library are

Author and licence

The principal author of cuda-fixnum is Dr Hamish Ivey-Law (@unzvfu). Email: hamish (at) ivey-law.name

cuda-fixnum is copyright (c) 2016-2019 Commonwealth Scientific and Industrial Research Organisation (CSIRO).

cuda-fixnum is released under a modified MIT/BSD Open Source Licence (see LICENCE for details).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].