0%

Encheap

An encrypted (enclave-based) heterogeneous calculation protocol based on Nvidia CUDA and Intel SGX with a simple sample of CUBLAS, designed and implemented by Tinghao Xie, Haoyang Shi, Zihang Li.

Enchecap illustration:

demo

Enchecap illustration (with protected and trusted regions):

demo

Notice: We have not implemented the user-server code into the library/sample now, since it’s similar to the host-device part of our protocol. For now, we just implement the host-device part. In this repository, we show how to wrap up the cudaMemcpy() into secureCudaMemcpy(), doing implicit en/decryption for handy secure deployment.

Phase I: Initialization

  • Create an enclave
  • Enclave generates its own keys (generation is yet an empty shell now), then broadcasts its public key to user & device
  • GPU generates its own keys (generation is yet an empty shell now), then broadcasts its public key to host & user

Phase II: Calculation

  • En/Decrypt in enclave (decrypt with SGX’s private key, encrypt with GPU’s public key)
  • En/Decrypt on GPU (decrypt with GPU’s private key, encrypt with SGX’s public key)

Enchecap performance:

performance


Installation

To build the project, you’ll need to install and configure:

  • SGX SDK
  • CUDA Toolkit
  • CUDA Samples

, then set your CUDA_PATH and INCLUDES in Makefile, and make sure your SGX environment activated by

1
source /PATH_OF_SGXSDK/environment

(check SGX SDK official site for more details)

Then build with:

1
make # SGX hardware mode

if your CPU and bios support SGX, or just simulate the SGX app with:

1
make SGX_MODE=SIM  # SGX simulation mode

(check README_SGX.txt for more details)


To run the project, you’ll need to install and configure correctly:

  • SGX PSW
  • SGX driver, if you build it in hardware mode and that your CPU & BIOS support SGX
  • CUDA Driver (of course you must have an Nvidia GPU)

Run with:

1
./app

Future Work

  • The GPU’s and SGX’s keys are both simply welded in the code currently, need FIX
  • The current RSA en/decrypt algorithm is yet extremely naive! (further works include regrouping, big number supports…)
  • Add the user-server part into the sample, including
    • Remote attestation with Intel SGX
    • Broadcast his/her public key to the enclave and GPU, meanwhile record their public keys
    • Send encrypted data to the server
    • Receive encrypted results from the server
  • Intergration with real industrial work based on CUDA (like PyTorch)
  • Intergration with a real trusted GPU (far from our reach now)