Rexis
  • Rexis: The L1 for DeSci
  • Key Challenges in BioDeSci
    • Overview
  • Data Level: Secure Biomedical Sharing
  • Model Level: Privacy-Preserving Fine-Tuning
  • Evaluation Level: Reproducible Computation
  • The Rexis Solution: Layer for DeSci
    • Overview
    • Data Level: Decentralized Biomedical Data Market
    • Model Level: Privacy-Preserving Training & Inference via Equivariant Encryption
    • Evaluation Level: Secure and Verifiable Biomedical Computation
  • Example BioDeSci Data and Models
    • Overview
  • Tabular Data
  • Biomedical Signals
  • Biological Sequences
  • Medical Imaging
  • Volumetric Medical Imaging
  • Spatial Omics Data
  • Tokenomics
    • $REX Overview
  • Links
    • rexis.io
  • Term of Use
  • Privacy Policy
  • Community
Powered by GitBook
On this page
  1. The Rexis Solution: Layer for DeSci

Model Level: Privacy-Preserving Training & Inference via Equivariant Encryption

PreviousData Level: Decentralized Biomedical Data MarketNextEvaluation Level: Secure and Verifiable Biomedical Computation

Last updated 1 month ago

Model Level: Privacy-Preserving Training & Inference via Equivariant Encryption

Securely adapting a biomedical model—whether a transformer like BioBERT or a large language model (LLM) like Llama—requires addressing privacy and ownership constraints. We distinguish two scenarios based on trust models, data ownership, and institutional roles.


Scenario 1: Federated Learning + EE Inference

In this setup, multiple institutions train a shared model using federated learning (FL) without sharing raw data. Each participant retains local data and contributes encrypted updates. EE is applied to the final model to secure inference.

Federated Workflow:

  1. Local Data Stays On-Premise Each institution (e.g., hospital ( H_i )) keeps its data locally. It pulls the current global model and computes updates (e.g., using SGD or Adam) without exposing raw data.

  2. Gradient Computation and Optional Privacy

    • Institutions compute gradient updates ( \Delta \theta_i ).

    • Optionally apply gradient clipping, differential privacy (DP), or secure aggregation to protect sensitive info.

    • EE may also be used for encrypting updates. Gradient inversion attacks exist but can be mitigated with best practices.

  3. Aggregator Collects Updates The aggregator (e.g., a neutral coordinator or cloud provider) collects either plaintext or securely aggregated updates. It never sees raw data.

  4. Global Model Convergence The aggregator applies federated averaging to produce a new global model ( \theta_{\text{global}} ). This cycle repeats until training converges.

  5. EE Encryption of Final Model Once converged, the model is encrypted with EE:

    A′=TAT−1,b′=TbA' = TAT^{-1}, \quad b' = TbA′=TAT−1,b′=Tb

    where ( T ), ( T^{-1} ), and ( \delta ) are secret keys. The EE-encrypted model appears scrambled to untrusted parties.

  6. Secure Inference with EE Inference is run on encrypted input:

    x′=Tx+δ,y′=R(A′x′+b′)x' = T x + \delta, \quad y' = R(A' x' + b')x′=Tx+δ,y′=R(A′x′+b′)

    Only the querying client can decrypt ( y' ) using ( T^{-1} ). No raw input/output is exposed to others.


Compliance and Privacy Summary

  • Local-Only Training Data All raw data stays on-premise, satisfying HIPAA/GDPR.

  • Optional Gradient Security DP or secure aggregation may be used for further gradient protection.

  • Encrypted Inference with EE No intermediate activations or outputs are revealed post-training.

  • No Runtime Overhead EE maintains similar runtime complexity as standard inference—unlike HE.


Scenario 2: Single-Party Fine-Tuning with Encrypted Data (Pharma + Rexis)

Here, a pharma company owns sensitive data and collaborates with Rexis for compute (e.g., GPU-heavy fine-tuning). The model may remain in plaintext, but data is encrypted via EE.

Workflow:

  1. Model Transfer (Optional) Pharma and Rexis agree on a model. Pharma may hold plaintext weights—no model encryption is required.

  2. Local Data Encryption Each training batch is encrypted locally:

    x′=Tx+δx' = T x + \deltax′=Tx+δ

    The encrypted batch ( x' ) is sent to Rexis's HPC cluster.

  3. Encrypted Forward + Backprop Rexis computes:

    • Forward pass using ( x' ) and model weights

    • Backward pass and updates

    • Without access to raw inputs or activations

  4. Loss and Update

    • Encrypted outputs can be returned to Pharma for computing loss

    • Or Rexis completes all updates under EE

    • Updated model may be stored in plaintext or encrypted

  5. Iterative Fine-Tuning Pharma continues sending encrypted batches to Rexis. Rexis trains without seeing unencrypted data or gradients.

  6. Model Retrieval Pharma retrieves the final model in plaintext or decrypts it locally.


Security Guarantees & Trade-offs

  • Data Privacy Rexis sees only encrypted data ( x' ), never plaintext.

  • Model Transparency The model can remain in plaintext on Pharma’s side—no encryption required.

  • Low Local Overhead Pharma encrypts data locally; Rexis handles compute.

  • Efficient Performance EE avoids ciphertext expansion and supports fast runtime.


When to Use This

  • A single institution holds sensitive data

  • It lacks internal compute capacity

  • It requires data privacy without needing to hide model weights

  • The compute provider (Rexis) is semi-trusted or untrusted

By leveraging Equivariant Encryption (EE), pharma institutions can use high-performance compute resources securely—ensuring privacy while retaining model ownership and full control.

Model Level Scenarios: (1) Federated Learning + EE Inference, (2) Two-party Encrypted Fine-tuning. Top (Scenario 1): Multiple institutions train a shared model using federated learning (FL) on local data and apply Equivariant Encryption (EE) only for inference. Bottom (Scenario 2): A pharma entity (data owner) and Rexis (model owner) jointly fine-tune a large model under EE, ensuring data and model confidentiality.