# Model Level: Privacy-Preserving Training & Inference via Equivariant Encryption

### Model Level: Privacy-Preserving Training & Inference via Equivariant Encryption

Securely adapting a biomedical model—whether a transformer like BioBERT or a large language model (LLM) like Llama—requires addressing privacy and ownership constraints.\
We distinguish two scenarios based on trust models, data ownership, and institutional roles.

<figure><img src="/files/O01M7dxUbB5fKeUl2sFs" alt=""><figcaption><p>Model Level Scenarios: (1) Federated Learning + EE Inference, (2) Two-party Encrypted Fine-tuning. Top (Scenario 1): Multiple institutions train a shared model using federated learning (FL) on local data and apply Equivariant Encryption (EE) only for inference. Bottom (Scenario 2): A pharma entity (data owner) and Rexis (model owner) jointly fine-tune a large model under EE, ensuring data and model confidentiality.</p></figcaption></figure>

***

#### Scenario 1: Federated Learning + EE Inference

In this setup, multiple institutions train a shared model using **federated learning (FL)** without sharing raw data. Each participant retains local data and contributes encrypted updates. EE is applied to the final model to secure inference.

**Federated Workflow:**

1. **Local Data Stays On-Premise**\
   Each institution (e.g., hospital ( H\_i )) keeps its data locally. It pulls the current global model and computes updates (e.g., using SGD or Adam) without exposing raw data.
2. **Gradient Computation and Optional Privacy**
   * Institutions compute gradient updates ( \Delta \theta\_i ).
   * Optionally apply gradient clipping, differential privacy (DP), or secure aggregation to protect sensitive info.
   * EE may also be used for encrypting updates.\
     Gradient inversion attacks exist but can be mitigated with best practices.
3. **Aggregator Collects Updates**\
   The aggregator (e.g., a neutral coordinator or cloud provider) collects either plaintext or securely aggregated updates. It never sees raw data.
4. **Global Model Convergence**\
   The aggregator applies federated averaging to produce a new global model ( \theta\_{\text{global}} ). This cycle repeats until training converges.
5. **EE Encryption of Final Model**\
   Once converged, the model is encrypted with EE:

   $$
   A' = TAT^{-1}, \quad b' = Tb
   $$

   where ( T ), ( T^{-1} ), and ( \delta ) are secret keys. The EE-encrypted model appears scrambled to untrusted parties.
6. **Secure Inference with EE**\
   Inference is run on encrypted input:

   $$
   x' = T x + \delta, \quad y' = R(A' x' + b')
   $$

   Only the querying client can decrypt ( y' ) using ( T^{-1} ). No raw input/output is exposed to others.

***

#### Compliance and Privacy Summary

* **Local-Only Training Data**\
  All raw data stays on-premise, satisfying HIPAA/GDPR.
* **Optional Gradient Security**\
  DP or secure aggregation may be used for further gradient protection.
* **Encrypted Inference with EE**\
  No intermediate activations or outputs are revealed post-training.
* **No Runtime Overhead**\
  EE maintains similar runtime complexity as standard inference—unlike HE.

***

#### Scenario 2: Single-Party Fine-Tuning with Encrypted Data (Pharma + Rexis)

Here, a pharma company owns sensitive data and collaborates with Rexis for compute (e.g., GPU-heavy fine-tuning). The model may remain in plaintext, but data is encrypted via EE.

**Workflow:**

1. **Model Transfer (Optional)**\
   Pharma and Rexis agree on a model. Pharma may hold plaintext weights—no model encryption is required.
2. **Local Data Encryption**\
   Each training batch is encrypted locally:

   $$
   x' = T x + \delta
   $$

   The encrypted batch ( x' ) is sent to Rexis's HPC cluster.
3. **Encrypted Forward + Backprop**\
   Rexis computes:
   * Forward pass using ( x' ) and model weights
   * Backward pass and updates
   * Without access to raw inputs or activations
4. **Loss and Update**
   * Encrypted outputs can be returned to Pharma for computing loss
   * Or Rexis completes all updates under EE
   * Updated model may be stored in plaintext or encrypted
5. **Iterative Fine-Tuning**\
   Pharma continues sending encrypted batches to Rexis. Rexis trains without seeing unencrypted data or gradients.
6. **Model Retrieval**\
   Pharma retrieves the final model in plaintext or decrypts it locally.

***

#### Security Guarantees & Trade-offs

* **Data Privacy**\
  Rexis sees only encrypted data ( x' ), never plaintext.
* **Model Transparency**\
  The model can remain in plaintext on Pharma’s side—no encryption required.
* **Low Local Overhead**\
  Pharma encrypts data locally; Rexis handles compute.
* **Efficient Performance**\
  EE avoids ciphertext expansion and supports fast runtime.

***

#### When to Use This

* A single institution holds sensitive data
* It lacks internal compute capacity
* It requires data privacy without needing to hide model weights
* The compute provider (Rexis) is semi-trusted or untrusted

By leveraging **Equivariant Encryption (EE)**, pharma institutions can use high-performance compute resources securely—ensuring privacy while retaining model ownership and full control.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://rexis.gitbook.io/rexis/the-rexis-solution-layer-for-desci/model-level-privacy-preserving-training-and-inference-via-equivariant-encryption.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
