Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 154 additions & 0 deletions mongodb-qe-tutorial/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# MongoDB Queryable Encryption Tutorial (Python)
**Automatic Client-Side Field Level Encryption with Azure Key Vault – Including CMK Rotation in Atlas**

## Overview

This repository demonstrates how to set up [MongoDB Queryable Encryption (QE)](https://www.mongodb.com/docs/manual/core/queryable-encryption/#std-label-qe-manual-feature-qe) using Python and Azure Key Vault, including secure Data Encryption Key (DEK) management and rewrapping after Customer Master Key (CMK) rotation in MongoDB Atlas.

Queryable Encryption allows you to **encrypt sensitive data client side**, perform expressive queries on encrypted fields, and manage your encryption keys securely with cloud KMS providers such as Azure Key Vault.

## Features

- **Create encrypted MongoDB collections** with [automatic encryption](https://www.mongodb.com/docs/manual/core/queryable-encryption/install-library/#std-label-qe-csfle-install-library)
- **Encrypt and decrypt fields transparently** in application code
- Use [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview) for secure key management (CMK)
- **Rewrap DEKs** (change key under which your encrypted keys are wrapped) after CMK rotation
- Full Python demo including helper functions, insertion, and querying

## Prerequisites

### Software

- **Python 3**
- [MongoDB Atlas Cluster](https://www.mongodb.com/cloud/atlas/register)
- [PyMongo Driver](https://www.mongodb.com/docs/languages/python/pymongo-driver/current/) (`>=4.4`)
- [pymongocrypt](https://pypi.org/project/pymongocrypt/) (`>=1.6`)
- Automatic Encryption Shared Library ([crypt_shared](https://www.mongodb.com/docs/manual/core/queryable-encryption/install-library/#automatic-encryption-shared-library))

### Cloud Providers (Azure)

- [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview) with your **CMK**
- [Register your application in Microsoft Entra ID](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-register-app)
- Assign the application the **Key Vault Administrator** role, or permissions to wrap/unwrap keys

### Other Supported KMS Providers
- AWS, GCP, KMIP, or local (see `.env` placeholders)

---

## Getting Started

### 1. Clone This Repository

```bash
git clone https://github.com/<your-org>/<your-repo>.git
cd /<your-repo>/mongodb-qe-tutorial
```

### 2. Populate Environment Variables

Edit the **.env** file and replace all placeholder values (`<Your ...>`) with your credentials.

```bash
# Azure Example:
export AZURE_TENANT_ID="<Your Azure tenant ID>"
export AZURE_CLIENT_ID="<Your Azure client ID>"
export AZURE_CLIENT_SECRET="<Your Azure client secret>"
export AZURE_KEY_NAME="<Your Azure Key Name>"
export AZURE_KEY_VERSION="<Your Azure Key Version>"
export AZURE_KEY_VAULT_ENDPOINT="<Your Azure Key Vault Endpoint>"
export KEY_VAULT_MONGODB_URI="<Your Atlas Connection String>"
export MONGODB_URI="<Your Atlas Connection String>"
export SHARED_LIB_PATH="/full/path/to/mongo_crypt_v1.so"
...
```

See `.env` in repo for a full example including other KMS providers.

### 3. Install Python Dependencies

```bash
python -m pip install -r requirements.txt
```

### 4. Download Automatic Encryption Shared Library

Follow [these instructions](https://www.mongodb.com/docs/manual/core/queryable-encryption/install-library/#automatic-encryption-shared-library) to download the correct `mongo_crypt_v1.so` (or `.dylib` for Mac) for your system, and record its full path in your `.env`.

---

## Usage

### Step 1: Create Key Vault and Encrypted Collection

This script creates the **key vault collection** (to hold your DEKs) and sets up an **encrypted collection** for your data.

```bash
python create_encrypted_collections.py
```

### Step 2: Insert Encrypted Document

This script uses automatic encryption to insert a document with encrypted fields.

```bash
python insert_encrypted_doc.py
```

**Sample output:**
```plaintext
Successfully inserted another patient with ssn: 123-45-6789
{...decrypted document...}
```

### Step 3: Rotate Your CMK in Azure Key Vault

- Use the Azure Portal to [rotate your root key](https://learn.microsoft.com/en-us/azure/key-vault/keys/change-key-version).
- Record the new version in your `.env` if needed.

### Step 4: Rewrap Data Encryption Keys (DEKs)

After CMK rotation, rewrap all the DEKs in MongoDB – they’ll be wrapped under the new version of your master key and remain usable.

Edit `rewrap_deks.py` with your new CMK details if needed:

```bash
python rewrap_deks.py
```

---

## Troubleshooting

### Common Issues

- **"Not all keys were satisfied":**
If demo code is run multiple times without dropping collections, documents may be encrypted under keys that are lost or missing. Drop your vault and collection, restart, and generate keys once.

- **Shared library load errors:**
Example:
```
Error while opening candidate for crypt_shared dynamic library [/path/mongo_crypt_v1.so]
```
- Ensure your library matches your OS and CPU arch (`file mongo_crypt_v1.so`, `uname -a`)
- Path must be correct and the file must be present

---

## File Reference

- `requirements.txt` – Python package requirements
- `.env` – Environment variables for all supported KMS providers
- `queryable_encryption_helpers.py` – Helper functions for KMS credentials and encryption setup
- `create_encrypted_collections.py` – Create vault, DEKs, and encrypted collection
- `insert_encrypted_doc.py` – Insert and query encrypted documents
- `rewrap_deks.py` – Rewrap DEKs after master key rotation

---

## References & Documentation

- [Queryable Encryption Tutorials](https://www.mongodb.com/docs/manual/core/queryable-encryption/tutorials/#queryable-encryption-tutorials)
- [Queryable Encryption Quick Start](https://www.mongodb.com/docs/manual/core/queryable-encryption/quick-start/#queryable-encryption-quick-start)
- [MongoDB Atlas](https://www.mongodb.com/docs/atlas/)
- [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview)
119 changes: 119 additions & 0 deletions mongodb-qe-tutorial/create_encrypted_collections.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
from pymongo import MongoClient #import MongoClient class to connect to MongoDB servers/clusters.
import queryable_encryption_helpers as helpers # our helper functions
import os #For reading environment variables.
from dotenv import load_dotenv #Loads variables from a .env file into your environment

load_dotenv() #Loads the values in a .env file

# start-setup-application-variables
kms_provider_name = "azure"

# URIs for Atlas clusters
key_vault_uri = os.environ['KEY_VAULT_MONGODB_URI'] # Key Vault Cluster!
data_uri = os.environ['MONGODB_URI'] # Application Data Cluster!

key_vault_database_name = "queryable_encryption"
key_vault_collection_name = "queryable_keyVault"
key_vault_namespace = f"{key_vault_database_name}.{key_vault_collection_name}"
encrypted_database_name = "mongoMedicalRecords"
encrypted_collection_name = "mongoDBpatients"



kms_provider_credentials = helpers.get_kms_provider_credentials(kms_provider_name)
customer_master_key_credentials = helpers.get_customer_master_key_credentials(kms_provider_name)

#Drop old collections for a fresh setup
data_client = MongoClient(data_uri)
try:
data_client[encrypted_database_name][encrypted_collection_name].drop()
except Exception:
pass

key_vault_client = MongoClient(key_vault_uri)
try:
key_vault_client[key_vault_database_name][key_vault_collection_name].drop()
except Exception:
pass

# ---- Ensure the key vault collection has a unique index on keyAltNames ----
key_vault_client[key_vault_database_name][key_vault_collection_name].create_index(
"keyAltNames",
unique=True,
partialFilterExpression={"keyAltNames": {"$exists": True}} #Creates a unique index only on documents that actually have keyAltNames (not all do).
)
print("Created unique index on keyAltNames for key vault collection.")

# Set Up the ClientEncryption Object
#Initializes an object that lets you securely create and use data encryption keys (DEKs).
#Uses the key vault, KMS, credentials, and collection namespace.
client_encryption = helpers.get_client_encryption(
key_vault_client,
kms_provider_name,
kms_provider_credentials,
key_vault_namespace
)



# ---- Create DEKs with keyAltNames (one per field) ----
ssn_altname = f"{encrypted_database_name}.ssn"
billing_altname = f"{encrypted_database_name}.billing"

# create a DEK (only once), record its keyId:
# key_id is a BSON Binary(UUID_subtype_4) and Use the keyIds for both fields:
ssn_key_id = client_encryption.create_data_key(
kms_provider_name,
master_key=customer_master_key_credentials,
key_alt_names=[ssn_altname]
)
billing_key_id = client_encryption.create_data_key(
kms_provider_name,
master_key=customer_master_key_credentials,
key_alt_names=[billing_altname]
)
print(f"Created SSN Key ID: {ssn_key_id}")
print(f"Created Billing Key ID: {billing_key_id}")

# Save the DEKs for use in insert_doc.py (write to file, print, etc.)
with open("ssn_key_id.bin", "wb") as f:
f.write(ssn_key_id)
with open("billing_key_id.bin", "wb") as f:
f.write(billing_key_id)


# start-encrypted-fields-map

encrypted_fields_map = {
"fields": [
{
"path": "patientRecord.ssn",
"bsonType": "string",
"queries": [{"queryType": "equality"}],
"keyId": ssn_key_id
},
{
"path": "patientRecord.billing",
"bsonType": "object",
"keyId": billing_key_id
}
]
}


# creates a new collection in your MongoDB data cluster.

try:
client_encryption.create_encrypted_collection(
data_client[encrypted_database_name],
encrypted_collection_name,
encrypted_fields_map,
kms_provider_name,
customer_master_key_credentials,
)
print("Encrypted collection created successfully.")
except Exception as e:
print("Unable to create encrypted collection due to the following error:", e)

data_client.close()
key_vault_client.close()
40 changes: 40 additions & 0 deletions mongodb-qe-tutorial/env_template
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# MongoDB connection URI(s) and automatic encryption shared library path
# In most deployments, the key vault and application data use the same Atlas cluster.
# Only separate them if you have a specific security or compliance reason.

export KEY_VAULT_MONGODB_URI="Your Atlas cluster URL" # Used for the key vault collection
export MONGODB_URI="Your Atlas cluster URL" # Used for your encrypted application data
export SHARED_LIB_PATH="/full/path/to/the downloaded Automatic_Encryption_Shared_Library"

# AWS Credentials

export AWS_ACCESS_KEY_ID="<Your AWS access key ID>"
export AWS_SECRET_ACCESS_KEY="<Your AWS secret access key>"
export AWS_KEY_REGION="<Your AWS key region>"
export AWS_KEY_ARN="<Your AWS key ARN>"

# Azure Credentials

export AZURE_TENANT_ID="<Your Azure tenant ID>"
export AZURE_CLIENT_ID="<Your Azure client ID>"
export AZURE_CLIENT_SECRET="<Your cleint secret>"
export AZURE_KEY_NAME="<Your key name>"
export AZURE_KEY_VERSION="<Your key version>"
export AZURE_KEY_VAULT_ENDPOINT="<Your key vault endpoint>"

# GCP Credentials

export GCP_EMAIL="<Your GCP email>"
export GCP_PRIVATE_KEY="<Your GCP private key>"

export GCP_PROJECT_ID="<Your project id>"
export GCP_LOCATION="<Your location>"
export GCP_KEY_RING="<Your key ring>"
export GCP_KEY_NAME="<Your key name>"
export GCP_KEY_VERSION="<Your key version>"

# KMIP Credentials

export KMIP_KMS_ENDPOINT="<Endpoint for your KMIP KMS>"
export KMIP_TLS_CA_FILE="<Full path to your KMIP certificate authority file>"
export KMIP_TLS_CERT_FILE="<Full path to your client certificate file>"
Loading