MongoDB's Queryable Encryption is the latest addition to its array of cutting-edge features, promising a revolutionary level of data security. With this feature, you can encrypt your sensitive data from your end, ensuring it remains confidential throughout its lifecycle. It takes your data and stores it as randomized encrypted data on the server, all while keeping the actual data hidden from the server itself.
This innovative approach ensures that your sensitive information stays encrypted during transmission, storage, utilization, backups, and even in logs. The best part? Only you, holding the encryption keys, have the power to decrypt and access the data. In this article, we'll delve into how MongoDB's Queryable Encryption functions and why it's a game-changer for safeguarding your data.
Setting up Queryable Encryption
Setting up Queryable Encryption involves three distinct approaches:
Automatic Encryption
This streamlined method eliminates the need for manual code creation. Encrypted read and write operations occur seamlessly, without requiring explicit encryption code. This automation ensures data security without additional effort.
It's important to note that Automatic Encryption is exclusive to MongoDB Enterprise Edition; MongoDB Community Edition does not support this feature.
Explicit Encryption
For those who prefer a hands-on approach, explicit encryption offers customization. This approach involves utilizing MongoDB's encryption library through your driver to define encryption logic.
Envelope Encryption
First, your data is protected with a Data Encryption Key (DEK). Then, this DEK is further secured by encrypting it with a Customer Master Key (CMK). The CMK, your ultimate key protector, is created using tools like a cloud Key Management Service (KMS). MongoDB keeps the encrypted DEKs safe in the Key Vault collection. Deleting a DEK makes the associated data unreadable, and deleting a CMK makes data tied to its DEKs permanently inaccessible. Envelope Encryption is like a double lock for your data, ensuring its safety and confidentiality.
Exploring the Automatic Encryption Shared Library for Queryable Encryption
Think of it as a clever tool for your application, expertly managing automatic Queryable Encryption. This tool knows what needs to be encrypted or decrypted and ensures that your application handles encrypted data correctly.
To get it, simply head over to the MongoDB Download Center. Select your version and platform, and then download the library.
Installation
Packages
Code Example
In the provided code, we established the patients collection within the medicalRecords database. For security reasons in production, it's essential to avoid using a local key file. Instead, consider using key providers like AWS, GCP, or Azure.
The shared library plays a crucial role in identifying encrypted fields and preventing unsupported actions on encrypted data. If, for any reason, the Automatic Encryption Shared Library isn't accessible, the driver will attempt to connect to mongocryptd for encryption.
Note: Queryable Encryption is only available for new collections. You cannot add or remove Queryable Encryption from existing collections.
Key Vault Collection
MongoDB includes a dedicated collection known as __keyVault to house Data Encryption Keys (DEKs). These keys serve the essential purpose of encrypting and decrypting fields within your encrypted collections.
Additionally, when configuring an encrypted collection, MongoDB generates two distinct special collections referred to as ESC and ECOC. These collections function as discreet helpers within the encryption process. As you introduce documents with encrypted fields that require querying, MongoDB updates these auxiliary collections to enhance search performance. This transforms the associated fields into special fields that are optimized for efficient querying. It's important to bear in mind that this specialized treatment does consume some storage and may influence write operation speed. The key is to strike a balance between facilitating fast searches and effectively managing storage and performance.
Document Insertion
Inserting documents with encrypted fields requires configuring the autoEncryption parameter during the setup of your database connection. Here's a code snippet that demonstrates this procedure:
Encrypted vs. Normal Document Insertion
The process of inserting 100,000 documents into an encrypted collection consumed roughly 18 minutes and 27.816 seconds, considering two encryption fields. In contrast, performing the same operation within a non-encrypted collection took only about 3.6 seconds. This notable difference in performance emphasizes the additional time overhead introduced by encryption processes. While encryption significantly enhances data security, it comes with a trade-off in terms of insertion speed.
Host information: Memory: 3.8 GB, CPU: 2 cores (x86_64), Linux (Ubuntu 22.04)
Encrypted Insertion
Normal Document Insertion
Query Flexibility with Automatic Encryption
- Automatic encryption supports specific equality query operators: $eq, $ne, $in, $nin, $and, $or, $not, and $nor.
- To enable clients to run read and write queries on these fields, you can add the queries property to your JSON schema. This property allows you to specify which fields are queryable.
- If you omit the queries property, querying for that field will be restricted, providing a balance between data security and query flexibility using automatic encryption.
Controlling Contention Factor:
- Adjusting the contention factor changes the default value of 8. Increasing contention speeds up insert and update operations, especially for low-cardinality fields.
- However, it's important to note that higher contention might impact find performance. Finding the right balance between these factors is crucial for the effective management of encrypted fields.
Enabling Queryability:
- Enabling queryability for encrypted fields involves creating an index for each field.
- This process can slightly slow down write operations on those fields. Whenever a write operation modifies an indexed field, MongoDB updates the associated index.
Usage of $or Operator:
- When using the $or operator, remember that only encrypted equality queries can be executed with an encrypted equality index.
Usage of $ne Operator:
- When using the $ne operator, the totalKeysExamined value will be 0, and the query will perform a scan of all the documents.
Data Backup with Queryable Encryption
When it comes to handling data backup in the context of queryable encryption, certain complexities arise. It's important to note that tools like mongodump, mongorestore, mongoimport, and mongoexport do not currently support queryable encryption. Therefore, when you attempt data restoration, you may encounter an error message such as bulk write exception: write errors: [Cannot insert a document with field name safeContent. Unfortunately, the available documentation may lack clarity when it comes to the intricacies of performing backups in the presence of queryable encryption.
If you have concerns or questions regarding data backup in such scenarios, please feel free to provide them in the command box for further assistance.
Limitations of MongoDB's Queryable Encryption
- Contention Factor: The contention factor can only be set when defining a field for encryption. Once a field is designated for encryption, the contention factor remains unchangeable.
- Restoration of Encrypted Collections: When attempting to restore encrypted collections, trying to restore a document with the field name safeContent will result in an error.
- Manual Index Compaction: If the metadata collections exceed a size of 1 GB, manual index compaction is required.
- Excluded CRUD Operations: Certain CRUD (Create, Read, Update, Delete) operations are excluded from being recorded in the slow operations query log and the Database Profiler's system.profile collection when performed on an encrypted collection.
- Standalone Deployments: Queryable encryption is not supported in standalone deployment configurations.
For more detailed information, you can refer to the source link: MongoDB Queryable Encryption Limitations.
In summary, MongoDB's queryable encryption is a robust security feature that safeguards sensitive data on the server, ensuring its confidentiality both at rest and during transit. However, it's essential to acknowledge certain limitations. For example, the contention factor can only be defined during field encryption, and once set, it remains unchanged. Additionally, specific CRUD operations are excluded from certain logs and collections when executed on encrypted data. Despite these limitations, MongoDB's queryable encryption represents a significant enhancement to data protection in database deployments.
Stay connected for more valuable MongoDB insights.
Also read: MongoDB 7.0 Cluster-to-Cluster Sync: Simplifying Data Synchronization