Unveiling Expected Features in MongoDB 8.0

Manosh Malai
Jun 14, 2024
10
Mins to Read
All

I'm truly excited about what's coming with MongoDB 8.0—it's poised to boost performance and introduce some really cool features. While we're still waiting on the official release date, I've already dived into the details from the MongoDB.local event in NYC and done some digging into Jira tickets and GitHub code. I can't wait to share my discoveries and insights in this blog post. Stay tuned!

Performance Enhancements in MongoDB 8.0

  • Benchmarks using YCSB, a popular database benchmarking tool, showcase significant performance gains in MongoDB 8.0. Compared to version 7.0, write-heavy workloads (YCSB Write Bulk Data) can expect up to a 54% improvement in speed.
  • Read performance also sees a boost, with YCSB read-heavy workloads (100% Read) experiencing up to a 27% improvement and mixed workloads (95% Read, 5% Write) achieving up to a 25% improvement.
  • Additionally, other benchmarks like Linkbench suggest an 18% performance increase, while Time Series workloads (TSBS) are expected to benefit from up to a 60% performance boost.
  • Faster Writes and Reads: Benchmark results indicate up to a 54% improvement in write performance using YCSB Write Bulk Data workloads and a 27% improvement in read performance. This translates to quicker data processing and retrieval times.

Benchmark

Improvement

YCSB Write-Heavy (100% Write)

Up to 54%

YCSB Read-Heavy (95% Read)

Up to 27%

Linkbench (if data available)

Up to 18%

Time Series Workloads (TSBS)

Up to 60%

Workload Specific Improvements in MongoDB 8.0

Time Series Enhancements

  • MongoDB 6.0 & 7.0 brought a significant advancement for time series data management with the introduction of columnar storage. This innovative format dramatically enhances performance for critical tasks such as aggregation, sorting, and visualization. Particularly, the $group operation, which is crucial for summarizing and analyzing time series data, sees substantial performance improvements. This enhancement makes these versions a game-changer for efficiently handling and querying large volumes of time series data.
  • MongoDB 8.0 (expected): We're super excited about what’s coming up in MongoDB 8.0—especially the new block processing feature. This upgrade is a big deal because it organizes data into blocks, which means faster queries and smarter use of storage space. But here's the real kicker: it’s all made possible by switching to columnar storage. This change is a major breakthrough as it boosts our ability to handle and compress data more effectively. While we know block processing is confirmed, we're still on the lookout for the detailed documentation to see exactly how it’s going to make MongoDB 8.0 better.

Command Path Optimization

Express Path Efficiency

MongoDB 8.0 introduces the Express Path, a streamlined route that optimizes the processing of specific queries, especially those suitable for quick resolution through basic index or clustered index scans. This innovative approach is designed to boost performance for eligible queries by focusing on speed and reducing overhead.

Key Benefits of the Express Path

Optimized Speed: The Express Path is engineered to expedite queries that can directly benefit from simple index scans. By bypassing more complex query planning and execution steps, this approach significantly reduces execution time.

Reduced Overhead: The Express Path simplifies data retrieval by eliminating unnecessary operations, minimizing computational overhead. This is especially advantageous for frequently executed operations like point lookups.

Targeted Use Cases: Best suited for common query types, such as point lookups on the id index (IDHACK-eligible queries), the Express Path enhances the efficiency of retrieving documents based on their unique ID.

Improved Visibility: Queries utilizing the Express Path are distinctly marked in explain outputs and slow query logs as "EXPRESSIXSCAN" or "EXPRESS_CLUSTERED_IXSCAN". This clear identification aids developers and database administrators in optimizing query performance and troubleshooting efficiently.

Resource Efficiency: By avoiding the creation of complex BSON objects and minimizing the traversal needed through the database architecture, the Express Path conserves resources. This leads to not only faster query performance but also more available resources for other database operations.

Express Path 17% Latency Improvement in MongoDB 8.0 Streamlining Data Retrieval with seekForKeyValueView

In MongoDB 8.0, the seekForKeyValueView method represents a pivotal enhancement, consolidating the keystring and RecordId (RID) into a tuple. This method avoids the unnecessary creation of BSONObj, streamlining the data retrieval process.

Current Method

The current approach involves the use of SortedDataInterface::seek within MongoDB's typical query execution pathway but not within the Express Path. This function retrieves an IndexKeyEntry, which includes the RecordId, and requires the allocation of a BSONObj for the dehydrated index key. Although this BSONObj encapsulates the required index key, it often remains underutilized, leading to inefficiencies:

  • Invocation: The seek function is activated with the specific index key.BSONObj
  • Creation: Constructs a BSONObj to encapsulate the dehydrated index key, even though it’s not effectively used afterwards.Data
  • Retrieval: Fetches the IndexKeyEntry that includes the necessary RecordId.

Proposed Improvement for Express Path

The proposed method suggests incorporating the seekForKeyValueView into the Express Path, which is not currently utilized in this optimized pathway. This function would enhance efficiency by directly returning a view that combines the keystring and RecordId (RID) into a tuple, thus avoiding the creation of an underutilized BSONObj:

  • Direct Retrieval: Calls seekForKeyValueView with the necessary index parameters.
  • Avoids BSONObj: This method bypasses the BSONObj creation, directly providing a keystring + RID tuple.
  • Efficient Data Access: The tuple offers all necessary data for the Express Path's processing needs, eliminating overhead and improving performance.

Benefits

  • Reduced Overhead: By eliminating the creation of unnecessary BSONObj, the query process is streamlined.
  • Enhanced Query Speed: Direct data retrieval speeds up the overall execution time for eligible queries in the Express Path.
  • Improved Resource Efficiency: Lower resource consumption per query allows for better overall system performance and resource allocation.

This integration of seekForKeyValueView into the Express Path aims to leverage its efficiency for direct and rapid query execution, aligning with MongoDB's goal to optimize performance and resource utilization in high-demand scenarios.

Ref: https://jira.mongodb.org/browse/SERVER-89445

Reduced Memory Fragmentation Size 18%

  • Utilization of Per-CPU Cache: Leverages dedicated CPU caches to accelerate data access and reduce latency.
  • 18% Reduction in Memory Fragmentation: Minimizes wasted memory space, enhancing overall system efficiency.
  • Enhanced Peak Load Behavior: Improves system responsiveness and stability under high load conditions.

Improved Visibility In Query Insights: MongoDB Atlas

MongoDB 8.0
Query Insights

Query Insight Features

MongoDB Atlas now provides detailed namespace-level metrics to help achieve faster issue resolution. The latest updates allow users to compare performance across collections and utilize a heatmap panel to understand query trends.

Query Shape and Operation Rejection Filters

In MongoDB, the query shape hash is calculated using a method called kToRepresentativeParseableValue. This method standardizes all literals of the same data type to a consistent value. This standardization allows MongoDB to classify queries with the same structural framework but different specific values under a unified "query shape." This process, known as "shapifying," simplifies queries by focusing on their structural patterns rather than the literal values they contain. This technique is essential for efficiently identifying and optimizing structurally similar queries.

For more details, please refer to the following resources:

  • Code Path: mongo-master/src/mongo/db/query/query_shape
  • CmdSpecificShapeComponents for general command shapes: check query_shape.h
  • AggCmdShapeComponents for aggregation command shapes: check agg_cmd_shape.h
  • DistinctCmdShapeComponents for distinct command shapes: check distinct_cmd_shape.h
  • FindCmdShapeComponents for find command shapes: check find_cmd_shape.h
  • LetShapeComponent for commands incorporating the let statement: check cmd_with_let_shape.h

Reject Operations Using Query Shape You can reject a particular query using its query shape without changing anything in the application. This helps prevent resource-consuming queries from causing further issues.

 
db.adminCommand({
  setQuerySetting: ‘xxxxxx’,
  Settings: {
    Reject: true
  }
})
	

Define Read Timeout Globally

You can define a default timeout (maxTimeMS) for all read operations on your cluster. This setting helps protect your cluster from resource-intensive queries due to suboptimal or unindexed queries.

 
db.adminCommans(
{
setClusterParameter: {
defaultMaxTimeMS: { readOperations: 20000}
}}}
	

Persistent Query Settings

Previously, we achieved this using planCacheSetFilter:

 
db.runCommand(
   {
  	planCacheSetFilter: ,
  	query: ,
  	sort: ,
  	projection: ,
  	indexes: [ , , ...],
   }
)
	

However, this setting does not persist between restarts, causing many challenges on the operational side.

Now, we can overcome all these challenges using the new feature, Persistent Query Settings.

Advantages:

  • For specific query shapes that match the hash and apply the index
  • Replicates to all members of the replica set
  • Persists between restarts
  • Requires no changes to our application code
 
db.adminCommand({
   setQuerySettings: '',
   indexFilters: ''
})
	

Example slow query log entry:

 
{
   planSummary: "COLLSCAN",
   queryShapeHash: '7F312F79FCOC3_*',
   durationMillis: 11123232 // Too long
}
	

Advanced Sharding Capabilities in MongoDB 8.0

8.0 Move Unsharded Collection

In MongoDB 8.0, moving an unsharded collection from one shard to another is now easier:

  • Use db.adminCommand({moveCollection:”mydbops.mongodb”, toShard: “shard1”}) to move any particular collection from ShardX to ShardY without losing any CRUD operations.
  • Before 8.0, the movePrimary option was available but it was completely offline, and the read or write behavior was undefined.

8.0 Convert Sharded Collection to Unshard

In MongoDB 8.0, converting a sharded collection to unsharded is simpler:

  • Use db.adminCommand({unshardCollection:”mydbops.mongodb”, toShard: “shard1”}) to convert a sharded collection to unsharded.
  • Faster resharding is now possible for adding or removing sharding, without impacting the workload compared to previous methods.
  • For example, a 1TB cluster resharding 500GB now takes hours instead of days.
MongoDB 8.0
Convert Sharded Collection to Unshard

Queryable Encryption Enhancements in MongoDB 8.0

Queryable Encryption Range Support

MongoDB 8.0 introduces support for range queries within date ranges or numeric bounds, including Decimal 128 for precise financial values. Encrypted fields now support $gt, $lt, $gte, and $lte operators.

These expected features in MongoDB 8.0 paint a promising picture for database administrators and developers. Stay tuned for the official release to experience these advancements firsthand and unlock the full potential of your database operations!

Ready to unlock the full potential of MongoDB 8.0? Our MongoDB consulting experts can help you navigate the new features and optimize your database performance. We also offer managed and remote DBA services to ensure your MongoDB environment runs smoothly. Contact us today for a free consultation!

{{cta}}

No items found.

About the Author

Manosh Malai

CTO, Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.