
TiDB Distributed eXecution Framework (DXF): Revolutionizing Database Performance
We've all experienced this. It’s late at night, and you find yourself waiting for a database operation that seems like it’s taking forever – maybe you’re trying to create an index and it just never seems to finish, or perhaps you’re trying to import data and it’s going on and on and on. Minutes become hours, and you start wondering whether you are ever going to get to log off.
Now picture this scenario: instead of just one machine working all of the time, your database operations are distributed across multiple nodes. Data is processed more quickly, and you don’t have to just sit around and wait for data operations to finish. This is the beauty of TiDB’s Distributed eXecution Framework (DXF).In this blog, we will cover how DXF is a game changer in making database operations easier and more seamless. Let's dive in!

What is DXF?
TiDB follows a computing-storage separation model, allowing for better scalability and flexibility. With the release of version 7.1.0, TiDB introduced the Distributed eXecution Framework (DXF)—a major step toward improving distributed database management. DXF brings unified task scheduling, distributed execution, and centralized resource management, ensuring efficient system performance while maximizing resource utilization.
.png)
DXF is TiDB's answer to distributed task management. Rather than overwhelming one node with intensive tasks, DXF efficiently distributes operations between multiple nodes in the cluster. Imagine delegating a group of experienced workers to work on a project as a team — quicker, smoother, and without exhausting one resource.With DXF, operations such as ADD INDEX, IMPORT INTO, ANALYZE, and TTL management become much more streamlined.
.png)
Use Cases of DXF
DXF is designed for executing large-scale database management tasks beyond core Transactional Processing (TP) and Analytical Processing (AP), such as:
- DDL operations (e.g., ADD INDEX)
- IMPORT INTO (importing CSV, SQL, Parquet files)
- TTL (Time-to-Live) management
- ANALYZE for statistics collection
- Backup and Restore
Key Benefits of DXF
- Most impressive about DXF is its superior scalability while not compromising on availability and performance levels. It loads work intelligently across your TiDB cluster, making the best use of available computing resources when and where it's necessary.
- DXF also maintains resource balance optimally, keeping both system as a whole healthy along with individual tasks completed.
.png)
Limitations of DXF
There are certain limitations to keep in mind, however - DXF limits concurrent operation to 16 concurrent tasks, including operations such as adding indexes or importing data(ADD INDEX and IMPORT INTO).
Enabling and Configuring DXF
Before using DXF for ADD INDEX, Fast Online DDL mode must be enabled:
System Variables:
tidb_ddl_enable_fast_reorg
: Enabled by default from v6.5.0.tidb_ddl_disk_quota
: Configures the disk quota for Fast Online DDL mode.temp-dir
: A local disk path is required for Fast Online DDL mode, with at least 100 GiB of free space recommended.
How to Enable DXF
DXF is enabled by default from TiDB v8.1.0. For earlier versions, it can be enabled manually using the following command:
SET GLOBAL tidb_enable_dist_task = ON;
When enabled, supported statements (e.g., ADD INDEX, IMPORT INTO) execute in a distributed manner across all TiDB nodes.
Recommended System Variables for DXF
- tidb_ddl_reorg_worker_cnt (Default: 4, Max: 16)
- tidb_ddl_reorg_batch_size (Default: System-defined, Max: 1024)
Task Scheduling and Management
Tasks are distributed across all TiDB nodes by default. From versions 7.4.0 to 8.0.0, the tidb_service_scope
can be configured to control task execution:
Controlling Task Execution (v7.4.0+)
- v7.4.0 - v8.0.0:
- tidb_service_scope can be set to '' (default) or background.
- If any nodes have tidb_service_scope = 'background', DXF prioritizes scheduling tasks on them.
- If no such nodes exist, tasks run on nodes with default service scope
- v8.1.0+:
- tidb_service_scope can be assigned any valid value.
- Tasks are scheduled only on nodes with the same tidb_service_scope as the submitting node.
- Newly added nodes automatically follow these rules.
Best Practices
- For clusters running v7.4.0 to v8.0.0, set
tidb_service_scope = 'background'
on at least two TiDB nodes. - Ensure sufficient resources are available by monitoring the system’s resource utilization.
- Review and adjust the number of concurrent tasks for optimized performance.
DXF Architecture
DXF Workflow Diagram

As shown in the preceding diagram, the execution of tasks in the DXF is mainly handled by the following modules:
- Dispatcher: generates the distributed execution plan for each task, manages the execution process, converts the task status, and collects and feeds back the runtime task information.
- Scheduler: replicates the execution of distributed tasks among TiDB nodes to improve the efficiency of task execution.
- Subtask Executor: the actual executor of distributed subtasks. In addition, the Subtask Executor returns the execution status of subtasks to the Scheduler, and the Scheduler updates the execution status of subtasks in a unified manner.
- Resource pool: provides the basis for quantifying resource usage and management by pooling computing resources of the above modules.
Benchmark report :
TiDB DXF significantly enhances resource utilization, task scheduling efficiency, and execution performance for distributed workloads. It is a robust framework tailored for large-scale data operations in a TiDB cluster, ensuring optimal performance and scalability.

Summary
Managing large-scale databases doesn’t have to be a constant headache. With TiDB’s DXF, you get the power of distributed task management, optimized resource utilization, and faster operations without sacrificing sleep. Whether it’s handling massive data imports, creating indexes, or running analytics, DXF ensures everything runs smoothly.
So, would you prefer the unchanged challenges of traditional database management? Or wouldy ou find a future with DXF and the impact of distributed databases freeing your time and resources once and for all.
Looking to scale your databases without the headaches? Mydbops offers specialized TiDB Consulting Services to help you implement DXF for peak performance. Our team will guide you from setup to fine-tuning, ensuring your database works smarter, not harder.
Let’s make your database challenges a thing of the past. Reach out to Mydbops today and discover how DXF can bring unparalleled scalability to your business.