Automate Database Optimization: Say Goodbye to Manual Indexing with TiDB Index Advisor

Mydbops
Mar 11, 2024
12
Mins to Read
All

Database performance tuning hinges on optimal index selection. However, this process is notoriously complex and time-consuming, even for seasoned database administrators. Even experienced experts struggle to identify the most effective indexing strategy for intricate workloads involving numerous tables (dozens or even hundreds) and countless SQL queries (thousands).

TiDB Index Advisor

TiDB Index Advisor is a command-line tool that can automatically recommend indexes based on the workload, statistics, and execution plan cost in TiDB, which can greatly reduce the workload of index maintenance in performance tuning.

How TiDB Index Advisor works

Index Advisor is based on the Hypo Index feature of TiDB. This feature allows users to create and maintain a series of hypothetical indexes in the optimizer. These indexes are only maintained in the optimizer and will not be actually created, so the overhead is very low. Combined with the Explain statement, you can evaluate the impact of an index on the query plan, and then determine whether the index is valuable.

How TiDB Index Advisor Works
How TiDB Index Advisor Works

The working principle of Index Advisor is as follows, which can be roughly divided into three steps:

  • Data Gathering: It collects relevant information from the TiDB instance's system tables, including table structures, statistics, and related queries within the workload.
  • Candidate Generation: Based on the collected data, Index Advisor generates potential indexes (candidate indexes). It then leverages the Hypo Index feature to create these hypothetical indexes within the optimizer.
  • Evaluation and Recommendation: Using the Explain statement, Index Advisor assesses the value of each candidate index. This evaluation determines if the index can reduce the cost of query plans. Finally, based on this analysis, Index Advisor recommends valuable indexes for implementation.

How to Use TiDB Index Advisor

TiDB Index Advisor offers two usage modes: online and offline.

  • Online Mode: This mode provides a simpler setup. Index Advisor directly accesses your TiDB instance to analyze your workload and recommend indexes. It only reads system tables and creates temporary Hypo Indexes, so there's no impact on your data.
  • Offline Mode: This mode offers more flexibility but requires some data preparation. Index Advisor starts a local TiDB instance, imports the data you provide, and then analyzes it to recommend indexes.

Breakdown of the online vs. offline modes:

  • Online Mode: Easier to use and more secure. It directly analyzes your existing TiDB cluster without requiring any data export. However, it offers less flexibility.
  • Offline Mode: More flexible but requires some data preparation upfront. You'll need to export data and provide it to Index Advisor for analysis.
How TiDB Index Advisor Works
Online Vs. Offline Mode

Installation

There are two main ways to install TiDB Index Advisor:

Online Mode (Recommended): This is the simpler and more secure option. It directly analyzes your existing TiDB cluster without requiring any data export.

Prerequisites:

  • TiDB version: v7.3 or higher (required for the Hypo Index feature)
  • Statement Summary: Enabled (needed to read query information)
  • tidb_redact_log feature: Disabled (optional, but recommended for better performance)

Pre-check your environment to ensure compatibility:

 
index_advisor precheck --dsn='root:@tcp(127.0.0.1:4000)'
	

Installation Command:

 
curl --proto '=https' --tlsv1.2 -sSf https://raw.githubusercontent.com/pingcap/index_advisor/main/install.sh | sh
	

Here's a breakdown of how to use TiDB Index Advisor in online mode:

 
index_advisor advise-online --dsn= --max-num-indexes= --output=
	

Explanation of Options:

  • --dsn: Provide the Data Source Name (DSN) of your TiDB instance.
  • --max-num-indexes (Optional): Specify the maximum number of indexes you want Index Advisor to recommend (defaults to 5).
  • --output (Optional): Set the path where you want the analysis results saved. Leaving it blank will display the output on the terminal.

Filter Options (Optional):

  • --query-schemas: (Comma-separated list) Analyze queries only from specific schemas (e.g., db1, db2).
  • --query-exec-time-threshold: Focus on queries with execution times exceeding a certain threshold (in milliseconds) (e.g., 300).
  • --query-exec-count-threshold: Analyze queries executed more than a specified number of times (e.g., 20).

Offline Mode (Optional): This mode offers more flexibility but requires some data preparation upfront. You'll need to export data and provide it to Index Advisor for analysis.

Note: To use offline mode, you'll need TiUP installed first.

Install TiUP:

 
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
	

Offline mode offers more flexibility for index analysis but requires some preparation compared to online mode. Here's what you'll need:

Data Preparation:

  • Query Data: Prepare your queries in either a single file or a folder.
  • Schema Information: Create a file containing the original CREATE TABLE statements for your tables, separated by semicolons (;).
  • Statistics information folder: which stores the statistics information files of related tables. Ref : TiDB statistics information dump.

Using Offline Mode:

Once you have your data prepared, use the following command to run Index Advisor in offline mode:

 
index_advisor advise-offline \
  --tidb-version= \
  --query-path= \
  --schema-path= \
  --stats-path= \
  --max-num-indexes= \
  --output=
	

Explanation of Options:

  • <tidb_version>: Specify the TiDB version to be used (e.g., v7.5.0). Index Advisor will start a local instance of this version.
  • <query_data_path>: Path to the file or folder containing your query data.
  • <schema_file_path>: Path to the file containing your schema information (CREATE TABLE statements).
  • <statistics_folder_path>: Path to the folder containing your dumped statistics information files.
  • <number_of_indexes>(Optional): Set the maximum number of indexes you want Index Advisor to recommend (defaults to 5).
  • <output_path>(Optional): Specify the path where you want the analysis results saved. Leaving it blank will display the output on the terminal.

Understanding the Output

After running Index Advisor, it will generate a folder containing the analysis results. Here's a breakdown of what you'll find inside:

Content of the Output Folder:

  • summary.txt: This file contains a high-level overview of the recommended indexes and their expected benefits. It will list the suggested indexes and provide an estimate of the performance improvement you can expect.
  • ddl.sql: This file is a DDL script containing the CREATE INDEX statements for all the recommended indexes. You can directly execute this script in your TiDB instance to implement the suggested improvements.
  • q.txt (Multiple Files):* These files represent individual query analyses. Each file is named qX.txt where X is a sequential number. These files detail the expected benefits for each query in your workload. They will typically show the query plan and its estimated cost before and after applying the recommended indexes, allowing you to visualize the performance improvement for each specific query.

Sample Output:

TiDB Index Advisor provides a detailed analysis to help you optimize your database performance. Here's a breakdown of the key findings:

Workload Analysis:

  • Total Queries: 21 (This indicates the number of queries analyzed)

Recommended Indexes:

  • Total Count: 5 (This is the number of indexes Index Advisor suggests for improvement)
  • CREATE INDEX statements: These statements provide the exact DDL code for creating the recommended indexes. You can directly copy and execute them in your TiDB instance
 
CREATE INDEX idx_l_partkey_l_quantity_l_shipmode ON tpch.lineitem (l_partkey, l_quantity, l_shipmode);

CREATE INDEX idx_l_partkey_l_shipdate_l_shipmode ON tpch.lineitem (l_partkey, l_shipdate, l_shipmode);

CREATE INDEX idx_l_suppkey_l_shipdate ON tpch.lineitem (l_suppkey, l_shipdate);

CREATE INDEX idx_o_custkey_o_orderdate_o_totalprice ON tpch.orders (o_custkey, o_orderdate, o_totalprice);

CREATE INDEX idx_ps_suppkey_ps_supplycost ON tpch.partsupp (ps_suppkey, ps_supplycost);
	

Workload Performance Improvement:

  • Original Workload Cost: 1.37E+10 (Estimated cost before applying recommendations)
  • Optimized Workload Cost: 1.02E+10 (Estimated cost after applying recommendations)
  • Cost Reduction Ratio: 25.22% (This significant reduction highlights the overall performance improvement)

Top Queries Benefiting from Indexes:

  • The output includes a section showcasing the top 10 queries with the most significant cost reduction after implementing the indexes. This helps you prioritize optimizations by focusing on queries with the greatest potential gains.
 
Alias: q22, Cost Reduction Ratio: 1.97E+08->4.30E+06(0.02)

Alias: q19, Cost Reduction Ratio: 2.89E+08->1.20E+07(0.04)

Alias: q20, Cost Reduction Ratio: 3.40E+08->4.39E+07(0.13)
	

Above is the summary of the recommendation, which contains the recommended indexes, the expected benefits to the entire workload, and the expected benefits of the top 5 queries.

Future Enhancements

  • Identifying Redundant and Unwanted Indexes: Currently, TiDB Index Advisor doesn't detect redundant (duplicate) or unnecessary indexes. Future versions will include this functionality to provide a more comprehensive analysis and avoid creating superfluous indexes.
  • Multivalued Index Support: Multivalued indexes are not yet supported by TiDB Index Advisor. Integration of this feature will allow the tool to analyze and recommend indexes for tables containing columns with multiple values per row.
  • TiUP Integration: Simplifying usability is a priority. Future plans include integrating TiDB Index Advisor with TiUP, a popular package manager for the TiDB ecosystem. This will streamline the installation and management of TiDB Index Advisor.

TiDB Index Advisor emerges as a valuable tool for database administrators struggling to optimize index selection for complex workloads. By automating index recommendations and leveraging the Hypo Index feature, Index Advisor simplifies the process and reduces the time investment required for performance tuning.

While TiDB Index Advisor is a powerful tool, there's always room for human expertise. Our TiDB Consulting and remote DBA services can help you leverage TiDB Index Advisor's recommendations and go even further. Ready to unlock the full potential of your TiDB database? Contact us today to learn more about our TiDB Consulting and remote DBA services!

{{cta}}

No items found.

About the Author

Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.