Share

Teradata Explain Statement: A Guide to Optimizing SQL Performance

Teradata Explain Statement: A Guide to Optimizing SQL Performance
sql4

What is the Teradata Explain Statement?

Optimizing performance can be challenging, often requiring significant effort to identify the root issues. Thankfully, Teradata provides the EXPLAIN statement as a robust tool for assessing the execution of SQL statements. In Teradata, SQL statements are processed in a series of distinct tasks that correspond to specific components of the statement, such as retrieving rows, joining tables, or aggregating results. By utilizing our knowledge of how Teradata handles these tasks and scrutinizing the execution plan furnished by EXPLAIN, we can swiftly pinpoint weak points within a query and utilize this data to enhance performance.

The EXPLAIN statement can be intimidating for novices. At first, the result may seem disorderly and incomprehensible. Nonetheless, deciphering the steps is easier than it seems. We must evaluate each step's effect on performance and whether Teradata has made ideal selections, such as choosing an inappropriate join strategy.

Our analysis will entail a thorough evaluation of every Execution Plan element, identification of crucial operations, and support in identifying potential issues.


Want more practical data engineering analysis like this?

Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. Early subscribers keep launch access before the paid plan launches.

Get the next issue


The Execution Plan comprises retrieval, joining, aggregation, and other tasks. Nonetheless, these three tasks make up 95% of all steps; therefore, attention will be paid to them. In brief, the Execution Plan entails reading, joining, and aggregating, ultimately creating a final result set. Each step depends on the preceding one's output.

Decoding the Plan Steps

Teradata's blocks correspond to distinct operations, including fetching rows, joining tables, and conducting aggregations. In addition to the operation category, we collect information on the number of AMPs involved, including All-AMPs, Single-AMP, or Group-AMP.

Each block's EXPLAIN output will show something akin to:

AMPs retrieve, and single AMP retrieve steps, AMPs join, and single AMP join steps, and an AMPs step for aggregation.

By identifying the various types of operations, we can analyze and obtain detailed information about each one.

Prior to each transaction, certain preparation steps are necessary, such as redistributing and sorting the rows. Sorting is essential to utilize the most efficient search algorithm, specifically the binary search.

Join Preparation

Redistribution is required as the AMPs hold the rows exclusively for joining. Since the AMPs are separate processes and lack access to other AMPs' disks, the two tables' rows to be joined must reside on the same AMP. To accomplish this, the easiest method is rehashing one or both tables in question.

Get the next issue by email.

The EXPLAIN result for row movement might show something like:

duplicated on all AMPs, redistributed by the hash code of (new hash column(s)), redistributed by rowkey, etc.

The EXPLAIN output's sorting section will show something akin to this:

Sort according to hash code, order by row hash, partition by rowkey, etc.

Row retrieval strategy

A vital aspect of all operations is retrieving rows from disks. The Teradata Database offers various access paths to obtain data, such as full table scans, primary and secondary index access, and more.

The EXPLAIN output for row retrieval may contain the following information:

by way of an all-row scan, by way of a rowhash match scan, by way of the primary index, by way of the hash value, etc.

Join Type

The EXPLAIN output will provide information on the selected join strategy if the operation is a join.

using a product join, a single partition hash join, a merge join, a rowkey-based merge join, etc.

Confidence

Each block denotes the level of confidence in its corresponding rows. Although desirable, attaining a high level of confidence is not always possible.

The EXPLAIN output will display a similar format:

high confidence, low confidence, no confidence, and join index confidence

This article explains the essential elements of an EXPLAIN statement, delivering about 80% of the relevant data present in an explanation output.


Planning or surviving an enterprise data platform migration?

I write regularly about the performance, cost, architecture, and project mistakes that show up in real Teradata, Snowflake, Databricks, and enterprise data work.

Subscribe before the paid plan launches and keep launch access.

Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.

Subscribe to DWHPro Letters

Practical field notes on enterprise data engineering, production AI systems, platform migration, and the senior engineering market.
Written by Roland Wenzlofsky Founder of DWHPro Author of Teradata Query Performance Tuning
Get the next issue
Subscribe