Share

Practical Approach for Teradata SQL Tuning: Finding the Root Cause of Performance Issues

Practical Approach for Teradata SQL Tuning: Finding the Root Cause of Performance Issues
sql3

To optimize Teradata SQL performance, it is crucial to identify the root cause of any issues. The SQL statement itself is typically not the culprit but rather one or more stages of the execution plan.

This article does not cover genuine optimization techniques. Instead, it presents a pragmatic method for query optimization. I have noticed that people often rely on a trial and error approach, which may result in a successful outcome through sheer luck, but without a clear understanding of the solution.

We must adopt a detective-like approach to the issue, identifying the incorrect steps in the execution plan and determining their root causes. Isolated inspection is the appropriate method in this case.


Want more practical data engineering analysis like this?

Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. Early subscribers keep launch access before the paid plan launches.

Get the next issue


We must adopt a detective-like approach to the

SQL performance issues can be attributed to two primary factors: uneven distribution of rows, also known as skew or an inefficient workload. Typically, these problems stem from missing or outdated statistics.

The checklist:
  • Take a snapshot of your SQL statement's EXPLAIN plan, showing any changes in the execution plan caused by your optimization activities.
  • Check that your statistics are complete and up to date. Do this even before taking a closer look at the execution plan. Add missing statistics and update the old ones. Watch out for steps with low or no confidence, but don't assume you can consistently achieve high confidence levels. Certain conditions prevent high confidence (OR conditions, for example).
  • If statistics must be refreshed or added, run another EXPLAIN, and examine whether the execution plan changed. If the optimizer has a new execution plan, rerun your query. The new execution plan could have solved your performance issue.
  • If your performance problem did not resolve, it is time to look at each step of the execution plan. While you could start with static analysis on the EXPLAIN output, I prefer to start with real-time monitoring as it is usually the less time-consuming approach. Any real-time monitoring tool (like Viewpoint, PMON, DBMON) gives you the essential information needed.
  • Execute your SQL statement and watch each step in real time.
  • Wasteful workload: Look for discrepancies between estimated rows and rows returned in each step. This will mislead the optimizer, causing it to choose the wrong data preparation (copy, rehash, etc.) and join strategies (product join instead of merge join). If this is the case and all your statistics are fine, you need to consider possible solutions like temporary tables, advanced indexing, partitioning, query rewriting, etc.
  • Skewed data: If you detect a step in the execution plan, which is skewed, your base tables have a well-distributing primary index, and the statistics are excellent. A skewed intermediate spool probably causes the problem. Skewed spool happens for several reasons, such as two joined tables being redistributed to a common ROWHASH with only a few distinct values. There is no secret to solving such a problem. Suppose the goal is to avoid the skewed spool. In that case, you should consider all optimization techniques that can help you, like breaking up queries into smaller parts and splitting the query into two parts, one handling the skewed values and the other handling the un-skewed values.
  • Repeat the above approach until no wrong steps are left. Don't forget to set statistics after each impacting change on the SQL statement.

To summarize, optimization consists of the following steps:

Detect the wrong steps in the execution plan, and figure out what is done wrong by the optimizer (bad join, etc.) and why it is wrong.
When you know the reason, you can browse your optimization toolkit for the appropriate tool and use your creativity: Many roads lead to Rome.


Planning or surviving an enterprise data platform migration?

I write regularly about the performance, cost, architecture, and project mistakes that show up in real Teradata, Snowflake, Databricks, and enterprise data work.

Subscribe before the paid plan launches and keep launch access.

Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.

Subscribe to DWHPro Letters

Practical field notes on enterprise data engineering, production AI systems, platform migration, and the senior engineering market.
Written by Roland Wenzlofsky Founder of DWHPro Author of Teradata Query Performance Tuning
Get the next issue
Subscribe