Share

How to Avoid Full Table Scans When Using LIKE Operators in Teradata Indexing

Learn how to avoid resource-killing full table scans on extensive tables with Teradata Indexing. Discover a fantastic trick to optimize query performance.

How to Avoid Full Table Scans When Using LIKE Operators in Teradata Indexing
sql2

When Teradata Indexing fails

Selecting rows from extensive tables with billions of entries can be resource-intensive. While it may not cause significant strain on resources when accessing rows directly through primary indices or unique secondary indices (USI), using a "LIKE" operator to select rows can be extremely resource-intensive.

SELECT * FROM wide_table WHERE column_B LIKE '%A%';

Typically, the above query is resolved through a full table scan (FTS), without any additional measures.

Using a NUSI or single table join index (STJI) can minimize disk IO and decrease query runtime.


Want more practical data engineering analysis like this?

Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. Early subscribers keep launch access before the paid plan launches.

Get the next issue


Using an STJI may be viable; however, persuading the optimizer to replace the FTS on the base table with an FTS on a NUSI sub-table could prove challenging. The inadequate statistics (whether collected or sampled) may not adequately support index usage. Typically, when the search parameter begins with "value%", the optimizer may opt for a NUSI traversal instead of "%value%". This preference stems from the storage manner of statistic histograms in Teradata.

Get the next issue by email.

Note that the use of LIKE operators prevents a binary search in NUSI data blocks due to sorting the row pointer array by ROWHASH of indexed columns, which LIKE operators cannot hash. This limits access to a full table scan of the NUSI sub-table, which is only feasible when the number of NUSI data blocks to be read is significantly lower than the number of base table data blocks accessed with an FTS.

The Solution without Teradata Indexing

Fortunately, there is a great technique to prevent a complete table scan on a broad table.

We generate a condensed version of the big table, comprising solely the primary key columns and those essential for fulfilling the LIKE operator's requirements. To employ this technique, it is crucial that the primary key and primary index of the table match.

We will join our auxiliary table with the wide table instead of selecting directly from the latter.

SELECT t01.*FROM     wide_table t01INNER JOIN    auxiliary_table t02 ON   t01.primary_key = t02.primary_key WHERE   t02.column_A LIKE  '%A%';

If all necessary statistics are available for the query, the optimizer will generate an execution plan that is less demanding on resources than the original plan.

Performing a complete scan of the smaller auxiliary table will isolate only the rows that fulfill the LIKE operator and transfer them to the spool. Since the auxiliary table is significantly smaller, there will be a significant decrease in IO.

The second step involves a swift and AMP-local primary index merge join of the spooled records with the wide table.

The above solution was effectively executed in multiple telecommunication projects, which involved carrying out daily queries using LIKE operators on massive call detail record (CDR) tables. Typically, the generation of auxiliary tables is deferred until the nightly batch load window.


Planning or surviving an enterprise data platform migration?

I write regularly about the performance, cost, architecture, and project mistakes that show up in real Teradata, Snowflake, Databricks, and enterprise data work.

Subscribe before the paid plan launches and keep launch access.

Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.

Subscribe to DWHPro Letters

Practical field notes on enterprise data engineering, production AI systems, platform migration, and the senior engineering market.
Written by Roland Wenzlofsky Founder of DWHPro Author of Teradata Query Performance Tuning
Get the next issue
Subscribe