Share

Improve Query Performance by Using the Teradata CLOB Datatype for Storing Longer Texts

Learn how the Teradata CLOB datatype can improve query performance and reduce resource consumption for storing longer texts in your database.

Improve Query Performance by Using the Teradata CLOB Datatype for Storing Longer Texts
sql1

Longer texts, such as descriptions of other columns, are stored in VARCHAR() columns. The maximum character capacity for storage is 64,000 for the LATIN character set and 32,000 for the UNICODE character set. When considering the impact of longer texts on query performance, it is important to note that the number of rows per data block decreases with longer text inputs, resulting in more IO-intensive full table scans.

The Teradata CLOB Datatype

CLOBs are a superior option for storing long strings as they are stored in a distinct sub-table. By separating CLOBs from the main table, queries accessing the table with a full table scan can be significantly enhanced. This is because Teradata reads fewer data blocks when the CLOB column is not selected, resulting in fewer IOs.

Below is an example table with detailed descriptions in each row's "Desc" column.


Want more practical data engineering analysis like this?

Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. Early subscribers keep launch access before the paid plan launches.

Get the next issue


CREATE TABLE Varchar_Table
(
ID INTEGER NOT NULL,
COl1 CHAR(01),
Desc VARCHAR(30000)
) PRIMARY INDEX (ID);

The absence of a secondary index requires copying all data blocks from the disk to the FSG cache, regardless of our interest in only the "ID" column content. Given that all rows contain extensive textual information in the "Desc" column, this results in a substantial number of IOs.

SELECT ID FROM Varchar_Table WHERE COl1 = 'X' ;

Changing the data type from VARCHAR() to CLOB can significantly reduce resource consumption by allowing more rows to be stored per data block and requiring less disk transfer.

CREATE TABLE CLOB_Table
(
ID INTEGER NOT NULL,
COl1 CHAR(01),
Desc CHARACTER LARGE OBJECT
) PRIMARY INDEX (ID);
SELECT ID FROM CLOB_Table WHERE COl1 = 'X'; -- fewer blocks have to be moved from disk to FSG cache

Selecting a CLOB column does not confer any performance advantage. Instead, both the base table and the sub-table containing the CLOB data must be read.

SELECT ID,Desc FROM CLOB_Table WHERE COl1 = 'X' ;

Planning or surviving an enterprise data platform migration?

I write regularly about the performance, cost, architecture, and project mistakes that show up in real Teradata, Snowflake, Databricks, and enterprise data work.

Subscribe before the paid plan launches and keep launch access.

Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.

Subscribe to DWHPro Letters

Practical field notes on enterprise data engineering, production AI systems, platform migration, and the senior engineering market.
Written by Roland Wenzlofsky Founder of DWHPro Author of Teradata Query Performance Tuning
Get the next issue
Subscribe