Share

Teradata DATABLOCKSIZE: How to Minimize Disk I/Os and Optimize Workload Requirements

Learn how to minimize disk IOs with the Teradata DatablockSize option. Read on for an overview of how it works and how to choose the best size.

Teradata DATABLOCKSIZE: How to Minimize Disk I/Os and Optimize Workload Requirements
admin4

What is the Teradata DATABLOCKSIZE Option?

Most of us are familiar with the Teradata option DATABLOCKSIZE, included in the CREATE TABLE statement. However, only a few know how to use it to reduce disk I/Os. This article provides a brief overview of how to use this option effectively.

Teradata stores data in data blocks consisting of multiple sectors on a disk cylinder.

Normally, a data block contains rows from a single table only, with no intermixing of rows from different tables. As such, the block header of each data block includes a table ID that identifies the table to which the block belongs.


Want more practical data engineering analysis like this?

Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. Early subscribers keep launch access before the paid plan launches.

Get the next issue


Columnar storage was introduced in Teradata 14.10, allowing the values of one or more columns from multiple rows to be compressed into a single data block.

Get the next issue by email.

The DATABLOCKSIZE option specifies the number of sectors composing a data block, each with a default size of 512 bytes. Teradata appends an additional 512-byte sector to the data block when it reaches full capacity, continuing until the maximum block size is reached.

Teradata will split a data block into two blocks when it reaches the maximum block size.

What does this mean in practice? The same number of rows can be stored in a few large data blocks or in many smaller ones. Inserting rows into smaller data blocks may lead to more frequent block splits.

The optimal data block size depends on your workload requirements.

Tables accessed through full table scans should have a larger data block size, as this allows multiple rows to be transferred efficiently into AMP memory.

If your table has many distinct index accesses retrieving only a few rows at a time, it is advisable to use smaller data blocks to avoid unnecessarily loading large amounts of data into AMP memory during retrieval.


Planning or surviving an enterprise data platform migration?

I write regularly about the performance, cost, architecture, and project mistakes that show up in real Teradata, Snowflake, Databricks, and enterprise data work.

Subscribe before the paid plan launches and keep launch access.

Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.

Subscribe to DWHPro Letters

Practical field notes on enterprise data engineering, production AI systems, platform migration, and the senior engineering market.
Written by Roland Wenzlofsky Founder of DWHPro Author of Teradata Query Performance Tuning
Get the next issue
Subscribe