Teradata Statistics Collection for Improved Query Performance
Learn how to optimize your Teradata queries with statistics collection. Discover what to collect and what to omit for accurate query estimation.
Teradata Collect Statistics - The Decision
Assuming that all Teradata physical data model and table design issues have been resolved, it is expected that your tables can be regularly utilized.
Inform Teradata of a table's expected properties before executing an SQL statement, with a clear directive to utilize the information. Teradata can more accurately estimate the path through a query by doing so.
Statistics are a potent and costless tool for accomplishing this.
Want more practical data engineering analysis like this?
Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. Launch access is open. The single paid plan will be EUR 49 per year.
The Scope of the Teradata Statistics Collection
Statistics can be collected on a single column or a combination of columns.
If you work with enterprise data platforms, migrations, performance tuning, or AI-driven delivery teams, DWHPro Letters is written for you. Get the next issue by email.
The table's physical design influences what you gather; otherwise, it should be based on how it is utilized.
Always include the following as initial collection choices:
- on the dummy column PARTITION, irrespective of whether your table is partitioned or not
- on the Primary Index columns (in combination)
- on the Secondary Index columns (in combination)
- on the Columns used for partitioning
The second usage-driven set of collect options consists of:
- Join columns: columns typically used to join the table to others.
- WHERE condition columns: columns frequently used to select data subsets from the table
Based on our experience, certain attributes should be excluded from the collection unless there is compelling evidence to include them.
- time and date columns in table landscapes of daily changes
- Code, type, or flag columns hardly used in follow-up queries
Planning or surviving an enterprise data platform migration?
I write regularly about the performance, cost, architecture, and project mistakes that show up in real Teradata, Snowflake, Databricks, and enterprise data work.
Subscribe before the paid plan launches and keep launch access.
Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.