Stop disqualifying yourself: how to decode a job posting
Something is broken in how roles are advertised, and it is worth naming plainly, because most of you are blaming the wrong party for it. When you read a posting and feel that you d
Browse every published letter by topic and date. This archive shows metadata and excerpts only; Ghost still controls member-only and paid access when you open a post.
Something is broken in how roles are advertised, and it is worth naming plainly, because most of you are blaming the wrong party for it. When you read a posting and feel that you d
If you have spent twenty years writing tests for data pipelines, you carry a mental model of what a passing test means. The input is fixed. The function is deterministic. The outpu
"All models are wrong, but some are useful." George Box A short note before the piece. Starting next week, this newsletter will run on a steady rhythm: a short, practical piece ev
"The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn, and relearn." Alvin Toffler Open any job board, scroll any Li
How to think about the choice when the standard path no longer reliably works. Foreword The path that worked through most of the previous decade — pick any reasonable engineerin
Banks are cutting junior data engineers today. The bill comes due in 2035. The 2026 layoff numbers are real. Tens of thousands of tech workers lost their jobs in early 2026, with
Three default moves senior technical practitioners make when rates begin to compress, and why each makes the underlying position worse.
Every enterprise AI strategy deck I have seen in the past years contains the same promise: “We will build a RAG-based knowledge assistant…
How corporate cost-cutting is creating the exact problem AI was supposed to solve.
Z-ordering and Liquid Clustering both aim to improve Databricks query performance through data skipping. But when your data is skewed, one of them quietly becomes useless. A visual explanation of why — and how the Hilbert curve changes everything.
“The spreadsheet is the most dangerous piece of software ever created.” — Daniel Lemire
I have spent over twenty years building and optimising enterprise data warehouses, and in that time, I have watched the industry cycle…
There is a question that is almost never asked in the data platform debate, and the reason it is not asked is that nobody benefits from the…
A vendor-neutral look at the architectural trade-offs that actually matter
When Microsoft launched Fabric, it promised to unify analytics under one roof. Databricks, meanwhile, has been building its lakehouse…
If you have spent any amount of time working with Teradata, you know that the Primary Index is one of the most important design decisions you make. It determines how data is distri
In my last post, I described a time when five people could build and maintain an entire data warehouse. That era ended not because the…
Chapter 1: When Data Warehousing Was Still Fun
Somewhere around 2020, the data world quietly arrived at a conclusion that Teradata engineers could have told you in 1984: SQL on a massively parallel architecture is a pretty good
If you come from Teradata, Oracle, or SQL Server, the first thing you learn about Snowflake stops you cold: there are no indexes on…
Every data platform eventually faces the same question: how do you serve both the analyst running a full-table scan across billions of rows…
Snowflake in 2026 is not the platform you onboarded onto. The question is whether your team has noticed.
Somewhere around 2020, the data world quietly arrived at a conclusion that Teradata engineers could have told you in 1984: SQL on a…
Migration success stories are everywhere. A quick search reveals case studies of companies that moved from Teradata to Snowflake and…
Why data warehouse professionals have been doing “bronze, silver, gold” for over 20 years.
What Twenty Years of Teradata Expertise Won’t Prepare You For
Introducing DWHPro Query Master: A Modern Web-Based Alternative to Teradata Studio As Teradata professionals, we've all been there: waiting for Teradata Studio to launch, dealing
Database features should be compared based on their documented behavior, their operational impact, and the architectural principles behind them. This applies especially to physical
How to Track Query and User Activity After Migrating from Teradata to Snowflake By Roland Wenzlofsky · DWHPro 🎯 Why This Matters In Teradata, QUERY_BAND has long been a hidden
One of the smallest but most persistent sources of data mismatches during a Teradata-to-Snowflake migration comes from an unassuming function: ROUND(). While both systems follow A
1. The Forgotten Performance Trick: A Semicolon That Saves Time For decades, Teradata developers have quietly used one of the smallest but most powerful performance optimizations
Performance degradation caused by uneven workload distribution is one of the oldest and most persistent challenges in parallel data warehouse systems. Both Teradata and Snowflake c
At first glance, an UPDATE looks universal. In reality, it’s one of the most misleading similarities between Teradata and Snowflake. The SQL is the same, but the storage, logging,
For many years, Teradata was the undisputed leader in large-scale data warehousing. Banks, insurers, and telcos built their most critical systems on it. Today, the market is very d
When migrating analytical workloads from Teradata to Snowflake, one subtle but important performance factor often gets overlooked: how the two systems handle GROUP BY operations on
Big files, real risks, and how not to overspend On Snowflake, COPY INTO scales with the number of files, not the total GB. A single big file equals one unit of work. To go fast a
Snowflake’s physical join execution is predominantly hash-based. In practice you’ll observe hash-join variants with two distributions: * Broadcast hash joins — replicate a small
I was recently approached to support a case of importing the results of a Teradata query into a third-party vendor database. On the export side, Teradata happily wrote close to 60
Introduction to the Teradata AMP Worker Task The Teradata AMP Worker Task or AWT is the heart of the AMP, responsible for executing tasks and ensuring the smooth functioning of th
Introduction to Teradata Performance and NOT NULL Welcome to our latest Teradata performance blog post, a series designed to provide valuable insights into SQL queries. This artic
Learn how to optimize your Teradata SQL performance by leveraging secondary indexes! Avoid full table scans by bypassing the COALESCE function in the WHERE clause.
Learn how to overcome the hurdle of using Teradata Identity columns with Volatile Tables. Discover a workaround using the 'CREATE TABLE AS' statement.
Running out of free Cylinders in Teradata Encountering a situation where free cylinders are exhausted is a significant concern when managing a system, and no more Teradata Space i
Introduction In a row-oriented database engine like Teradata, data is organized and stored in units called data blocks. Each data block features a fixed header and accommodates mu
The Teradata flavor of SQL is still, in principle, a declarative language. Hence, there can be multiple ways to describe an SQL query and achieve the same result. While the answer
Occasionally, it is necessary to utilize a cursor within a stored procedure to execute specific functionality. I recently encountered a stored procedure that contained a loop with
Learn how to easily load a CSV file into a Teradata database using Python with minimal code. No need for complex scripts or ETL tools. Read more.
Functions on Join Columns and Their Impact on Teradata Performance In many Teradata systems, developers apply functions directly in join conditions to work around data-model incon
Introduction Processing and analyzing large volumes of data quickly and efficiently is essential in today's data-driven world. Apache Spark, an open-source big data processing eng
Teradata NOS facilitates querying data in an S3 object store with ease. To attain maximum performance, partitioning external data is crucial for efficient reading. This article out
What is the Teradata TD_WhichMax Function? Teradata releases new SQL features with each update, providing enhanced functionality and improved performance with reduced resource usa
In a previous article, Roland explained how to use recursion to build a data range without touching a physical table. This time, we explore an alternative approach that avoids recu
Introduction to Teradata Compression Note: Teradata Block Level Compression is now permanently enabled and cannot be turned off. Nonetheless, this article remains useful for curre
Splitting column content can be a challenging task. Teradata provides multiple methods to achieve this, each with unique pros and cons. In this article, we will examine these techn
This article compares the index types in SQL Server and Teradata. It can benefit those transitioning between the two platforms to understand their distinctions and overlaps, despit
Introduction to Teradata Multiload and Fastload What is the Teradata Multiload?Teradata Multiload is a utility that supports loading data (INSERT) and other DML statements (UPDAT
Teradata NUSI Selectivity and Data-Block Density The goal of this article is to show how the number of rows per base-table data block impacts the selectivity threshold for Non-Uni
The Teradata Tutorial to support the selection of indexes How much space does the Teradata index occupy?An index can take up surprisingly much space, especially if the Teradata s
Teradata Statistics Histograms - A Short Introduction Many are familiar with the Optimizer's statistical confidence levels. I was recently surprised to discover that a "high confi
Hadoop is a buzzword in the world of big data, but its actual value can be concealed by the hype. This article compares Teradata and Hadoop Data Warehousing, highlighting the advantages of leveraging Hadoop's scalability and preprocessing capabilities to improve Teradata's performance. However, the
Have you ever experienced extended waiting times for a table comparison to yield results? Have you ever been compelled to halt and defer quality checks on sizable tables owing to e
Typically, query tuning involves altering the composition of various objects. An alternative method for achieving quicker results, in cases where modifying SQL, is not feasible or
Introduction to Teradata Data Types Further Data Type considerations require occasional attention to additional issues. Happenstance Nullability Strive for nullability settings
Have you encountered a poorly designed physical data model where object columns are distributed randomly across tables, and you wish to unify them in their rightful place? Merging
Collect Statistics in Teradata - The Evaluation After collecting every combination considered necessary and helpful, you can check the result of the collected statistics on a tabl
Introduction Teradata offers several methods for conducting joins, but all necessitate one prerequisite. The paired table rows must reside on identical AMPs. The chosen method f
The Art of Teradata Performance Tuning As a Teradata Performance Tuner, technical expertise and experience are essential, occasionally accompanied by fortuitous circumstances. I'
A recent question came in about how the Teradata Optimizer uses multi-column statistics. Here are the essential details: The Optimizer uses multi-column statistics when the query
The Teradata Express Edition is a great opportunity to experiment with a fully functional Teradata system. It can be downloaded from https://downloads.teradata.com/download/databas
Here is an illustration depicting the design of real-world map-reduce implementations, such as Hadoop: The input files reside in a distributed file system, such as HDFS for Hadoop
Learn about the differences between GROUP BY and DISTINCT for aggregations in Teradata. The optimizer now selects the appropriate algorithm depending on data demographics. Understand when to use each method to optimize query performance.
Learn about locking in Teradata to ensure data integrity and consistency. Teradata automatically selects the best lock for each situation to prevent data inconsistencies.
Learn about the goals of SQL tuning and how to optimize database performance by reducing resource usage. Skew, IOs, and CPU seconds are key metrics. Discover how to ensure completeness and correctness of Teradata statistics, detect missing and stale statistics, and improve query plans.
Learn about Query Logging with Teradata DBQL Tables, a powerful feature for workload analysis and performance tuning. Configure settings and select which key figures to store and their level of detail. The article covers how to implement and activate DBQL tables, determine which information to colle
Introduction Teradata efficiently manages complex workflows by distributing and expanding processes across numerous AMPs. However, when an AMP's maximum capacity is reached, it ca
Learn about Teradata's Parallel Transporter Utility (TPT), the all-in-one tool that combines Fastload, Multiload, TPUMP, BTEQ, and Fastexport functionalities. Discover the benefits of TPT's consistent syntax and parallelism, as well as a comprehensive overview of its operators.
In this article, we will delve into the world of Teradata set operators, exploring their functionality, applications for attaining peak performance, and the optimization techniques
Add Snowflake, BigQuery, and Redshift to your Knowledge As a passionate Teradata expert, I have fine-tuned thousands of SQL queries throughout my professional journey. It was an e
1. Statistics In Teradata, understanding and managing statistics is essential for optimizing database performance. Statistics provide the optimizer with precise data about stored
What are Deadlocks in Teradata? Deadlocks arise when two transactions hold locks on database objects required by the other transaction. Here is an example of a deadlock: Transac
The Teradata Parallel Transporter (TPT) is a Teradata Tools and Utilities (TTU) product. Teradata TPT offers under one roof an SQL-like scripting language that simplifies the synta
Table cloning is required for purposes such as data backup, testing, and replication. Teradata, a leading data warehousing platform, provides an efficient method for cloning tables
Introduction Relational databases are essential for contemporary data management and analysis. Joining tables, which merges data from two or more tables based on a shared column o
Introduction Teradata uses various mechanisms, such as hash maps, master and cylinder indexes, and binary and sequential search algorithms, to locate table rows. This article expl
This article outlines the procedure for migrating the Teradata Express Edition image from VMware to a level 1 hypervisor. Teradata provides the Express Edition for download on dif
Teradata ORDER BY and Performance To employ a basic sorting algorithm, all rows must be present in one location for sorting. However, this is not feasible in Teradata, where numer
Teradata employs various join methods and techniques to merge the rows of two tables onto a single AMP, which is essential for joining. The combination of join technique and data g
Teradata MERGE INTO vs. UPDATE This article compares the UPDATE statement to the MERGE INTO statement, analyzing their respective performance differences and limitations. The Ter
Teradata has incorporated several analytical functions from Teradata Aster in its recent releases. Although many of these functions are tailored for web click analysis, Teradata An
Teradata table skew is a common issue encountered while working with the Teradata database. If you're reading this page, you may have experienced this problem. Common knowledge
A normalized data model can increase the complexity of creating performant queries due to the higher number of tables that must be linked compared to a denormalized data model. It
The latest generation of Teradata systems always has Block Level Compression (BLC) enabled. When using MultiValue compression, the compression factor is typically low. How Block L
Introduction This blog post discusses tactical workloads on a Teradata system. Despite Teradata's implementation of features that support tactical workloads, this workload categor
This article assumes prior basic knowledge of Python and demonstrates the easy process of loading data using Python and the Teradata SQL Driver for Python with Fastload. If you ar
How to find out if the Teradata Statistics we created for a specific workload are used? Teradata statistics greatly affect SQL query efficiency. We need a reliable method to get
Problem-solving without a Teradata Recursive Query To depict a corporate hierarchy in our instance, we can employ a non-recursive approach, illustrated by the query presented belo
Teradata is commonly used for tactical workloads and OLTP applications in my projects. However, it is crucial to avoid designing databases carelessly. Teradata excels as a database
This article illustrates loading a flat file into an empty Teradata table using TPT. The example was successfully tested on Teradata 16.20. Although TPT does offer a wide range of
1. Complete and up-to-date Statistics At the start of Teradata SQL Tuning, statistics are a vital concern. The Teradata Optimizer employs statistics to formulate the optimal execu
This article will highlight the contrasting architectural features of Teradata and SQL Server, along with their shared aspects and performance impacts. The piece titled "Teradata v
What is Teradata Query Rewriting? Teradata query rewriting is an integral component of the optimization process for Teradata. The optimizer replaces your query with a more effici
Introduction to Teradata Dynamic AMP Sampling Teradata calculates dynamic AMP samples for indexed columns (PI, USI, NUSI) at runtime without requiring statistics. These samples pr
It's time to share a new Teradata SQL tuning case study that showcases the impressive impact of query rewriting on performance. We are studying the query below that originally too
Teradata and Redshift share similar architectures and data distribution methods. Teradata's AMPs store portions of table data, while Redshift utilizes slices. There are notable di
Teradata employs two distinct approaches in Ordered Analytic Functions for preparing the data layout necessary for processing. This article explains both approaches and their respe
What is Teradata Join Estimation? This article demonstrates the functioning of Teradata Join Estimation in the absence of statistics. It presents the heuristics employed to estima
Learn how to use the Teradata OVERLAPS command to detect overlapping time periods with this example query. Caution: minimum overlap required.
1. Outline This showcase demonstrates optimizing statements with multiple JOINs using Teradata Optimizer's tuning approach. The approach efficiently determines the best JOIN strat
Learn how to optimize LIKE operator queries in Teradata to avoid full table scans and improve performance. Read on for expert tips and tricks.
Learn about the benefits of compression in Teradata Columnar, including a reduction in permanent space and disk IOs. Different compression methods are used, including run-length, dictionary, and delta compression. This article explains how each method works and the advantages of using them.
When designing tables for Teradata, it is important to distribute the rows across all AMPs in the system evenly. For instance, on a 100-AMP system with 100,000 rows, the objective
Learn how to optimize load times in Teradata by choosing the correct data types and sizes for stage tables. Read on for a real-life example.
Learn how Teradata 14.10 simplifies statistics collection for better performance. Discover new features that identify unused and missing statistics.
Learn how to optimize your Teradata queries with statistics collection. Discover what to collect and what to omit for accurate query estimation.
Introduction Parallel database architectures have undergone significant advancements over the past four decades, transitioning from shared memory architecture to shared disk archi
Introduction Discover the world of Teradata, an industry-leading data warehouse and analytics platform that offers unparalleled performance, scalability, and ease of use. This com
Teradata Load Times vs. Snowflake Load Times Elasticity is a crucial aspect of contemporary cloud databases like Snowflake that sets them apart from on-premise shared-nothing data
Situation and Setup of a Historisation in Teradata Vantage Imagine that the time has come to improve your physical data model. Flawed or obsolete Teradata Vantage historisation de
Teradata ALTER Table versus INSERT INTO Altering the structure of a substantial Teradata table can consume significant resources. Essentially, there are two approaches: altering t
Teradata introduced several new features, including one that caught our attention: object use counts (OUC). This feature optimizes the calculation of extrapolated statistics, impro
The initial situation without any index In this blog, I will demonstrate how to optimize a query using Teradata's tools. We will begin with the following test scenario: CREATE T
To comprehend the issue of Teradata hash collisions, I will briefly explain how rows are allocated. If you are unfamiliar with Teradata Architecture or require a refresher, I sugge
What is a Teradata Rollback?A rollback reverses an incomplete transaction. Transactions can be unfinished for various reasons, such as user termination or database restart. Rollbac
Why is Teradata I/O so crucial? Minimizing Teradata I/O is a crucial aspect of performance tuning. IOs involve transferring data from storage to main memory, which is essential fo
Customize Workload Management Workload management in Redshift means optimizing the queues. This includes the correct selection of queues, the assignment of queries, and the allocat
How Advanced Is The Teradata Columnar Solution? A Column Store is defined by three distinct properties: * Improved compression compared to Row Stores. This results from c
Teradata transitioned to the cloud to prevent disconnection from contemporary cloud-based databases. Want more practical data engineering analysis like this? Join DWHPro Letters
Accessing data has always been the bottleneck of database systems. Once the data resides in the main memory, it can be promptly processed. Although SSDs are now predominantly used
Experienced Teradata users are familiar with the concept of a primary index. However, a new term has surfaced with the introduction of Teradata Release 15: Primary AMP Index. This
In this blog post, I will show you what you need to pay special attention to when switching from Teradata to Snowflake. Want more practical data engineering analysis like this? J
What Are Zero-Copy Clones? We are all familiar with creating table backups before performing specific operations. This is not a significant problem for smaller tables, as duplica
Introduction to Teradata in the Cloud Cloud databases are pressuring traditional data warehousing MPP systems. This blog post will illustrate the reasons for this and outline the
This post will contrast the architecture of two widely-used database systems. The similarities between Teradata and Amazon Redshift are notable, as much of your knowledge about Te
Isolation Levels and their Impact on Performance & Concurrency Isolation is a crucial factor in determining the visibility of transaction integrity to database users. This propert
What Are The Main Characteristics Of Transactions in Teradata? What is a transaction?Transactions ensure data integrity. All requests within a transaction must be successful. Oth
Despite implementing Multivalue Compression to minimize the size of our tables, we still require additional space. Shrinking A Teradata Table To A Minimum Size I once used a tri
The Idea Behind This Trick for Teradata UNION ALL What if you need to apply a UNION ALL operation to distinct columns within a single table? Typically, the process would involve:
This post aims to compile all crucial aspects to be considered while constructing a Teradata Data Warehouse, including the ETL process and SQL queries. This list is just the begin
The optimizer has various access paths to consider when creating the execution plan, each with unique advantages and disadvantages. This article will introduce the 8 most essential
Do you work in an environment where a Teradata compression tool is installed that conforms to your satisfaction? If not, I assume you are probably not very pleased with having to
How does Teradata handle Skew in Joins? The imbalanced workload is a primary issue that must be addressed in performance optimization. Efficient optimization is crucial for join
How Do I Select The Appropriate Data Type In Teradata? Converting datatypes incurs substantial costs and demands significant CPU resources when dealing with extensive tables. Inc
What is a Teradata Macro? The Teradata macro executes multiple SQL queries in a single transaction, ensuring that all requests are successful or changes to database objects are ro
What is the Waterfall Model? The waterfall model facilitates the sequential progression of a data warehouse project. Each phase must be concluded before the subsequent stage comme
Introduction to Teradata Sample Statistics Discover the optimal utilization of Teradata Sample Statistics, including when, how, and why to implement them. What is Teradata Sampl
Introduction to Teradata Referential Integrity What is Teradata Referential Integrity?Referential integrity ensures that a value created in a foreign key (FK) column of a referen
Teradata Sessions explained in a few Sentences What is a Teradata session?A session is a LOGON of a user on a Teradata system through which one or more sequential transactions ca
To exemplify the impact of mistakes in Teradata Data Warehouse projects, consider the analogy of a medical team. Imagine yourself as the project, preparing for a crucial and costly
Teradata Table Design Checklist As creating tables is a frequent task, I deemed it necessary to provide a checklist. * Primary Index or NOPI? * Is the Primary Index distributin
Teradata Query Parallelism A query on a Teradata system runs in parallel at every step, whether for joining, sorting, or aggregating data. Teradata's uniqueness lies in its abili
What are Teradata Surrogate Keys? A Teradata Surrogate Key is an artificial key that maps to a natural key. It is usually of the data type INTEGER or BIGINT and is represented by
Method 1, avoiding any helper tables or functions which are only available in new releases of Teradata: WITH RECURSIVE NumberRanges(TheNumber,TheString) AS ( SELECT 1 AS TheNumber
What is Partial Group By? Joins are costly. Before the introduction of PARTIAL GROUP BY, the join would be performed first, and then the aggregated result would follow. PARTIAL G
Have you observed the different behavior of the Teradata LIKE operator when applied to a CHAR or VARCHAR data type column? Consider the following table as an illustration: Creatin
Teradata SQL and Advanced GROUPING Functions What advanced GROUPING methods are there in Teradata SQL?- GROUP BY GROUPING SETS - GROUP BY ROLLUP - GROUP BY CUBE What is GROUP B
Improve the Performance of your Teradata System Executing the following query may reveal that a small number of queries are responsible for consuming a significant portion of the
There are some important considerations to keep in mind when upgrading to Teradata 16.20. After upgrading to Teradata 16.20, we noticed that certain reporting queries produced inc
Executive summaryGUEST POST BY ARTEMIY KOZYR Today I shed some light on how Data Warehousing lies at the core of Retail Banking operations. We will see the actual case of vital ma
Become a Teradata Vantage expert by downloading our new Android app for free from the Google Play store and practicing. Want more practical data engineering analysis like this? J
Multivalue compression and null indication serve distinct purposes yet share a common objective of storing information about the values of each row's columns. To conserve space, Te
The Optimizer typically excels in utilizing statistics, yet examining the execution plan and cardinality estimations can sometimes be beneficial. Since Teradata 14.10, I have habi
When do you need a fire brigade most? When your house is on fire! When is the best time to establish the fire brigade? No, not when your house is on fire, but before the city the
Learn about Teradata's priority scheduler and how resources are allocated across workloads and requests in this detailed guide on workload management.
All-AMP and single-AMP operations are well-understood. All-AMP retrieval typically involves full table scans, including NUSI sub-table scans. On the other hand, single-AMP retrieva
This course focuses on Teradata indexing techniques. In this two-hour video, you will learn about Teradata's architecture and Full Table Scans, different types of Primary Index Ac
Introduction Teradata SQL lacks a built-in function to determine the ASCII code of a character. But fear not, for there exists a query workaround to this dilemma. This blog post w
As any experienced Teradata DBA will tell you, some tables in the Teradata dictionary (DBC) need housekeeping. But some customer sites that I have worked with ignore table DBC.Acce
Learn about Teradata Heuristics and Optimizer's approach to estimating cardinality for non-indexed columns in this article. Discover how skew improves estimations and why heuristics should be avoided.
Learn how Teradata stores dates internally as INTEGER values and how to efficiently calculate dates after 1900-01-01 using a simple formula. Get more useful Teradata date calculations in this article.
Learn why you should opt for DELETE instead of DROP when it comes to Teradata DDL. Avoid locking dictionary tables and conserve resources.
Teradata DBMS offers various compression options to save disk space and improve performance. But, the cost of implementing the compression takes man-time. This article discusses the cost vs. benefit of compression and suggests carefully considering the compression setting when designing new or modif
I made organizing Stored Procedures into Chapters and Steps a principle. These separation points are marked by comments and variables used in performance and result logging. A st
Discover the importance of Teradata statistics in optimizing query execution plans. Learn how to collect statistics and when to use them for best results.
Learn how to analyze Teradata tables for Multivalue compression without costly operations by using the SHOW STATISTICS VALUES COLUMN statement. This method delivers excellent compression suggestions with less resource and time consumption than traditional approaches. However, it only works for colum
Introduction to Teradata Partial Redistribution and Partial Duplication Managing data skew is a critical aspect of optimizing Teradata performance. While table-level skew can typi
Turning on the Teradata Locking Logger by Vasudeva Rao To activate the locking logger, access the DBS console window or cnsterm subsystem and ensure it is enabled. Note that the c
Teradata UNION ALL - The New Features of Release 16 Teradata 16 introduces new features for UNION ALL queries that can enhance their performance, although their application is not
TO_NUMBER I have often needed a function to verify whether a character column contains a numeric value. Typically, I have resorted to using a workaround, such as: CASE WHEN UPPER
Teradata Big Data Blocks - Introduction Teradata 13.10 introduced larger file systems components, such as cylinders and data blocks. The cylinder size was increased from 1.9 MB to
As a Teradata SQL specialist, I find skew to be a common challenge. In fact, approximately 90% of SQL performance difficulties stem from skew, based on my experience. However, this
Learn an unusual trick to improve the performance of joining partitioned tables with different primary indexes or partitioning. Read on to find out more.
Learn about Teradata Utility Workload and its specialized and optimized user requests that are not SQL protocol-based. Explore its protocols and management techniques.
Learn about Teradata Workload and its benefits in this article. Workloads group requests with common characteristics, making it easier to manage them.
Teradata administrators worldwide face similar challenges. I propose creating a worldwide database containing the characteristics of Teradata systems. This will allow for easy com
Learn about Teradata Statistics Extrapolation, a technique that allows for reasonable estimations when querying historical data with stale statistics.
Get a Grip on Clumpy Row Distribution - the Teradata PPI Table. For this technique to work properly, the primary index in Teradata should not include the partition column(s). We
Learn how specialization in Teradata development can lead to the development of huge SQL statements and how to reduce the risk of bad performance.
Learn the truth about Teradata statistics and their impact on system performance. Address common misconceptions, follow best practices, and optimize your system for peak performance.
The Teradata Architecture Video Course. This video will give you an overview of the Teradata Shared Nothing Architecture. We will show you how Teradata distributes data evenly acro
A thorough understanding of the Teradata system's technical intricacies is crucial to attaining expertise in performance optimization. Although many of us are knowledgeable about e
Introduction to Teradata Recursions To link records from a primary table with a specific date range from a secondary table, a common requirement is to create a simulated history t
Welcome to part two of our Teradata performance optimization series. In this article, we will delve into how a good data model can prevent future performance issues in data warehousing projects. We will also discuss the drawbacks of taking over operational systems' source definitions directly and th
Learn how to use compression as a table column addition to manage your data warehouse's space effectively. Discover constraints and best practices.
This article will address some common questions about formatting date and time values in Teradata. Teradata is a powerful relational database management system (RDBMS) that provide
Teradata offers various options for designing Partitioned Primary Index Tables. The optimal design choice depends on the anticipated workload. This article examines the impact of
Learn about the Teradata Nested Join, the cheapest join method using indexes. We explore its prerequisites, variants, and performance.
What is the Teradata Sample Function? The SAMPLE function returns a set of randomly selected rows. Here are the characteristics: * Ask for a sample with an absolute number of ro
Fast Path INSERT/SELECT efficiently loads a single table. Requirements for Teradata Fast Path INSERT /SELECT * All tables involved have the same primary index. * The target ta
A hot standby node belongs to a clique of nodes and does not host virtual processes such as AMPs or Parsing Engines. The standby node remains idle until a member of its clique cra
Without Teradata CHECK Constraint When a non-indexed column is utilized in a retrieval operation (i.e. WHERE statement) and lacks Teradata Check Constraints, the Optimizer employs
Skewed Teradata Joins - The Initial Situation Consider the scenario where one table includes various currencies while the other comprises customers' accounts with their correspond
Learn how Teradata Integrated Workload Management assigns workloads to queries and offers features like filters and throttles to improve system performance.
Teradata 16.10 introduces Sparse Maps, a feature to place small table rows on one AMP. Learn more about how it can optimize your system here.
Learn about the Teradata row size limits and how to deal with the "3577 Row size or Sort Key size overflow" error in this informative article.
Teradata data access is categorized based on the number of AMPs involved and the technique employed to search rows within a data block. In our article about SQL Tuning, we mention
Teradata 16 introduces a new option for handling sets of rows combined with "UNION ALL" in views and derived tables, reducing resource usage. The optimizer can apply aggregation steps and join operations to each branch of the "UNION ALL" separately, resulting in smaller spool sizes and improved perf
Learn about heuristics in Teradata optimizer and how they estimate the number of selected data rows for nonindexed columns in WHERE condition predicates. This post analyzes several queries and their estimates based on heuristics and provides insights on how to replace heuristic estimations with more
Learn about referential integrity and how it ensures data consistency in databases. Discover soft referential integrity and its use in inner join elimination.
Learn how to optimize joins in Teradata with this case study. Discover how simple changes in query text can improve query performance.
Learn how biased value histograms store exact cardinalities and provide up-to-date statistics for queries, and witness how the Optimizer estimates the number of rows for a given query.
Learn how query rewriting can improve performance in row partition elimination. Follow the test setup to optimize the retrieval of rows from a table.
Natural skew may hinder join performance, but the PDPR capability in Teradata 14 can alleviate this problem. However, PDPR may not detect all potential applications if skewed value
Learn how to analyze workload in Teradata using the DBC.ResUsageSAWT table. This guide explores the essential columns and measures to consider.
Learn how express requests work in Teradata, bypassing stages like syntax checking and permission checking, and how Teradata 14.10 offers a new feature to expedite them.
Learn how to optimize Teradata performance with these general tuning ideas. From changing one parameter at a time to testing on the target system.
To optimize Teradata SQL performance, it is crucial to identify the root cause of any issues. The SQL statement itself is typically not the culprit but rather one or more stages of
The Teradata NUSI is an alternative access path to table rows, similar to other relational database indexes. Indexes are used to minimize resource consumption, especially IOs. Unli
Learn about the benefits of partitioned primary index (PPI) in Teradata, a method for reducing I/Os and improving performance in database systems.
Learn how the CHAR2HEXINT function helped solve a real-world problem of extracting number parts from a string in Teradata SQL. Read on for the solution.
Learn about Teradata's temporal data management functionality based on TSQL2 specification in this article. Discover how it simplifies historization.
Learn about the latest improvements in Teradata 14.00 that allow you to customize the collection of statistics to better suit your needs. These improvements include the option to set different sample sizes, consider more bytes for histogram creation, and choose the number of intervals for building s
Block Level Compression (BLC) is a feature that allows compression of entire data blocks, leading to disk space reduction. Read on to know more.
Learn how variable declarations work in a Teradata Stored Procedure. Declarations are always local to the surrounding compound statement. See examples.
Learn about the Teradata Hash Index, designed to minimize disk IOs and improve access to rows. Discover how it differs from Single Table Join Indexes.
Learn how to optimize SQL tuning on Teradata by minimizing I/Os and maximizing parallelism, which can be achieved by evenly distributing rows across all AMPs.
Learn about decomposable columns, which can be split and lead to better performance for data access. Find out how to optimize their use in this guide.
Learn about Teradata IPE, a technique that uses additional information collected during query execution to create execution plans in fragments.
The importance of the PDM Compression is a widely used method for reducing IOs by enabling the accommodation of more rows in each data block. Data blocks are the smallest unit exc
Learn how to troubleshoot a statistics problem in Teradata. Discover why estimations for a query can be wrong and how to fix them.
Learn how to improve the performance of SQL statements by rewriting them through this article on Teradata tuning. Follow along as we analyze a high-runtime SQL statement and optimize it to significantly reduce disk I/O and spool space usage.
What are Throttles used for in Teradata Workload Management? Throttles restrict concurrent sessions, requests, or utilities to safeguard vital resources, such as CPU, memory, and
Learn how Teradata Concurrency Control uses locking and transactions to prevent simultaneous updates, deletions, or insertions in different sessions.
Learn about optimizing SQL queries in Teradata 14.10 by analyzing the execution plan of two queries and using statistics on columns and expressions.
Learn why it's important to be careful when applying SQL functions on WHERE condition columns and how it can impact query performance.
Learn how the Teradata CLOB datatype can improve query performance and reduce resource consumption for storing longer texts in your database.
Learn how Teradata applies two effective join optimization methods, Early and Partial GROUP BY, for decision support workloads with lots of aggregations. These transformations reduce resource usage and are widely used in modern databases. Discover how to improve the optimizer's chance to apply these
Learn about Teradata value ordering, an optimization technique for range scans of 32-bit integers. Read on to understand how it is implemented.
Learn about the characteristics of static and dynamic SQL in Teradata stored procedures. Discover how to use variables and parameters in SQL statements.
Learn about Teradata Partitioning and how row and column partitioning can minimize data access resource usage. Read on for benefits and examples.
Learn how to avoid resource-killing full table scans on extensive tables with Teradata Indexing. Discover a fantastic trick to optimize query performance.
Discover the improvements for Teradata statistics with the new syntax starting from Release 14.00. Learn how to optimize performance and usage.
Teradata SQL Stored Procedures enhance traditional SQL with procedural language features like iterations, condition and error handling, and variables. Learn more here.
Learn about Teradata's MaxValueLength feature, which allows users to specify how many bytes or characters should be used when creating statistics histograms.
Learn about Teradata's load utilities, which can be divided into two groups: those that bypass the transient journal and those that use it. Depending on your requirements, you may choose to load data either way. This article offers advice on how and when to use each tool, including BTEQ, TPump, Fast
Learn how to solve the problem of accessing a table containing many rows using Teradata Partition Elimination with a Stored Procedure.
Learn about Teradata Nodes - Linux systems packed into cabinets with multicore CPUs, memory, and parallel database extension software (PDE).
Learn how to minimize disk IOs with the Teradata DatablockSize option. Read on for an overview of how it works and how to choose the best size.
Optimizing Teradata performance can be time-consuming and unpredictable. Before starting, it's crucial to define clear goals and measure improvements using absolute criteria like CPU usage and disk access. In this article, we'll explore the areas that can be optimized in a typical Teradata Data Ware
Learn about the different types of access rights in Teradata, including implicit, explicit, inherited, and automatic privileges, and how they are granted to owners and creators of database objects.
Learn about the Teradata Primary Index, which distributes table rows across all AMPs for parallel execution. Discover how to select the best index for performance and avoid hash collisions.
To gain a comprehensive insight into the operation and performance of your load utilities and DSA jobs, the DBQLogTbl offers a starting point. The DBQLogTbl provides metrics for t
Introduction Teradata is a name synonymous with data warehousing and large-scale data analytics. Teradata has provided organizations with innovative solutions and technologies for
Discover the powerful capabilities of Teradata SQL Assistant, a user-friendly interface for managing and querying data in the Teradata database environment. Learn how to connect and export CSV files, and explore popular alternatives. Plus, find out how to access it for free with Teradata Express.
Discover the power and unique features of Teradata SQL, the dialect of SQL used within Teradata's large-scale data warehousing and analytics platform. Learn about its differences from standard SQL and how to run queries using Teradata SQL Assistant and BTEQ.
Teradata BTEQ is a powerful tool for SQL query execution and data transfer. Learn why it is often a better option than ETL tools for developers.
Learn how to work with Teradata OLAP Functions in your data warehouse applications. This systematic approach covers five key design decisions.
The Teradata Primary Index and Hash Collisions Using either INTEGER or CHARACTER data types for the Teradata Primary Index does not usually impact performance. The hashing algorit
Learn how to maintain your Teradata data warehouse to keep it running smoothly. Discover how to collect statistics and refresh them as needed.
Learn how to gather statistical samples in Teradata systems to reduce workload and generate comparable execution plans through sample statistics.
Learn about the features and benefits of Teradata Identity Columns, and why they are used for transactional and bulk inserts. Discover the data types, syntax, and more.
Learn why selecting only necessary columns can improve query performance. This article includes an example query and execution plan analysis.
Learn about tactical workload tuning in Teradata, which requires unique skills and tools. This article covers the skills needed to optimize tactical queries and highlights the difference between tactical and decision support workloads. Discover the Tactical Workload Toolkit and how to use join index
Learn why business intelligence tools often use outer joins and why they can be vulnerable to skewing. Read on to understand the details and risks involved.
Teradata Active System Management (TASM) offers dynamic workload management features that allow workload adaptation based on system conditions. Learn more here.
Learn how to optimize SQL queries in Teradata with this case study. See how a recursive query was replaced with a more efficient approach.
Teradata Viewpoint is a web-based tool for monitoring and managing Teradata data warehousing systems. This article explores the tool's functionalities and how to utilize it for mon
Learn how Teradata Comparison Operators and Performance are linked, and how choosing the right operator can impact query performance. Column statistics play a crucial role in the Teradata Optimizer, and this article explores how to use them effectively.
Introduction To round up decimal values in Teradata, use the CEILING function. It is efficient and accurate. This blog post will show how to use the CEILING function in Teradata t
Learn how to improve your data quality and consistency with the Teradata TRIM function. Remove unwanted spaces and characters for more accurate analysis.
Learn how to enhance the efficiency and potency of Teradata queries using the CURRENT_DATE function. Discover how to filter data and calculate time intervals with ease.
Learn the most essential Teradata SQL statements and functions in this article. From CREATE TABLE to Teradata-specific hashing functions, enhance your data manipulation and analysis capabilities on the platform. Whether you're a beginner or an experienced user, this guide will help you master Terada
Discover the key architectural differences between Teradata and Snowflake in this informative article. Learn about data distribution, scalability, indexes, workload management, SQL tuning, and more. Read on to find out if Snowflake could be the right fit for your data needs.
Teradata Querygrid is a Teradata database management system component that enables users to retrieve data from various sources and conduct distributed queries. This article examine
Learn how to detect and prevent Teradata skewing, one of the main causes of slow query performance. Discover the different types of skew and how to analyze tables, joins, and column values to detect and resolve issues. With practical tips and example queries, this article is a must-read for performa
Learn about the Teradata Transient Journal, a crucial component for ensuring data integrity and protecting against system failures. This dictionary table is managed by each AMP, containing rows copied before changes are made to a transaction. If a transaction fails, the journal is used to roll back
Learn how to optimize your Teradata Load Performance by choosing the right loading method per table. Bulk loads for large tables and transactional loads for small ones. Find out more in this article.
Introduction to Teradata Volatile Tables and Snowflake Temporary Tables Teradata and Snowflake offer temporary tables. This post will compare and contrast the types of tables avai
Learn about the basics of Teradata architecture and why it's still a top RDBMS for Data Warehousing. Discover how the system is built for parallelism and how hashing algorithms help distribute data evenly among AMPs. Explore the essential roles of Parsing Engines, AMPs, and Nodes in executing instru
Learn how to efficiently add new partitions to a Row Partitioned Table. We compare two methods, ALTER TABLE and INSERT…SELECT, with test results.
Learn how to handle rounding issues in Teradata with this helpful article. Discover a possible solution to a common problem and improve your results!
Learn how to optimize Teradata SQL Tuning with comprehensive analysis of available tools in this article. Discover the optimal approach for query optimization for both tactical and strategic workloads, as well as methods for eliminating full table scans, reducing skew, and examining the execution pl
Learn about Teradata Spool Space, a temporary workspace used to store intermediate and final result sets. Discover ways to increase and manage Spool Space.
Learn the differences between Teradata SUBSTR and SUBSTRING functions for string manipulation. Discover the syntax variations, parameters, and usage examples.
Discover the differences between Teradata MULTISET and SET tables for effective data modeling. Learn how to avoid duplicates and improve performance.
Discover the power of Teradata FastLoad - the bulk load utility that efficiently loads vast amounts of data, bypassing the transient journal. Learn more here.
Learn about Teradata Volatile Tables, their features, restrictions, and performance tuning possibilities. Find answers to common questions about their usage.
Avoid data warehouse project failures by choosing the right data model. In this article, we explore denormalization techniques for when it makes sense, including repeating groups, prejoins, and derived information. We also discuss alternatives to denormalization, such as Global Temporary Tables and
Learn how Teradata protects your data against potential loss with its various features. Each feature offers unique protection on different levels.
Discover the benefits of Teradata Intelligent Memory and its unique approach to storing data. Learn how it prioritizes frequently accessed data for faster access.
The Teradata Permanent Journal captures permanent table snapshots pre- and post-modifications. Prior to any changes, the original row duplicates are saved in the permanent journal
Learn about Teradata's unique secondary indexes (USI) and how they provide an alternate access path to reduce disk IOs while retrieving data. This article explains the technical differences between USI and other indexes, and provides detailed information on the data retrieval process. Discover how t
Discover the history of parallel database architectures - from shared memory to shared disk and shared-nothing. Learn about the advantages and limitations of each architecture and how fault tolerance is handled. Explore the shift towards big data and the trend of "Hadoop over SQL."
Learn how to determine SQL Query Performance in Teradata with these three essential parameters: AMPCPUTime, TotalIOCount, and SpoolUsage. Discover how to get SQL Query Stats and make the right decision for your queries using the DBC tables DBQLOGTBL and DBQLSQLTBL. Follow these three points to consi
Learn about the Teradata parsing engine's tasks and how it impacts overall system performance. Find out how to clean up the DBC. AccessRights table to improve parsing times in this informative article.
Discover the benefits of using a Teradata NESTED join strategy for economical data joining. Learn how to enable this feature and optimize query design.
Learn how skewing on Teradata can impact query run times with this informative test scenario. Find key figures and observations to help optimize your performance.
Learn about the challenges of Teradata Performance Optimization from an expert. Discover why fixing a poorly designed data model is critical to success.
Learn about Teradata's column-orientated storage, which offers an alternative way of laying out data on disks, benefiting data retrieval in Big Data times. This article explains how columnar tables work, how they differ from row-oriented databases, and how you can create them in Teradata 14.00.
Discover how Map Reduce is becoming a key feature of most database vendors' RDBMS. Follow an example of SQL aggregation statement joined with two tables.
This article presupposes your existing familiarity with the fundamental structure of a Teradata System. As you are aware, the AMPs operate on a Teradata System. The number of AMP
Learn how to clone tables in Teradata quickly and easily using the CREATE TABLE AS statement. Transfer statistics and definitions without wasting resources.
Learn how to accurately evaluate the statistics of a table beyond just the collection date. Find out how to test the validity of your table's statistics with a simple query.
What is the Teradata Explain Statement? Optimizing performance can be challenging, often requiring significant effort to identify the root issues. Thankfully, Teradata provides th
Learn about Teradata Partitioning and how it can improve query performance. Discover strategies for using row-level partitioning and considerations before partitioning a table.
Learn about the Teradata Join Index, a performance tuning tool that stores pre-joined tables, aggregations, or table content in a different structure. Discover its benefits, what it is used for, and when to use it.
Learn how to handle errors in Teradata Stored Procedures. Error handlers provide event-driven functionality and make error handling more efficient.
Discover the exciting new features of Teradata Version 16.00, which can significantly improve data warehouse administration and loading. Learn how to avoid error messages and manage skewed tables with the new Global Space Soft Limit and SKEW features.
Learn how Teradata macros can improve your database efficiency with reusable blocks of SQL code. Create macros once, execute them anytime!
Teradata's MAPS Feature allows for hardware expansion with minimal downtime by delaying table redistribution. Learn how to use MAPS INSERT. SELECT here.
Discover how Teradata summary statistics provide valuable information about the number of rows in a table, helping to optimize performance tuning.
Learn about the Teradata Primary Index and how it differs from a primary key. Discover the benefits of defining a primary index and how many can be created per table.
Introduction to Teradata Temporary Tables This article highlights the distinct types and uses of Teradata temporary tables. 1. Derived Tables * Are materialized by a select st
Learn about Cursors in Teradata SQL stored procedures, which are used to iterate over result sets for record-oriented data processing with SQL. Find out why they are necessary and the different types of cursor declarations available.
Discover Teradata TPump, the real-time loading tool for transactional systems. TPump loads one row at a time, allowing concurrent INSERTs and UPDATEs using row hash locks. With TPump, you can throttle statement rates and execute DML functions, all while bypassing some of the limitations of bulk load
Learn what a Teradata Query Band is and how it can be used to optimize queries in a Teradata system. Discover how to set, update, and read query bands, and how to use them for performance tuning. Find out what to consider when creating a query band in Teradata.
Learn how to create Teradata recursive queries using either the WITH RECURSIVE clause or CREATE RECURSIVE VIEW statement in this guide. Explore a real-world example of finding the shortest paths in a graph through a recursive solution.
Learn how Teradata and Snowflake minimize IOs to enhance performance. Teradata uses row partitioning while Snowflake relies on micro-partitions.
What is the DIAGNOSTIC HELPSTATS ON FOR SESSION Command in Teradata? Gathering statistics is a crucial aspect of optimizing performance. Collecting statistics requires resources,
Learn how to export and upload tables using Teradata TPT scripts with bulk loading. Check out our example and recommended load methods here.
The Teradata TO_CHAR Function Utilize the Teradata TO_CHAR function to: * Convert numeric expressions to character strings * Convert dates, times, interval data types, or times
Is Teradata-Specific SQL different from SQL?While the SQL Teradata uses is a dialect of SQL, there are some differences compared to ANSI SQL. It is specifically designed for the Te
Introduction Data lakes have gained significant popularity in recent years as organizations seek more flexible and scalable solutions to store and process their ever-growing volum
Teradata DBQLOGTBL is a system table in the Teradata database management system that stores log data for all queries on the system. It is a valuable resource for tracking and analy
Teradata Vantage has also made its move into the public cloud. Teradata Vantage is now available in AWS, Microsoft Azure, and Google Cloud Storage. It is worth noting that the same
Introduction to Teradata Sparse Maps The Teradata hashing architecture and how AMPs are used to split tables for maximum parallelism evenly have been unchanged since the beginning
BLC, Row Level Partitioning, and Datablock Size Tuning In this article, we will learn how block-level compression, row-level partitioning, and selection of the appropriate data bl
How to use the Teradata MAPS Architecture? In an earlier article (link to All Rows on One AMP – By Design), I showed an alternative design for small reference tables, ensuring tha
The Essential Facts About NOPI Tables in Teradata What are Teradata NOPI tables?The rows of a Teradata NOPI table are not distributed evenly across all AMPs using the hashing alg
Teradata FastExport is a powerful tool for rapidly moving large volumes of data between Teradata Vantage and client utilities. This tool uses multiple sessions to export data to a
What is a Teradata AMP?Teradata AMP (Access Module Processor) is a critical component in the Teradata system, responsible for managing and processing its data share. Each AMP opera
What are Teradata roles?Roles serve to simplify rights management. A user assigned a role automatically has the same rights as the role. What are the advantages of roles in Terad
What is the purpose of the Teradata COALESCE function?COALESCE receives a list of scalars as arguments and returns the first value that is not NULL. If all scalars are NULL, then C
Joining tables can be costly since each join requires that the rows being joined are on the same AMP. In cases where the two tables possess different primary indices, copying one o
Summary Today I will show you how to identify the source problem to tune the query's performance utilizing the unnecessary amount of IO, the UII indicator, and how the WHERE claus
Overview * What is A Data Model? * What are the Major Types of Data Models? * What is FS-LDM? * How is FS-LDM Implemented? Dimensional Modeling * Dimensional Modeling
Introduction In modern business, prompt response to opportunities and issues is crucial. Detect and collect business events for prompt processing, generating either an alert for
Teradata Secure Zones I've had a few conversations recently with different customers about the "Secure Zones" optional feature added in Teradata 15.10. There is a fair amount of m
Introduction to the Teradata Sliding Window Merge Join DWH Pro thoroughly analyzes the Teradata sliding window merge join, which has limited available information. Our extensive r
As always, when something comes to the attention of a larger number of people, some embrace it, some frown. We were asked to dissociate the name of this website from the term Tera
A Teradata Show Table Proxy for Full DDL Recreation from DBC Throughout my recent development activities, an old desire became relevant again. It goes as follows: Can one write a
Introduction to Teradata Physical Storage In the second part of this series on Teradata Physical Storage, let me take over from Roland to continue where we stopped in part 1. We w
Over the years, numerous debates have emerged concerning the superior performance of specific statements: SELECT <COLUMN> GROUP BY 1 or SELECT DISTINCT <COLUMN> Many personal ex
Introduction to Teradata Data Types and Data Modeling Suppose you work in a well-organized environment where diligence and discipline prevail and significant effort is put into de
For various reasons, we may need to switch our Teradata historisation from snapshot tables to a start date/end date logic table, which I usually call a change history, as only chan