Teradata TPump: What it is and How it Facilitates Real-Time Data Updates

Discover Teradata TPump, the real-time loading tool for transactional systems. TPump loads one row at a time, allowing concurrent INSERTs and UPDATEs using row hash locks. With TPump, you can throttle statement rates and execute DML functions, all while bypassing some of the limitations of bulk load

tool1

What is Teradata TPump?

TPump is short for Teradata Parallel Data Pump.

Teradata TPump loads data one row at a time, utilizing row hash locks to facilitate numerous concurrent INSERTs and UPDATEs on a table. This approach differs from Fastload or Multiload, which transfer data in large blocks.

Teradata TPump is not intended for loading large volumes of data rapidly. Instead, it facilitates a steady stream of data to enter the database.

Want more practical data engineering analysis like this?

Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. DWHPro Letters is free. Subscribe to get new issues by email.

Get the next issue

The Teradata TPump Features

Transactional systems require rapid data transfer to the data warehouse. TPump facilitates near real-time updates from source systems to Teradata.

Throttling: TPump enables you to establish a statement rate for data updates during loading, indicating the desired frequency of updates per minute.

The statement rate can be adjusted during TPump job execution. It may be expedient to accelerate the load process during nightly batch processing and decelerate it (thus reducing resource utilization) when business users generate their reports during the day.

DML Functionality: TPump can do DML functions (like Multiload). This includes INSERT, UPDATE, and DELETE statements.

TPump offers advantages over bulk loading with Fastload or Multiload by allowing for the use of USI and NUSI. Unlike Fastload, which requires the target table to be empty, TPump can load data into an already populated table, including tables using Multiload.

Row duplicates are permissible in contrast to Fastload and Multiload. Additionally, dropping triggers or referential integrity is not necessary when loading.

Get the next issue by email.

The most critical TPump Limitations

No concatenation of input data files is allowed.
TPump can't handle aggregates, arithmetic functions, or exponentiation.
The use of SELECT is not allowed.
A single load task may use no more than four IMPORT commands. We can directly read at most four files in a single run.
TPump performance will decrease if access logging is enabled on the target objects.

Monitoring TPump

The TPump Monitor tool, included with Teradata TPump, enables real-time monitoring of TPump job statuses and permits modification of statement rates for active jobs.

To initiate the monitor on LINUX, use the command:

tpumpmon [-h] [TDPID/],[,]

TPump Error Handling

Each target table in TPump has its own dedicated error table that retains a duplicate of every erroneous row.

Common TPump load errors include:

2816: Failed to insert a duplicate row into the TPump Target Table.

TPump encountered a duplicate row error, causing Teradata to insert the initial record solely while discarding the duplicates.

2817: Activity count greater than one for TPump UPDATE/DELETE.

TPump encounters this error when it attempts to UPDATE or DELETE additional rows.

2818: Activity count zero for TPump UPDATE or DELETE.

This indicates that the UPDATE or DELETE operation failed.

TPump Restartability

TPump can be restarted completely, provided the log and error tables have not been dropped.

TPump Example Script

.logtable TheDatabase.tpumplog;
.logon DWHPRO,******;
database TheDatabase;
.name test;
.begin load errortable ET_CUST
sleep 5
checkpoint 5
sessions 8
errlimit 4
pack 4
tenacity 4
serialize on;
specify a PRIMAY KEY in the .FILED command */
.layout mylayout;
.field CUSTID * varchar(5) key;
.field CUSTNAME * varchar(30);
.dml label INST;
insert into CUSTOMER
(CUSTID,CUSTOMER_NAME
)
values
(:CUSTID,
:CUSTNAME
);
.import infile CUSTDATA.txt
format vartext ','
layout mylayout
apply INST;
.end load;
.logoff;

The Fundamentals of FastLoading on Teradata

Planning or surviving an enterprise data platform migration?

I write regularly about the performance, cost, architecture, and project mistakes that show up in real Teradata, Snowflake, Databricks, and enterprise data work.

Subscribe for free and keep launch access.

Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.