Share

Understanding Soft Referential Integrity and Its Role in Improving Performance

Learn about referential integrity and how it ensures data consistency in databases. Discover soft referential integrity and its use in inner join elimination.

Understanding Soft Referential Integrity and Its Role in Improving Performance
design1

What is Referential Integrity?

Referential integrity establishes table relationships through primary and foreign keys, specifying the referencing table columns corresponding to the referenced table's foreign key.

Referential integrity is a reliable mechanism for maintaining data consistency in a database.

What is Soft Referential Integrity?

Soft Referential Integrity informs the Optimizer of a table relationship, which the Optimizer will utilize without enforcing referential integrity, assuming instead that the user will ensure it.


Want more practical data engineering analysis like this?

Join DWHPro Letters and get field-tested notes on Teradata, Snowflake, AI, migrations, performance, and enterprise data work. Early subscribers keep launch access before the paid plan launches.

Get the next issue


How is Soft Referential Integrity used?

Soft referential integrity facilitates INNER JOIN elimination in views.

Get the next issue by email.

By exclusively selecting columns from the referencing table's primary key side in a view where an inner join exists with soft referential integrity between two tables, the Optimizer can bypass the need for the inner join. Consider the following illustration:

REPLACE VIEW TradeInfo ASSELECTt01.TraderId,t01.TraderName,t01.TraderCountry,t02.AccountCurrency,t02.BaseCurrencyFROMTrader t01INNER JOINTradeAccount t02ONt01.TraderId = t02.TraderId;

The TraderId column defines soft referential integrity between the primary and foreign keys (Trader) (Trades).

Executing the following SQL statement will not result in the Optimizer joining to the "Trades" table. This is because all referenced columns are from the "Trader" table, and the defined soft referential integrity guarantees that all rows in "Trades" will be present in "Trader". As a result, there is no need to filter non-matching rows using an INNER JOIN.

Teradata's soft referential integrity option can improve performance when achieving join elimination. While we could opt for "hard" referential integrity, doing so would result in performance overhead that we may wish to avoid.

It is crucial to note that "soft" referential integrity relies on our ability to guarantee the relationship's integrity. The Optimizer places its trust in us.

To establish soft referential integrity, ensure that the column's data type is identical in both tables and that the referenced column is uniquely indexed.

CREATE MULTISET TABLE Trader(TraderId INTEGERREFERENCES WITH NO CHECK OPTION TradeAccount(TraderId),TraderName VARCHAR(255),...TraderCountry CHAR(03)) UNIQUE PRIMARY INDEX (TraderId);

Planning or surviving an enterprise data platform migration?

I write regularly about the performance, cost, architecture, and project mistakes that show up in real Teradata, Snowflake, Databricks, and enterprise data work.

Subscribe before the paid plan launches and keep launch access.

Written by Roland Wenzlofsky, founder of DWHPro and author of Teradata Query Performance Tuning. DWHPro has helped data warehouse practitioners for 15+ years.

Subscribe to DWHPro Letters

Practical field notes on enterprise data engineering, production AI systems, platform migration, and the senior engineering market.
Written by Roland Wenzlofsky Founder of DWHPro Author of Teradata Query Performance Tuning
Get the next issue
Subscribe