Back Home Up Next
| |
Declarative referential integrity v/s triggers.
��� Relational theory says you must normalize data to make it
fit inside a relational model. Of course, entities must be connected in some
way. The "glue" or link between two normalized entities is a common
field for both entities. The common field acts as a bridge so you can pass from
one entity to another using SQL and it's the physical representation of a
logical "relationship" between two entities. Indeed, you can have one
entity linked with itself or three entities linked at the same time, too,
depending on the semantics of your model. When a model is normalized, this
means usually you get one real-world concept or thing decomposed in several
normalized entities, usually referred as tables. But these tables must be kept
in sync to get a consistent data repository. For example, a bill and its details
are two separated tables, but no detail should be allowed if it refers to a bill
that doesn't exist. Contrarily, a bill cannot be deleted unless there aren't
details or these details are erased first. Validating these rules in client
applications is not the bulletproof solution because people might be able to use
a third party tool to insert, delete or update data. To the rescue comes
referential integrity. This is a metadata declaration that enforces a dependency
relationship between two tables through one or more fields. So, the engine
itself is in charge of verifying consistency across tables; you can think that
one table "refers" to other for validation and hence the name
"referential integrity". In IB, a referential integrity constraint
can be set at a field level or at a table level. When more than one field is
involved, a table-level constraint is required. In SQL, this constraint is
called a foreign key (FK) declaration.
The lookup table (the one where� values are verified against) must exist.
Three important restrictions are:
 |
Currently, a FK declaration can be done only
when one connection is made to the database. If more connections are in
effect, an error message like "object is in use" will appear. |
 |
If the lookup table has been used in the
current session even if only there's one connection to the database, it
might be necessary to disconnect and reconnect. |
 |
The lookup table must have declared already
the referenced field(s) with an unique
index (UK) or the command won't be accepted. Only the child (referrer)
table will have a non-unique index automatically generated to handle the
relationship. |
Now that FK declarations have established the database schema, you will find
that's not very easy to modify metadata of tables with referential integrity
constraints. Specifically, you cannot drop or alter fields that are involved in
a FK declaration. Because of these limitations, you may be tempted to define
your referential integrity not with DDL standard
commands but handle the validations for yourself by means of triggers. However,
some drawbacks exist:
 |
The main reason why a referred (master,
lookup) table must be altered, too is that a master record cannot be
deleted if child records pointing to it exist, so you'll have to define a
trigger not only on the child (to avoid insertions without parent) but
also on the master. This may become a burden as the number of tables
increase. |
 |
FK declarations are self-evident by reading
them. Having to read the source of a trigger to discover what it does is
not as attractive. Also, because a table can have many triggers, to
user-defined ones for referential integrity purposes are better given
meaningful names. Triggers are most used to make entries in audit tables,
to modify or fill some fields before inserting or changing data, to make
possible auto-increment fields and to stop operations in case a condition
fails and this condition cannot be handled by declarative table-level or
column-level constraints so it has to be done with procedural
instructions. Also, you can't drop fields that are used by triggers or
stored procedures. |
 |
Usually, a FK declaration is optimized so it
makes a fast check of the conditions. If you intend to replace it with a
trigger, make sure you make a good lookup. For example, don't use count(*)
to know if you find a match; you only are interested if a match happens,
not how many matches exist, so an EXISTS test with a plain select_* will
be faster. |
 |
From IB5 onwards, declarative FK declarations
can include options that specify what should happen when a record in a
parent table is deleted or modified and there are dependant records on the
child table. The typical options are CASCADE, NULL and DEFAULT. Having to
reproduce these features in a trigger becomes unattractive. |
 |
Perhaps the most important point was left at
the end: IB goes to great pains to ensure that referential integrity
constraints defined through PK declarations are ran against committed data
from other transactions in a way that's independent on the state and
isolation level of your transaction. Of course, inside your own
transaction there's no visibility problems. Remember that IB always uses a
transaction, either implicit or explicit. Because the default isolation is
snapshot (see the explanation on transaction
options) you risk missing some entries that were posted and committed
after your transaction started, so a trigger making a select will not see
these new entries. Here, two things may occur: either you cannot insert in
the child table even though another transaction has committed the needed
record in the master table or worse, you delete a record from the master
table and your trigger doing a delete on the child table doesn't erase
some records that meet the condition to be erased because they were
committed after your transaction started, so you are leaving some
"dangling records" (orphan records) in a problem that resembles
the dangling pointers in programming languages, the only difference being
that an SQL sentence with orphan records won't crash the engine as an
invalid pointer may stop a program. These nightmares are avoided with
declarative referential integrity. |
In IB, stored procedures and triggers cannot start transactions by themselves
and cannot change the current transaction isolation. They run inside the
transaction started by the client. You can say that procedural referential
integrity (namely, using triggers and not declarations) still can be used
provided that the read committed level of isolation is used by all the client.
After all, other relational engines used read committed as the default level and
the BDE uses it, too. However, nothing prevents your client from using snapshot
explicitly (even on other engines) and you have no way to detect that in a
trigger. Also, one aim of relational engines is passing the enforcement of all
basic rules to the engine, so there's no need to reinvent the wheel unless you
really need a very complex lookup in another table. In this case, it can be a
sign of a normalization problem. � |