Database Terminology
Database Terminology
ASSIGNMENT No. 2
DATABASE TERMINOLOGY; GIVE THE DEFINITION FOR EACH TERMINOLOGY.
THAT INFORMATION MUST BE CLEAR AND ENCLOSE AN EXAMPLE.
ITT (Information Technology in Tourism)
Submitted to:
Saritha Pradhan
Submitted by:
Bal Gopal Subudhi
PGDM (TT)
Roll No.18
9TH Oct 2009
1|Page
KEY TERMS
1. FOREIGN KEY
In the context of relational databases, a foreign key is a referential constraint between two
tables. The foreign key identifies a column or a set of columns in one (referencing) table that
refers to a column or set of columns in another (referenced) table. The columns in the
referencing table must be the primary key or other candidate key in the referenced table. The
values in one row of the referencing columns must occur in a single row in the referenced table.
Thus, a row in the referencing table cannot contain values that don't exist in the referenced
table (except potentially NULL). This way references can be made to link information together
and it is an essential part of database normalization. Multiple rows in the referencing table may
refer to the same row in the referenced table. Most of the time, it reflects the one (master
table, or referenced table) to many (child table, or referencing table) relationship. The
referencing and referenced table may be the same table, i.e. the foreign key refers back to the
same table. Such a foreign key is known in SQL: 2003 as self‐referencing or recursive foreign
key.
Example
An accounts database has a table with invoices and each invoice is associated with a particular
supplier. Supplier details (such as address or phone number) are kept in a separate table; each
supplier is given a 'supplier number' to identify it. Each invoice record has an attribute
containing the supplier number for that invoice. Then, the 'supplier number' is the primary key
in the Supplier table. The foreign key in the Invoices table points to that primary key. The
relational schema is the following. Primary keys are marked in bold, and foreign keys are
marked in italics.
2. FORMS GENERATOR
3. HETEROGENEOUS DATA ENVIRONMENT
4. HIERARCHICAL DATABASE MODEL
A hierarchical data model is a data model in which the data is organized into a tree‐like
structure. The structure allows repeating information using parent/child relationships: each
parent can have many children but each child only has one parent. All attributes of a specific
record are listed under an entity type.
In a database, an entity type is the equivalent of a table; each individual record is represented
as a row and an attribute as a column. Entity types are related to each other using 1: mapping,
also known as one‐to‐many relationships.
The most recognized and used hierarchical database is IMS developed by IBM.
2|Page
5. MAIN MEMORY
Primary storage, presently known as memory, is the only one directly accessible to the CPU.
The CPU continuously reads instructions stored there and executes them as required. Any data
actively operated on is also stored there in uniform manner.
Historically, early computers used delay lines, Williams tubes, or rotating magnetic drums as
primary storage. By 1954, those unreliable methods were mostly replaced by magnetic core
memory, which was still rather cumbersome. Undoubtedly, a revolution was started with the
invention of a transistor that soon enabled then‐unbelievable miniaturization of electronic
memory via solid‐state silicon chip technology.
This led to a modern random‐access memory (RAM). It is small‐sized, light, but quite expensive
at the same time. (The particular types of RAM used for primary storage are also volatile, i.e.
they lose the information when not powered).
6. MEMORY BUFFER
A Memory Buffer Register (MBR) is the register in a computer's processor, or central processing
unit, CPU, that stores the data being transferred to and from the immediate access store. It acts
as a buffer allowing the processor and memory units to act independently without being
affected by minor differences in operation. A data item will be copied to the MBR ready for use
at the next clock cycle, when it can be either used by the processor or stored in main memory.
This register holds the contents of the memory which are to be transferred from memory to
other components or vice versa. A word to be stored must be transferred to the MBR, from
where it goes to the specific memory location, and the arithmetic data to be processed in the
ALU first goes to MBR and then to accumulated register, and then it is processed in the ALU.
7. METADATA
Metadata (Meta data, or sometimes metainformation) is "data about data", of any sort in any
media. Metadata is text, voice, or image that describes what the audience wants or needs to
see or experience. The audience could be a person, group, or software program. Metadata is
important because it aids in clarifying and finding the actual data. An item of metadata may
describe an individual datum, or content item, or a collection of data including multiple content
items and hierarchical levels, such as a database schema. In data processing, metadata provides
information about, or documentation of, other data managed within an application or
environment. This commonly defines the structure or schema of the primary data.
Example
Metadata would document data about data elements or attributes, (name, size, data type, etc)
and data about records or data structures (length, fields, columns, etc) and data about data
(where it is located, how it is associated, ownership, etc.). Metadata may include descriptive
information about the context, quality and condition, or characteristics of the data. It may be
recorded with high or low granularity.
3|Page
8. NETWORK DATABASE MODEL
The network model is a database model conceived as a flexible way of representing objects and
their relationships. Its distinguishing feature is that the schema, viewed as a graph in which
object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or
lattice.
The network model's original inventor was Charles Bachman, and it was developed into a
standard specification published in 1969 by the CODASYL Consortium.
9. NON ‐ VOLATILE STORAGE
Non‐volatile memory, nonvolatile memory, NVM or non‐volatile storage, is computer memory
that can retain the stored information even when not powered. Examples of non‐volatile
memory include read‐only memory, flash memory, most types of magnetic computer storage
devices (e.g. hard disks, floppy disks, and magnetic tape), optical discs, and early computer
storage methods such as paper tape and punch cards.
Non‐volatile memory is typically used for the task of secondary storage, or long‐term persistent
storage. The most widely used form of primary storage today is a volatile form of random
access memory (RAM), meaning that when the computer is shut down, anything contained in
RAM is lost. Unfortunately, most forms of non‐volatile memory have limitations that make
them unsuitable for use as primary storage. Typically, non‐volatile memory either costs more or
performs worse than volatile random access memory.
Several companies are working on developing non‐volatile memory systems comparable in
speed and capacity to volatile RAM. For instance, IBM is currently developing MRAM
(Magnetoresistive RAM). Not only would such technology save energy, but it would allow for
computers that could be turned on and off almost instantly, bypassing the slow start‐up and
shutdown sequence.
Non‐volatile data storage can be categorized in electrically addressed systems (read only
memory) and mechanically addressed systems (hard disks, optical disc, magnetic tape,
holographic memory and such). Electrically addressed systems are expensive, but fast, whereas
mechanically addressed systems have a low price per bit, but are slow. Non‐volatile memory
may one day eliminate the need for comparatively slow forms of secondary storage systems,
which include hard disks.
10. OBJECT‐ORIENTED DATABASE MODEL
The object‐oriented paradigm has been applied to database technology, creating a various
kinds of new programming model known as object databases. These databases attempt to
bring the database world and the application programming world closer together, in particular
by ensuring that the database uses the same type system as the application program. This aims
to avoid the overhead (sometimes referred to as the impedance mismatch) of converting
information between its representation in the database (for example as rows in tables) and its
representation in the application program (typically as objects). At the same time, object
4|Page
databases attempt to introduce the key ideas of object programming, such as encapsulation
and polymorphism, into the world of databases.
A variety of these ways have been tried for storing objects in a database. Some products have
approached the problem from the application programming end, by making the objects
manipulated by the program persistent. This also typically requires the addition of some kind of
query language, since conventional programming languages do not have the ability to find
objects based on their information content. Others have attacked the problem from the
database end, by defining an object‐oriented data model for the database, and defining a
database programming language that allows full programming capabilities as well as traditional
query facilities.
11. OBJECT‐RELATIONAL DATABASE MODEL
An object‐relational database (ORD), or object‐relational database management system
(ORDBMS), is a database management system (DBMS) similar to a relational database, but with
an object‐oriented database model: objects, classes and inheritance are directly supported in
database schemas and in the query language. In addition, it supports extension of the data
model with custom data‐types and methods.
An object‐relational database can be said to provide a middle ground between relational
databases and object‐oriented databases (OODBMS). In object‐relational databases, the
approach is essentially that of relational databases: the data resides in the database and is
manipulated collectively with queries in a query language; at the other extreme are OODBMSes
in which the database is essentially a persistent object store for software written in an object‐
oriented programming language, with a programming API for storing and retrieving objects, and
little or no specific support for querying.
12. OPERATING SYSTEM SOFTWARE
An Operating System (OS) is an interface between hardware and user which is responsible for
the management and coordination of activities and the sharing of the resources of the
computer that acts as a host for computing applications run on the machine. As a host, one of
the purposes of an operating system is to handle the details of the operation of the hardware.
This relieves application programs from having to manage these details and makes it easier to
write applications. Almost all computers (including handheld computers, desktop computers,
supercomputers, video game consoles) as well as some robots, domestic appliances
(dishwashers, washing machines), and portable media players use an operating system of some
type. Some of the oldest models may, however, use an embedded operating system that may
be contained on a compact disk or other data storage device.
13. PHYSICAL DATA POINTER
5|Page
14. PRIMARY KEY
In relational database design, a unique key or primary key is a candidate key to uniquely
identify each row in a table. A unique key or primary key comprises a single column or set of
columns. No two distinct rows in a table can have the same value (or combination of values) in
those columns. Depending on its design, a table may have arbitrarily many unique keys but at
most one primary key.
A unique key must uniquely identify all possible rows that exist in a table and not only the
currently existing rows.
Example
Social Security numbers (associated with a specific person) or ISBNs (associated with a specific
book). Telephone books and dictionaries cannot use names, words, or Dewey Decimal system
numbers as candidate keys because they do not uniquely identify telephone numbers or words.
15. PRODUCTION DATABASE (DBMS)
16. QUERY
Literally, a question you ask about data in the database in the form of a command, written in a
query language, defining sort order and selection, that is used to generate an ad hoc list of
records; The output subset of data produced in response to a query.
17. QUERY OPTIMIZER
The query optimizer is the component of a database management system that attempts to
determine the most efficient way to execute a query. The optimizer considers the possible
query plans for a given input query, and attempts to determine which of those plans will be the
most efficient. Cost‐based query optimizers assign an estimated "cost" to each possible query
plan, and choose the plan with the smallest cost. Costs are used to estimate the runtime cost of
evaluating the query, in terms of the number of I/O operations required, the CPU requirements,
and other factors determined from the data dictionary. The set of query plans examined is
formed by examining the possible access paths (e.g. index scan, sequential scan) and join
algorithms (e.g. sort‐merge join, hash join, nested loops). The search space can become quite
large depending on the complexity of the SQL query.
Generally, the query optimizer cannot be accessed directly by users: once queries are
submitted to database server, and parsed by the parser, they are then passed to the query
optimizer where optimization occurs. However, some database engines allow guiding the query
optimizer with hints.
18. QUERY PROCESSOR
6|Page
19. RAID
RAID is an acronym first defined by David A. Patterson, Garth A. Gibson, and Randy Katz at the
University of California, Berkeley in 1987 to describe a redundant array of inexpensive disks, a
technology that allowed computer users to achieve high levels of storage reliability from low‐
cost and less reliable PC‐class disk‐drive components, via the technique of arranging the devices
into arrays for redundancy.
More recently, marketers representing industry RAID manufacturers reinvented the term to
describe a redundant array of independent disks as a means of dissociating a "low cost"
expectation from RAID technology.
"RAID" is now used as an umbrella term for computer data storage schemes that can divide and
replicate data among multiple hard disk drives. The different schemes/architectures are named
by the word RAID followed by a number, as in RAID 0, RAID 1, etc. RAID's various designs
involve two key design goals: increase data reliability and/or increase input/output
performance. When multiple physical disks are set up to use RAID technology, they are said to
be in a RAID array. This array distributes data across multiple disks, but the array is seen by the
computer user and operating system as one single disk. RAID can be set up to serve several
different purposes.
20. RELATIONAL DATABASE MODEL (DBMS)
The relational model for database management is a database model based on first‐order
predicate logic, first formulated and proposed in 1969 by E.F. Codd.
Its core idea is to describe a database as a collection of predicates over a finite set of predicate
variables, describing constraints on the possible values and combinations of values. The content
of the database at any given time is a finite (logical) model of the database, i.e. a set of
relations, one per predicate variable, such that all predicates are satisfied. A request for
information from the database (a database query) is also a predicate.
21. RELATIONSHIP
22. REPORT WRITER
Report writing refers to the transfer of data into some form or document. It occurs not only
during origination step and during distribution step, but also occurs throughout the processing
cycle.
23. SCHEMA
The schema (pronounced skee‐ma) of a database system is its structure described in a formal
language supported by the database management system (DBMS). In a relational database, the
schema defines the tables, the fields, relationships, views, indexes, packages, procedures,
functions, queues, triggers, types, sequences, materialized views, synonyms, database links,
directories, Java, XML schemas, and other elements.
7|Page
Schemas are generally stored in a data dictionary. Although a schema is defined in text
database language, the term is often used to refer to a graphical depiction of the database
structure.
24. SECONDARY STORAGE
Secondary storage in popular usage differs from primary storage in that it is not directly
accessible by the CPU. The computer usually uses its input/output channels to access secondary
storage and transfers the desired data using intermediate area in primary storage. Secondary
storage does not lose the data when the device is powered down—it is non‐volatile. Per unit, it
is typically also an order of magnitude less expensive than primary storage. Consequently,
modern computer systems typically have an order of magnitude more secondary storage than
primary storage and data is kept for a longer time there.
In modern computers, hard disk drives are usually used as secondary storage. The time taken to
access a given byte of information stored on a hard disk is typically a few thousandths of a
second, or milliseconds. By contrast, the time taken to access a given byte of information stored
in random access memory is measured in billionths of a second, or nanoseconds. This illustrates
the very significant access‐time difference which distinguishes solid‐state memory from
rotating magnetic storage devices: hard disks are typically about a million times slower than
memory. Rotating optical storage devices, such as CD and DVD drives, have even longer access
times.
25. STRUCTURED QUERY LANGUAGE (SQL)
SQL (Structured Query Language) is a database computer language designed for managing data
in relational database management systems (RDBMS). Its scope includes data query and
update, schema creation and modification, and data access control. SQL was one of the first
languages for Edgar F. Codd's relational model in his influential 1970 paper, "A Relational Model
of Data for Large Shared Data Banks" and became the most widely used language for relational
databases.
26. TABLE (Database)
In relational databases and flat file databases, a table is a set of data elements (values) that is
organized using a model of vertical columns (which are identified by their name) and horizontal
rows. A table has a specified number of columns, but can have any number of rows. Each row is
identified by the values appearing in a particular column subset which has been identified as a
candidate key.
Table is another term for relations; although there is the difference in that a table is usually a
multi‐set (bag) of rows whereas a relation is a set and does not allow duplicates. Besides the
actual data rows, tables generally have associated with them some meta‐information, such as
constraints on the table or on the values within particular columns.
8|Page
The data in a table does not have to be physically stored in the database. Views are also
relational tables, but their data are calculated at query time. Another example are nicknames,
which represent a pointer to a table in another database.
27. TRANSACTION
A database transaction comprises a unit of work performed within a database management
system (or similar system) against a database, and treated in a coherent and reliable way
independent of other transactions. Transactions in a database environment have two main
purposes:
To provide reliable units of work that allow correct recovery from failures and keep a
database consistent even in cases of system failure, when execution stops (completely
or partially) and many operations upon a database remain uncompleted, with unclear
status.
To provide isolation between programs accessing a database concurrently. Without
isolation the programs' outcomes are typically erroneous.
A database transaction, by definition, must be atomic, consistent, isolated and durable.
Database practitioners often refer to these properties of database transactions using the
acronym ACID.
Transactions provide an "all‐or‐nothing" proposition, stating that each work‐unit performed in a
database must either complete in its entirety or have no effect whatsoever. Further, the system
must isolate each transaction from other transactions, results must conform to existing
constraints in the database, and transactions that complete successfully must get written to
durable storage.
28. UNIFIED MODELING LANGUAGE (UML)
Unified Modeling Language (UML) is a standardized general‐purpose modeling language in the
field of software engineering.
The Unified Modeling Language (UML) is used to specify, visualize, modify, construct and
document the artifacts of an object‐oriented software intensive system under development.
UML offers a standard way to visualize a system's architectural blueprints, including elements
such as:
Actors
Business processes
(Logical) components
Activities
Programming language statements
Database schemas, and
Reusable software components.
9|Page
29. VIEW
In database theory, a view consists of a stored query accessible as a virtual table composed of
the result set of a query. Unlike ordinary tables (base tables) in a relational database, a view
does not form part of the physical schema: it is a dynamic, virtual table computed or collated
from data in the database. Changing the data in a table alters the data shown in subsequent
invocations of the view.
Functions (in programming) can provide abstraction, so database users can create abstraction
by using views. In another parallel with functions, database users can manipulate nested views,
thus one view can aggregate data from other views. Without the use of views the normalization
of databases above second normal form would become much more difficult. Views can make it
easier to create lossless join decomposition.
30. ATTRIBUTE
In computing, an attribute is a specification that defines a property of an object, element, or
file. An attribute of an object usually consists of a name and a value; of an element, a type or
class name; of a file, a name and extension.
Each named attribute has an associated set of rules called operations: one doesn't add
characters or manipulate and process an integer array as an image object— one doesn't
process text as type floating point (decimal numbers).
It follows that an object definition can be extended by imposing data typing: a
representation format, a default value, and legal operations (rules) and restrictions
("Division by zero is not to be tolerated!") are all potentially involved in defining an
attribute, or conversely, may be spoken of as attributes of that object's type. A JPEG file
is not decoded by the same operations (however similar they may be—these are all
graphics data formats) as a PNG or BMP file, nor is a floating point typed number
operated upon by the rules applied to typed long integers.
Example
In computer graphics, line objects can have attributes such as thickness (with real values), color
(with descriptive values such as brown or green or values defined in a certain color model, such
as RGB), dashing attributes, etc. A circle object can be defined in similar attributes plus an origin
and radius.
31. BINARY LARGE OBJECT (BLOB) DATA TYPE
A binary large object, also known as a blob, is a collection of binary data stored as a single entity
in a database management system. Blobs are typically images, audio or other multimedia
objects, though sometimes binary executable code is stored as a blob. Database support for
blobs is not universal.
Blobs were originally just amorphous chunks of data invented by Jim Starkey at DEC, who
describes them as "the thing that ate Cincinnati, Cleveland, or whatever". Later, Terry
McKiever, a marketing person for Apollo felt that it needed to be an acronym and invented the
10 | P a g e
backronym Basic Large Object. Then Informix invented an alternative backronym, Binary Large
Object
32. CENTRALIZED MODEL (DBMS)
It is centralized if the data is stored at a single computer side. A centralized model can support
many users, but the DBMS and the database themselves reside totally at a single computer
side.
33. CONCURRENCY CONTROL
In computer science, especially in the fields of computer programming (see also concurrent
programming, parallel programming), operating systems, multiprocessors, and databases,
concurrency control ensures that correct results for concurrent operations are generated, while
getting those results as quickly as possible.
Concurrency control in database management systems (DBMS) ensures that database
transactions are performed concurrently without the concurrency violating the data integrity of
a database. Executed transactions should follow the ACID rules, as described below. The DBMS
must guarantee that only serializable (unless Serializability is intentionally relaxed), recoverable
schedules are generated. It also guarantees that no effect of committed transactions is lost, and
no effect of aborted (rolled back) transactions remains in the related database.
34. CRUD
Create; read, update and delete (CRUD) are the four basic functions of persistent storage.
Sometimes CRUD is expanded with the words retrieve instead of read or destroys instead of
delete. It is also sometimes used to describe user interface conventions that facilitate viewing,
searching, and changing information; often using computer‐based forms and reports.
35. DATA ABSTRACTION
The characteristic that allows program – data independence and program operation
independence is called data abstraction.
36. DATABASE ENGINE
A database engine (or "storage engine") is the underlying software component that a database
management system (DBMS) uses to create, retrieve, update and delete (CRUD) data from a
database. One may command the database engine via the DBMS's own user interface, and
sometimes through a network port.
37. DATABASE MANAGEMENT SYSTEM (DBMS)
A Database Management System (DBMS) is a set of computer programs that controls the
creation, maintenance, and the use of the database of an organization and its end users. It
allows organizations to place control of organization‐wide database development in the hands
of database administrators (DBAs) and other specialists. DBMSs may use any of a variety of
database models, such as the network model or relational model. In large systems, a DBMS
allows users and other software to store and retrieve data in a structured way. It helps to
11 | P a g e
specify the logical organization for a database and access and use the information within a
database. It provides facilities for controlling data access, enforcing data integrity, managing
concurrency controlled, restoring database.
38. DATABASE PRACTITIONER
39. DATABASE SOFTWARE
A Database Management System (DBMS) is a set of computer programs that controls the
creation, maintenance, and the use of the database of an organization and its end users. It
allows organizations to place control of organization‐wide database development in the hands
of database administrators (DBAs) and other specialists. DBMSs may use any of a variety of
database models, such as the network model or relational model. In large systems, a DBMS
allows users and other software to store and retrieve data in a structured way. It helps to
specify the logical organization for a database and access and use the information within a
database. It provides facilities for controlling data access, enforcing data integrity, managing
concurrency controlled, and restoring database.
40. DATA CATALOG
41. DATA DICTIONARY
A data dictionary, as defined in the IBM Dictionary of Computing, is a "centralized repository of
information about data such as meaning, relationships to other data, origin, usage, and
format." The term may have one of several closely related meanings pertaining to databases
and database management systems (DBMS):
A document describing a database or collection of databases.
An integral component of a DBMS that is required to determine its structure.
A piece of middleware that extends or supplants the native data dictionary of a DBMS.
42. DATA REPOSITORY
A data dictionary, as defined in the IBM Dictionary of Computing, is a "centralized repository of
information about data such as meaning, relationships to other data, origin, usage, and
format."
43. DATA TYPE
All programming languages explicitly include the notion of data type, though different
languages may use different terminology. Most programming languages also allow the
programmer to define additional data types, usually by combining multiple elements of other
types and defining the valid operations of the new data type. For example, a programmer might
create a new data type named "Person" that specifies that data interpreted as Person would
include a name and a date of birth. Common data types may include:
Integers,
Floating‐point numbers (decimals), and
Alphanumeric strings.
12 | P a g e
44. DECISION SUPPORT DATABASE
A database from which data is extracted and analysed statistically (but not modified) in order to
inform business or other decisions. This is in contrast to an operational database which is being
continuously updated.
For example:‐
A decision support database might provide data to determine the average salary of different
types of workers, whereas an operational database containing the same data would be used to
calculate pay check amounts. Often, decision support data is extracted from operation
databases
45. DIRECT MEMORY ACCESS (DMA)
Direct memory access (DMA) is a feature of modern computers and microprocessors that
allows certain hardware subsystems within the computer to access system memory for reading
and/or writing independently of the central processing unit. Many hardware systems use DMA
including disk drive controllers, graphics cards, network cards and sound cards. DMA is also
used for intra‐chip data transfer in multi‐core processors, especially in multiprocessor system‐
on‐chips, where its processing element is equipped with a local memory (often called
scratchpad memory) and DMA is used for transferring data between the local memory and the
main memory. Computers that have DMA channels can transfer data to and from devices with
much less CPU overhead than computers without a DMA channel. Similarly a processing
element inside a multi‐core processor can transfer data to and from its local memory without
occupying its processor time and allowing computation and data transfer concurrency.
46. DISTRIBUTED MODEL (DBMS)
A distributed model or DBMS can have actual database. DBMS software is distributed over
many sites, connected by a computer network.
47. ENTITY
An entity is something that has a distinct, separate existence, though it need not be a material
existence. In particular, abstractions and legal fictions are usually regarded as entities. In
general, there is also no presumption that an entity is animate. Entities are used in system
developmental models that display communications and internal processing of, say, documents
compared to order processing.
In software engineering, an Entity‐Relationship Model (ERM) is an abstract and conceptual
representation of data. Entity‐relationship modeling is a database modeling method, used to
produce a type of conceptual schema or semantic data model of a system, often a relational
database, and its requirements in a top‐down fashion.
13 | P a g e