0% found this document useful (0 votes)
42 views22 pages

Distrubuted Database Concept

it improve the concept about a distrubted database we know that about a database is a collection of related object managed by some software package this package is called DBMS ppt is more focused on distrubuted database over acomputer network

Uploaded by

Jerry Alem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views22 pages

Distrubuted Database Concept

it improve the concept about a distrubted database we know that about a database is a collection of related object managed by some software package this package is called DBMS ppt is more focused on distrubuted database over acomputer network

Uploaded by

Jerry Alem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Distributed database concept

We can define a distributed database (DDB) as a collection of multiple logically interrelated
databases distributed over a computer network, and a distributed database management
system (DDBMS) as a software system that manages a distributed database while making
the distribution transparent to the user.
What Constitutes a DDB For a database to be called distributed, the following minimum
conditions should be satisfied:
■ Connection of database nodes over a computer network. There are multiple computers,
called sites or nodes. These sites must be connected by an underlying network to transmit
data and commands among sites.
■ Logical interrelation of the connected databases. It is essential that the information in the
various database nodes be logically related.
■ Possible absence of homogeneity among connected nodes. It is not necessary that all node.
Availability and Reliability

• Reliability and availability are two of the most common potential


advantages cited for distributed databases.
• Reliability is broadly defined as the probability that a system is
running (not down) at a certain time point.
• availability is the probability that the system is continuously available
during a time interval.
Data Fragmentation, Replication, and Allocation Techniques for
Distributed Database Design

• In this section, we discuss techniques that are used to break up the


database into logical units, called fragments.
• We also discuss the use of data replication, which permits certain
data to be stored in more than one site to increase availability and
reliability
• and the process of allocating fragments—or replicas of fragments—
for storage at the various nodes.
FRAGMENTATION

Breakup a global relation R in to a smaller relation called R1,R2...Rn such that


these smaller relation contain enough information to reconstruct the global
relation R.

1.Horizontal fragmentation
 Primary horizontal fragmentation
 Derived horizontal fragmentation
2. Vertical fragmentation
3.Hybride fragmentation
Horizontal fragmentation

•Breaking a global relation R horizontally with respect to data stored in to two or


more sub-reltion without disturbing the structurer of the ralation is called
horizontal fragmentation.
•SCHEMA(R)=SCHEMA(R1)=SCHEMA(R2)=...=SCHEMA(Rn)

Let see example student table


ID NAME GENDER CAMPUS
2050 MAME MALE WOLLO
2055 MUAZ MALE MEKELE
1555 MIRAJ MALE WELKITE
1121 MIHRET FEMALE MEKELE
1542 FIKER FEMALE AMBO
3456 SEMHAL FEMALE HARAMAYA
cont...
schema of student(ID,NAME,GENDER,CAMPUS)
Select*from student where campus=MEKELE

ID NAME GENDER CAMPUS


2055 MUAZ MALE MEKELE
1121 MIHRET FEMALE MEKELE

Horizontal fragmentation is just a selection operation performed or a


global relation R with appropriate selection condtion.
Vertical fragmentation
•In Vertical fragmentation the fragmentation are created by dividing the attribute.
•Vertical fragmentation is more complex than Horizontal fragmentation
•a Vertical fragmentation the felid or column of a table are grouped into fragments
in order to maintain each fragment should contain the primary key filed(s) of
table
•Let see an example in next slide
Cont..
P A B
1 A1 B1
2 A2 B2
3 A3 B3

P A B C D
1 A1 B1 C1 D1
2 A2 B2 C2 D2
3 A3 B3 C3 D3

P C D
1 C1 D1
2 C2 D2
3 C3 D3
Cont…
• in vertical fragmentation the field or column of a table are grouped
into fragment in order to maintain reconstructiveness each fragment
should contain the primary key field(s) of the table .
• Vertical fragmentation can be used to enforce privacy data.
Hybrid fragmentation
• In hybrid fragmentation, a combination of horizontal and vertical
fragmentation are used.
• It is the most flexible fragmentation technique since it generates
fragments with minimal extraneous information ,however
reconstruction of the original table is often expensive task.
• Hybrid fragmentation can be done in to two ways
1. At first generate a set of horizontal fragmentation then generate
vertical fragment from one or more horizontal fragment.
Cont…

2. At first generate a set of vertical fragmentation then generate horizontal


fragment from one or more of the vertical fragment.
Let see in example P A B
P A B
1 A1 B1
1 A1 B1 2 A2 B2
2 A2 B2
3 A3 B3
P A B C P A B
1 A1 B1 C1 3 A3 B3
2 A2 B2 C2
3 A3 B3 C3
p C
1 C1
2 C2
3 c3
Data Replication and Allocation
• is useful in improving the availability of data. The most extreme case is
replication of the whole database at every site in the distributed
system, thus creating a fully replicated distributed database. This can
improve availability remarkably.
• also improves performance of retrieval (read performance).
• The disadvantage of full replication is that it can slow down update
operations (write performance) drastically
• Full replication makes the concurrency control and recovery techniques
more expensive than they would be if there was no replication,
Cont…
• The other extreme involves having no replication—that is, each
fragment is stored at exactly one site. In this case, all fragments must
be disjoint This is also called nonredundant allocation.
• Between these two extremes, we have a wide spectrum of partial
replication of the data—that is, some fragments of the database may
be replicated whereas others may not.
• A description of the replication of fragments is sometimes called a
replication schema. Each fragment—or each copy of a fragment—
must be assigned to a particular site in the distributed system. This
process is called data distribution (or data allocation).
Types of Distributed Databases

• Distributed databases can be broadly classified into homogeneous and


heterogeneous distributed database environments, each with further
sub-divisions, as shown in the following illustration.
Homogeneous Distributed Databases

• Homogeneous Distributed Databases


• In a homogeneous distributed database, all the sites use identical
DBMS and operating systems. Its properties are −
• The sites use very similar software.
• Each site is aware of all other sites and cooperates with other sites to
process user requests.
• The database is accessed through a single interface as if it is a single
database.
Con….
Types of Homogeneous Distributed Database
• There are two types of homogeneous distributed database −
• Autonomous − Each database is independent that functions on its
own. They are integrated by a controlling application and use message
passing to share data updates.
• Non-autonomous − Data is distributed across the homogeneous nodes
and a central or master DBMS co-ordinates data updates across the
sites.
Heterogeneous Distributed Databases

• In a heterogeneous distributed database, different sites have different


operating systems, DBMS products and data models. Its properties are −
• Different sites use dissimilar schemas and software.
• The system may be composed of a variety of DBMSs like relational,
network, hierarchical or object oriented.
• Query processing is complex due to dissimilar schemas.
• Transaction processing is complex due to dissimilar software.
• A site may not be aware of other sites and so there is limited co-
operation in processing user requests.
Cont….
• Types of Heterogeneous Distributed Databases
• Federated − The heterogeneous database systems are independent in
nature and integrated together so that they function as a single
database system.
• Un-federated − The database systems employ a central coordinating
module through which the databases are accessed.
23.5.1 Distributed Query Processing

A distributed database query is processed in stages as follows:


1. Query Mapping. The input query on distributed data is specified formally
using a query language. it is then translated into an algebraic query on global
relations.
in this translation is largely identical to the one performed in a centralized
DBMS. It is first normalized, analyzed for semantic errors, simplified, and finally
restructured into an algebraic query.
2.Localization. In a distributed database, fragmentation results in relations
being stored in separate sites, with some fragments possibly being replicated.
This stage maps the distributed query on the global schema to separate queries
on individual fragments using data distribution and replication information.
Cont….
3. Global Query Optimization. Optimization consists of selecting a
strategy from a list of candidates that is closest to optimal.
• A list of candidate queries can be obtained by permuting the ordering
of operations within a fragment query generated by the previous stage.
• Time is the preferred unit for measuring cost. The total cost is a
weighted combination of costs such as CPU cost, I/O costs, and
communication costs.
• Since DDBs are connected by a network, often the communication
costs over the network are the most significant. This is especially true
when the sites are connected through a wide area network (WAN).
Cont…..
• 4. Local Query Optimization. This stage is common to all sites in the
DDB. The techniques are similar to those used in centralized systems.
The first three stages discussed above are performed at a central
control site, whereas the last stage is performed locally
THANK YOU

You might also like