0% found this document useful (0 votes)

84 views20 pages

Storage System Hierarchy in DBMS

The document outlines the storage system hierarchy in DBMS, detailing various storage types from fastest to slowest, including registers, cache memory, main memory, flash storage, magnetic disks, optical disks, magnetic tapes, and cloud storage. It also discusses file organization methods such as sequential, hashed, and indexed file organization, highlighting their features, advantages, and disadvantages. Additionally, it covers indexing types, clustered and primary indexes, and their respective benefits and drawbacks in enhancing data retrieval efficiency.

Uploaded by

M. Madhusudhan M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views20 pages

Storage System Hierarchy in DBMS

Uploaded by

M. Madhusudhan M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Storage System Hierarchy in DBMS

The storage hierarchy typically has multiple levels, each with its specific
characteristics. Here's a typical hierarchy from fastest (and usually most expensive
per byte) to slowest (and usually least expensive per byte):

1. Registers
 Located within the CPU.
 Smallest and fastest type of storage.
 Used to hold data currently being processed.

2. Cache Memory (L1, L2, L3 caches)

 On or very close to the CPU.
 Extremely fast but small in size.
 Acts as a buffer for frequently used data.

3. Main Memory (RAM)

 Data that's actively being used or processed is loaded here.
 Faster than secondary storage.
 Volatile in nature (i.e., data is lost when power is turned off).

4. Flash Storage (Solid State Drives - SSD)

 No moving parts.
 Faster than traditional hard drives.
 More durable and reliable.

5. Magnetic Disks (Hard Disk Drives - HDD)

 Primary secondary storage medium.
 Non-volatile, persistent storage.
 Data is stored in tracks, sectors, and cylinders.
 Slower than main memory but offers a large storage capacity at a lower cost.
6. Optical Disks (CD, DVD, Blu-Ray)
 Data is read using lasers.
 Slower than magnetic disks and usually have less storage capacity.
 Portable and commonly used for media and software distribution.

7. Magnetic Tapes
 Sequential access storage, unlike disks which are random access.
 Often used for backups and archiving due to their high capacity and low cost.
 Much slower access times compared to magnetic disks.

8. Remote Storage/Cloud Storage

 Data stored in remote servers and accessed over the internet.
 Provides scalability, availability, and fault-tolerance.
 Latency depends on network speed and distance to servers.

Types of Storage:
1. Primary Storage: Includes registers, cache memory, and main memory
(RAM). It's the main memory where the operating system, application
programs, and data in current use are kept for quick access by the computer's
processor.
2. Secondary Storage: Encompasses data storage devices like HDDs, SSDs,
CDs, and USB drives. It is non-volatile and retains data even when the
computer is turned off.
3. Tertiary Storage or Off-line Storage: Often involves magnetic tape systems
or optical disk archives. This is slower than secondary storage and is used for
data archiving and backup.
4. Quaternary Storage: Refers to methods like cloud storage or other remote
storage techniques where data is stored in remote servers and is fetched over
the internet or other networks

File Organization and Indexing in DBMS

File Organization
File organization refers to the arrangement of data on storage devices. The method
chosen can have a profound effect on the efficiency of various database operations.
Common methods of file organization include:

Sequential (or Serial) File Organization

In sequential file organization, records are stored in sequence, one after the other,
based on a key field. This key field is a unique identifier for records, ensuring that
they have some order. The records are inserted at the end of the file, ensuring the
sequence is maintained.

Features of Sequential File Organization

 Ordered Records: Records in a sequential file are stored based on a key field.
 Continuous Memory Allocation: The records are stored in contiguous memory
locations.
 No Direct Access: To access a record, you have to traverse from the first record
until you find the desired one.

Advantages Sequential File Organization

 Simplicity: The design and logic behind sequential file organization are
straightforward.
 Efficient for Batch Processing: Since records are stored in sequence, sequential
processing (like batch updates) can be done efficiently.
 Less Overhead: There's no need for complex algorithms or mechanisms to store
records.

Disadvantages Sequential File Organization

 Inefficient for Random Access**: If you need a specific record, you may have to go
through many records before finding the desired one. This makes random access
slow.
 Insertion and Deletion**: Inserting or deleting a record (other than at the end) can be
time-consuming since you may need to shift records to maintain the order.
 Redundancy Issues**: There's a risk of redundancy if checks are not made before
inserting records. For example, a record with the same key might get added twice if
not checked.
Practical Application: Suppose you have a file of students ordered by their roll
number:

Roll No | Name
--------|--------
1 | Madhu
2 | Naveen
4 | Shivaji
5 | Durga

In a sequential file, if you wanted to add a student with roll number 6, you would
append them at the end. However, if you wanted to insert a student with a roll
number 3 which is between 2 and 4, you would need to shift all subsequent records
to maintain the sequence, which can be time-consuming.

Direct (or Hashed) File Organization

In hash file organization, a hash function is used to compute the address of a block
(or bucket) where the record is stored. The value returned by the hash function using
a record's key value is its address in the database.

Features of Hash File Organization

 Hash Function: A hash function converts a record's key value into an address.
 Buckets: A bucket typically stores one or more records. A hash function might map
multiple keys to the same bucket.
 No Ordering of Records: Records are not stored in any specific logical order.

Advantages Hash File Organization

 Rapid Access: If the hash function is efficient and there's minimal collision, the
retrieval of a record is very quick.
 Uniform Distribution: A good hash function will distribute records uniformly across
all buckets.
 Efficient Search: Searching becomes efficient as only a specific bucket needs to be
searched rather than the entire file.

Disadvantages Hash File Organization

 Collisions**: A collision occurs when two different keys hash to the same bucket.
Handling collisions can be tricky and might affect access time.
 Dependency on Hash Function**: The efficiency of a hash file organization heavily
depends on the hash function used. A bad hash function can lead to clustering and
inefficient utilization of space.
 Dynamic Growth and Shrinking**: If the number of records grows or shrinks
significantly, rehashing might be needed which is an expensive operation.
Practical Application: Imagine a database that holds information about books.
Each book has a unique ISBN number. A hash function takes an ISBN and returns
an address. When you want to find a particular book's details, you hash the ISBN,
which directs you to a particular bucket. If two books' ISBNs hash to the same value,
you handle that collision, maybe by placing the new record in a linked list associated
with that bucket.

Indexed File Organization

Indexed file organization is a method used to store and retrieve data in databases. It
is designed to provide quick random access to records based on key values. In this
organization, an index is created which helps in achieving faster search and access
times.

Features Indexed File Organization:

 Primary Data File: The actual database file where records are stored.
 Index: An auxiliary file that contains key values and pointers to the corresponding
records in the data file.
 Multi-level Index: Sometimes, if the index becomes large, a secondary (or even
tertiary) index can be created on the primary index to expedite searching further.

Advantages Indexed File Organization:

 Quick Random Access: Direct access to records is possible using the index.
 Flexible Searches: Since an index provides a mechanism to jump directly to
records, different types of search operations (like range queries) can be efficiently
supported.
 Ordered Access: If the primary file is ordered, then indexed file organization can
support efficient sequential access too.

Disadvantages Indexed File Organization:

 Overhead of Maintaining Index: Every time a record is added, deleted, or updated,

the index also needs to be updated. This can introduce overhead.
 Space Overhead: Indexes consume additional storage space.
 Complexity: Maintaining multiple levels of indexes can introduce complexity in terms
of design and implementation.
Practical Application: Consider a database that holds information about
students.
where each student has a unique student ID. The main file would contain detailed
records for each student. A separate index file would contain student IDs and
pointers to the location of the detailed records in the main file. When you want to find
a specific student's details, you first search the index to find the pointer and then use
that pointer to fetch the record directly from the main file.
Indexed Sequential Access Method (ISAM)
ISAM is a popular method for indexed file organization. In ISAM:

 The primary file is stored in a sequential manner based on a primary key.

 There's a static primary index built on the primary key.
 Overflow areas are designated for insertion of new records, which keeps the
main file in sequence. Periodically, the overflow area can be merged back into
the main file.

Indexing
Indexing involves creating an auxiliary structure (an index) to improve data retrieval
times. Just like the index in the back of a book, a database index provides pointers to
the locations of records.

Structure of Index
We can create indices using some columns of the database.

|-------------|----------------|
| Search Key | Data Reference |
|-------------|----------------|

 The search key is the database’s first column, and it contains a duplicate or
copy of the table’s candidate key or primary key. The primary key values are
saved in sorted order so that the related data can be quickly accessible.
 The data reference is the database’s second column. It contains a group of
pointers that point to the disk block where the value of a specific key can be
found.

Types of Indexes:
1. Single-level Index: A single index table that contains pointers to the actual
data records.
2. Multi-level Index: An index of indexes. This hierarchical approach reduces
the number of accesses (disk I/O operations) required to find an entry.
3. Dense and Sparse Indexes:
o In a dense index, there's an index entry for every search key value in
the database.
o In a sparse index, there are fewer index entries. One entry might point
to several records.
4. Primary and Secondary Indexes:
o A primary index is an ordered file whose records are of fixed length
with two fields. The first field is the same as the primary key, and the
second field is a pointer to the data block. There's a one-to-one
relationship between the number of entries in the index and the number
of records in the main file.
o A secondary index provides a secondary means of accessing data. For
each secondary key value, the index points to all the records with that
key value.
5. Clustered vs. Non-clustered Index:
o In a clustered index, the rows of data in the table are stored on disk in
the same order as the index. There can only be one clustered index
per table.
o In a non-clustered index, the order of rows does not match the index's
order. You can have multiple non-clustered indexes.
6. Bitmap Index: Used mainly for data warehousing setups, a bitmap index
uses bit arrays (bitmaps) and usually involves columns that have a limited
number of distinct values.
7. B-trees and B+ trees: Balanced tree structures that ensure logarithmic
access time. B+ trees are particularly popular in DBMS for their efficiency in
disk I/O operations.

Benefits of Indexing:
 Faster search and retrieval times for database operations.

Drawbacks of Indexing:
 Overhead for insert, update, and delete operations, as indexes need to be
maintained.
 Additional storage requirements for the index structures.

Cluster Indexes in DBMS

A clustered index determines the physical order of data in a table. In other words, the
order in which the rows are stored on disk is the same as the order of the index key
values. There can be only one clustered index on a table, but the table can have
multiple non-clustered (or secondary) indexes.
Characteristics of a Clustered Index:
1. It dictates the physical storage order of the data in the table.
2. There can only be one clustered index per table.
3. It can be created on columns with non-unique values, but if it's on a non-
unique column, the DBMS will usually add a uniqueifier to make each entry
unique.
4. Lookup based on the clustered index is fast because the desired data is
directly located without the need for additional lookups (unlike non-clustered
indexes which require a second lookup to fetch the data).

Clustered Index Example

Imagine a `Books` table with the following records:

| BookID | Title | Genre |

|--------|----------------------|-----------|
| 3 | A Tale of Two Cities | Fiction |
| 1 | Database Systems | Academic |
| 4 | Python Programming | Technical |
| 2 | The Great Gatsby | Fiction |

If we create a clustered index on `BookID`, the physical order of records would be

rearranged based on the ascending order of `BookID`.
The table would then look like this:

| BookID | Title | Genre |

|--------|----------------------|-----------|
| 1 | Database Systems | Academic |
| 2 | The Great Gatsby | Fiction |
| 3 | A Tale of Two Cities | Fiction |
| 4 | Python Programming | Technical |

Now, when you want to find a book based on its ID, the DBMS can quickly locate the
data because the data is stored in the order of the BookID.

Benefits of a Clustered Index:

 Fast Data Retrieval: Because the data is stored sequentially in the order of
the index key, range queries or ordered queries can be very efficient.
 Data Pages: With the data being stored sequentially, the number of data
pages that need to be read from the disk is minimized.
 No Additional Lookups: Once the key is located using the index, there's no
need for additional lookups to fetch the data, as it is stored right there.

Drawbacks of a Clustered Index:

 Overhead on Inserts/Updates: Because the data must be stored physically
in the order of the index keys, inserts or updates can be slower since they
might require data pages to be rearranged.
 Single Clustered Index: You can have only one clustered index per table, so
you have to choose wisely based on the most critical queries' requirements.

Primary Indexes in DBMS

A primary index is an ordered file whose records are of fixed length with two fields.
The first field is the same as the primary key of the data file, and the second field is a
pointer to the data block where that specific key can be found.
The primary index can be classified into two types

1. Dense Index: In this, an index entry appears for every search key value in the
data file.
2. Sparse (or Non-Dense) Index: Here, index records are created only for some
of the search key values. A sparse index reduces the size of the index file.

Characteristics of a Primary Index:

1. It is based on the primary key of the table.
2. The primary key must have unique values.
3. The index entries are sorted in the same order as the data file (hence often a
clustered index).
4. The index has one entry for each block in a data file, not for each record.

Primary Index Example

Let's assume a simple table named `Students` that stores data about students. The
table has the following structure:

| RollNumber (Primary Key) | Name | Age |

|--------------------------|-------|-----|
| 1001 | Madhu | 20 |
| 1003 | Mahi | 22 |
| 1007 | Ramu | 21 |
| 1010 | Durga | 23 |

Assuming each block of our storage can hold two records, our blocks will be:

 Block 1: Contains records for RollNumbers 1001 and 1003.

 Block 2: Contains records for RollNumbers 1007 and 1010.
Dense Primary Index:

| Key (RollNumber) | Pointer to Block |

|------------------|------------------|
| 1001 | Block 1 |
| 1003 | Block 1 |
| 1007 | Block 2 |
| 1010 | Block 2 |
Sparse Primary Index:

| Key (RollNumber) | Pointer to Block |

|------------------|------------------|
| 1001 | Block 1 |
| 1007 | Block 2 |

Benefits of a Primary Index:

 Fast retrieval of records based on the primary key.
 Efficient for range-based queries due to sorted order.
 Can be used to enforce the uniqueness of the primary key.

Secondary Indexes in DBMS

A secondary index provides a secondary means (other than the primary key) to
access table data. Unlike primary indexes, secondary indexes aren't necessarily
unique and aren't based on the primary key of a table. The records in a secondary
index are stored separately from the data, and each index entry points to the
corresponding data record. A table can have multiple secondary indexes.

Characteristics of a :
1. Provides an alternative path to access the data.
2. Can be either dense or sparse.
3. Allows for non-unique values.
4. Does not guarantee the order of records in the data file.
5. Unlike a primary (often clustered) index, a secondary index is typically a non-
clustered index. This means the physical order of rows in a table is not the
same as the index order.

Secondary Index Example

Let's continue with the `Students` table:

| RollNumber (Primary Key) | Name | Age |

|--------------------------|-------|-----|
| 1001 | John | 20 |
| 1003 | Alice | 22 |
| 1007 | Bob | 21 |
| 1010 | Clara | 23 |

Assuming we want to create a secondary index on the `Age` column:

Dense Secondary Index on Age:

| Age | Pointer to Record |

|-----|-------------------|
| 20 | Record 1 |
| 21 | Record 3 |
| 22 | Record 2 |
| 23 | Record 4 |

Here, each age value has a direct pointer to the corresponding record.
If another student with an age of 22 is added:

| RollNumber | Name | Age |

|------------|-------|-----|
| 1012 | David | 22 |

The dense secondary index would then be:

| Age | Pointer to Record |
|-----|-------------------|
| 20 | Record 1 |
| 21 | Record 3 |
| 22 | Record 2, Record 5|
| 23 | Record 4 |

Benefits of a Secondary Index:

 Provides additional query paths, potentially speeding up query performance.
 Can be created on non-primary key columns, even on columns with non-
unique values.
 Useful for optimizing specific query patterns.

Drawbacks of a Secondary Index:

 Increases the overhead of write operations since any insert, update, or delete
operation on the table may require changes to the secondary index.
 Consumes additional storage space.
 May increase the complexity of database maintenance.
In a real-world scenario, database administrators and developers need to strike a
balance. They need to identify which columns are frequently used in query
conditions and might benefit from indexing, while also considering the trade-offs
regarding space and write operation performance.

Index data Structures

One way to organize data entries is to hash data entries on the sea.rch key. Another
way to organize data entries is to build a tree-like data structure that directs a search
for data entries.

Hash-Based Indexing
In hash-based indexing, a hash function is used to convert a key into a hash code.
This hash code serves as an index where the value associated with that key is
stored. The goal is to distribute the keys uniformly across an array, so that access
time is, on average, constant.
Let's break down some of these elements to further understand how hash-based
indexing works in practice:
Buckets
In hash-based indexing, the data space is divided into a fixed number of slots known
as "buckets." A bucket usually contains a single page (also known as a block), but it
may have additional pages linked in a chain if the primary page becomes full. This is
known as overflow.

Hash Function
The hash function is a mapping function that takes the search key as an input and
returns the bucket number where the record should be located. Hash functions aim
to distribute records uniformly across buckets to minimize the number of collisions
(two different keys hashing to the same bucket).

Disk I/O Efficiency

Hash-based indexing is particularly efficient when it comes to disk I/O operations.
Given a search key, the hash function quickly identifies the bucket (and thereby the
disk page) where the desired record is located. This often requires only one or two
disk I/Os, making the retrieval process very fast.

Insert Operations
When a new record is inserted into the dataset, its search key is hashed to find the
appropriate bucket. If the primary page of the bucket is full, an additional overflow
page is allocated and linked to the primary page. The new record is then stored on
this overflow page.

Search Operations
To find a record with a specific search key, the hash function is applied to the search
key to identify the bucket. All pages (primary and overflow) in that bucket are then
examined to find the desired record.

Limitations
Hash-based indexing is not suitable for range queries or when the search key is not
known. In such cases, a full scan of all pages is required, which is resource-
intensive.

Hash-Based Indexing Example

Let's consider a simple example using employee names as the search key.
Employee Records
| Name | Age | Salary
|-----------|----------|--------
| Alice | 28 | 50000
| Bob | 35 | 60000
| Carol | 40 | 70000

Hash Function: H(x) = ASCII value of first letter of the name mod 3

 Alice: 65 mod 3 = 2
 Bob: 66 mod 3 = 0
 Carol: 67 mod 3 = 1
Buckets:
Bucket 0: Bob
Bucket 1: Carol
Bucket 2: Alice

Pros of Hash-Based Indexing

 Extremely fast for exact match queries.
 Well-suited for equality comparisons.

Cons of Hash-Based Indexing

 Not suitable for range queries (e.g., "SELECT * FROM table WHERE age
BETWEEN 20 AND 30").
 Performance can be severely affected by poor hash functions or a large
number of collisions.

Tree-based Indexing
The most commonly used tree-based index structure is the B-Tree, and its variations
like B+ Trees and B* Trees. In tree-based indexing, data is organized into a tree-like
structure. Each node represents a range of key values, and leaf nodes contain the
actual data or pointers to the data.

Why Tree-based Indexing?

Tree-based indexes like B-Trees offer a number of advantages:

 Sorted Data: They maintain data in sorted order, making it easier to perform
range queries.
 Balanced Tree: B-Trees and their variants are balanced, meaning the path
from the root node to any leaf node is of the same length. This balancing
ensures that data retrieval times are consistently fast, even as the dataset
grows.
 Multi-level Index: Tree-based indexes can be multi-level, which helps to
minimize the number of disk I/Os required to find an item.
 Dynamic Nature: B-Trees are dynamic, meaning they're good at inserting
and deleting records without requiring full reorganization.
 Versatility: They are useful for both exact-match and range queries.


Tree-based Indexing Example

Continuing with the "Students" table:

ID Name
1 Abhi
2 Bharath
3 Chinni
4 Devid

A simplified B-Tree index could look like this:

[1, 3]
/ \
[1] [3, 4]
/ \ / \
1 2 3 4

In the tree, navigating from the root to the leaf nodes will lead us to the desired data
record.

Pros of Tree-based Indexing:

 Efficient for range queries.
 Good for both exact and partial matches.
 Keeps data sorted.

Cons of Tree-based Indexing:

 Slower than hash-based indexing for exact queries.
 More complex to implement and maintain.

Comparison of File Organizations, Indexes and

Performance Tuning
File organization, indexing, and performance tuning are three interconnected areas
in the realm of Database Management Systems, each contributing to the overall
efficiency and effectiveness of data storage and retrieval. Below is a comparison of
these three concepts, focusing on their objectives, methodologies, and implications.

File Organizations
1. Objective: To physically store records on storage media in an organized manner.
2. Methodologies: Includes sequential, random (or direct), and hashed file
organizations, among others.
3. Implications:

 Sequential organization is suitable for batch processing but inefficient for

random access.
 Direct or random organization allows fast access but can be inefficient in
terms of storage space.
 Hashed file organization is excellent for equality searches but not for range-
based queries.
4. Real-world Examples: Ledger systems, log files, archival systems.

Indexing
1. Objective: To create a data structure that improves the speed of data retrieval
operations.
2. Methodologies: Includes clustered, non-clustered, primary, secondary,
composite, bitmap, and hash indexes, among others.
3. Implications:

 Clustered indexes are excellent for range-based queries but slow down
insert/update operations.
 Non-clustered indexes improve data retrieval speed but can take up additional
storage.
 Bitmap indexes are useful for low-cardinality columns.
4. Real-world Examples: Search engines, e-commerce websites, any application
that requires fast data retrieval.
Performance Tuning
1. Objective: To optimize the resources used by the database for efficient
transaction processing.
2. Methodologies: Query optimization, index tuning, denormalization, database
sharding, caching, partitioning, etc.
3. Implications:

 Query optimization can dramatically reduce the resources needed for query
processing.
 Proper indexing can mitigate the need for full-table scans.
 Denormalization and caching can improve read operations but may
compromise data integrity or consistency.
4. Real-world Examples: Financial trading systems, real-time analytics, high-
performance computing.

Points of Comparison on File organization, indexing,

and performance tuning

1. Granularity:
 File organization is about how data is stored at the file level.
 Indexing is about improving data access at the table or even column level.
 Performance tuning is a broad set of activities that can encompass both file
organization and indexing among many other techniques.

2. Resource Usage:
 File organization techniques aim to use disk space efficiently.
 Indexing aims to use both disk space and memory for fast data retrieval.
 Performance tuning aims to optimize all system resources including CPU,
memory, disk, and network bandwidth.

3. Query Efficiency:
 File organization generally impacts how efficiently data can be read or written
to disk.
 Indexing significantly impacts how efficiently queries can retrieve data.
 Performance tuning seeks to optimize both read and write operations through
a variety of methods.
4. Complexity:
 File organization is relatively straightforward.
 Indexing can become complex depending on the types of indexes and the
nature of the queries.
 Performance tuning is usually the most complex as it involves a holistic
understanding of hardware, software, data, and queries.

Chapter 8 New WEEK 11
No ratings yet
Chapter 8 New WEEK 11
68 pages
Decision Support and Business Intelligence Systems
No ratings yet
Decision Support and Business Intelligence Systems
57 pages
Pet_Management_System_Project
No ratings yet
Pet_Management_System_Project
8 pages
Info Sphere Information Analyzer - Methodology and Best Practices
No ratings yet
Info Sphere Information Analyzer - Methodology and Best Practices
127 pages
DBMS UNIT-I
No ratings yet
DBMS UNIT-I
47 pages
Unit-5 Storage and Indexing
No ratings yet
Unit-5 Storage and Indexing
100 pages
Module 5 File Organization 1
No ratings yet
Module 5 File Organization 1
37 pages
DBMS_UNIT-5
No ratings yet
DBMS_UNIT-5
68 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
25 pages
Unit 5 Stroage Structures
No ratings yet
Unit 5 Stroage Structures
41 pages
PRACTICAL RESEARCH 2 Annotated Bibliography
No ratings yet
PRACTICAL RESEARCH 2 Annotated Bibliography
18 pages
Unit 2 - Different Stages of Research
No ratings yet
Unit 2 - Different Stages of Research
17 pages
DBMSUNIT-3
No ratings yet
DBMSUNIT-3
26 pages
Chapter 5: File Organization
No ratings yet
Chapter 5: File Organization
13 pages
Database Design Succinctly
No ratings yet
Database Design Succinctly
87 pages
Module_3_DM
No ratings yet
Module_3_DM
34 pages
2022 - CMP 262 - File Organisation - Slides
No ratings yet
2022 - CMP 262 - File Organisation - Slides
19 pages
WINSEM2024-25_CBS1003_ETH_VL2024250505129_2025-04-08_Reference-Material-I
No ratings yet
WINSEM2024-25_CBS1003_ETH_VL2024250505129_2025-04-08_Reference-Material-I
12 pages
File organisation
No ratings yet
File organisation
45 pages
Upgma: Presented by Shreya Gopinath
No ratings yet
Upgma: Presented by Shreya Gopinath
17 pages
1-File Structure
No ratings yet
1-File Structure
17 pages
UNIT -V DBMS
No ratings yet
UNIT -V DBMS
27 pages
unit ii to v dbms b.com
No ratings yet
unit ii to v dbms b.com
9 pages
Module_3_DbMs(merrin)
No ratings yet
Module_3_DbMs(merrin)
28 pages
Unit 5 DBMS
No ratings yet
Unit 5 DBMS
38 pages
CDMP Chapter 1 Notes
No ratings yet
CDMP Chapter 1 Notes
17 pages
Working with ER Diagrams
No ratings yet
Working with ER Diagrams
12 pages
141-CIS Lab Manual v3
100% (2)
141-CIS Lab Manual v3
56 pages
Google Scholar Coverage of A Multidisciplinary Field: William H. Walters
No ratings yet
Google Scholar Coverage of A Multidisciplinary Field: William H. Walters
12 pages
Sameerbansalcalculus20190711113709pdf PDF Free
No ratings yet
Sameerbansalcalculus20190711113709pdf PDF Free
270 pages
DBMS_Unit-5
No ratings yet
DBMS_Unit-5
13 pages
Dbms Notes_ Unit 5
No ratings yet
Dbms Notes_ Unit 5
21 pages
File Organization in DBMS
100% (1)
File Organization in DBMS
23 pages
Dbms - Unit 5 Notes
No ratings yet
Dbms - Unit 5 Notes
30 pages
SQL Server Integration Services-1
No ratings yet
SQL Server Integration Services-1
10 pages
UNIT 5 dbms
No ratings yet
UNIT 5 dbms
25 pages
Lecture 6
No ratings yet
Lecture 6
55 pages
Database basics 1
No ratings yet
Database basics 1
42 pages
DBMS UNIT-2
No ratings yet
DBMS UNIT-2
50 pages
MCA File Structures MCA 212
No ratings yet
MCA File Structures MCA 212
31 pages
dbms 3 sem
No ratings yet
dbms 3 sem
31 pages
UNIT-3
No ratings yet
UNIT-3
64 pages
Week 14 Persistent Data Storage
No ratings yet
Week 14 Persistent Data Storage
7 pages
Chapter 5
No ratings yet
Chapter 5
28 pages
File Organization in DBMS
No ratings yet
File Organization in DBMS
23 pages
Unit-1-Lecture-9
No ratings yet
Unit-1-Lecture-9
22 pages
Unit 4 Chapter 1 Storage and Querying
No ratings yet
Unit 4 Chapter 1 Storage and Querying
37 pages
LM2 File Organisation
No ratings yet
LM2 File Organisation
31 pages
This File Contains The Following Worksheets:: Quick Instructions
No ratings yet
This File Contains The Following Worksheets:: Quick Instructions
7 pages
File Organization in Dbms
No ratings yet
File Organization in Dbms
11 pages
Unitv Part1
No ratings yet
Unitv Part1
53 pages
DS_TM_Study_Material_Presentations_Unit-4_1TM
No ratings yet
DS_TM_Study_Material_Presentations_Unit-4_1TM
22 pages
CIT-503 DAM Week 3
No ratings yet
CIT-503 DAM Week 3
50 pages
What is File Organization in DBMS
No ratings yet
What is File Organization in DBMS
5 pages
ADBMS Lec#2
No ratings yet
ADBMS Lec#2
42 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
81 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
UI UX answer (1)
No ratings yet
UI UX answer (1)
6 pages
DDL& DML COMMANDS
No ratings yet
DDL& DML COMMANDS
18 pages
Introduction To The Student Attendance Portal
No ratings yet
Introduction To The Student Attendance Portal
10 pages
Dbms Chapter 5
No ratings yet
Dbms Chapter 5
28 pages
Dbms Unit III Notes
No ratings yet
Dbms Unit III Notes
27 pages
$R101OHL
No ratings yet
$R101OHL
17 pages
Relational Calculus
No ratings yet
Relational Calculus
5 pages
Key Constraints in DBMS
No ratings yet
Key Constraints in DBMS
8 pages
Database File Organisation Lecture
No ratings yet
Database File Organisation Lecture
32 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
Cassandra Best Practices
No ratings yet
Cassandra Best Practices
49 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
File Structure
No ratings yet
File Structure
18 pages
File Organization
No ratings yet
File Organization
16 pages
Business Analytics
No ratings yet
Business Analytics
27 pages
Chapter 4 Summery
No ratings yet
Chapter 4 Summery
14 pages
9ERtoRelnl
No ratings yet
9ERtoRelnl
3 pages
file organization
No ratings yet
file organization
9 pages
Unit 6
No ratings yet
Unit 6
20 pages
Faisal_Resume
No ratings yet
Faisal_Resume
2 pages
File Organization in RDBMS
No ratings yet
File Organization in RDBMS
9 pages
R18 DBMS Unit-V
No ratings yet
R18 DBMS Unit-V
43 pages
File Organization
No ratings yet
File Organization
11 pages
Microviz An R Package For Microbiome Data Visualiz
No ratings yet
Microviz An R Package For Microbiome Data Visualiz
4 pages
Library Binding Can Be Divided Into The
No ratings yet
Library Binding Can Be Divided Into The
3 pages
Admas University Faculty of Business
No ratings yet
Admas University Faculty of Business
4 pages
In Text Citation
No ratings yet
In Text Citation
1 page
1 File Structure & Organization
No ratings yet
1 File Structure & Organization
23 pages
Seeq UseCase Continuous Process Verification
No ratings yet
Seeq UseCase Continuous Process Verification
2 pages
UNIT-6 Important Questions & Answers
No ratings yet
UNIT-6 Important Questions & Answers
20 pages
Column Name Data Type Description Constraint
No ratings yet
Column Name Data Type Description Constraint
2 pages
Guide For Activity No. 2
No ratings yet
Guide For Activity No. 2
2 pages
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
12 pages
File Organization in DBMS
No ratings yet
File Organization in DBMS
13 pages
iCEDQ Brochure - Product Datasheet
No ratings yet
iCEDQ Brochure - Product Datasheet
5 pages
Class 6
No ratings yet
Class 6
15 pages
UNIT 5 File Organization in DBMS
No ratings yet
UNIT 5 File Organization in DBMS
22 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
24 pages
DSA Unit6 Theory
No ratings yet
DSA Unit6 Theory
23 pages
C++ File Handling Step by Step: A Practical Guide with Examples
From Everand
C++ File Handling Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet

Uploaded by

Uploaded by

Storage System Hierarchy in DBMS

2. Cache Memory (L1, L2, L3 caches)

3. Main Memory (RAM)

4. Flash Storage (Solid State Drives - SSD)

5. Magnetic Disks (Hard Disk Drives - HDD)

8. Remote Storage/Cloud Storage

File Organization and Indexing in DBMS

Sequential (or Serial) File Organization

Features of Sequential File Organization

Advantages Sequential File Organization

Disadvantages Sequential File Organization

Direct (or Hashed) File Organization

Features of Hash File Organization

Advantages Hash File Organization

Disadvantages Hash File Organization

Indexed File Organization

Features Indexed File Organization:

Advantages Indexed File Organization:

Disadvantages Indexed File Organization:

 Overhead of Maintaining Index: Every time a record is added, deleted, or updated,

 The primary file is stored in a sequential manner based on a primary key.

Cluster Indexes in DBMS

Clustered Index Example

| BookID | Title | Genre |

If we create a clustered index on `BookID`, the physical order of records would be

| BookID | Title | Genre |

Benefits of a Clustered Index:

Drawbacks of a Clustered Index:

Primary Indexes in DBMS

Characteristics of a Primary Index:

Primary Index Example

| RollNumber (Primary Key) | Name | Age |

 Block 1: Contains records for RollNumbers 1001 and 1003.

| Key (RollNumber) | Pointer to Block |

| Key (RollNumber) | Pointer to Block |

Benefits of a Primary Index:

Secondary Indexes in DBMS

Secondary Index Example

| RollNumber (Primary Key) | Name | Age |

Assuming we want to create a secondary index on the `Age` column:

| Age | Pointer to Record |

| RollNumber | Name | Age |

The dense secondary index would then be:

Benefits of a Secondary Index:

Drawbacks of a Secondary Index:

Index data Structures

Disk I/O Efficiency

Hash-Based Indexing Example

Pros of Hash-Based Indexing

Cons of Hash-Based Indexing

Why Tree-based Indexing?

Tree-based Indexing Example

A simplified B-Tree index could look like this:

Pros of Tree-based Indexing:

Cons of Tree-based Indexing:

Comparison of File Organizations, Indexes and

 Sequential organization is suitable for batch processing but inefficient for

Points of Comparison on File organization, indexing,

You might also like