IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
INDEXING
Index
entries in an index. Search keys may only be constructed for
the key columns in the index, and may contain one or more
column values.
Quick overview:
hVps://en.wikipedia.org/wiki/Database_index
Less quick but more detailed overview (for geeks):
hVps://www.geeksforgeeks.org/indexing-in-databases-set-1/
Primary index:
Dense vs. Sparse Indexes
A dense index has one
lookup for each value of the
search key
A simple comparison
• 1.000.000 entries of 300B,4B search key, 4B pointers
• 4K block size, average 5,6ms for fetching a block
• 13,6 records per block, i.e., 76924. blocks with data
• 512 indexes per block, i.e., 1954 blocks for a dense
o Without index: index and 151 blocks for a sparse index
o 76924/2 = 38462 average block access.
o Takes 38462 * 5,6 ms = 215,4 s
o With dense index and binary search:
o [log2(1954)] + 1 = 11 + 1 = 12 block accesses (max.),
o Takes 12 * 5,6 ms = 67,2 ms
o 3205 Hmes faster than the one without indexes!
o With sparse index and binary search:
o [log2(151)] + 1 = 8 + 1 = 9 block accesses (max.),
Takes 9 * 5,6 ms = 50,4 ms
o 4272 Hmes faster as compared to no index, and 1,33 Hmes faster than dense index
A simple comparison
o An index can occupy several blocks
o A mulH-level index (i.e., an index on an
index) can improve performance
o ConHnuing the example:
o Needs only 1954 / 512 = 4 blocks for 2. level h
o [log2(4)] + 1 + 1 = 2 + 1 + 1 = 4 block access,
takes 4 * 5,6 ms = 22,4 ms
o 2,25 faster than a simple sparse index,
3 Hmes faster than a dense index
o One can in principle have any number of
indices
Deletion in the case of
sparse index
o Delete entry where a = 60
o No change necessary for the
index
DENSE SPARSE
Space required One index field per entry One index field per data block
Block access “Many” “Feẘ”
Access to entry Direct access Must search in the data block
Must always access the data
Exists-queries Uses only the index
block
Usage All cases Not on unsorted elements
Always updated if entry Updated only if the first entry of
Updates/changes
sequence is changed the data block is changed
Inverted Indices
oStrategy 1:
o Use an index, f.ex. on a
o Find and fetch all records with a = 30.
o Search through these records to find records with b < 5.
☺ Simple strategy
☹ Risks reading many unnecessary records from disk (storage)
Several conditions: Strategy 2
select c from R where a = 30 and b = ´x´