PHYSICAL STRUCTURES

"You have a point there," said Trurl, "We will have to figure some way around that......The main thing, however, is the algorithm."
"Any child knows that! What's a beast without an algorithm?"

Stanislaw Lem, The Cyberiad Click here for Part A of audio-text lecture and feed it to the speech agent Click here for Part B of audio-text lecture and feed it to the speech agent
Click here for Part A of an audio lecture that can be played using RealPlayer Click here for Part B of an audio lecture that can be played using RealPlayer

Access Methods

Accessing is the way of obtaining data stored in storage devices.

Access methods are generalized I/O routines which conceal most device-dependent details from the file management or database management system. Some existing access routines also include functions generally attributed to DBMS, such as keyword compression, sequencing, sorting, secondary index generation, etc.

Buffering

The purpose of buffering is to allow the operating system to transfer one buffer of data, while the application program is processing data in another buffer. An example is illustrated below.

Blocking

Blocking refers to the practice of placing several logical records within one physical record. The purposes of blocking are: (a) to reduce access time by reading/writing more data in one I/O access, and (b) to increase the amount of data stored on I/O devices. The first reason is quite evident, because I/O devices such as disks and tapes usually have long latency times, and relatively short data transmission times. The second reason also becomes evident, when we consider the inter-record gaps that are wasted in I/O devices. For tapes and disks, it is common practice to block many logical records into a single physical record, in order to reduce this inter-record gap.

Sequential Access Method

The logical records are accessed in a sequential order, either according to their natural sequence of entry (or entry-sequence), or according to a fixed sequence based upon the sorted value of a key (or key-sequence).

Direct Access Method

If the records are stored in a direct-access storage device, then with the correct formatting and blocking, the records could be accessed directly by their physical position, or address, or by a key. A symbolic key may or may not be used directly to locate a logical record in the storage device. If there exists a key-to-address transformation which could be implemented by a computer program to convert a symbolic key to the physical address of the record, then we can use symbolic keys for direct access of records.

Indexed Access Method

The indexed access method can be used for either sequential or direct processing and accessing of records in a file. The records are logically arranged with respect to a certain collating sequence, according to a key field in each record. A directory, (sometimes called an index), is maintained for either sequential or direct access.

Conceptually, the concept of key-to-address transformation is analogous to using a mathematical function to specify a mapping from keys to addresses. The concept of index is analogous to using a table to specify the mapping.

An example is illustrated. Using this table (or index, directory), we can find out the address of any record, given the key value. For example, the record with key equal to '00120' has the address "1".

The index above is searched sequentially. When the number of records becomes large, such sequential searching becomes uneconomical. A search tree (also called tree structure index, multilevel index, or multilevel directory) can be used, as shown below. A search tree, or a multilevel directory, is a generalization of the binary search tree.

To find the address of record with key '04800', we start with the first level index, and compare the key '04800' with keys stored in this index. We find the first key which is greater than or equal to the search key (the key for the record to be searched) in the collating sequence, and follow the pointer to the second level index. In this example, since '06127' is larger than '04800', the second level index B is entered. We again compare '04800' against the stored keys, and find a match. Since there are no more levels of indices, the address for record with key '04800' can be found, which is '5'.

In the above example, the directory contains entry for every record. Such an index is called a dense index. If a dense index is used, the indexed access method is sometimes called the Indexed Direct Access Method (IDAM).

For Indexed Sequential Access Method (ISAM), the index is usually nondense. As an example, if the second level tables are removed, then it becomes a nondense index. When nondense index is used, the addresses in the directory are addresses of first records in blocks of records, as shown below. In order to locate the correct records, a sequential scan of records in a block has to be made.

Key-To-Address Transformation

Key-to-address transformation algorithms enable the retrieval of records in one access, by directly computing the record address from the given search key. An example is illustrated below.

Hashing Technique

In hashing schemes, the key is converted into a near-random number, and this number is used to determine the record address, or the address of a group of records, called a bucket. The number of logical records stored in this area is referred to as the bucket capacity. The hashing scheme is illustrated below.

When the file is initially loaded, the location in which the records are stored is determined as follows:

overflow area

When records are read from the file, the method of finding them is as follows:

The factors affecting the performance of a hashing algorithm are the following:

(1) The bucket size, or how many logical records can be stored per bucket. Smaller buckets cause higher proportion of overflows, which will necessitate additional bucket reads and possibly additional seeks. If a direct-access storage device is used which has a long access time, it is desirable to minimize the numbers of overflows, by choosing a bucket size of 10 or more. If the entire file is stored in main memory, then a bucket size of 1 can be chosen, so that the bucket-searching operation can be more efficient. In practice, bucket capacity is often tailored to the hardware characteristics. For example, one track of a disk unit, or half a track, may be made one bucket.

(2) The packing density, or the ratio of the number of records in a file, and the total number of records that can be stored in the buckets. Because of the random nature of the hashing algorithm, it is not possible to achieve 100% packing density. A higher packing density means more overflows, and more efficient storage. A lower packing density means less overflows, and less efficient storage. Trade-off between storage and access time can be investigated, by adjusting the two parameters - bucket size and packing density.

(3) The hashing algorithm to perform key-to-address transformation. The ideal transform is one which can distribute the key set uniformly across the address space. The best way to choose a transform is to take the key set for the file in question and simulate the behavior of many possible transforms. For each transform, all the keys in the file will be converted, and the numbers of records in each bucket counted. The percentage of overflows for given bucket sizes can be evaluated. The best hashing algorithm can be chosen then.

(4) The method of handling overflows, or the way of storing overflowed records in overflow areas. The commonly used techniques are illustrated below. Overflowed records can be organized into overflow chains in a separate overflow area (a). Overflow area can also be organized with overflow buckets, so that the number of additional disk seeks can be reduced (b). In the latter case, when a record is deleted, a deletion flag is set. Another overflow may fill it at a later time. However, if there are many deletions, the overflow area should be reorganized periodically, to remove the deleted records.

The overflow buckets can also be distributed among the buckets in prime area, as shown below (a). The method has the advantage that the overflow bucket is physically close to the overflowed bucket. No additional disk seek is necessary, and no chains have to bee maintained. In case an overflow bucket overflows, the next consecutive overflow bucket is used.

Instead of reserving overflow buckets, the easiest way to handle overflows is the consecutive spill method, where overflowed records are stored in the next consecutive bucket (b). If that bucket is full, it is stored in the next bucket, and so forth. The advantage of the consecutive spill method is that a lengthy seek is usually not needed for overflows. However, when neighborhood buckets become crowded, many buckets may have to be examined, before a free slot is found. The method is efficient if the bucket size is large, and inefficient otherwise.

The method of handling overflows should be selected, in consideration of disk access speeds, percentage of overflows, frequencies of insertion and deletion of records, bucket size, file size, etc.

Type of Indices

An index is dense if it includes all key values of a file, and nondense otherwise. Indexed sequential access method uses nondense index, and indexed direct access method uses dense index.

An index can be built on primary key (or primary composite key). It can also be built on secondary key (or secondary composite key). The keys can be compressed by the various techniques. The primary index can be either dense or nondense, depending whether ISAM or IDAM is used. The pointers in lowest-level index can be absolute pointers, logical pointers, or symbolic pointers. The pointers can point to an individual record, or a block of records, again depending upon the access method used.

There is another kind of index which is called a secondary index. A secondary index is built upon a non-key attribute, instead of a key or a composite key. A secondary index is usually employed as an auxiliary index for information retrieval from a file. Since there is another primary index for record access based upon a primary key, the secondary index is an auxiliary index. However, we must note that there is no on-to-one address-key correspondence.

The figure below illustrates some of the useful index types. Whether any index should be constructed, depends upon the access patterns exhibited by the queries, and the applications intended.

The file is shown in (a). The primary key is Ao, and the attributes are A1, A2, and A3. a non-dense primary index is shown in (b). (c) illustrates a secondary index on A1, where the pointers are logical addresses. The same secondary index can also be represented by (d), where symbolic pointers are used (so that the primary index must be used to find the records); or by (e), where the logical addresses are coded into a bit-string, and the entire matrix is called a bit matrix. (f) illustrates a nondense index on A2. This attribute has values between 0 and 30. To compress information, we have retained only six quantized levels for the attribute A2, and the logical block addresses are represented by a bit matrix. Finally, a secondary index on A3 is shown in (g), where only the logical addresses of records with A3 = "F" are retained. a composite secondary index can also be formed, as illustrated in (h), where the composite attribute {A2, A3} is used.

Construction of Primary and Secondary Indices

As an example, the file shown below has six records, with PNO as the primary key, and SEX as the descriptor. The generalized loader creates an ISAM file and an ISAM index, plus the file containing (SEX,PNO) pairs as shown. After sorting on SEX, a secondary index (or descriptor index) is created.

Inverted File Structure

In a regular file structure, the records are ordered according to the primary key. Regular file structures are suitable for answering attribute query. For example, the RESIDENT file contains six records as shown below.

The inverted file structure is shown below.

Multilist File Structure

An example of a multilist file structure is illustrated below. It should be noted that it is not necessary to have lists for every attribute, and for some attributes, quantized levels are used to reduce the number of lists and size of indices. In the directory for multilist file, the list length for every (attribute, value) pair is also stored, which can be used to decide which list to search first, in query processing.

B-Trees

An example of the construction of B-tree is illustrated below.

Whenever one index table overflows, we create two half-full index tables to replace the overflowing table. With every split, one more entry is added to the next level up. When the top level table splits, a new master level is created beginning with just two entries.

For example, a new record with key "21" is to be inserted. Since the original index table T21 overflows, it is split into two new tables T21 and T22, as shown in (b). Similarly, in (b), after the insertion of record with key "41", the original index table T23 overflows. The restructuring of the tree results in the creation of a new master table, as illustrated in (c).

The IBM VSAM file structure is illustrated below, which employs a variation of Knuth's B+ tree.

Extendible Hashing

This approach is similar to B-tree technique of dynamically reorganizing the indexing tree. An example of extendible hashing is illustrated below. Suppose the relation R(K1,K2) has two domains K1 and K2, and K1 = {A, B, C, D, E, F, G, H, I, J}, K2 = {1, 2, 3, 4, 5, 6}. The relation R is:

K1 K2
A 1
B 1
C 1
D 2
E 1
F 3
G 2
H 4
I 4
J 6

The hash function f1: K1 -> I is:

K1 I
A 0000
B 0001
C 0010
D 0011
E 0100
F 0101
G 0110
H 0111
I 1000
J 1001

The hash function f1 is a one-to-one function mapping keys to binary hash codes. However, to construct the hash file, not the entire hash code string is used. Suppose there are five records with keys 'A', 'B', 'E', 'G', and 'I', we may use only two bits of the hash code. The situation is illustrated in (a), where d is the length of the truncated hash code, in this case 2. The hash file contains pointers to blocks, each capable of holding two logical records. Attached to each block, there is another number indicating how many bits of the hash code are needed to distinguish records stored in this block. In (a), the first two blocks contain 2 records each, and the third block contains 1 record.

Suppose record (F, 3) is to be inserted. Since the truncated hash code of key "F" is "01", it should be inserted to the block containing records (E, 1) and (G, 2), causing a split because the block can accommodate only two records. After the split, one more block is added, and records in the overflowed block are distributed to the two blocks using the new hash code, which has been extended by 1 bit, as illustrated in (b). Since one more bit is used, the depth d is increased to 3, and the hash table is doubled in size. To avoid doubling the hash table, it is possible to use a tree structure to encode the hash table. The resultant dynamic hashing technique is even closer to the B-tree technique.

Key and Data Compression/Encoding Techniques

A straight forward technique for compression of character strings is to truncate strings into fixed length strings.

The construction of an index with truncated keys is shown below.

Rear-end truncation is another commonly used technique for key compression. The keys are first sorted. The truncated key is obtained by using the least number of characters, starting from the left, that will be required to completely distinguish the key from the one immediately preceding it. An example is illustrated in column 2. In order to be able to use truncated keys in an index, a bidirectional read-end truncation technique should be used. In column 3, a truncated key is obtained by using the least number of characters, starting from the left, that will be required to completely distinguish the key from the one immediately succeeding it. In column 4, the truncated key is obtained from the longer one in column 2 and column 3. Bidirectional truncated keys have the property that no truncated key is the initial segment of another truncated key, unless the truncated key is identical to the original untruncated key, in which case it is marked by a flag (an asterisk in columns 3 and 4). In making comparisons, such keys are treated as having a blank character as the rightmost character. (Thus, BELL* is treated as a five-character string, where the symbol * indicates a blank). The bidirectional truncated keys can be used in a tree index without causing confusion.

Further compression can be obtained, when the index is constructed as a parsing tree in one or more levels. With the parsing tree, the highest level of index contains a portion of the key. The next level of index contains the following character or characters, but does not repeat those which were in the higher index. An example is illustrated below. It can be seen that this technique combines key compression and tree structure methods.

Sometimes a bit matrix can be used to store a variable-length list of values of the same descriptor. For example, an employee may serve both as an accountant and as a secretary. If the job category is encoded in code#1 then the above information can be stored as {001, 101}. If the job category is encoded in code#2, then it is possible to combine the codes for ACCOUNTANT and SECRETARY into 10001, which is the logical disjunction or logical OR of the codes 00001 and 10000, as shown below.

Hoffman code is an example of a variable-length code which results in the smallest average code-word length. The procedure of generating a binary Hoffman code is illustrated below. First, the code-words are sorted into a descending sequence according to their probabilities of occurrence. Then, the two smallest probabilities are added together, leading to a higher node in the coding tree. This node has the probability as the sum of the two descendant nodes. Again, the two smallest probabilities are combined. The process repeats, until the highest node with probability 1 is reached, and a coding tree has been constructed. For each branch in this coding tree, a 0 or a 1 is assigned, as shown below. Finally, each original code-word is encoded into the binary sequence shown in the coding tree. The average code-word length in the above example is 2 bits. It can be shown that Hoffman code is theoretically optimal in the sense of having minimum average code-word length.