We describe the design and implementation of a new data layout scheme, called multi-dimensional clustering, in DB2 Universal Database Version 8. Many applications, e.g., OLAP and data warehousing, process a table or tables in a database using a multi-dimensional access paradigm. Currently, most database systems can only support organization of a table using a primary clustering index. Secondary indexes are cre- ated to access the tables when the primary key index is not applicable. Unfortunately, secondary indexes perform many random I/O accesses against the table for a simple opera- tion such as a range query. Our work in multi-dimensional clustering addresses this important deciency in database systems. Multi-Dimensional Clustering is based on the def- inition of one or more orthogonal clustering attributes (or expressions) of a table. The table is organized physically by associating records with similar values for the dimension attributes in a cluster. We describe novel techniques for maintaining this physical layout ecien tly and methods of processing database operations that provide signican t per- formance improvements. We show results from experiments using a star-schema database to validate our claims of per- formance with minimal overhead.
Published in 2003.
