
Algorithms and Architectures for Parallel Processing
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
PHC: A Rapid Parallel Hierarchical Cubing Algorithm on High Dimensional OLAP. (p. 72-73)
Kongfa Hu1, Ling Chen1, and Yixin Chen2
1 Department of Computer Science and Engineering, Yangzhou University, 225009, China
2 Department of Computer Science and Engineering, Washington University, 63130, USA
Abstract. Data cube has been playing an essential role in OLAP (online analytical processing). The pre-computation of data cubes is critical for improving the response time of OLAP systems. However, as the size of data cube grows, the time it takes to perform this pre-computation becomes a significant performance bottleneck. In a high dimensional OLAP, it might not be practical to build all these cuboids and their indices. In this paper, we propose a parallel hierarchical cubing algorithm, based on an extension of the previous minimal cubing approach. The algorithm has two components: decomposition of the cube space based on multiple dimension attributes, and an efficient OLAP query engine based on a prefix bitmap encoding of the indices. This method partitions the high dimensional data cube into low dimensional cube segments. Such an approach permits a significant reduction of CPU and I/O overhead for many queries by restricting the number of cube segments to be processed for both the fact table and bitmap indices. The proposed data allocation and processing model support parallel I/O and parallel processing, as well as load balancing for disks and processors. Experimental results show that the proposed parallel hierarchical cubing method is significantly more efficient than other existing cubing methods. Keywords: data cube, parallel hierarchical cubing algorithm (PHC), high dimensional OLAP.
1 Introduction
Data warehouses integrate massive amounts of data from multiple sources and are primarily used for decision support purposes. They have to process complex analytical queries for different access forms such as OLAP (on-line analytical processing), data mining, etc. OLAP refers to the technologies that allow users to efficiently retrieve data from the data warehouse for decision support purposes [1]. A lot of research has been done in order to improve the OLAP query performance and to provide fast response times for queries on large data warehouses. Efficient indexing [2], materialization [3] and data cubing [4] are common techniques to speed up the OLAP query processing. Many efficient cube computation algorithms have been proposed recently, such as BUC [5], H-cubing [6], Quotient cubing [7], and Starcubing [8]. However, in the large data warehouse applications, such as bioinformatics, the data usually has high dimensionality with more than 100 dimensions. Since data cube grows exponentially with the number of dimensions, it is generally too costly in both computation time and storage space to materialize a full high-dimensional data cube. For example, a data cube of 100 dimensions, each with 100 distinct values, may contain as many as 101100 cells. If we consider the dimension hierarchies, the aggregate cell will increase even more tremendously. Although condensed cube [9], dwarf cube [10], or star cubes [8] can delay the explosion, it does not solve the fundamental problem [11]. The minimal cubing approach from Li and Han [11] can alleviate this problem, but does not consider the dimension hierarchies and cannot efficiently handle OLAP queries. In this paper, we develop a feasible parallel hierarchical cubing algorithm (PHC) that supports dimension hierarchies for highdimensional data cubes and answers OLAP queries efficiently. The algorithm decomposes a multi-dimensional hierarchical data cube into smaller cube segments. This proposed data allocation and processing model supports parallel I/O and parallel processing as well as load balancing for disks and processors. This proposed cubing algorithm is an efficient and scalable parallel processing algorithm for cube computation.
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.