Worst Case Efficient Data Structures

Gerth Stølting Brodal

January 1997


We study the design of efficient data structures. In particular we focus on the design of data structures where each operation has a worst case efficient implementations. The concrete problems we consider are partial persistence, implementation of priority queues, and implementation of dictionaries.

The first problem we consider is how to make bounded in-degree and out-degree data structures partially persistent, i.e., how to remember old versions of a data structure for later access. A node copying technique of Driscoll et al. supports update steps in amortized constant time and access steps in worst case constant time. The worst case time for an update step can be linear in the size of the structure. We show how to extend the technique of Driscoll et al. such that update steps can be performed in worst case constant time on the pointer machine model.

We present two new comparison based priority queue implementations, with the following properties. The first implementation supports the operations FINDMIN, INSERT and MELD in worst case constant time and DELETE and DELETEMIN in worst case time tex2html_wrap_inline60 . The priority queues can be implemented on the pointer machine and require linear space. The second implementation achieves the same worst case performance, but furthermore supports DECREASEKEY in worst case constant time. The space requirement is again linear, but the implementation requires auxiliary arrays of size tex2html_wrap_inline60 . Our bounds match the best known amortized bounds (achieved by respectively binomial queues and Fibonacci heaps). The data structures presented are the first achieving these worst case bounds, in particular supporting MELD in worst case constant time. We show that these time bounds are optimal for all implementations supporting MELD in worst case time o(n). We also present a tradeoff between the update time and the query time of comparison based priority queue implementations. Finally we show that any randomized implementation with expected amortized cost t comparisons per INSERT and DELETE operation has expected cost at least tex2html_wrap_inline68 comparisons for FINDMIN.

Next we consider how to implement priority queues on parallel (comparison based) models. We present time and work optimal priority queues for the CREW PRAM, supporting FINDMIN, INSERT, MELD, DELETEMIN, DELETE and DECREASEKEY in constant time with tex2html_wrap_inline60 processors. Our implementation is the first supporting all of the listed operations in constant time. To be able to speed up Dijkstra's algorithm for the single-source shortest path problem we present a different parallel priority data structure. With this specialized data structure we give a parallel implementation of Dijkstra's algorithm which runs in O(n) time and performs tex2html_wrap_inline74 work on a CREW PRAM. This represents a logarithmic factor improvement for the running time compared with previous approaches.

We also consider priority queues on a RAM model which is stronger than the comparison model. The specific problem is the maintenance of a set of n integers in the range tex2html_wrap_inline78 under the operations INSERT, DELETE, FINDMIN, FINDMAX and PRED (predecessor query) on a unit cost RAM with word size w bits. The RAM operations used are addition, left and right bit shifts, and bit-wise boolean operations. For any function f(n) satisfying tex2html_wrap_inline84 , we present a data structure supporting FINDMIN and FINDMAX in worst case constant time, INSERT and DELETE in worst case O(f(n)) time, and PRED in worst case tex2html_wrap_inline88 time. This represents the first priority queue implementation for a RAM which supports INSERT, DELETE and FINDMIN in worst case time tex2html_wrap_inline90 -- previous bounds were only amortized. The data structure is also the first dictionary implementation for a RAM which supports PRED in worst case tex2html_wrap_inline92 time while having worst case tex2html_wrap_inline90 update time. Previous sublogarithmic dictionary implementations do not provide for updates that are significantly faster than queries. The best solutions known support both updates and queries in worst case time tex2html_wrap_inline96 .

The last problem consider is the following dictionary problem over binary strings. Given a set of n binary strings of length m each, we want to answer d-queries, i.e., given a binary query string of length m to report if there exists a string in the set within Hamming distance d of the query string. We present a data structure of size O(nm) supporting 1-queries in time O(m) and the reporting of all strings within Hamming distance 1 of the query string in time O(m). The data structure can be constructed in time O(nm). The implementation presented is the first achieving these optimal time bounds for the preprocessing of the dictionary and for 1-queries. The data structure can be extended to support the insertion of new strings in amortized time O(m)

Available as PostScript, PDF.

[BRICS symbol] BRICS WWW home page