SODA 2011
Ordered and Unordered Top-K Range Reporting in Large Data Sets
Abstract
We study the following problem: Given an array A storing N real numbers, preprocess it to allow fast reporting of the K smallest elements in the subarray A [ i, j ] in sorted order, for any triple ( i, j, K ) with 1 ≤ i ≤ j ≤ N and 1 ≤ K ≤ j − i + 1. We are interested in scenarios where the array A is large, necessitating an I/O-efficient solution. For a parameter f with 1 ≤ f ≤ log m n, we construct a data structure that uses O(( N/f ) log m n ) space and achieves a query bound of O(log B N + fK/B ) I/Os, 1 where B is the block size, M is the size of the main memory, n: = N/B, and m: = M/B. Our main contribution is to show that this solution is nearly optimal. To be precise, we show that achieving a query bound of O(log α n + fK/B ) I/Os, for any constant α, requires space, assuming B = Ω(log N ). For M ≥ B 1+ε, this is within a log log m n factor of the upper bound. The lower bound assumes indivisibility of records and holds even if we assume K is always set to j − 1 + 1. We also show that it is the requirement that the K smallest elements be reported in sorted order which makes the problem hard. If the K smallest elements in the query range can be reported in any order, then we can obtain a linear-size data structure with a query bound of O(log B N + K/B ) I/Os.
Authors
Keywords
No keywords are indexed for this paper.
Context
- Venue
- ACM-SIAM Symposium on Discrete Algorithms
- Archive span
- 1990-2025
- Indexed papers
- 4674
- Paper id
- 79702412958515124