Arrow Research search
Back to FOCS

FOCS 1999

Cache-Oblivious Algorithms

Conference Paper Session 6B Algorithms and Complexity ยท Theoretical Computer Science

Abstract

This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are cache oblivious: no variables dependent on hardware parameters, such as cache size and cache-line length, need to be tuned to achieve optimality. Nevertheless, these algorithms use an optimal amount of work and move data optimally among multiple levels of cache. For a cache with size Z and cache-line length L where Z=/spl Omega/(L/sup 2/) the number of cache misses for an m/spl times/n matrix transpose is /spl Theta/(1+mn/L). The number of cache misses for either an n-point FFT or the sorting of n numbers is /spl Theta/(1+(n/L)(1+log/sub Z/n)). We also give an /spl Theta/(mnp)-work algorithm to multiply an m/spl times/n matrix by an n/spl times/p matrix that incurs /spl Theta/(1+(mn+np+mp)/L+mnp/L/spl radic/Z) cache faults. We introduce an "ideal-cache" model to analyze our algorithms. We prove that an optimal cache-oblivious algorithm designed for two levels of memory is also optimal for multiple levels and that the assumption of optimal replacement in the ideal-cache model. Can be simulated efficiently by LRU replacement. We also provide preliminary empirical results on the effectiveness of cache-oblivious algorithms in practice.

Authors

Keywords

  • Sorting
  • Algorithm design and analysis
  • Strontium
  • Laboratories
  • Hardware
  • Central Processing Unit
  • Banking
  • Optimization Algorithm
  • Fast Fourier Transform
  • Real Matrices
  • Transpose Of Matrix
  • Caching
  • Practical Algorithm
  • P Matrix
  • Cache Size
  • Cache Misses
  • Lower Bound
  • Hierarchical Regression
  • Iterative Algorithm
  • Tuning Parameter
  • Matrix Multiplication
  • Base Case
  • Square Matrix
  • Discrete Fourier Transform
  • Two-level Model
  • Temporal Localization
  • Recursive Algorithm
  • Power-of-two
  • L2 Cache
  • Sorting Algorithm
  • Divide-and-conquer Approach
  • Number Of Misses
  • Memory Hierarchy
  • Local Memory
  • Induction Hypothesis
  • Hash Function

Context

Venue
IEEE Symposium on Foundations of Computer Science
Archive span
1975-2025
Indexed papers
3809
Paper id
1006007533876289294