Arrow Research search
Back to FOCS

FOCS 2018

Deterministic Document Exchange Protocols, and Almost Optimal Binary Codes for Edit Errors

Conference Paper Accepted Paper Algorithms and Complexity · Theoretical Computer Science

Abstract

We study two basic problems regarding edit errors (insertions and deletions). The first one is document exchange, where two parties Alice and Bob hold two strings x and y with a bounded edit distance k. The goal is to have Alice send a short sketch to Bob, so that Bob can recover x based on y and the sketch. The second one is the fundamental problem of designing error correcting codes for edit errors, where the goal is to construct an explicit code to transmit a message x through a channel that can add at most k worst case insertions and deletions, so that the original message x can be successfully recovered at the other end of the channel. Both problems have been extensively studied for decades, and in this paper we focus on deterministic document exchange protocols and binary codes for insertions and deletions (insdel codes). If the length of x is n, then it is known that for small k (e. g. , k ≤ n/4), in both problems the optimal sketch size or the optimal number of redundant bits is Θ(k log n/k). In particular, this implies the existence of binary codes that can correct ε fraction of insertions and deletions with rate 1-Θ(ε log (1/ε). However, known constructions are far from achieving these bounds. In this paper we significantly improve previous results on both problems. For document exchange, we give an efficient deterministic protocol with sketch size O(k log 2 n/k). This significantly improves the previous best known deterministic protocol, which has sketch size O(k 2 + k log 2 n) [2]. For binary insdel codes, we obtain the following results: 1) An explicit binary insdel code which encodes an n-bit message x against k errors with redundancy O(k log 2 n/k). In particular this implies an explicit family of binary insdel codes that can correct ε fraction of insertions and deletions with rate 1-O(ε log 2 1/(1-ε))=1-Õ(ε). This significantly improves the previous best known result which only achieves rate 1-Õ(√ε) [11], [10], and is optimal up to a log (1/ε) factor. 1) An explicit binary insdel code which encodes an n-bit message x against k errors with redundancy O(k log n). This significantly improves the previous best known result of [4], which only works for constant k and has redundancy O(k 2 log k log n); and that of [2], which has redundancy O(k 2 + k log 2 n). Our code has optimal redundancy for k ≤ n 1-α, any constant 0 <; α <; 1. This is the first explicit construction of binary insdel codes that has optimal redundancy for a wide range of error parameters k, and this brings our understanding of binary insdel codes much closer to that of standard binary error correcting codes. In obtaining our results we introduce several new techniques. Most notably, we introduce the notion of ε-self matching hash functions and ε-synchronization hash functions. We believe our techniques can have further applications in the literature.

Authors

Keywords

  • Protocols
  • Error correction codes
  • Redundancy
  • Standards
  • Binary codes
  • Synchronization
  • Binary Code
  • Editing Errors
  • Deterministic Protocol
  • Hash Function
  • Wide Range Of Parameters
  • Edit Distance
  • Explicit Construction
  • Explicit Code
  • High Probability
  • Running Time
  • Total Size
  • Set Of Functions
  • Block Size
  • Small Interval
  • Block Length
  • Maximum Norm
  • String Length
  • Study Code
  • Split Point
  • Deletion Errors
  • Reed-Solomon Codes
  • Random Bits
  • Insertion Errors
  • Alphabet Size
  • Fractional Error
  • Output Bits
  • Codeword Length
  • Sphere Packing
  • Collision Happens
  • Edit errors, Document exchange, Error correcting codes

Context

Venue
IEEE Symposium on Foundations of Computer Science
Archive span
1975-2025
Indexed papers
3809
Paper id
57724954184954126