Highlights 2021
Synchronization Schemas for Modeling Data Streams
Abstract
We present a type-theoretic framework for data stream processing for real-time decision making, where the desired computation involves a mix of sequential computation, such as smoothing and detection of peaks and surges, and naturally parallel computation, such as relational operations, key-based partitioning, and map-reduce. To unify sequential (ordered) and relational (unordered) data models, we define synchronization schemas as types and series-parallel streams (SPS) as objects of these types. A synchronization schema imposes a hierarchical structure over relational types that succinctly captures ordering and synchronization requirements among different kinds of data items. Series-parallel streams naturally model objects such as relations, sequences, sequences of relations, sets of streams indexed by key values, time-based and event-based windows, and more complex structures obtained by nesting of these. We introduce series-parallel stream transformers (SPST) as a domain-specific language for modular specification of deterministic transformations over such streams. SPSTs provably specify only monotonic transformations allowing streamability, have a modular structure that can be exploited for correct parallel implementation, and are composable allowing specification of complex queries as a pipeline of transformations. We conclude by discussing research directions for both theory of automata, logics, and transducers for distributed stream processing.
Authors
Keywords
No keywords are indexed for this paper.
Context
- Venue
- Highlights of Logic, Games and Automata
- Archive span
- 2013-2025
- Indexed papers
- 1236
- Paper id
- 895592125328325854