Database Seminar: Helena Caminal (Google) – “Vector architectures for data-parallel workloads”

Speaker: Helena Caminal
Location: Soda 380
Date: October 11, 2023
Time: 11AM – 12 PM PST

Title: “Vector architectures for data-parallel workloads”


Abstract: In this talk, I will first discuss the potential of next-generation vector extensions for data-parallel workloads. The rapid growth of technological
advancements in sensors, compute power at the edge, and online activity are dramatically increasing the volume and complexity of data that need to be
managed and processed. Data-intensive data-parallel workloads, such as database analytics and deep learning, exhibit data-level parallelism.
Today, most database analytics systems run on general–purpose systems and leverage data-level parallelism via instruction-, task-, and program-level
parallelism. Columnar-layout databases enable further performance enhancement with the use of single-instruction multiple-data (SIMD) instructions. While SIMD extensions are ubiquitous in today’s CPU systems, they offer moderate performance gains and limited functionality. Next-generation vector extensions (i.e., RISC-V vector and Arm SVE2) offer a more flexible abstraction to map current data-intensive workloads given their vector-length agnosticism and predicated mode of execution. Second, I will give an overview of the associative processing (AP) computational paradigm, a concept from the 1970s used to design parallel processors using memories. Recent studies show that AP can be mapped to next-generation vector ISAs to accelerate database analytics by manifold performance gains, compared to state-of-the-art out-of-order processors with SIMD support. Database analytics, even when expressed using SIMD abstractions, suffer from memory bottlenecks. Processing-in-memory architectures attempt to
alleviate the memory bottleneck by enabling memories to process the data in-situ. The Associative Processing (AP) paradigm leverages
content-addressable memories to realize bit-serial arithmetic and logic operations, via sequences of search and update memory operations over
multiple memory rows in parallel. Associative processors exhibit data-level parallelism, content-addressability and processing-using-memory features,
which can be leveraged across a multitude of domains in the context of data-intensive problems.Lastly, I will touch on how sparsity challenges the
mapping of data-parallel workloads to parallel processor architectures. Analytical queries operating on columnar formats can be expressed as SIMDized programs by leveraging machine-specific SIMD intrinsics. Sparsity commonly appears as operators are applied to the data columns and fewer
rows get selected to be processed on the subsequent operators in the query. In SIMD or vector abstractions in general, this translates into having to
use slow vector gather and scatter memory instructions or underutilizing the hardware. How should we handle sparsity efficiently both in the hardware and in the database systems abstractions?