Beyond the embarrassingly parallel – New languages, compilers, and runtimes for big-data processing
Large-scale data processing requires large-scale parallelism. Data-processing systems from traditional databases to Hadoop and Spark rely on embarrassingly-parallel relational primitives (e.g. map, reduce, filter, and join) to extract parallelism from input programs. But many important applications, such as machine learning and log processing, iterate over large data sets with true loop-carried dependences across iterations. As such, these applications are not readily parallelizable in current data-processing systems.
In this talk, I will challenge the premise that parallelism requires independent computations. In particular, I will describe a general methodology for extracting parallelism from dependent computations. The basic idea is replace dependences with symbolic unknowns and execute the dependent computations symbolically in parallel. The challenge of parallelization now becomes a, hopefully mechanizable, task of performing the resulting symbolic execution efficiently. This methodology opens up the possibility of designing new languages for data-processing computations, compilers that automatically parallelize such computations, and runtimes that exploit the additional parallelism. I will describe our initial successes with this approach and the research challenges that lie ahead.
Madan Musuvathi is a Principal Researcher at Microsoft Research working in the intersection of programming languages and systems, with specific focus on concurrency and parallelism. His interests span program analysis, systems, model checking, verification, and theorem proving. His research has led to several tools that improve the lives of software developers both at Microsoft and at other companies. He received his Ph.D. from Stanford University in 2004.
Chair: Mary Lou Soffa (University of Virginia)
#4: Tongping Liu and Xu Liu. Cheetah: Detecting False Sharing Efficiently and Effectively
#27: Dehao Chen, Xinliang David Li and Tipp Moseley. AutoFDO: Automatic Feedback-directed Optimization for Warehouse-scale Applications
#32: Ivan Jibaja, Ting Cao, Steve Blackburn and Kathryn McKinley. Portable Performance on Asymmetric Multicore Processors
Chair: Dorit Nuzman (Intel)
#53: Probir Roy and Xu Liu. MemTool: A Lightweight Profiler to Guide Structure Splitting
#29: Linchuan Chen, Peng Jiang and Gagan Agrawal. Expoliting Recent SIMD Architectural Advances for Irregular Applications
#59: Hao Zhou and Jingling Xue. Exploiting Mixed SIMD Parallelism by Reducing Data Reorganization Overhead
Chair: Vijay Janapa Reddi (University of Texas)
#52: Raj Barik, Naila Farooqui, Brian Lewis, Chunling Hu and Tatiana Shpeisman. A Black-box Approach to Energy-Aware Scheduling on Integrated CPU-GPU Systems
#5: Christos Margiolas and Michael F.P. O’Boyle. Portable and Transparent Software Managed Scheduling on Accelerators for Fair Resource Sharing
#62: Dong Nguyen and Jongeun Lee. Communication-Aware Mapping of Stream Graphs for Multi-GPU Platforms
#8: Jingyue Wu, Eli Bendersky, Mark Heffernan, Chris Leary, Jacques Pienaar, Bjarke Roune, Rob Springer, Xuetian Weng and Robert Hundt. gpucc: An Open-Source GPGPU Compiler