Programme

Mar
13
Sun
2016
Building Dynamic Tools with DynamoRIO on x86 and ARM (DynamoRIO) @ Tibidabo
Mar 13 @ 9:00 am – 12:30 pm

This tutorial will present the DynamoRIO tool platform and describe how to use its API to build custom tools that utilize dynamic code manipulation for instrumentation, profiling, analysis, optimization, introspection, security, and more. The DynamoRIO tool platform was first released to the public in June 2002 and has since been used by many researchers to develop systems ranging from taint tracking to prefetch optimization.  DynamoRIO is publicly available in open source form and operates on Linux and Windows on IA-32, AMD64, and ARM platforms.

An Open-Source GPGPU Compiler (GPUCC) @ BNC A
Mar 13 @ 2:00 pm – 5:30 pm

This tutorial will present gpucc, an open-source compiler built by Google targeting CUDA and NVIDIA GPUs. gpucc performs various general and CUDA-specific optimizations to generate high performance code. It outperforms NVIDIA’s toolchain (nvcc) on internal large-scale end-to-end benchmarks by up to 51%, and is on par for several open-source benchmarks (Rodinia, SHOC and Tensor). It supports modern language features such as those in C++11 and C++14, and compiles code 8% faster than nvcc, up to 2.4x faster for pathological compiles.

This tutorial will cover the following topics:

  • Using gpucc
    • gpucc system overview: a brief description of how gpucc works under the hood
    • Detailed performance results of gpucc vs nvcc
    • Compiling CUDA programs with gpucc: a demo on how to install gpucc and compile some sample CUDA programs
  • Contributing to gpucc
    • Performance debugging: how to debug the performance of generated binary by using nvprof and observing device code
    • Writing new optimizations for gpucc
Mar
14
Mon
2016
Session 1: Profiling Feedback (Mary Lou Soffa)
Mar 14 @ 10:00 am – 11:15 am

Chair: Mary Lou Soffa (University of Virginia)

#4: Tongping Liu and Xu Liu. Cheetah: Detecting False Sharing Efficiently and Effectively

#27: Dehao Chen, Xinliang David Li and Tipp Moseley. AutoFDO: Automatic Feedback-directed Optimization for Warehouse-scale Applications

#32: Ivan Jibaja, Ting Cao, Steve Blackburn and Kathryn McKinley. Portable Performance on Asymmetric Multicore Processors

Session 2: Data Layout and Vectorization (Dorit Nuzman)
Mar 14 @ 11:35 am – 12:50 pm

Chair: Dorit Nuzman (Intel)

#53: Probir Roy and Xu Liu. MemTool: A Lightweight Profiler to Guide Structure Splitting

#29: Linchuan Chen, Peng Jiang and Gagan Agrawal. Expoliting Recent SIMD Architectural Advances for Irregular Applications

#59: Hao Zhou and Jingling Xue. Exploiting Mixed SIMD Parallelism by Reducing Data Reorganization Overhead

Session 3: GPU (Vijay Janapa Reddi)
Mar 14 @ 2:20 pm – 4:00 pm

Chair: Vijay Janapa Reddi (University of Texas)

#52: Raj Barik, Naila Farooqui, Brian Lewis, Chunling Hu and Tatiana Shpeisman. A Black-box Approach to Energy-Aware Scheduling on Integrated CPU-GPU Systems

#5: Christos Margiolas and Michael F.P. O’Boyle. Portable and Transparent Software Managed Scheduling on Accelerators for Fair Resource Sharing

#62: Dong Nguyen and Jongeun Lee. Communication-Aware Mapping of Stream Graphs for Multi-GPU Platforms

#8: Jingyue Wu, Eli Bendersky, Mark Heffernan, Chris Leary, Jacques Pienaar, Bjarke Roune, Rob Springer, Xuetian Weng and Robert Hundt. gpucc: An Open-Source GPGPU Compiler

Session 4: ACM Student Research Competition Presentations
Mar 14 @ 4:20 pm – 6:00 pm
Mar
15
Tue
2016
Session 5: Affine Programs (Louis-Noël Pouchet)
Mar 15 @ 10:00 am – 11:15 am

Chair: Louis-Noël Pouchet (Ohio State University)

#91: Daniele G. Spampinato and Markus Püschel. A Basic Linear Algebra Compiler for Structured Matrices

#38: Lénaïc Bagnères, Oleksandr Zinenko, Stéphane Huot and Cédric Bastoul. Opening Polyhedral Compiler’s Black Box

#64: Gabriel Rodríguez, José M. Andión, Mahmut Kandemir and Juan Tourino. Trace-based Affine Reconstruction of Codes

Session 6: Static Analysis (Michael O’Boyle)
Mar 15 @ 11:35 am – 12:50 pm

Chair: Michael O’Boyle (University of Edinburgh)

#42: Mateus Tymburiba, Rubens Emílio and Fernando Pereira. Inference of Peak Density of Indirect Branches to Detect ROP Attacks

#25: Yulei Sui, Peng Di and Jingling Xue. Sparse Flow-Sensitive Pointer Analysis for Multithreaded C Programs

#43: Vitor Paisante, Maroua Maalej, Leonardo Barbosa, Laure Gonnord and Fernando Pereira. Symbolic Range Analysis of Pointers

Session 7: Programming Models (Mauricio Breternitz)
Mar 15 @ 2:20 pm – 3:35 pm

Chair: Mauricio Breternitz (AMD)

#74: Vassilis Vassiliadis, Jan Riehme, Jens Deussen, Konstantinos Parasyris, Christos D. Antonopoulos, Nikolaos Bellas, Spyros Lalis and Uwe Naumann. Towards Automatic Significance Analysis for Approximate Computing

#17: Kevin Brown, Hyoukjoong Lee, Tiark Rompf, Arvind Sujeeth, Christopher De Sa, Christopher Aberger and Kunle Olukotun. Have Abstraction and Eat Performance Too: Optimized Heterogeneous Computing with Parallel Patterns

#28: Melanie Kambadur and Martha Kim. NRG-Loops: Adjusting Power from Within Applications

Mar
16
Wed
2016
Session 8: Correctness (Aaron Smith)
Mar 16 @ 10:00 am – 11:15 am

Chair: Aaron Smith (Microsoft)

#45: Soham Chakraborty and Viktor Vafeiadis. Validating Optimizations of Concurrent C/C++ Programs

#85: Ignacio Laguna, Martin Schulz, David F. Richards, Jon Calhoun and Luke Olson. IPAS: Intelligent Protection Against Silent Output Corruption in Scientific Applications

#99: Adarsh Yoga and Santosh Nagarakatte. Atomicity Violation Checker for Task Parallel Programs

Session 9: Binary/Virtualization (Soo-mook Moon)
Mar 16 @ 11:35 am – 12:50 pm

Chair: Soo-mook Moon (Seoul National University)

#95: Daniele Cono D’Elia and Camil Demetrescu. Flexible On-Stack Replacement in LLVM

#96: Byron Hawkins, Brian Demsky and Michael Taylor. BlackBox: Lightweight Security Monitoring for COTS Binaries

#69: Toshihiko Koju, Reid Copeland, Motohiro Kawahito and Moriyoshi Ohara. Re-constructing High-Level Information for Language-Specific Binary Re-optimization