Concurrency III

Thu 10:30-12:00 pm - Salon C (GB)
session chair: Xipeng Shen
session chairXipeng Shen, The College of William and Mary, United States
Software Data-Triggered Threads
Hung-Wei Tseng, University of California, San Diego, United States
Dean Tullsen, University of California, San Diego, United States

The data-triggered threads (DTT) programming and execution model can increase parallelism and eliminate redundant computation. However, the initial proposal requires significant architecture support, which impedes existing applications and architectures from taking advantage of this model.

This work proposes a pure software solution that supports the DTT model without any hardware support. This research uses a prototype compiler and runtime libraries running on top of existing machines. Several enhancements to the initial software implementation are presented, which further improve the performance.

The software runtime system improves the performance of serial C SPEC benchmarks by 15% on a Nehalem processor, but by over 7X over the full suite of single-thread applications. It is shown that the DTT model can work in conjunction with traditional parallelism. The DTT model provides up to 64X speedup over parallel applications exploiting traditional parallelism.

Efficiently Combining Parallel Software using Fine-grained, Language-level, Hierarchical Resource Management Policies
Zachary Anderson, ETH Zurich, Switzerland

This paper presents Poli-C, a language extension, runtime library, and system daemon enabling fine-grained, language-level, hierarchical resource management policies. Poli-C is suitable for use in applications that compose parallel libraries, frameworks, and programs. In particular, we have added a powerful new statement to C for expressing resource limits and guarantees in such a way that programmers can set resource management policies even when the source code of parallel libraries and frameworks is not available. Poli-C enables application programmers to manage any resource exposed by the underlying OS, for example cores or IO bandwidth. Additionally, we have developed a domain-specific language for defining high-level resource management policies, and a facility for extending the kinds of resources that can be managed with our language extension. Finally, through a number of useful variations, our design offers a high degree of composability. We evaluate Poli-C by way of three case-studies: a scientific application, an image processing webserver, and a pair of parallel database join implementations. We found that using Poli-C yields efficiency gains that require the addition of only a few lines of code to applications.

Execution Privatization for Scheduler-Oblivious Concurrent Programs
Jeff Huang, Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong
Charles Zhang, Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong

Making multithreaded execution less non-deterministic is a promising solution to address the difficulty of concurrent programming plagued by the non-deterministic thread scheduling. In fact, a vast category of concurrent programs are scheduler-oblivious: their execution is deterministic, regardless of the scheduling behavior. We present and formally prove a fundamental observation of the privatizability property for scheduler-oblivious programs, that paves the theoretical foundation for privatizing shared data accesses on a path segment. With privatization, the non-deterministic thread interleavings on the privatized accesses are isolated and as the consequence many concurrency problems are alleviated. We further present a path and context sensitive privatization algorithm that safely privatizes the program without introducing any additional program behavior. Our evaluation results show that the privatization opportunity pervasively exists in real world large complex concurrent systems. Through privatization, several real concurrency bugs are fixed and notable performance improvements are also achieved on benchmarks.

Integrating Task Parallelism with Actors
Shams Imam, Rice University, United States
Vivek Sarkar, Rice University, United States

This paper introduces a unified concurrent programming model combining the previously developed Actor Model (AM) and task-parallel Async-Finish Model (AFM). With the advent of multi-core computers, there is a renewed interest in programming models that can support a wide range of parallel programming patterns. The proposed unified model shows how the divide-and-conquer approach of the AFM and the no-shared mutable state and event-driven philosophy of the AM can be combined to solve certain classes of problems more efficiently and productively than either of the aforementioned models individually. The unified model adds actor creation and coordination to the AFM, while also enabling parallelization within actors. This paper explains two implementations of the unified model as extensions of Habanero-Java and Habanero-Scala. The unified model adds to the foundations of parallel programs, and to the tools available for the programmer to aid in productivity and performance while developing parallel software.

 

News & Announcements

SPLASH 2013 will be in Indianapolis, Indiana. We hope to see you there!

Old News