OOPSLA 2A: Tools for Reliability and Testing

Tue 2:00-3:30 pm - Pavilion East
Integrated Language Definition Testing: Enabling Test-Driven Language Development
Lennart C. L. Kats, Delft University of Technology, Netherlands
Rob Vermaas, Delft University of Technology, Netherlands
Eelco Visser, Delft University of Technology, Netherlands

The reliability of compilers, interpreters, and development environments for programming languages is essential for effective software development and maintenance. They are often tested only as an afterthought. Languages with a smaller scope, such as domain-specific languages, often remain untested. General-purpose testing techniques and test case generation methods fall short in providing a low-threshold solution for test-driven language development. In this paper we introduce the notion of a language-parametric testing language (LPTL) that provides a reusable, generic basis for declaratively specifying language definition tests. We integrate the syntax, semantics, and editor services of a language under test into the LPTL for writing test inputs. This paper describes the design of an LPTL and the tool support provided for it, shows use cases using examples, and describes our implementation in the form of the Spoofax testing language.

Catch Me If You Can: Performance Bug Detection in the Wild
Milan Jovic, University of Lugano, Switzerland
Andrea Adamoli, University of Lugano, Switzerland
Matthias Hauswirth, University of Lugano, Switzerland

Profilers help developers to find and fix performance problems. But do they find performance bugs - performance problems that real users actually notice? In this paper we argue that - especially in the case of interactive applications - traditional profilers find irrelevant problems but fail to find relevant bugs. We then introduce lag hunting, an approach that identifies perceptible performance bugs by monitoring the behavior of applications deployed in the wild. The approach transparently produces a list of performance issues, and for each issue provides the developer with information that helps in finding the cause of the problem. We evaluate our approach with an experiment where we monitor an application used by 24 users for 1958 hours over the course of 3-months. We characterize the resulting 881 issues, and we successfully find and fix the causes of a set of representative examples.

keywords: Profiling, Latency bug, Perceptible performance

PREFAIL: A Programmable Tool for Multiple-Failure Injection
Pallavi Joshi, University of California, Berkeley, United States
Haryadi S. Gunawi, University of California, Berkeley, United States
Koushik Sen, University of California, Berkeley, United States

As hardware failures are no longer rare in the era of cloud computing, cloud software systems must “prevail” against multiple, diverse failures that are likely to occur. Testing software against multiple failures poses the problem of combinatorial explosion of multiple failures. To address this problem, we present PreFail, a programmable failure-injection tool that enables testers to write a wide range of policies to prune down the large space of multiple failures. We integrate PreFail to three cloud software systems (HDFS, Cassandra, and ZooKeeper), show a wide variety of useful pruning policies that we can write for them, and evaluate the speed-ups that we obtain by using the policies.

Synthesizing Method Sequences for High-Coverage Testing
Suresh Thummalapenta, IBM Research, Bangalore, India
Tao Xie, Department of Computer Science, North Carolina State University, Raleigh, United States
Nikolai Tillmann, Microsoft Research, Redmond, United States
Jonathan De Halleux, Microsoft Research, Redmond, United States
Zhendong Su, Department of Computer Science, University of California, Davis, United States

High-coverage testing is challenging. Modern object-oriented programs present additional challenges for testing. One key difficulty is the generation of proper method sequences to construct desired objects as method parameters. Existing approaches suffer from low-coverage. In this paper, we cast the problem as an instance of program synthesis, which automatically produces candidates programs to satisfy a user-specified intent. In our setting, candidate programs are method sequences, and desired object states specify intent. Automatic generation of desired method sequences is difficult because of the large search space—sequences often involve methods from multiple classes and require specific primitive values. This paper introduces a novel approach, Seeker, to intelligently navigate the large search space. Seeker synergistically combines static and dynamic analyses: (1) dynamic analysis generates method sequences to cover branches; (2) static analysis uses dynamic analysis information for not-covered branches to generate candidate sequences; and (3) dynamic analysis explores and eliminates statically generated sequences. For evaluation, we implemented Seeker and demonstrate its effectiveness on object-oriented test-case generation. We show that Seeker achieves higher branch coverage and def-use coverage than existing state-of-the-art approaches on four subject applications totalling 28K LOC. We also show that Seeker detects 34 unknown defects missed by existing tools.