Execution comparison is becoming more common as a means of debugging faulty programs or simply explaining program behavior. Often, such as when debugging, the goal is to understand particular aspects of a single execution, and it is not immediately clear against what we should compare this execution. Prior work has led to approaches for acquiring a second execution, or peer, with which to compare the first. The earliest of these searched test suites for suitable candidates. More recent work focuses on synthesizing a new execution, either by generating new input for the program or by directly mutating the execution itself. In spite of these proposals, it is unclear what advantages these different techniques for finding peers might have over each other. In this paper, we implement five different existing techniques and examine their impact on 20 real bugs. These bugs represent the full set of reported bugs for three programs during one year. We present a metric to evaluate the quality of the peers. It is based on the similarity of the peers to the executions of the patched programs. We also discuss in detail the different scenarios where these techniques hold advantages.
@inproceedings{DBLP:conf/issta/SumnerBZ11,
  author    = {William N. Sumner and
               Tao Bao and
               Xiangyu Zhang},
  title     = {Selecting peers for execution comparison},
  booktitle = {ISSTA},
  year      = {2011},
  pages     = {309-319},
  ee        = {http://doi.acm.org/10.1145/2001420.2001458},
  crossref  = {DBLP:conf/issta/2011},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}