Colloquium: Finding Substitutable Binary Code By Synthesizing Adapters

September 24, 2018 - 11:15am to 12:15pm
3-180 Keller Hall

Stephen McCamant has been an Assistant Professor of Computer Science
and Engineering at the University of Minnesota since the fall of 2012,
where his main research area is program analysis for software security
and correctness. He is especially interested in binary code analysis
and transformation, hybrid dynamic/static techniques and symbolic
execution, information flow/taint analysis, and applications of
decision procedures. His research on software-based fault isolation
was a key foundation for the Google Native Client system, and he is
the primary author of the FuzzBALL binary symbolic execution system
which participated in the DARPA Cyber Grand Challenge and is available
open-source.  He received his Ph.D from MIT in 2008, and from
2008-2012 he was a postdoc at UC Berkeley.

In software it is common to encounter pieces of code that intuitively
implement the same functionality, but have different interfaces. A
developer might use one such fragment in place of another by writing a
wrapper function that bridges their interface difference. We build on
this idea in an approach to determine whether one code fragment can be
used in place of another. Our approach attempts to automatically
create such a wrapper (an adapter) that gives equivalent semantics.
Our counterexample-guided algorithm synthesizes such an adapter from a
family of adapters, or terminates if no suitable adapter exists in the
family, as determined by either concrete enumeration or symbolic
execution. Adapter synthesis can be applied to a variety of
development tasks: we demonstrate our binary-level implementation for
reverse-engineering functionality, switching between compatible
implementations, and avoiding security vulnerabilities, among
others. In a large-scale evaluation, our system examined more than
61,000 fragments of binary code extracted from a ARM image built for
the iPod Nano 2g, and found across more than a million synthesis tasks
that many fragments are equivalent to instances of known functions
from the VLC media player.