Pepe Vila explains at RootedCON the automata theory for reversing modern CPUs

March 11, 2020

Pepe Vila explains at RootedCON the automata theory for reversing modern CPUs

The pre-doctoral researcher at the IMDEA Software Institute, Pepe Vila, participated at RootedCON 2020 last 7th of March giving a talk titled ‘The ’80s never died: automata theory for reversing modern CPUs’ (view presentation). He explained his latest paper developed with the support and of the researchers Pierre Ganty and Marco Guarnieri, from IMDEA Software Institute, and Boris Köpf, from Microsoft Research, ‘CacheQuery: Learning Replacement Policies from Hardware Caches’ which reveals the previously unknown implementation of cache replacement policies used in modern processors.

Understanding the timing behavior of modern CPUs is crucial for optimizing code and for ensuring timing-related security and safety properties. Unfortunately, the timing behavior of modern processors depends on subtle and poorly documented details of their microarchitecture, which has triggered laborious efforts to reverse-engineer microarchitectural details. Cache replacement policies have received special attention, because they control the content stored in the memory hierarchy and hence heavily influence execution time.

In the paper, which has been accepted into PLDI'20 (Conference on Programming Language Design and Implementation), researchers from the IMDEA Software Institute and Microsoft Research present an end-to-end solution for automatically learning cache replacement policies from hardware time measurements. Their contribution is based on two main contributions: a tool, called CacheQuery, that provides an interface to any set in the cache hierarchy, freeing the user from having to deal with intricate details such as address translation, index mapping, interference from other cache levels or measurement noise; and, an algorithm, called Polka, that provides an abstract automaton for the cache replacement policy, and exploits various symmetries that make the automaton learning techniques applicable to the problem. In addition, they have been able to synthesize programs to automatically derive readable descriptions of the learned replacement policy.

In the experimental phase, their tests have been successful in learning several cache replacement policies used in Intel’s latest processors, including two previously undocumented policies.