Menu:Noticias:Aleks Nanevski (Ph.D. Carnegie Mellon, 2004) se ha incorporado a la plantilla de IMDEA Software como Assistant Research Professor. Realizó estancia postdoctoral en Microsoft Research, Cambridge[leer mas] Pierre Ganty (Ph.D. Université Libre de Bruxelles, 2007) se ha unido a la plantilla de IMDEA Software como Assistant Research Professor. Realizo su posdoctorado en UCLA [leer mas] |
2010 Invited talks
Monday, May 10Julien Bertrane, Researcher, Computer Science Departament, Carnegie Mellon University, USA Developing temporal abstract domains that prove the temporal specifications of reactive systemsAbstract: We consider the complex behaviors of embedded systems with several communicating imperfectly-clocked synchronous systems. In order to prove statically and automatically their properties, we introduced a "time-continuous" semantics and several ad hoc temporal abstract domains. Then we show how to build new temporal domains from these simple basic domains, through the use of abstract transformers and of reduced products. The temporal aspects of this domains eases the automatisation of some of this optimizations and makes reductions more precise.
Monday, May 3Ruy Ley-Wild, Researcher, Computer Science Departament, Carnegie Mellon University, USA Programmable Self-Adjusting ComputationAbstract: In the self-adjusting computation model, programs can respond automatically and efficiently to input changes by tracking the dynamic data dependencies of the computation and incrementally updating the output as needed. After a run from scratch, the input can be changed and the output can be updated via change-propagation, a mechanism for re-executing the portions of the computation affected by the changes while reusing the unaffected parts. Previous research shows that self-adjusting computation can be effective at achieving near-optimal update bounds for various applications. We address the question of how to write and reason about self-adjusting programs. We propose a language-based technique for annotating ordinary programs and compiling them into equivalent self-adjusting versions. We also provide a cost semantics and a concept of trace distance that enables reasoning about the effectiveness of self-adjusting computation at the source level. To facilitate asymptotic analysis, we propose techniques for composing and generalizing concrete distances via trace contexts (traces with holes). The translation preserves the extensional semantics of the source programs, the intensional cost of from-scratch runs, and ensures that change-propagation between two evaluations takes time bounded by their relative distance
Tuesday, April 13Xavier Rival, Researcher, ENS Paris Shape analysis using separating shape graphsAbstract: The purpose of shape analysis is to infer properties about unbounded data structures, such as trees or lists. We will present a shape analysis based on abstract inerpretation, which uses separation logic and inductive definitions. Our analysis targets C programs. The elements of our abstract domain can be seen as separation logic formulae or equivalently as separating shape graphs. It is parametric in the inductive definitions of the structures ti be used for a given analysis. We will present the definition of this abstract domain together with the main analysis algorithms such as materialization (local concretization) and widening (global abstraction) and our recent progress for accomodating low level aspects of the C language.
Tuesday, April 6Emerson Murphy-Hill, Researcher, University of British Columbia Programmer-Friendly Software Restructuring ToolsAbstract: Tools that perform semi-automated software restructuring (refactoring) are currently under-utilized by programmers. If more programmers adopted refactoring tools, software projects could make enormous productivity gains. However, as more advanced refactoring tools are designed, a great chasm widens between how the tools must be used and how programmers want to use them. This talk discusses work that begins to bridge this chasm by exposing usability guidelines to direct the design of the next generation of programmer-friendly refactoring tools, so that refactoring tools fit the way programmers behave, not vice-versa.
Wednesday, March 24Laurent Mauborgne, Researcher, École Normale Supérieure, France Segmented RelationsAbstract: The core mechanism of abstract interpretation based program analysis consists in approximating a process of collecting more and more behaviors until reaching a fixpoint. When considering program values this process results in relations between values, and the precise approximation of such behaviors requires relations that can express disjunctions in some way. This expressiveness is necessarily limited in order to scale up and in the Astrée analyzer we developed some relations that could express disjunctions, such as the trace partitioning domain. Recent examples presented cases where those domains could not scale, because they introduced too many disjunctions. So we developed more powerful relations that express disjunctions based on segmentations. Segmentations can be seen as ordered partitions where the boundaries can be arbitrary expressions. The use of expressions allows good expressiveness, whereas the ordering imposes constraints on the disjunction which allow efficient computation and representation. Those relations can be parameterized in order to tune the cost and the precision. They have applications for simple numerical variables but also for arrays, which where either very imprecisely handled or very costly in Astrée. The main subject of this talk is joint work with Patrick Cousot and Radhia Cousot.
Tuesday, March 23Juan Caballero, Researcher, Carnegie Mellon University, USA Binary Program Analysis and Model Extraction for Security ApplicationsAbstract: In this talk I present a platform to extract models of security-relevant functionality from program binaries, enabling multiple security applications such as active botnet infiltration, finding deviations between implementations of the same functionality, vulnerability signature generation, and finding content-sniffing cross-site scripting (XSS) attacks. In this talk, I present two applications: active botnet infiltration and finding content-sniffing XSS attacks. Botnets, large networks of infected computers under control of an attacker, are one of the dominant threats in the Internet, enabling fraudulent activities such as spamming, phishing, and distributed denial-of-service attacks. To build strong botnet defenses, defenders need information about the botnet's capabilities and the attacker's actions. One effective way to obtain that information is through active botnet infiltration, but such infiltration is challenging due to the encrypted and proprietary protocols that botnets use to communicate. In this talk, I describe techniques for reverse-engineering such protocols and present how we use this information to infiltrate a prevalent, previously not analyzed, spam botnet. Cross-site scripting attacks are the most prevalent class of attacks nowadays. One subtle class of overlooked XSS attacks are content-sniffing XSS attacks. In this talk, I present model extraction techniques and how they enable finding content-sniffing XSS attacks. We use those models to find attacks against popular web sites and browsers such as Wikipedia when accessed using Internet Explorer 7. I describe our defenses for these attacks and how our proposals have been adopted by widely used browsers such as Google Chrome and IE8, as well as standardization groups.
Monday, March 22Saurabh Srivastava, PhD candidate, University of Maryland Satisfiability-based Program Reasoning and SynthesisAbstract: Today, software is ubiquitous---it is deployed on virtually all electronic devices, small and large, including those that are life- and safety-critical. The need for robust, certifiably correct software requires us to develop the theory and tools for mechanically reasoning about, and also automatically generating, programs. In this talk, I will present the theory and practice behind a novel approach to program reasoning and program synthesis. Program reasoning is the task of verifying a program against its specification, and of inferring the specification given a program. Program synthesis is the task of automatically generating the program corresponding to a specification. I will describe novel algorithms that reduce program reasoning and synthesis to SAT/SMT solving problems, which permits leveraging the substantial engineering advances present in today's solvers. Our tools based on this theory can reason about expressive properties of programs, e.g., quantified properties, that were beyond the reach of previous techniques. They can also automatically synthesize programs from specifications, e.g., sorting programs and Strassen's matrix multiplication, which was not possible earlier.
Monday, March 15Zvonimir Rakamaric, Researcher, University of British Columbia Modular Verification of Shared-Memory Concurrent System SoftwareAbstract: Software is large, complex, and error-prone. The trend in hardware design of switching to multi-core architectures makes software development even more complex. Cutting software development costs and ensuring higher reliability of software is of global interest and a grand challenge. This is especially true of the system software that is the foundation beneath all general-purpose application programs. The verification of system software poses particular challenges: system software is typically written in a low-level programming language with dynamic memory allocation and pointer manipulation, and system software is also highly concurrent, with shared-memory communication being the main concurrent programming paradigm. Available verification tools usually perform poorly when dealing with the aforementioned challenges. In this talk I'll present my research that addresses these problems by enabling precise and scalable verification of low-level, shared-memory, concurrent programs.
Friday, March 12Viktor Vafeiadis, Researcher, University of Cambridge, UK Towards full verification of concurrent librariesAbstract: Modern programming platforms, such as Microsoft's .NET, provide libraries of efficient concurrent data structures, which are used in a wide range of applications. In this talk, I will discuss some of the challenges in implementing such concurrent data structures, what correctness of these libraries means, how one can formally prove that a given library is correct, and the extent to which these proofs can be carried out automatically.
Monday, March 8Alexander Malkis, Researcher, University of Freiburg, Germany Abstract ThreadsAbstract: Verification of large multithreaded programs is challenging. Automatic approaches cannot overcome the state explosion in the number of threads; semi-automatic methods require expensive human time for finding global inductive invariants. Ideally, automatic methods should not deal with the composition of the original threads and a human should not supply a global invariant. We provide such an approach. In our approach, a human supplies a specification of each thread in the program. Here he has the freedom to ignore or to use the knowledge about the other threads. The checks whether specifications of threads are sound as well as whether the composition of the specifications is error-free are handed over to the off-the-shelf verifiers. We show how to apply this divide-and-conquer approach to the interleaving semantics with shared variables communication where specifications are targeted to real-world programmers: a specification of a thread is simply another thread. The new approach extends thread-modular reasoning by relaxing the structure of the transition relation of a specification. We demonstrate the feasibility of our approach by verifying two protocols governing the teardown of important data structures in Windows device drivers.
Monday, March 4Mark Marron, Post-doctoral Researcher, IMDEA Software High-Level Heap Abstractions for Debugging ProgramsAbstract: The identification, isolation, and correction of program defects require the understanding of both the algorithmic structure of the code as well as the data structures that are being manipulated. While modern development environments provide substantial support for examining the program source code (the algorithmic aspect of the program), they provide relatively weak support for examining heap-based data structures. The goal of our work is to provide support for understanding data structures on the heap and the relations between them. We introduce a novel approach that emphasizes high-level concepts about heap based data structures (their layout and size) and relationships between them (sharing and connectivity). The proposed abstract representation of the program is designed to help the developer look past, often unimportant, details and focus on understanding the overall structure of the program's computation. When used in conjunction with the low-level view of individual values and objects provided by traditional debuggers, the high-level information in the abstract heap representation can be used to identify and diagnose subtle errors, as demonstrated via several case studies identifying heap related program defects. Further, we give an efficient algorithm for computing this abstract representation that scales quasilinearly with the size of the heap.
Monday, March 1Boris Köpf, Researcher, Max Planck Institute for Software Systems, Saarbruecken,Alemania Quantitative Information-Flow Analysis - Automation and ApplicationsAbstract: Information-flow analysis is a powerful technique for reasoning about the sensitive information that is exposed by a program during execution. In this talk, I will present the first automatic method for information-flow analysis that discovers what information is leaked and computes its comprehensive quantitative interpretation. The leaked information is characterized by an equivalence relation on secret inputs, and is represented by a logical assertion over the corresponding program variables. For quantifying the leaked information, we determine the number of equivalence classes and their sizes. This data provides the basis for computing a comprehensive set of information-theoretic quantities. We provide an implementation of our method that builds upon existing tools for program verification and model counting. I will give an outlook on an extension of our method based on randomization and approximation techniques. Subsequently, I will report on the quantitative information-flow analysis of cryptosystems that are protected by blinding, the state-of-the-art countermeasure against timing attacks. The analysis exhibits the modifications needed to make a blinded implementation provably secure, in the sense that the number of timing measurements required for recovering a significant number of key bits is too large for any realistic attacker. We use this result to propose the first practical countermeasure against timing attacks that is backed up by a formal security guarantee. The talk describes joint work with Michael Backes, Andrey Rybalchenko (automation), and Markus Duermuth (timing attacks).
Wednesday, February 24Joshua Dunfield, Researcher, McGill University, Montreal Verifying Functional Programs with Type RefinementsAbstract: Types express properties of programs; typechecking is specification checking. But the specifications expressed by conventional type systems are imprecise. Type refinements enable programmers to express more precise properties, while keeping typechecking decidable. I present a system that unifies and extends past work on datasort and index refinements. Intersection and union types permit a powerful type system that requires no user input besides type annotations. Instead of seeing type annotations as a burden or just as a shield against undecidability, I see them as a desirable form of machine-checked documentation. Accordingly, I embrace the technique of bidirectional typechecking for everything from dimension types to first-class polymorphism. My implementation of this type system, for a subset of Standard ML, found several bugs in the SML/NJ data structure libraries.
Tuesday, January 26Sumit Gulwani, Researcher, Microsoft, USA The Reachability-bound ProblemAbstract: The "reachability-bound problem" is the problem of finding a symbolic worst-case bound on the number of times a given control location inside a procedure is visited in terms of the inputs to that procedure. This has applications in bounding resources consumed by a program such as time, memory, network-traffic, power, as well as estimating quantitative properties (as opposed to boolean properties) of data in programs, such as amount of information leakage or uncertainty propagation. Our approach to solving the reachability-bound problem brings together three fundamentally different techniques for reasoning about loops in an effective manner. This includes (a) abstract-interpretation based iterative technique for computing precise disjunctive invariants to summarize nested loops, (b) arithmetic constraint solving based technique for computing ranking functions for individual paths inside loops, and (c) proof-rules based technique for appropriate composition of ranking functions for individual paths for precise loop bound computation. We have implemented our solution to the reachability-bound problem in a tool called SPEED, which computes symbolic computational complexity bounds for procedures in .Net code-bases. The tool scales to large programs taking an average of around one second to analyze each procedure.
Tuesday, January 26Derek Dreyer, Researcher, Max Planck Institute for Software Systems, Alemania A Modal Logic for Equational Reasoning in ML-Like LanguagesAbstract: The method of logical relations is a classic technique for proving the equivalence of higher-order programs that implement the same observable behavior but employ different internal data representations. Although it was originally studied for pure, strongly normalizing languages like System F, it has been extended over the past two decades to reason about increasingly realistic languages. In particular, Appel and McAllester's idea of step-indexing has been used recently to develop syntactic Kripke logical relations for ML-like languages that mix functional and imperative forms of data abstraction. However, while step-indexed models are powerful tools, reasoning with them directly is quite painful, as one is forced to engage in tedious step-index arithmetic to derive even simple results. In this work, we propose a logic LADR for equational reasoning about higher-order programs in the presence of existential type abstraction, general recursive types, and higher-order mutable state. LADR exhibits a novel synthesis of features from Plotkin-Abadi logic, Gödel-Löb logic, S4 modal logic, and relational separation logic. Our model of LADR is based on Ahmed, Dreyer, and Rossberg's state-of-the-art step-indexed Kripke logical relation, which was designed to facilitate proofs of representation independence for "state-dependent" ADTs. LADR enables one to express such proofs at a much higher level, without counting steps or reasoning about the subtle, step-stratified construction of possible worlds. This is joint work with Georg Neis, Andreas Rossberg, Amal Ahmed, and Lars Birkedal.
Monday, January 25Stan Rosenberg, PhD candidate, Stevens Institute of Technology, Hoboken, NJ, USA Local reasoning for Java programs and its automationAbstract: Shared mutable objects are a cornerstone of the object-oriented paradigm. Modular reasoning remains a challenge. For example, how can a client of setLeftZero be certain that the right subtree did not change after the invocation? In this talk, I will describe a program logic, called Region Logic, designed to control aliasing through explicit use of effects and disjointness assertions. The logic employs regions (sets of references) in a novel way, by using them in ghost state (i.e., as program annotations), effects and assertions. The region assertion language is especially instrumental in reasoning about many interesting "lightweight" specifications (e.g., without reachability predicates or inductive definitions). To substantiate the preceding claim, I will present a specification (in Region Logic) derived from the Composite design pattern challenge problem and describe how to automatically verify it. Lastly, I will give some insight into a decision procedure for quantifier-free region assertions.
Monday, January 25Thomas Wies, Post-doctoral Researcher, Institute of Science and Technology (IST), Austria Forward Analysis of Depth-Bounded ProcessesAbstract: We study the verification of message passing systems that admit unbounded creation of threads and name mobility. Depth-bounded processes form the most expressive known fragment of such systems for which interesting verification problems are still decidable. A configuration of a message passing system can be represented as a graph. In a depth-bounded system the lengths of the acyclic paths in all reachable configurations are bounded by a constant. Many real-life use cases of message passing concurrency are depth-bounded. We develop an adequate domain of limits for the well-structured transition systems that are induced by depth-bounded processes. An immediate consequence of our result is the existence of a forward algorithm that decides the covering problem for this class. Unlike backward algorithms, the forward algorithm terminates even if the depth of the system is not known a priori. More importantly, our result promises a whole spectrum of forward algorithms that enable the effective verification of a large class of mobile systems and concurrent programs based on message passing. This is joint work with Tom Henzinger and Damien Zufferey.
Monday, January 25Ruzica Piskac, PhD Student, EPFL, Suiza Combining Theories with Shared Set OperationsAbstract: We explore automated reasoning techniques for the non-disjoint combination of theories that share set variables and operations. We are motivated by applications to software analysis, where verification conditions are often expressed in a combination of such theories. The standard Nelson-Oppen result on combining theories does not apply in this setting, because the theories share more than just equalities. We state and prove a new combination theorem and use it to show the decidability of the satisfiability problem for a class of formulas obtained by applying propositional connectives to formulas belonging to: 1) Boolean algebra with Presburger arithmetic (with quantifiers over sets and integers) 2) weak monadic second-order logic over trees (with first and second order quantifiers) 3) two-variable logic with counting quantifiers 4) the Bernays-Schönfinkel-Ramsey class of first-order logic with equality (with \forall^* \exists^* quantifier prefix) 5) the quantifier-free logic of multisets with cardinality constraint We illustrate our result through verification conditions expressing properties of data structures. This is a joint work with Thomas Wies and Viktor Kuncak.
Monday, January 25Sumit Gulwani, Researcher, Microsoft, USA Dimensions in Program SynthesisAbstract: In this talk, I will briefly describe some recent program synthesis projects that are motivated by various reasons: enabling people with no programming background to develop utility programs, helping regular programmers automatically discover tricky/mundane details, and even discovery of new algorithms. These projects target synthesis of a variety of programs such as standard undergraduate textbook algorithms (e.g., sorting, dynamic programming), program inverses (e.g., decoders, deserializers), bit-vector manipulation routines, deobfuscators, graph algorithms, data transformers, algebraic algorithms, and mutual exclusion algorithms. Our tools today scale to synthesis of about 25 lines of code within an hour. These projects cover various design points along three dimensions in program synthesis: functional specification, search space, and search technique. (1) The functional specification is in the form of formal pre/postconditions, input-output examples, or partial/inefficient/related programs. Interactive rounds play an important role in resolving any potential under/over-specification. (2) The search space is either over imperative/functional programs, or over restricted models of computation such as regular/context-free transducers, or succinct logical representations. (3) The search technique has two components: constraint generation,and constraint solving. (a) Constraint generation is input-based, path-based, or invariant-based. (b) Constraint solving of resultant quantified formulas typically involves use of quantifier elimination techniques (such as Farkas Lemma, cover algorithms, sampling) that further enable efficient solving of quantifier-free constraints using off-the-shelf SAT/SMT solvers.
|