Fhtec Da Lang
Fhtec Da Lang
Written at the University of Applied Sciences Technikum Wien Master Degree Programme Embedded Systems
Adavit
I hereby declare by oath that I have written this paper myself. Any ideas and concepts taken from other sources either directly or indirectly have been referred to as such. The paper has neither in the same nor similar form been handed in to an examination board, nor has it been published.
Place, Date
Signature
Abstract
Verication of embedded systems software is crucial for providing awless functionality of nowadays intelligent computer systems found in automobiles, elevators, aircrafts, medical devices, robots, etc. The common approach most widely used in industry relies on testing of dened corner cases. Although everyone is aware of the fact that only a very limited set of the test space can be covered in this way, no other more complete approaches have been widely adopted so far. Formal verication methods such as model checking complemented with various techniques to reduce state spaces has recently gained some momentum in this regard. Nevertheless, formal verication of embedded systems still played a minor role in the past. Practical restrictions of this approach are (i) due to the problem to (manually) create a model of the system beforehand and (ii) due to the resulting large state spaces. This master thesis focuses on model checking and static analysis of Intel MCS-51 assembly code with the [mc]square framework. In the presented approach, issue (i) is solved by using a dedicated target CPU simulator. In order to tackle (ii) existing abstraction techniques are adapted for the Intel MCS-51 target architecture. A novel state space reduction technique termed Delayed Nondeterminism with Look Ahead is introduced. The presented abstraction technique centers around the coherence among boolean operators with particular regard to the 3-valued microcontroller memory model. Besides, the Intel MCS-51 CPU simulator is integrated into the existing static analysis framework of [mc]square. A novel data-ow analysis termed Register Bank Analysis is described in order to handle register bank swapping. Register bank swapping is a particular feature of some embedded microcontrollers such as the Intel MCS-51. This approach allows narrowing and rening the subsequent data-ow analyses, leading to more precise analysis results. The additional precision in turn contributes to a reduction of state spaces during model checking. In order to evaluate the benets and to show the applicability of the introduced concepts, a real world case study is conducted. The case study source code is taken from an industrial application. The microcontroller software is model checked with [mc]square by taking advantage of the presented state space abstractions and static analysis techniques.
Keywords: Assembly code model checking, static analysis of assembly code, abstraction techniques, case study, [mc]square
Kurzfassung
Die Verikation von Software fr Embedded Systems ist ein notwendiges Kriterium um die fehlerfreie Funktion von intelligenten Computersystemen in Automobilen, Aufzgen, Flugzeugen, medizintechnischen Gerten, Robotern, usw. zu garantieren. Die in der Industrie weit verbreitete Standardmethode beruht auf dem Abdecken von einigen wenigen reprsentativen Testfllen. Es ist bekannt, dass dieser Ansatz nur eine sehr kleine Menge des tatschlichen Testraums abdecken kann, trotzdem gibt es nur wenig ausgereifte Konzepte um diesen Verikationsmistand zu beseitigen. Formale Verikationsmethoden wie Model Checking sind vielversprechende Anstze um die Fehlerfreiheit von Software zu zeigen. Im Kontext von Embedded Systems spielten diese formalen Anstze in der Vergangenheit nur eine untergeordneter Rolle. In der Praxis zeigen sich Schwierigkeiten durch (i) die manuell durchgefhrte Modellierung des Systems und (ii) die unhandbar groen Zustandsrume. Diese Masterarbeit beschftigt sich mit Model Checking und Statischer Analyse von Intel MCS-51 Assembler Code unter Zuhilfenahme des [mc]square Frameworks. Der vorgestellte Ansatz versucht das Problem (i) durch einen speziellen Mikrocontrollersimulator zu lsen. Bestehende Abstraktionstechniken werden fr den Intel MCS-51 Mikrocontroller angepasst um die entstehenden Zustandsrume zu minimieren (ii). Eine neue Zustandsreduktion namens Delayed Nondeterminism with Look Ahead wird vorgestellt. Dieser Ansatz basiert auf den Zusammenhngen zwischen Boolescher Logik und dem dreiwertigen Speichermodell des Mikrocontrollersimulators. Weiters wird der vorhandene Intel MCS-51 Simulator in das Statische Analyse Framework von [mc]square integriert. Eine neuartige Datenussanalyse (Register Bank Analysis) wird entwickelt um das architekturbedingte Umschalten von Registerbnken zu bercksichtigen. Dieser Ansatz erlaubt es die nachfolgenden Analyseergebnisse einzugrenzen und zu przisieren. Diese gewonnene Przision erlaubt eine weitere Zustandsreduktion whrend des Model Checkings. Um die Vorteile und die Anwendbarkeit der vorgestellten Konzepte zu demonstrieren wird eine Fallstudie vorgestellt. Die Software der Fallstudie stammt aus einer industriellen Anwendung. Das Mikrocontrollerprogramm wir unter Zuhilfenahme der vorgestellten Abstraktionstechniken und statischen Analysen mit dem [mc]square Model Checker veriziert.
Schlagwrter: Assembler Code Model Checking, Statische Analyse von Assembler Code, Abstraktionstechniken, Fallstudie, [mc]square
Acknowledgements
Not because it is customary, but because it is appropriate: I would like to thank my advisor, FH-Prof. Dr. Martin Horauer, for his excellent guidance and for giving me the opportunity to join one of his research projects within the Department of Embedded Systems at the University of Applied Sciences FH Technikum Wien. He allowed me a great degree of freedom in my work and kindly helped me to gain ground in academic work. Most valuable to me were his numerous tips, his pragmatic approach of doing things, and our fruitful discussions both related and unrelated to work. I highly enjoyed the time working together. Next, I want to thank Dr. Bastian Schlich and his team from the Embedded Software Laboratory at the RWTH Aachen University. Even though we were most time geographically separated, he greatly contributed to set up a smooth and rich collaboration. He was always willing to listen to my problems and gave me plenty of support to get started with model checking and [mc]square. Last but denitely not least I thank my family and friends. They supported me in everything I did and greatly helped me to make my way.
Contents
1 Motivation and Introduction 2 Contribution 2.1 Status Quo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Long-Term Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Background 3.1 A World Where Nothing Works and Nobody Knows Why . . 3.2 Formal Verication . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The Verication Problem . . . . . . . . . . . . . . . . . . . . 3.4 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 The Model Checking Problem . . . . . . . . . . . . . . 3.4.2 The Kripke Structure . . . . . . . . . . . . . . . . . . 3.4.3 The Temporal Logic CTL . . . . . . . . . . . . . . . . 3.4.4 The Model Checking Workow . . . . . . . . . . . . . 3.4.5 Coee Vending Machine Example . . . . . . . . . . . . 3.4.6 Local vs. Global Model Checking Algorithms . . . . . 3.4.7 The Pros and Cons of Model Checking . . . . . . . . . 3.5 Assembly Code Model Checking and [mc]square . . . . . . . 3.6 C51Simulator Intel MCS-51 Simulator Component . . . . . 3.6.1 The Intel MCS-51 Microcontroller . . . . . . . . . . . 3.6.2 The Big Picture . . . . . . . . . . . . . . . . . . . . . 3.6.3 Test and Verication of the C51Simulator Component 3.6.4 The Software Architecture of the C51Simulator . . . . 3.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 The Assembly Code Model Checking Approach . . . . 3.7.2 3-valued Abstraction Techniques . . . . . . . . . . . . 3.7.3 Static Analysis . . . . . . . . . . . . . . . . . . . . . . 3.7.4 Simulators for [mc]square . . . . . . . . . . . . . . . 1 3 3 3 4 5 5 6 7 8 9 9 9 11 12 13 13 14 16 16 18 18 19 21 21 22 23 23 25 25 25 26 28 30 30 31 38
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
4 Abstraction Techniques 4.1 Abstraction in Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Reducing System Complexity through Abstraction . . . . . . . . . . 4.1.2 Turings Halting Problem and Why Model Checking Works Anyway 4.1.3 Nondeterministic Behavior in Assembly Code Model Checking . . . . 4.2 Implementation Abstraction Techniques for the C51Simulator . . . . . . . 4.2.1 Delayed Nondeterminism . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Delayed Nondeterminism with Look Ahead . . . . . . . . . . . . . . 4.2.3 Nondeterministic Program Status Word . . . . . . . . . . . . . . . .
5 Static Analysis 5.1 Background Static Analysis of Embedded Systems Code . . . . . . 5.1.1 Control Flow Graphs . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Data-ow Analysis . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Forward Data-ow Analysis - RDA . . . . . . . . . . . . . . . 5.1.4 Backward Data-ow Analysis - LVA . . . . . . . . . . . . . . 5.1.5 Solving Data-ow Equations . . . . . . . . . . . . . . . . . . . 5.2 Implementation Static Analysis for the C51Simulator . . . . . . . . 5.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Control Flow Graph Building . . . . . . . . . . . . . . . . . . 5.2.3 Action List Building . . . . . . . . . . . . . . . . . . . . . . . 5.2.4 Live Variable Analysis . . . . . . . . . . . . . . . . . . . . . . 5.2.5 Reaching Denitions Analysis . . . . . . . . . . . . . . . . . . 5.2.6 Register Bank Analysis . . . . . . . . . . . . . . . . . . . . . 5.2.7 Stack Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.8 Interrupt Flag Analysis . . . . . . . . . . . . . . . . . . . . . 5.2.9 Path Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.10 Implementation Summary . . . . . . . . . . . . . . . . . . . . 5.3 Remaining Challenges in Static Analysis of (Intel MCS-51) Assembly 5.3.1 Indirect Addressing . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Indirect Control Flow . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Self-Modifying Code . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Loop Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Real Life Case Study 6.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . 6.2 The Knitting Machine Monitoring Device Hardware Overview 6.3 The Knitting Machine Monitoring Device Software Overview 6.3.1 The Main Bulding Blocks . . . . . . . . . . . . . . . . . 6.3.2 Serial Receive and Transmit Ringbuer . . . . . . . . . 6.3.3 The Communication Protocol . . . . . . . . . . . . . . . 6.4 Extracting CTL Properties Out of the Textual Specication . . 6.4.1 The Given Textual Specication . . . . . . . . . . . . . 6.4.2 CTL Properties . . . . . . . . . . . . . . . . . . . . . . . 6.4.3 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.4 Reviewing Properties #4a to #8a . . . . . . . . . . . . 6.4.5 Communication Protocol Verication . . . . . . . . . . . 6.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 Stack Analysis . . . . . . . . . . . . . . . . . . . . . . . 6.5.3 The Circular Buer Implementation . . . . . . . . . . . 6.5.4 The Receiver State Machine . . . . . . . . . . . . . . . . 6.5.5 Properties #4a to #8a . . . . . . . . . . . . . . . . . . . 6.5.6 The Communication Protocol . . . . . . . . . . . . . . . 6.5.7 Compiler Criticism . . . . . . . . . . . . . . . . . . . . . 6.5.8 Comparison of Abstraction Techniques . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Code . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
II
7 Remaining Challenges and Future Work 7.1 Local Model Checking and Resulting Counterexamples . . . 7.2 Getting the Intel MCS-51 Simulator Implementation Right . 7.3 The Automatic Generated Target Simulator . . . . . . . . . 7.4 Counterexample Validation . . . . . . . . . . . . . . . . . . 7.5 Coping the State-Explosion Problem . . . . . . . . . . . . . 8 Conclusion Bibliography List of Figures List of Tables List of Algorithms List of Listings List of Abbreviations
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
105 105 105 106 106 107 109 111 120 121 123 125 127
III
IV
Embedded Systems are becoming ubiquitous. Most existing intelligent computer systems do not even have a screen or input devices. They are embedded and therefore hidden in various kinds of objects: automobiles, elevators, aircrafts, medical devices, industrial robots etc. The demand of eciency and exibility in information processing, leads to a movement from manual, mechanical, and hydraulic systems towards highly integrated embedded solutions. Each day, we are putting an increasing trust in these software and hardware systems. Software is the main enabler for innovative features and new application areas and most times the elaborate part of the system. However, a fact that is often overseen is the natural imperfection of the design team involved in the software implementation process. The ever increasing system complexity is another contributor to the vulnerability of state of the art embedded systems [1]. The assembly code that was written for the rst moon landing in 1969 was the estimated equivalent of 7500 lines of C code. The code had to t into the few kByte of program memory featured by the mission computer [2]. Nowadays embedded solutions can scale up easily to an amount of a few million lines of code. A lot of things may have changed since then, but one decisive question remains: How to guarantee and prove that the software is working correctly, without any aws? Formal verication methods such as model checking, theorem proving, and abstract interpretation have gained some momentum in verifying those systems. Indeed, almost all notable software companies [3, 4, 5] have developed and deployed model checking tools to ensure design correctness. In 2008, the achievements of model checking were greatly honored when the Association for Computing Machinery (ACM) awarded the prestigious Turing Award the Nobel Prize in computer science to the pioneers in this eld: Edmund Clarke, Allen Emerson, and Joseph Sifakis. As of today, model based software development and formal verication is well established in most of todays software engineering processes. Nevertheless, formal verication has played a minor role in the context of embedded systems in the past. The reasons are manifold, e.g., past model checking tools were only capable of handling small designs with a few hundred lines of machine code and generating the required behavioral models is most times tedious, challenging, and error-prone. This is especially true for the area of embedded systems. Software written for embedded systems is always linked to a certain application and a target hardware platform. Microcontroller specic programming language extensions are used to access particular hardware features that cannot be enabled by high level
programming language syntax. Formal verication of the high level application code is often not sucient to master the verication challenges of high-reliable and safety critical applications. Target platform peculiarities make formal verication of existing embedded software a tough job. Recently, model checking of assembly code became the focus of research projects [6, 7, 8]. It has some remarkable advantages compared to model checking programs written in high level programming languages. The code that is deployed to the hardware is checked and not just an intermediate representation, thus, any errors introduced during the development process can be found (e.g., compiler errors, toolchain errors, wrong periphery setup, and errors not visible in the C code at all). First tools, such as [mc]square (Model Checking for Micro Controllers) [9] from the Technical University of Aachen emerged and proved their feasibility in research and academia. Although this approach seems promising to formally verify embedded software, further abstraction techniques are needed to mitigate the prevalent state-explosion problem.
2 Contribution
2.1 Status Quo
In 2004, the RWTH Aachen University started o research incentives towards a model checker for microcontroller assembly code. A rst architecture was proposed in [6], and a tool named [mc]square was developed. While early versions of the tool focused exclusively on model checking, static source code analysis gradually took over a major part in [mc]square. The initial target microcontroller supported was the ATMEL ATmega family. In 2007, the Department of Embedded Systems of the University of Applied Sciences Wien established a research cooperation [10] with the RWTH Aachen University. Henceforth, the Department of Embedded Systems was actively involved in assembly code model checking research as well as in the further development of [mc]square. One of the rst tasks was to extend [mc]square by an Intel MCS-51 simulator component, thus, allowing a wider area of application for the toolchain. Consequently, the Intel MCS-51 simulator integration brought along signicant know-how for microcontroller families that might be included in future versions of [mc]square. First research results were presented to the scientic community in the paper Challenges in Embedded Model Checking a Simulator for the [mc]square Model Checker [11] presented at the Symposium on Industrial Embedded Systems (SIES) 2008. More details and a rst example code veried by [mc]square using the Intel MCS-51 simulator were published in [12].
2 Contribution
However, it would be foolhardy to state that tools such as [mc]square will ever become the holy grail of program verication. Nevertheless, without fail, they are a step in the right direction.
3 Background
If builders built buildings the way programmers wrote programs, then the rst woodpecker that came along would destroy civilization. (Gerald Weinbergs Second Law)
This chapter presents (theoretical) background related to formal verication and model checking. In what follows, the need of formal verication in the embedded systems domain is motivated by examples of famous software bugs. Next, a classication of formal verication methods is given and the term verication problem is dened. Later, the foundations of model checking are presented and the temporal logic CTL is covered. Then, advantages and disadvantages of model checking are discussed. Later on, the Intel MCS-51 simulator component of the [mc]square model checker is described. The chapter concludes with a summary of related work.
3 Background
root cause for the loss of the spacecraft was the failure to use imperial units instead of metric units, leading to an erroneous trajectory computed using this incorrect data (cf. [17]). US-Northeast blackout (2003). A massive power outage on August 14th , 2003, aected over 50 million people in northeastern USA and eastern Canada. A previously unknown software aw in a widely-deployed energy management system contributed to the devastating scope of the blackout. The software aw caused alarm systems to stall because of a race condition (cf. [18]). Toyota Prius software causes stopping and stalling on highways (2005). A software bug in the Electronic Control Module (ECM) causes Toyota Prius gas-electric hybrid cars to stall or shut down while driving at highway speeds. Approximately 75,000 vehicles were aected by this software bug (cf. [19]). Microsoft Excel multiplication bug (2007). Any multiplication evaluating to 65,535 will deliver incorrect results in early version of Microsoft Excel 2007. For instance, the multiplication of 850 by 77.1 results in 100,000 instead of the correct value of 65,535 (cf. [20]). A1 mobile network breakdown (2008). A software problem was responsible for the breakdown of the mobile network service in October 2008, aecting nearly 500,000 customers in Lower Austria and Vienna (cf. [21]). BB train ticketing machine selling single fare tickets for 3720.8 e (2008). A single fare ticket for the domestic railway line between Hollabrunn (Lower Austria) and Handelskai (Vienna) is normally sold for 6.8 e. However, in some rare cases the fully automatic ticket machine at the platform charges the passenger 3720.8 e. That happens only in case the language is changed from German to English before the ticket buying process is initiated (speaking from the authors own experience). Even when strictly abiding software programming rules and design guidelines, software is man-made and, therefore, may never be perfect. The development and use of methods attempting to remove man-made errors in software engineering is crucial to pave the way for further advances in software engineering. This is the ultimate goal of formal software verication. Hence, the formal approach of software verication may be seen as a major contributor to software correctness, reliability, and safety of present and future applications.
more than four decades [22]. Figure 3.1 gives a rough classication of formal verication methods. The main concept behind formal verication relies on the observation that computer programs can be seen as mathematical objects with well-determined behavior. Mathematical logic is used to describe the desired behavior of the computer program which is subject to verication. The process of formally verifying a program is now to give a mathematical proof to show that the program works as specied.
Formal verication
Model checking
Theorem proving
Figure 3.1: Formal verication methods classication. Basically, literature distinguishes two main areas of formal software verication approaches. The rst one is a rather mathematical related one, called theorem proving. In theorem proving, a proof of correctness is achieved through the derivation of a theorem. A short overview of theorem proving is given in [23]. However, software verication can also be achieved without explicitly establishing mathematical proofs. The more popular approach to formal verication is called model checking and is very well received in modern-day software development processes.
3 Background
the observed shift from stand-alone to real ubiquitous, pervasive, and networked safetycritical applications calls for eective methods to formally prove the correct behavior of a design. Until now, the advances in formal verication helped to successfully verify simple programs of moderate size that are used in safety critical applications. As recently pronounced by Hoare and Misra the forthcoming challenge in the eld of formal verication is seen as the process of merging the elaborated theoretical understanding of computer programs as well as existing tools in order to enable fully automatic verication of real life, large scale, and complex embedded designs. In [25], Hoare and Misra proclaim the verication grand challenge as an international project to construct a program verier that would use logical proof to give an automatic check of the correctness of programs submitted to it. What sounds for a moment out of touch with reality is, based on their assumptions, within the reach of the next 20 years. In their vision the verication grand challenge will lead to a tool that can be seen as the swiss army knife of formal verication, solving the verication challenge for future hardware and software designs. Hoare and Misra estimated more than a thousand person-years of eort to accomplish this project. To get an idea of the complexity of such a project: the Linux Kernel v2.6 one of the worlds largest software projects started its development back in 1991 and since then the development eort has gained an accumulated number of ve thousand person-years [26]. The verication grand challenge is undoubted an ambitious and catchy project, nevertheless, if it succeeds it will revolutionize the way how we develop (safety-critical) software and it will make essential contributions to reliability, safety, and trustworthiness of future software developments. In 2002, the US Department of Commerce estimated annual costs to the US economy of about 60 billion US dollars due to avoidable software errors [27]. Thus, producing error-free software is not only safe for people using those systems it is even highly economical advantageous. It is a long and steep way to the fully automatic, formal software verication and contributions made in order to achieve this ambitious goal come piece by piece. Hence, the work put into this thesis can be seen as a small step towards Hoare and Misras vision of a fully automatic software verication.
M = (S, s0 , R, L) S: nite set of states s0 : initial state s0 S R: transition relation R S S L: interpretation function L : 2AP
The transition relation R species for each state whether and which successor states are possible, i.e., for each state s S there is a successor state s S. The interpretation function L labels each state with the set of AP that are true in that state. A path in the Kripke structure M from a state s is a sequence of states = s0 s1 s2 ... such that s0 = s and R(si , si+1 ) holds for all i 0 [28].
3 Background
CTL Temporal Operators X holds neXt time F holds sometime in the Future G holds Globally in the future p U p holds Until holds In CTL, a temporal operator always must be preceded by a path quantier. The suggestion of using temporal logic for reasoning about ongoing concurrent programs (reactive systems) goes back to Pnueli in 1977 [34]. This thesis focuses exclusively on CTL model checking. A survey on other temporal logics is given in [35]. A few examples of common CTL expressions are given in Figure 3.2. Another well received temporal logic is Linear Temporal Logic (LTL). Whereas CTL considers the whole computation tree, LTL does only consider individual runs of the automata. Thus, CTL allows to reason about the branching behavior, considering multiple possible runs at once. However, CTL and LTL have a large overlap, thus, a considerable number of properties are expressible in both temporal logics. Although they have a common superset, namely Computational Tree Logic* (CTL*), not all properties can be expressed in both logics. For instance, a property commonly known as resetability is expressed in CTL as AG (EF ) from any state there is always a path where eventually holds and cannot be expressed in LTL. Consequently, some LTL properties such as A (FG ) along every path, there is some state from which will hold forever and fairness constraints (cf. Section 6.4.5), cannot be expressed in CTL either. More on the expressiveness of CTL*, CTL, and LTL is given in [36, 29].
Finally p Globally p neXt p p Until q
(a) AF p
(b) AG p
(c) AX p
(d) A p U q
Finally p
Globally p
neXt p
p Until q
(e) EF p
(f) EG p
(g) EX p
(h) E p U q
10
S1
System model M
System property
Model checker
M |= ?
Notication
yes
no
Counterexample
Figure 3.3: The model checking workow. Proving a certain property is performed by determining the truth of formulas in certain system states. In order to apply model checking, one needs a modeling language in which the system is described as well as a notation for the formulation of properties and algorithms to step through the state space. As shown in Figure 3.3, a typical model checking workow is composed of three major steps: Dene a formal model of the system that is subject to verication by creating a model of the system in a language that ts the model checkers input language. Those modeling languages are usually tight coupled to the model checker itself, such as Process or Protocol Meta Language (PROMELA) used by the SPIN model checker [37]. System modeling usually involves the process of abstraction (see Section 4.1), i.e., simplifying the original system. System modeling focuses on the main properties in order to better manage the system complexity. Provide a particular system property that should be proved. In other words, a question
11
3 Background
about the system behavior is formulated that should be answered by the model checker. The system property is usually derived from the specication and given in a temporal logic. Invoke the model checking tool and receive a notication whether the given system property was fullled or not. In case the system property could not be veried, a counterexample is generated to nger-point to the source of error in the system model.
M = (S, s0 , R, L) S := {S1, S2, S3} s0 := {S1} R := {(S1, S2), (S2, S1), (S2, S3), (S3, S1), (S3, S3)} L(s1 ) = {coin, brew, selection} L(s2 ) = {coin, brew, selection} L(s3 ) = {coin, brew, selection}
As noted in Section 3.4.1, model checking is based on a graph search, therefore, the transition system is transformed to computation paths. This is done by unwinding the Kripke structure to obtain a computation tree, as shown in Figure 3.4(b). Most model checkers expect system properties given in some temporal logic. For the coee vending machine meaningful system properties might be: Coee is brewed after a selection was made. Coee is brewed sometime. These properties can be written in CTL as: AG[selection brew]1 Whenever a selection is made coee is brewed for sure. EF[brew] There is a state where coee is brewed.
1
12
S1
S2
insert coin
select
S3
S1
S1
S2
13
3 Background
debugging, such an error trace is profoundly advantageous, since the counterexample gives a complete insight into the systems behavior. Nevertheless, all that glitters is not gold. The broad application of model checking in industry is taking place quite slowly, mainly because of its three major disadvantages: The state-explosion problem. The main challenge in model checking is to cope the problem of state-explosion. In general, a model checker aims to enumerate and analyze the set of states a system may ever reach. The overall number of system states, even when dealing with small systems, is often too large to be handled with reasonable computing resources. Peled [40] summarizes eective strategies for ghting against state-explosion and proposes a combination of Binary Decision Diagrams (BDD)2 , Partial Order Reduction (POR), and Symmetry. More details are also given by Clarke et al. in [28]. Reported errors may be false negatives [40]. Model checking requires, as the name implies, modeling of the system. In order to alleviate the state-explosion problem, abstraction is needed (cf. Section 4.1). Thus, the program that is veried may not be the original one and consequently, if model checking reports a property violation in the abstracted model of the system, one has to make sure that the error is indeed a real one, i.e., it can be reconstructed on the real target platform. The process of checking the counterexample on the real system is often carried out manually. False negatives arise from the dierences between an actual systems behavior and the behavior represented by the abstracted model. Manually ruling out false negatives is time intensive and an error prone task itself. Therefore, a major future challenge for the model checking community may be the automated elimination of false negatives. A more detailed discussion on how to overcome the problem of false negatives is carried out in Section 7.1. Model checking can only verify a given specication. Thus, an important point is the completeness of the specication. It is challenging to make sure that the specication covers all properties that the system should satisfy and to establish a one to one match of a given textual specication and the derived formal specication.
A BDD or a Propositional Directed Acyclic Graph (PDAG) is a data structure that is used to represent a boolean function. It can be seen as a compressed representation of sets.
14
standardized programming languages with so called microcontroller specic extensions. These additional language features allow the engineer to enable/disable interrupts, read and write data to/from peripheral units, use special data types, invoke additional hardware blocks, etc. Not surprisingly, model checking of high level descriptions often fails to meet the needs to verify embedded systems code. Fortunately, model checking and static analysis of assembly code gained the attention of recent research projects [42, 7, 8]. Formal verication based on assembly code has some tremendous advantages over model checking of high level system models. The code that is deployed to the hardware is veried and not just an intermediate representation. A compiler, which is a highly complex piece of software itself, translates the high level code to microcontroller instructions. In most approaches to embedded code verication, a high level behavior of the system is analyzed, but there is a lack of a cross-check verifying whether the behavior of the model remains unchanged after code compilation. Thus, when using model checking of assembly code one can detect any errors introduced during the whole development process, including compiler errors, toolchain errors, wrong periphery setup, errors not visible in the C code at all, etc. 0101010001001000 0100111101001101 0100000101010011 Assembly source code / Hex le [mc]square Model checker
System model M
System property
M |= ?
yes Notication
no
Counterexample
Figure 3.5: The model checking workow of the [mc]square approach (cf. Figure 3.3). With [mc]square (Model Checking for Micro Controllers), the Department of Computer Science XI of the Technical University of Aachen developed a model checker that is precisely tailored for formal verication in the context of microcontrollers. [mc]square is an explicit, timeless, CTL based, assembly code model checker and features model check-
15
3 Background
ing and static source code analysis of software written for embedded targets. Supported target platforms are the ATMEL ATMega series [9], the Intel MCS-51 [11], the Inneon XC16x [43], and Programmable Logic Controllers (PLCs) [44]. The [mc]square model checker uses an accurate and customized Central Processing Unit (CPU) simulator to automatically derive the system model out of an implementation. Thus, the manual and often error-prone process of model creation can be shifted from the test engineer towards the implementation of the verication tool. This leads to the revised model checking workow as shown in Figure 3.5. In the following, a high level introduction to the C51Simulator component of [mc]square is given and only those parts of the model checker are discussed that are relevant for the elaboration of this thesis. More details on assembly code model checking and the tool [mc]square are given by Schlich in [9].
16
Two 16 bit timer units Full-Duplex Universal Asynchronous Receiver Transmitter (UART) Five dierent interrupt sources and two levels of interrupt priorities 256 dierent instructions Five dierent addressing modes The majority of instructions are executed within 12 system clock cycles Registers as well as I/O ports are memory mapped, therefore, accessed like any other memory location. The stack is located within the IRAM area and grows to higher data memory addresses. A particular and powerful architecture feature is the bit-manipulating capability of the CPU. Single bits can be set, cleared, or involved in other logical calculations. Four separate register banks are located at the bottom of the IRAM occupying the rst 32 bytes of data memory. Register banks are altered by modifying two dedicated register bank selection bits within the Program Status Word (PSW). 21 Special Function Registers (SFRs) allow the conguration of peripherals. A few of them are bitaddressable, some are only byteaddressable and some can be accessed in either mode. Instruction Set The instruction set covers 256 dierent instructions, hence, resulting in 8 bit wide opcodes. Caused by the CISC architecture, instructions are either one, two, or three byte long. They can be separated into ve groups: logical, arithmetic, program branching, data transfer, and boolean operations. Supported Addressing Modes Data and program memory are accessed by one of the ve available addressing modes: Immediate addressing is used whenever the source operand is a constant value rather than a variable. The constant value can be either included as a single byte into the instruction, or be derived from the opcode itself. Direct addressing is used for accessing any IRAM location including SFRs. Indirect addressing uses the registers R0 or R1 from the active register bank as base registers. The value stored into these registers indicates an address in IRAM where data should be read from or written to. Any pointer makes use of indirect addressing. Extended direct addressing is basically the same as direct addressing but it is rather used to access additional external memory locations than IRAM locations. Indirect from program memory enables reading from program memory. The interested reader is referred to the Intel MCS-51 datasheet [46] for more details on the architecture and the instruction core.
17
3 Background
program parser
state space
static analyzer
counterexample gen.
CTL property
CTL parser
PLC
AVR
Figure 3.6: The [mc]square framework. As shown in Figure 3.6 the [mc]square framework provides a full CTL model checker, a counterexample generator, a comfortable Graphical User Interface (GUI), and an assembly code static analyzer. Whenever [mc]square needs data generated by the hardware the respective simulator component is invoked. A nice side eect of the simulator based approach is a full CPU simulator, allowing the user to analyze and debug the code prior to model checking. It is notable that [mc]square oers a new way of analyzing microcontroller programs, which is quite dierent to standard COTS tools, since the simulation covers the whole state space of the application.
18
care must be taken at verifying the implementation of the simulator component. It is achieved by verifying the actual implementation against commercial available Intel MCS51 simulators such as the Keil Vision debugger or CSim which is included in the Small Device C Compiler (SDCC) [48] toolchain. The conceptional test approach is shown in Figure 3.7. A test pattern le is loaded into both simulators and each instruction is independently executed by the two simulators. After the execution, the whole memory area of both simulators is dumped into separated les and these les are compared against each other. More on the applied test and verication strategy is given in [45].
test pattern le
MOV MOV PSW,#0 A,#10 R0,#10 A,R0
MOV ADD
LJMP FAILED
match?
instruction veried
yes
no
troubleshooting
19
3 Background
[mc]square Interface
Memory model
Splitter
Determinizer
Instruction Set Core A basic, straightforward implementation of the semantics of the opcodes supported by the microcontroller as dened in the corresponding datasheet [46].
Memory Model The memory model acts as a representation of the Intel MCS-51 data and program memory. As described in [49], [mc]square uses abstraction techniques that center around the idea of a 3-valued memory representation. Such a ternary memory representation allows certain memory locations to be marked as unknown in order to avoid the creation of unneeded successor paths. For this reason, the memory model requires shadow memory to indicate whether the actual value is known. Consequently, the simulator manages two blocks of memory. As shown in Table 3.1, every byte of memory is represented by its actual value and a second byte, serving as mask indicating whether or not a certain bit is deterministic (Those bits with value Nondeterministic (ND) are indicated by a *). Location @ 0x0A @ 0x0B @ 0x0C @ 0x0D @ 0x0E @ 0x0F Binary value b 11110000 b 00001111 b 10101010 b 00000000 b 00110011 b 01010101 ND-mask b 00001100 b 11110000 b 01010101 b 01100110 b 00000000 b 11111111 Ternary value 1111**00 ****1111 1*1*1*1* 0**00**0 00110011 ********
Table 3.1: Memory representation in [mc]square. More on the benets of this 3-valued memory representation is given in [13, 9] and in Section 4.1.3.
20
Splitter At certain points in the model checking ow it is necessary to predicate over memory location in order to prove a given specication. Thus, in the case a memory location involved in the formula is marked as ND, there must be a mechanism to strip down ND memory locations to every possible value combination resulting out of the ND. That is exactly what the Splitter is used for. The actual implementation of the Splitter can become quite tricky and complex, one of the main reasons are the various addressing modes supported by the respective target hardware. A few straightforward examples are given in Table 3.2. Location @ 0x0A @ 0x0B @ 0x0F Ternary value 1111**00 ****1111 ******** Value combinations 22 = 4 24 = 16 28 = 256
Determinizer The Determinizer is, in principle, the decision making part acting whenever the C51Simulator has to resolve nondeterministic behavior. For any given state in the state space it is capable of generating all possible successor states. Further on, the Determinizer takes over the proper handling of interrupts and branches to Interrupt Service Routines (ISRs). Interface A slim interface connects the C51Simulator to the [mc]square model checker as well as to the GUI.
21
3 Background
Estes model checks assembly code for the 68HC11 microcontroller constructing the state space either with a simulator or real hardware using the GNU debugger. In practice, this approach is only feasible for small programs. Constructing the state space for the model via the hardware takes time (unless dedicated hardware support is provided). Furthermore, using an out-of-the-box simulator/debugger to construct the model, on the other hand, restricts optimizations in order to minimize the state space. MCESS, in contrast, translates the assembly code of ATMEL ATmega 16 microcontrollers into hardware-independent byte-code for a specic virtual machine that is able to check properties given in LTL. However, due to this approach most hardware issues are abstracted rather coarse eventually removing essential information that may invalidate the entire verication process. Unlike these approaches, [mc]square constructs the model with special, tailored simulators for microcontrollers.
22
23
3 Background
24
4 Abstraction Techniques
All the world is an abstract interpretation (of all the world). (David Schmidt)
In this chapter the concept of abstraction is introduced and the need of abstraction in model checking is emphasized. First, the terms over-approximation and underapproximation are explained. Next, a thought experiment is conducted, showing the exponential connection between the state space size and the amount of data memory of a microcontroller. Then, nondeterministic behavior in assembly code model checking is addressed and a 3-valued memory model is presented. Finally, three state space abstraction techniques and their actual implementation into the Intel MCS-51 simulator component are described.
Bisimulation refers to a relation between state transition systems, associating systems which behave in the same way in the sense that one system simulates the other and vice-versa.
25
4 Abstraction Techniques
Abstraction is usually based on using additional human knowledge through manual or semiautomatic tools. Applying abstraction is challenging and usually a walk on a thin line between sound results and a miss of important properties in the abstracted model. Literature denes the terms over-approximation for system models containing more information as needed and as a counterpart the term under-approximation for system models lacking important system properties one is interested in (cf. Figure 4.1). exact world over-approximation under-approximation
4.1.2 Turings Halting Problem and Why Model Checking Works Anyway
Alan Turing rst proved that there is no way of deciding once a computer has started a calculation whether that calculation will terminate. In other words, it is not decidable whether a Turing Machine [24, 40] will come to a halt given a particular program input. The problem is known as the Halting Problem for Turing Machines and was rst discussed in 1936 [24]. For the eld of software verication the halting problem means that it is in general not possible to write a program that automatically checks another program given as input parameter. Thus, the halting problem is the foundation for the mathematical fact that in general verication of a program is undecidable. More on limitations on what can be decided by an algorithm is dened by the theory of computability [82]. A legitimate question that now arises is, why formal program verication is still gaining tremendous attention in recent research [32] and even commercial tools are celebrating great achievements in the eld of automatic program verication when Alan Turing back in 1936 already proved that all those problems are in general undecidable. Computers that we are using today are not comparable to Turing Machines. A Turing Machine is a mathematical model, which uses a linear tape as a storage device. The tape is divided into cells and each cell is labeled by a symbol from a given alphabet. The tape has a xed left end, and is innite on the right. A single cell on the tape corresponds to a register in main memory within modern-day computers. Whereas, the storage device on a Turing Machine has innite capacity (due to the innite tape), memory is always limited in conventional computers, especially for embedded systems. It follows, that a Turing Machine can reside in an innite number of distinct systemstates. This is not true for conventional computers. Since physical memory is always limited, the number of system states is limited to a nite number of states. Therefore, program code that runs on conventional computers can be described by a Finite State Machine (FSM). A FSM has a nite number of states and a nite number of transitions
26
between those states. The upper limit of possible states is dened by all possible register and memory congurations. The transitions are depending on the underlying hardware architecture. It is even possible to generate a nite state graph for all possible programs that may run on the computer. Each program would have a dierent entry node in the state graph. Depending on the current instruction of the program that represents the transitions it is possible to follow the graph in order to observe the intended behavior by the program. As one can imagine, those (complete) state graphs are huge, even though their generation is theoretically possible. Summarized, the undecidable Halting Problem for Turing Machines is reduced for real life computer systems with limited memory to a decidable one since the focus lies on: model checking of nite state machines, i.e., nite state reactive systems propositional temporal logics to describe properties of the FSM model Nevertheless, model checking of assembly code remains a tough job, mainly caused by the state-explosion problem. To illustrate the state-explosion problem, a thought experiment is conducted. Imagine an ordinary microcontroller, featuring a read-only program memory and a read-write data memory. Each memory location is 8 bit wide. Table 4.1 shows the relation between the number of data memory bytes and the resulting states the system may reside in. It is evident that resulting state space is exponential in the number of the data memory size. Data memory size 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 byte 8 byte 16 byte Resulting system states (state space size) = 256 2562 = 65536 2563 = 16777216 2564 = 4294967296 2565 = 1099511627776 2566 = 281474976710656 2567 = 72057594037927936 2568 = 18446744073709551616 25616 = 340282366920938463463374607431768211456 2561
Table 4.1: Data memory size and resulting system states. In fact, for the Intel MCS-51 target, the IRAM is compiled out of 256 bytes of memory, leading to an approximate of 256256 possible system congurations2 . Under the spell of Moores Law the number of transistors that can be placed inexpensively on an integrated circuit doubles every two years [83] the number of possible system congurations increases tremendously with every new microcontroller family. As the presented examples make clear, even for tiny systems with only a few bytes of memory the number of possible system states is tremendous, thus, claiming the use of abstraction in order to alleviate the state-explosion problem.
2
27
4 Abstraction Techniques
MOV MOV
0 x20 , P0 0 x21 , P1
Listing 4.1: Assembly code excerpt. Following the idea of explicit state space generation reveals that the two assembler instructions shown in Listing 4.1 generate altogether 256256 = 65536 successor states. The considerable number of successors is originated by the immediate instantiation of nondeterministic values contained in I/O ports. The two MOV instructions are stored successively in the program memory. The value of P0 is unknown, and therefore, the model checker creates 256 successor states to remove uncertainty concerning the actual value of the port. Afterwards, the second MOV instruction is executed. Each of the 256 successors then creates further 256 successors for the instantiation of P1, whose actual value is unknown too. Environment information is not present, hence, all 65536 successors are created in order to cover all conceivable situations. Let us consider the fact of immediate successor creation from a dierent point of view. Suppose, that the stated claim that is subject to verication does not include statements over memory location 0x20 nor over 0x21. In this case, there is no need to create successors states. It is sucient to nd a mechanism to mark certain bit positions whose value is unknown and, thus, can be read as ND.
28
To that end, a 3-valued logic approach for modeling the microcontroller memory is used. Whereas binary logic is composed out of elements that are valued on the set {0, 1}, i.e., each value obtains either true or false, 3-valued logic or ternary logic [58] is dened as follows in [84]: Ternary logic is a system whose elements called statements are valued in the set {0, 1, 2}. If x is a statement3 , the value of x can be interpreted as a mapping : {0, 1, 2} such that:
(4.1)
In the remainder of this thesis the term ND is used for the rst line of the semantic representation stated in Equation 4.1.3. Ternary logic is well known in hardware description languages such as VHDL or Verilog to represent unknown values of, e.g., input circuit latches or uninitialized memory locations. Synthesis tools use this ND representation to reveal design errors, which the designer can correct before synthesis towards an actual circuit. From the state space view, the 3-valued memory representation introduces a certain type of states, namely lazy states. A lazy state combines both explicit and symbolic parts of the state space4 . Any state including memory locations marked as ND is called lazy state. Consequently, a single lazy state represents a set of explicit states. A lazy state and the corresponding nondeterministic state space representation are shown in Figure 4.2.
S(n)
S(n+1)
S2
S5 ...
S6 ...
S(n+2)
S3 ...
S4 ...
Note that in our approach a statement refers to a single bit location within the IRAM of the microcontroller. 4 [mc]square still uses explicit model checking algorithms.
29
4 Abstraction Techniques
Table 4.3: Memory contents before and after the MOV instruction. To illustrate the concept of Delayed Nondeterminism, the instruction MOV [0xA, 0xB] is considered. With regard to the Delayed Nondeterminism approach, whenever the C51Simulator executes a MOV instruction, not only the value from 0xB is copied to 0xA as one would expect when reading the dened instruction semantics in the datasheet
30
it rather copies the corresponding ND-mask and the actual value. Hence, the generation of multiple successors is avoided by delaying the instantiation of the involved and perhaps nondeterministic memory location 0xB. This procedure is documented in Table 4.3 and illustrated in Figure 4.3. copy ND-mask value copy Figure 4.3: The Delayed Nondeterminism approach of handling the MOV [0xA, 0xB] instruction. A case study by Noll and Schlich [49] revealed the eect of Delayed Nondeterminism for dierent program congurations showing a possible state space reduction of 70% and above. Nevertheless, actual savings due to Delayed Nondeterminism depend on various factors, such as source code structure and the targeted hardware. ND-mask value
Source 0x0A
Destination 0x0B
31
4 Abstraction Techniques
Delayed Nondeterminism with Look Ahead can be seen as an extension of Delayed Nondeterminism. Delayed Nondeterminism fails to prove its superiority when dealing with logic operations, since the straightforward approach of copying ND-masks, as described in Section 4.2.1, cannot be applied anymore. The main idea of the Delayed Nondeterminism with Look Ahead approach to state space reduction is to take the semantic relations of the instructions into account. Delayed Nondeterminism with Look Ahead centers around the coherence among the boolean operators , , and with particular regard to 3-valued logic. Relevant relations are summarized in Table 4.4 A true true true false false false ND ND ND B true ND false true ND false true ND false AB true true true true ND false true ND ND AB true ND false false false false ND ND false A false
true
ND
Table 4.4: Truth table for 3-valued logic. Embedded systems code is tightly coupled to the environment of the microcontroller. Analog and digital values are read from sensors, giving the application the possibility to react upon changes in the environment. Therefore, reading data over I/O ports of the microcontroller is essential for embedded applications. Since reading unknown values from the environment is one of the major contributers to the state-explosion problem, special care has to be taken to avoid the generation of needless successor states from the very beginning. Delayed Nondeterminism with Look Ahead tackles the problem right from the point where data is read from the I/O ports. Reading values over the microcontroller I/O often involves bitwise operations performed by bit masks, since ports can be either accessed byte wise only, or the application is only interested in a certain number of bits, e.g., the lower nibble of an 8 bit wide port. Bit masking, or bit twiddling is a common way to individual operations on single bits. A summary of the most common usages of bitmasks is given in Table 4.5. Compilers translate those bit-twiddling statements from the high level language towards logic operations supported by the microcontrollers instruction set. In the following, a simple example is presented to explain the idea of Delayed Nondeterminism with Look Ahead. Example The C code in Listing 4.2 represents typical (low level) embedded code. In what follows, this code excerpt is used to discuss the concept of Delayed Nondeterminism with Look Ahead. The code reads the value of the 8 bit wide I/O port, termed Port1, and uses a bitmask to extract the upper two bits out of the I/O port.
32
Operand 01101110
Mask 11110111
Op
Setting bits to 1
y |= (1 pos);
10010101
00001000
Toggling a bit
y ^= (1 pos);
10011101
00001000
Testing a bit
y = x & (1 pos);
00011101
00001000
y = x & 0x0F;
10011101
00001111
y = x & 0xF0;
10011101
11110000
Example 01101110 11110111 01100110 10010101 00001000 10011101 10011101 00001000 10010101 00011101 00001000 00001000 10011101 00001111 00001101 10011101 11110000 10010000
1 2 3 4 5 6 7 8 9 10 11 12 13 14
unsigned char readValueFromIO ( void ) { / rea d p o r t v a l u e / v a l u e = Port1 ; / mask t h e upper two b i t s / v a l u e &= 0xC0 ; return v a l u e ; } int main ( void ) { while ( 1 ) { readValueFromIO ( ) ; / do s o m e t h i n g / } }
Listing 4.2: Embedded C code example program for the Intel MCS-51 target. The source code line 5 in Listing 4.2 is now mapped by the compiler (Keil C51 Compiler V8.01) to the opcode #0x53 representing the instruction ANL direct,#immediate. The ANL instruction compares the bits of the internal memory location (0x12) with the immediate value (#0xC0) and sets the corresponding bit in the resulting byte only if the particular bit is set in both of the operands, otherwise the resulting bits are cleared.
1 2
MOV ANL
Listing 4.3: Translated assembly code for source code lines 4-5 of Listing 4.2.
33
4 Abstraction Techniques
In the following, the eect of the assembly code in Listing 4.3 on the abstraction techniques Delayed Nondeterminism and Delayed Nondeterminism with Look Ahead are discussed and compared. Delayed Nondeterminism helps to avoid generating successor states for the initial MOV instruction by simply copying the actual value as well as the ND-mask from memory location 0x90 to memory location 0x12 (cf. Figure 4.3). Reading from the environment leads always to a full-nondeterministic read, since environment information is not present. However, Delayed Nondeterminism forces us to determinize (creating all possible successors) involved memory locations in preparation for the following ANL instruction. The variable value is unknown, thus, all 8 bits are marked as ND and the model checker invokes the simulator to generate all possible successors arising from this uncertainty. The number of successor states is easily calculated and results in 28 = 256 states. Consequently, Delayed Nondeterminism leads to a wide branch in the computation tree, having a negative impact on the state space and makes the state-explosion problem even worse. This scenario is depicted in Figure 4.4, showing the total number of 256 successor states generated.
S(n)
S1
S(n+1)
S0
value :=0x00 b00000000
S1
value :=0x01 b00000001
S2
value :=0x02 b00000010
S 3-253
S253
value :=0xFD b11111101
S254
value :=0xFE b11111110
S255
value :=0xFF b11111111
Figure 4.4: The state-explosion problem. However, the described approach results in a valid over-approximation (cf. Section 4.1.1) by replacing the ND value of memory location value with actual values (one at a time) and performing the ANL afterwards. Nevertheless, this approach lacks a consideration of the second operand included in the operation, i.e., the constant value of the bitmask. As shown in the example code, the bitmask is of value 0xC0. Examining the bitmask on the binary level, it evaluates to b 11000000. Thus, the only two bits of interest in this calculation are the upper two, i.e., the two most signicant bits. The remaining six bits, will evaluate, according to the relations dened in Table 4.4, to false in any case. Hence, the number of successor states can be reduced from 28 = 256 down to 22 = 4. The resulting values are 0x00, 0x40, 0x80, and 0xC0 as detailed in Table 4.6. Figure 4.5 presents the dierences in the number of the resulting system states for the various abstraction techniques when executing the two assembler instructions of Listing 4.3. The Delayed Nondeterminism with Look Ahead approach helps to avoid overapproximation whenever logical operations are performed over ND memory locations. How-
34
Value * * * * * * * *
Operation
Mask 1 1 0 0 0 0 0 0
Result * * 0 0 0 0 0 0
Combinations (i) 0x00 b 00000000 (ii) 0x40 b 01000000 (iii) 0x80 b 10000000 (iv) 0xC0 b 11000000
Table 4.6: Details on the Delayed Nondeterminism with Look Ahead approach. ever, the promising approach to state space abstraction cannot be applied to all logical instructions of the microcontrollers instruction set. An example is the XOR instruction. To exemplify this on the bit level representation, neither the result of XOR [1, ND] nor XOR [0, ND] can be decided without knowing the actual value of the ND bit. The same applies to the negation, i.e., NOT [ND]. Nevertheless, considering the frequent I/O accesses and the common method of bittwiddling in typical embedded systems code, the presented abstraction technique can be seen as a promising contributer to state space reduction. Regarding the C51Simulator implementation Delayed Nondeterminism with Look Ahead is applied to 32 out of 256 instructions in total. In [13] a saving in overall state space of 99% is achieved by Delayed Nondeterminism with Look Ahead compared to plain explicit state space building. It should be noted, that this result is only valid for the chosen example in [13]. Actual savings due to this method are depending on the source code structure and the number of accesses to nondeterministic memory locations, i.e., for source code without any I/O accesses this concept will not contribute to state space reduction (but wont increase the state space either). Implementation The actual implementation in the C51Simulator component uses a visitor pattern. The visitor design pattern is a common way of separating an operation from an object structure upon which it operates. The major benet lies in the ability to add new operations to existing objects without modifying those structures. More on the visitor design pattern is found in [85]. For the C51Simulator, the whole instruction set implementation is built around a visitor pattern. Based on the actual abstraction technique, the corresponding instruction visitor is selected and used to apply the desired abstraction mechanism. As an example, the Delayed Nondeterminism and Delayed Nondeterminism with Look Ahead instruction visitors for the ANL [direct, #immediate] instruction is presented in the following (the notation uses pseudo Java code). The Delayed Nondeterminism instruction visitor, shown in Listing 4.4, implements the ANL instruction as stated in the instruction set manual [46]. First, the involved memory location is read from the internal memory. Second, the logical ANL operation is performed and the new value is written back to the destination register. Note that for this particular instruction, the Delayed Nondeterminism visitor pattern is the same as for plain state space building without any
35
S(n) S Port1 := ND
Port1 := ND
Port1 := ND
S 3-253
S255
value := ND
value := ND
S(n+2) S0 S1 S2
S 3-253
S253
S254
S255
S0
S1
S2
S 3-253
S253
S254
S255
S1
S2
S3
S4
4 Abstraction Techniques
Figure 4.5: Successor state generation and resulting system states with options: instantiate immediately, Delayed Nondeterminism, and Delayed Nondeterminism with Look Ahead for the assembly code presented in Listing 4.3.
36
abstractions applied. Recall that Delayed Nondeterminism is only applied on data transfer instructions.
1 public 2 3 4 5 6 7 8 } v o i d v i s i t ( ANL_Direct_Const i n s t r u c t i o n ) { i n t tmp2 = 0 x00 ; / Read d i r e c t a d d r e s s b y t e from memory / tmp2 = mcu . r e a d R e g i s t e r B y A d d r e s s ( i n s t r u c t i o n . a d d r e s s ) ; / Perform AND and w r i t e back / mcu . w r i t e R e g i s t e r B y A d d r e s s ( i n s t r u c t i o n . a d d r e s s , i n s t r u c t i o n . c o n s t a n t & tmp2 ) ;
Listing 4.4: The Delayed Nondeterminism visitor pattern for the ANL [direct, #immediate] instruction. Consequently, the visitor pattern used by Delayed Nondeterminism with Look Ahead works in a dierent way, since it takes care about the relations dened in Table 4.4. It works as follows. First, if the involved memory location is deterministic, i.e., none of the bits is masked as ND, the algorithm calls the standard visitor pattern, as introduced in Listing 4.4 and returns. Second, the algorithm iterates over all bits of the involved memory location and extracts the bit values as well as the corresponding ND mask values. Since the ANL [direct, #immediate] instruction involves a constant immediate value, the ND mask of the constant value is always 0x00 (false). Consequently, the gathered information is evaluated according to Table 4.4, and written back to a temporal register, termed resultReg. This procedure continues until all 8 bits of the operand are handled. Finally, the ND mask and the actual value of the resultReg are written to the destination register (cf. lines 24-25 in Listing 4.5). As aforementioned, Delayed Nondeterminism with Look Ahead can be applied to 32 out of 256 instructions. Although the individual realization of the Delayed Nondeterminism with Look Ahead approach for the remaining instructions may dier, the main idea remains the same. The interested reader is referred to the source code of the C51Simulator component for further details.
1 public 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 } v o i d v i s i t ( ANL_Direct_Const i n s t r u c t i o n ) { i f ( mcu . i s A d d r e s s D e t e r m i n i s t i c ( i n s t r u c t i o n . a d d r e s s ) ) { super . v i s i t ( instruction ); return ; } C51Register resultReg = new C51Register ( " " , 0 ) ; b o o l e a n bitA , tbdA , bitB , tbdB ; for ( int i = bitA tbdA bitB tbdB 0 ; i < C 5 1 U t i l i t i e s .STD_REG_LENGTH; i ++) { = mcu . g e t R e g i s t e r B y A d d r e s s ( i n s t r u c t i o n . a d d r e s s ) . b i t G e t ( i ) ; = mcu . g e t R e g i s t e r B y A d d r e s s ( i n s t r u c t i o n . a d d r e s s ) . bitGetTBD ( i ) ; = C 5 1 U t i l i t i e s . extractBitFromByte ( i n s t r u c t i o n . constant , i ) ; = false ;
/ A=0, B =nd > RES = 0 / i f ( ( b i t A == f a l s e && tbdA == f a l s e && tbdB == t r u e ) / A =nd , B=0 > RES = 0 / ( tbdA == t r u e && b i t B == f a l s e && tbdB == f a l s e ) ) { resultReg . bitSetTo ( i , fa lse ) ; } else { r e s u l t R e g . bitSetTBD ( i , t r u e ) ; }
||
Listing 4.5: The Delayed Nondeterminism with Look Ahead visitor pattern for the ANL [direct, #immediate] instruction. Summarized, the presented approach of Delayed Nondeterminism with Look Ahead helps to avoid the generation of successor states whenever a microcontroller executes logic operations by taking advantage of the 3-valued memory representation of the [mc]square model checker.
37
4 Abstraction Techniques
38
Whenever one of the operands, i.e., A or R0, contains nondeterministic bits, [mc]square will force the C51Simulator to create successor states by replacing nondeterminism with actual values and performing the ADDC [A, R0] operation one after another. However, the Nondeterministic Program Status Word approach avoids the generation of successor states in this case by setting involved memory locations to nondeterministic. For the ADDC [A, R0] operation, Nondeterministic Program Status Word sets the Accumulator A and the ags C, OV, AC, and P to nondeterministic. The second operand R0 is not modied, since it is not actively written by the operation. This is detailed in Table 4.7. Binary value ND-mask before executing ADDC [A, R0] Accumulator A b 11100000 b 00000000 Working register R0 b 00001111 b 11110000 Flags C OV AC P b 0001 b 0000 after executing ADDC [A, R0] Accumulator A b 00000000 b 11111111 Working register R0 b 00001111 b 11110000 Flags C OV AC P b 0000 b **** Location Ternary value 11100000 ****1111 0001 ******** ****1111 ****
Table 4.7: The ADDC [A, R0] example. Thus, Nondeterministic Program Status Word avoids to create successor states even for arithmetic instructions, leading to additional savings in the state space. Nevertheless, whenever a program branching instruction such as JC (Jump if Carry ag set) is encountered and the carry ag itself is nondeterministic, two successors are generated to maintain a sound over-approximation. Even though the contribution of this particular abstraction technique to state space reduction is tremendous, additional behavior is added which might not be present when executing the program on the real target hardware (cf. Table 4.2 for a rough overview). Implementation Nondeterministic Program Status Word is again implemented using a visitor pattern. As an example, the corresponding visitor patterns for the ADDC [A, R0] operation are discussed in the following. As shown in Listing 4.6, the Delayed Nondeterminism visitor pattern executes the instruction as specied in the manual. First, the two operands are read and the addition is performed afterwards. Then, the corresponding ags are set (cf. source code lines 25-55). Finally, the result is written back to the Accumulator. Again, for this particular instruction, the Delayed Nondeterminism visitor pattern behaves exactly like plain state space building without any abstraction at all.
1 2 3 4 5 6 7 8 9 10 11 12 public void int int int int v i s i t (ADDC_A_Rn i n s t r u c t i o n ) { tmp0 = 0 x00 ; tmp1 = 0 x00 ; tmp2 = 0 x00 ; tmp3 = 0 x00 ;
tmp1 = mcu . r e a d A c c u m u l a t o r ( ) ; tmp2 = mcu . r e a d W o r k i n g R e g i s t e r ( i n s t r u c t i o n . regNumber ) ; / I f c a r r y f l a g s e t , then add 1 t o t h e i f ( mcu . psw . b i t G e t (C51PSW .FLAG_CY) ) { tmp0 = 1 ; r e s u l t /
39
4 Abstraction Techniques
13 14 15 16 17 18 19 20 21 22 23 } 24 25 p r i v a t e 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 }
} / Perform A ddition / tmp3 = tmp0 + tmp1 + tmp2 ; / S e t c o r r e s p o n d i n g f l a g s / setFlagsForADDC ( tmp0 , tmp1 , tmp2 , tmp3 ) ; tmp3 &= C 5 1 U t i l i t i e s .MASK_SCALE_TO_BYTE_VAL; mcu . w r i t e A c c u m u l a t o r ( tmp3 ) ;
i n t tmp1 ,
i n t tmp2 ,
i n t tmp3 ) {
/ C: Check i f t h e r e i s a carryout a t b i t 7 / newCarry = ( ( tmp3 & C 5 1 U t i l i t i e s .MASK_CARRY_CHECK) == C 5 1 U t i l i t i e s .MASK_CARRY_CHECK) ; i f ( newCarry ) { mcu . psw . b i t S e t T o (C51PSW .FLAG_CY, t r u e ) ; } else { mcu . psw . b i t S e t T o (C51PSW .FLAG_CY, f a l s e ) ; } / AC: Check i f t h e r e i s a carryout a t b i t 3 / newACarry = ( ( ( tmp1 & C 5 1 U t i l i t i e s .MASK_EXTRACT_LOWER_NIBBLE) + ( tmp2 & C 5 1 U t i l i t i e s .MASK_EXTRACT_LOWER_NIBBLE) + tmp0 ) > C 5 1 U t i l i t i e s .MASK_EXTRACT_LOWER_NIBBLE) ; i f ( newACarry ) { mcu . psw . b i t S e t T o (C51PSW .FLAG_AC, } else { false ); } / OV: Check i f Overflow B i t s h o u l d be s e t / c a r r y A t P o s 6 = ( ( ( ( tmp1 & 0 x7F ) + ( tmp2 & 0 x7F ) + tmp0 ) & 0 x80 ) == 0 x80 ) ; if ( ( c a r r y A t P o s 6 ) ^ ( newCarry ) ) { mcu . psw . b i t S e t T o (C51PSW .FLAG_OV, mcu . psw . b i t S e t T o (C51PSW .FLAG_OV, } mcu . psw . b i t S e t T o (C51PSW .FLAG_AC,
true ) ;
true ) ; false );
} else {
Listing 4.6: The Delayed Nondeterminism visitor pattern for the ADDC [A, R0] instruction. Listing 4.7 shows the Nondeterministic Program Status Word visitor pattern. First, the algorithm checks if all included memory locations are deterministic. If so, the instruction is executed as specied in the instruction set manual and the algorithm returns. In case that nondeterministic memory locations are included, no calculation is performed at all. However, the visitor pattern marks the modied memory location as ND, thus, implementing the concept as described in Table 4.7.
1 public 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 } v o i d v i s i t (ADDC_A_Rn i n s t r u c t i o n ) { i f ( mcu . i s A c c u m u l a t o r D e t e r m i n i s t i c ()&& mcu . i s A d d r e s s D e t e r m i n i s t i c ( mcu . g e t A d d r e s s F o r W o r k R e g i s t e r ( i n s t r u c t i o n . regNumber))&& mcu . getPSW ( ) . i s C a r r y D e t e r m i n i s t i c ( ) ) { super . v i s i t ( instruction ); return ; } / S e t i n v o l v e d r e g i s t e r s t o ND / mcu . setTBD ( C 5 1 U t i l i t i e s . REGISTER_ACC, 0 x f f ) ; mcu . getPSW ( ) . bitSetTBD (C51PSW .FLAG_CY, t r u e ) ; mcu . getPSW ( ) . bitSetTBD (C51PSW .FLAG_AC, t r u e ) ; mcu . getPSW ( ) . bitSetTBD (C51PSW .FLAG_OV, t r u e ) ; mcu . getPSW ( ) . bitSetTBD (C51PSW .FLAG_P, t r u e ) ;
Listing 4.7: The Nondeterministic Program Status Word visitor pattern for the ADDC [A, R0] instruction. Summarized, Nondeterministic Program Status Word shifts the model checking approach of [mc]square further to the idea of abstract interpretation, i.e., not executing the instructions with all its details. The broad application of over-approximation, by marking involved memory locations as nondeterministic, helps to further shrink the resulting state
40
space. This is an interesting observation since the massive use of over-approximation due to the Nondeterministic Program Status Word concept decreases the state space drastically. This is against what one would expect without knowing the internals of the [mc]square approach and the underlying 3-valued memory model. Nevertheless, Nondeterministic Program Status Word introduces behavior which may not exist in reality, thus, may yield false-negatives during model checking.
41
4 Abstraction Techniques
42
5 Static Analysis
When I use a model checker, it runs and runs for ever and never comes back. . . when I use a static analysis tool, it comes back immediately and says I dont know. (Patrick Cousot)
The following chapter focuses on static analysis of embedded systems assembly code. First, a brief introduction to Control Flow Graphs (CFGs) and data-ow analyses is given. Next, the [mc]square static analysis framework and relevant internals are presented. Then, the adaption and implementation of regular data-ow analysis for the Intel MCS-51 microcontroller and an algorithm for CFG building are described. Later on, a novel dataow analysis concerning the particular architectural feature of register bank swapping is discussed in length. Finally, remaining challenges in static analysis of Intel MCS-51 assembly code are pointed out and possible approaches to overcome them are stated.
43
5 Static Analysis
Listing 5.1: Source code used for CFG building. The source code uses three variables, a few assignments and a conditional program branch. The resulting CFG is shown in Figure 5.1. It is composed out of six vertices and eight edges. entry [y:=x]
[z:=1]
[y>0]
[y:=0]
exit
[z:=z*y]
[y:=y-1] Figure 5.1: The resulting CFG for Listing 5.1. There is an edge from the entry to the rst executable node of the CFG, that is, to the node coming from the rst instruction of the program memory. There is an edge to the exit from any node that contains an instruction that could be the last executed instruction of
44
the program. If the nal instruction of the program is not an unconditional jump, then the node containing the nal instruction is one predecessor of the exit node. The same applies to any node that includes a jump to code that is not inside the valid program memory range. CFGs and their importance for various compiler optimizations are discussed in length in [87]. In summary, a CFG is a representation of all paths that might be traversed through a microcontroller program.
45
5 Static Analysis
entry
Backward data-ow
[y>0]
entry value
[y:=0] exit
[y>0]
exit value
[y:=0] exit
[z:=z*y]
exit value
[z:=z*y]
entry value
[y:=y-1]
[y:=y-1]
Figure 5.2: Data-ow analysis. assignments that may have dened the current value of variables. RDA aims at answering the following questions [88]: Which denitions of variable x reach a given use of x in an expression? Is x used anywhere before it is dened? A denition of a variable x is an operation that assigns, or may assign, an actual value to x. Furthermore, a denition R reaches a program location l if there is a path from the point immediately following R to l such that the denition R is not redened along the path [89]. A variable is redened between two program locations whenever there is an assignment that denes a new value of that variable. Considering the given example code in Listing 5.2, the denition of code line 1 reaches line 2, but the denition made at code line 3 does not reach code line 5 since y is redened in assignment 4.
1 2 3 4 5
Listing 5.2: RDA example code. The aforementioned informal statements about reaching denitions can be expressed as data-ow equations [81, 9] (Note that l denotes the current program location and l its successor):
RDentry (l) =
46
The presented data-ow equations use two assistant functions, i.e., killRD (l) and genRD (l), respectively. Whereas genRD (l) represents a set of denitions created by an operation at program location l, the term killRD (l) represents a set of denitions destroyed by an operation. RDentry (l) contains the set of denitions that are reaching the entry of program location l. The set of denitions that are reaching the exit of program location l are contained in RDexit (l). For the example code shown in Listing 5.3, the results of killRD (l) and genRD (l) as well as the results for RDentry (l) and RDexit (l) are gradually performed and listed in Table 5.1.3. The presented example is based on an example given in [81]. The statement at program location 3 (l = 3) simply checks whether the variable x is greater than a constant value, thus, both the killRD (3) and the genRD (3) function do not yield any results. Most important, the evaluation of RDentry (3) reveals that immediately before entering location 3, variable x was dened either at program location 1, denoted by (x, 1) or location 5, denoted by (x, 5). Location 5 is included for the case that the body of the while loop was previously executed. The result of the RDA is an over-approximation of denitions reaching this location. That is a mapping indicating for each variable where it was possibly written the last time.
1 2 3 4 5 6
Listing 5.3: RDA example code. l 1 2 3 4 5 killRD (l) (x, ?), (x, 1), (x, 5) (y, ?), (y, 2), (y, 4) (y, ?), (y, 2), (y, 4) (x, ?), (x, 1), (x, 5) genRD (l) (x, 1) (y, 2) (y, 4) (x, 5) RDentry (l) (x, ?), (y, ?) (y, ?), (x, 1) (x, 1), (y, 2), (y, 4), (x, 5) (x, 1), (y, 2), (y, 4), (x, 5) (x, 1), (y, 4), (x, 5) RDexit (l) (y, ?), (x, 1) (x, 1), (y, 2) (x, 1), (y, 2), (y, 4), (x, 5) (x, 1), (y, 4), (x, 5) (y, 4), (x, 4)
Table 5.1: Results after solving data-ow equations for source code Listing 5.3. In the case of [mc]square, RDA is used to obtain a set of possible values of memory locations within the microcontroller simulator. These variables are used as input data for further analysis, such as the global Interrupt Flag Analysis (IFA) [9] and the Register Bank Analysis (RBA) (see Sections 5.2.6 and 5.2.8).
47
5 Static Analysis
necessary to permanently store that value if it is known to be dead at the end of the block [87]. It is important to note, that LVA is used within [mc]square with a dierent intention. The results obtained from the LVA are used by the model checker to combine single states that do only dier in the value of dead memory locations. Dead memory locations can be reseted and states that only dier in dead memory locations can be merged into single states. Thus, this analysis contributes to state space reduction and helps to contain over-approximation by the model checker. As aforementioned, information ow for LVA travels backwards through the CFG opposite to the ow of control since the analysis aims to prove that the use of a variable at program location l is propagated to all points prior to l along an execution path, so that one may know at the prior point l that the variable will have its value used. Similar to RDA, data-ow equations are used to express the LVA problem [81, 9] (l denotes the current program location and l its predecessor):
LVexit (l) =
Within the LVA, killLV (l) represents a set of variables dened by an operation at program location l whereas genLV (l) represents a set of variables that are consumed by an operation. LVentry (l) contains the set of variables that are live at the entry of program location l. The set of variables that are live at the exit of program location l are contained in LVexit (l).
[ x:=2]; [ y:=4]; [ x:=1]; ( if [ y>x ] then [ z :=y ] else [ z :=yy ] ) ; [ x:= z ]
1 2 3 4 5 6 7
Next, the functions killLV (l) and genLV (l) are evaluated for each location of the program shown in Listing 5.4. The results are used and applied to the data-ow equation, resulting in the following statements.
48
LVentry (1) = (LVexit (1) \ killLV (1)) genLV (1) = LVexit (1) \ {x} = { } LVentry (2) = (LVexit (2) \ killLV (2)) genLV (2) = LVexit (2) \ {y} = { } LVentry (3) = (LVexit (3) \ killLV (3)) genLV (3) = LVexit (3) \ {x} = {y} LVentry (4) = (LVexit (4) \ killLV (4)) genLV (4) = LVexit (4) {x, y} = {x, y} LVentry (5) = (LVexit (5) \ killLV (5)) genLV (5) = (LVexit (5) \ {z}) {y} = {y} LVentry (6) = (LVexit (6) \ killLV (6)) genLV (6) = (LVexit (6) \ {z}) {y} = {y} LVentry (7) = (LVexit (7) \ killLV (7)) genLV (7) = {z} LVexit (1) = LVentry (2) = { } LVexit (2) = LVentry (3) = {y} LVexit (3) = LVentry (4) = {x, y} LVexit (4) = LVentry (5) LVentry (6) = {y} LVexit (5) = LVentry (7) = {z} LVexit (6) = LVentry (7) = {z} LVexit (7) = { } For example, the term LVentry (4) corresponds to the statement [y>x] found at source code line 4 (cf. Listing 5.4). LVentry (4) evaluates to {x, y}, revealing that at the entry point of that particular program location the only two variables live are x and y. Furthermore, LVentry (5) evaluates to {y} indicating that y is still alive immediately before statement [z:=y]. For the chosen example, LVexit (l) and LVentry (l ) yield the same results, arising from the fact that the presented example code for the sake of simplicity lacks any kind of program loops. l 1 2 3 4 5 6 7 killLV (l) x y x z z x genLV (l) x, y y y z LVentry (l) y x, y y y z LVexit (l) y x, y y z z
Table 5.2: Results after solving LVA data-ow equations for source code Listing 5.4. The annotated CFG of the example is shown in Figure 5.3. In here, it becomes obvious that there is no variable marked as live in the rst program location, i.e., the statement [x:=2]. As a result, the rst assignment of value 2 to variable x is superuous and can be neglected. The resulting and minimized CFG is shown in Figure 5.3(c). In compiler theory, such a reduced CFG would lead to a smarter code that can be generated by the compiler backend.
49
entry [x:=2] LVentry = {} LVexit = {y} LVA ow [x:=1] LVentry = {y} LVexit = {x, y} LVentry = {x, y} LVexit = {y} LVentry = {y} LVexit = {z} exit [x:=z]
(a) CFG for Listing 5.4.
LVentry = {} LVexit = {}
entry [y:=4]
[y:=4]
[x:=1]
[y>x]
[z:=y]
[z:=y*y]
[z:=y]
[z:=y*y]
exit [x:=z]
5 Static Analysis
50
However, [mc]square does not use this data-ow information to generate any code but would now mark the memory location where variable x is saved as dead, thus, resetting the memory location to its initial value. Resetting a memory location to its initial value increases the probability of nding equal states within the generated state space. Equal states do not contribute to the expansion of the state space size, since the model checker simply adds an additional edge to the state space graph.
RDexit (entry) = 0; foreach Node l other than entry do RDexit (l) = 0; end while any RDexit (l) changes do foreach Node l other than entry do RDentry (l) = RDexit (l ) | l predecessor of l; RDexit (l) = (RDentry (l) \ killRD (l)) genRD (l); end end
For the LVA a similar algorithm is used. As aforementioned, information ow travels backwards through the control ow in the CFG, thus, the LVA algorithm starts by initializing LVentry (exit) = 0 and the sets LVentry and LVexit have their roles interchanged as shown in Algorithm 2. For more details on the theoretical background of data-ow analysis the interested reader is referred to relevant literature, such as [87, 81].
51
5 Static Analysis
Algorithm 2: A xed point iterating algorithm to solve data-ow equations for the LVA problem [87]. Input : A CF G with killLV (l) and genLV (l) resolved for each node. Result: LV entry (l) and LV exit (l), the set of denitions reaching the entry and exit of each node l CFG.
1 2 3 4 5 6 7 8 9 10
LV entry (exit) = 0; foreach Node l other than exit do LV entry (l) = 0; end while any LV entry (l) changes do foreach Node l other than exit do LV exit (l) = LV entry (l ) | l predecessor of l; LV entry (l) = (LV exit (l) \ killLV (l)) genLV (l); end end
Figure 5.4: The [mc]square static analysis framework for the Intel MCS-51 target. The [mc]square static analysis framework is composed out of: Parser and preparation handles the interaction with the user. It accepts a compiled and linked *.hex le and a specication given in CTL. Furthermore, it parses common debug formats in order to preserve the connections between the analyzed assembler code and the source code le, which may be written in C, C++, Java, or any other high level language able to be compiled towards assembler machine code for the Intel MCS-51 microcontroller. As aforementioned, a complete and precise CFG is the basis
52
for all further data-ow analysis. Consequently, the parser & preparation component is responsible of preparing the analyses and building the CFG. Data-ow analyses performs forward and backward oriented data-ow analyses, such as RDA and LVA. It uses the CFG to execute those analyses. The extracted reaching denitions are further used by the particular abstraction techniques in order to gather a better program comprehension. Further, it includes the novel RBA, a Stack Analysis (SA), and an Interrupt Flag Analysis (IFA). Abstraction techniques use the information gathered by the data-ow analyses to apply state space reductions. A technique called Dead Variable Reduction (DVR) is used to mark dead memory locations1 , prompting the model checker to reset certain memory locations whilst model checking. Another concept is Path Reduction (PR), which aims at combining single successor chains, e.g., of an ISR into a single state. Model checking uses the additional information about the veried program in order to reduce the overall system states. In the following, a rather conceptional description about the actual implementation of the various analyses into the [mc]square framework is given. For a more detailed insight, the interested reader is referred to the respective source code.
5.2.1 Overview
Currently, [mc]square is able of conducting the following static analyses: Control Flow Analysis (CFA) [81] Stack Analysis (SA) [9] Reaching Denition Analysis (RDA) [81] Interrupt Flag Analysis (IFA) [9] Live Variable Analysis (LVA) [81] Dead Variable Reduction (DVR) [81, 90] Path Reduction (PR) [91, 90] The execution order is depicted in Figure 5.4. The SA is used to track dependencies between values pushed onto and popped from the stack. For instance, the PSW is frequently pushed onto the stack at the beginning of a function and then read from the stack at the end. The status of interrupt registers is extracted from the reaching denitions, which then inuences the RDA in the next iteration. The RBA described in Section 5.2.6 interacts with the RDA and, in consequence, increases the precision of the RDA and the IFA. All analyses are designed as interprocedural analyses due to the peculiarities of assembly code. For instance, all memory locations can be accessed globally. Data-ow analyses in [mc]square consist of the following steps:
1
Recapitulating, a dead memory location is a memory location that is not used anymore in the further progression of the input program.
53
5 Static Analysis
(i) The static behavior of a function is determined, where the eects of function calls are ignored. (ii) The static behavior of a called function is propagated from the return statement of a callee into the call site. (iii) Data-ow information is propagated from a call site into a called function. All these steps run as xed point iterations2 to support recursive function calls. More details of this approach are described in [90, 9]. In what follows, the adaption of these existing static analysis techniques for the Intel MCS-51 target architecture is described.
C: 0 x0800 C: 0 x0802 C: 0 x0803 C: 0 x0805 C: 0 x0807 C: 0 x0900 C: 0 x0902 C: 0 x0903 C: 0 x0905 C: 0 x0907 C: 0 x0800 C: 0 x0900
MOV MOV ADDC XRL CPL SETB MOV ADDC XRL CPL
78 D3 E8 35 20 64 11 F4 D3 E8 35 20 64 11 F4 XX
A xed point iteration is the common approach to solve data-ow equations. Usually, the analyses are repeated until no change can be detected. Most time only the dierence between iterations are concerned, in order to avoid redundant steps. The interested reader is referred to [81, 87]
54
To illustrate that behavior, consider Listing 5.5. In here, it is assumed that the compiler has already translated the high level code to assembly instructions. The program memory ranges 0x0800 - 0x0807 and 0x0900 - 0x0907 contain almost similar code. The only dierence is that the Carry ag is set at location 0x0900 before entering the calculations starting at locations 0x0802 and 0x0902, respectively. Thus, in order to save program memory space the compiler might now combine those similar blocks, by replacing the ve instructions located from 0x0900 to 0x0907 by an unconditional jump, e.g., an AJMP [0x0801], leading to a dynamic disassembly of the location 0x0801, which turns out to become the same sequence as when executing sequentially from 0x0900 to 0x0907. Thereby, the compiler can save ve bytes of program memory, since the AJMP instruction itself is two bytes long. Considering the discussed compiler optimization of sharing equal program memory bytes, it is not possible to sequentially decode the program memory, i.e., by iterating over the program memory and changing the index pointer by the instruction length. In the present case, a more elaborated approach to CFG building is needed. The implemented algorithm for building the CFG out of a given Intel MCS-51 program memory is given in Algorithm 3. Algorithm 3: CFG building algorithm for the Intel MCS-51 target. Input : A disassembled program memory content P. Result: An equivalent CF G representation for P.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
initialize CF G; foreach instruction I in P do if I is an indirect branching statement then quit CF G building; end add new node l to CF G; label node l with detail info of I; add new edge E from l to all its successors l ; end while unknown target addresses exist do foreach edge E in CF G do obtain target addresses A of E; if A does not point to an existing l in CF G then dynamically disassemble P at address A; obtain new instruction I = P(A); if I is an indirect branching statement then quit CF G building; end add new node l to CF G; label node l with detail info of I; add new edge E from l to all its successors l ; end end end
First, an object capable of storing a CFG is initialized. Then, sequential decoding of the program memory starts. For each instruction a node is added to the CFG.
55
5 Static Analysis
Moreover, edges are added from the current node to all its successors. For example, for non-program branching instructions one edge is added (from the current program counter location l to l representing the location with the program counter value of l + l.length). If an indirect branching statement is detected whilst decoding, CFG building is stopped, since one cannot guarantee anymore that the resulting CFG will be complete. Then, the iteration starts over as long as new branching targets are found that do not point to an existing node in the CFG. Later on, the target addresses are extracted and the program memory is re-decoded at this location. Finally, the dynamic decoded instructions are added to the CFG. The loop is left when all target addresses are resolved and map to a node in the CFG, thus, ensuring that the resulting CFG is complete.
56
Table 5.3 shows the evaluated gen(l) and kill(l) statements for the instructions above. ADDC [A, direct] reads the Accumulator, the specied direct memory location, and the carry ag, thus, gen(l) evaluates to {A, direct, C}. Similar, the instruction writes the Accumulator and the ags carry, auxiliary-carry, overow, and parity, thus, kill(l) evaluates to {A, C, AC, OV, P}. Instruction CLR [C] MOV [dest, src] ADDC [A, direct] gen(l) src {A, direct, C} kill(l) C dest {A, C, AC, OV, P}
Table 5.3: Action List Building a few examples. The implementation of the instruction visitors is straightforward. Listing 5.6 shows the corresponding visitor for the mnemonic ADDC [A, direct]. The instruction visitors for the remaining instructions work quite similar to the introduced one and the interested reader is referred to the actual source code of [mc]square for details.
1 2 3 4 5 6 7 8 9 10 11 public v o i d v i s i t ( ADDC_A_Direct i n s t r u c t i o n ) { addSingleRead ( currentVertex . a c t i o n L i s t , addSingleRead ( currentVertex . a c t i o n L i s t , addPSWBitRead ( c u r r e n t V e r t e x . a c t i o n L i s t , addSingleWrite ( currentVertex addPSWBitWrite ( c u r r e n t V e r t e x addPSWBitWrite ( c u r r e n t V e r t e x addPSWBitWrite ( c u r r e n t V e r t e x addPSWBitWrite ( c u r r e n t V e r t e x } . . . . . actionList actionList actionList actionList actionList , , , , , t r u e , C 5 1 U t i l i t i e s . REGISTER_ACC ) ; true , i n s t r u c t i o n . address ) ; t r u e , PSW.CY ) ; true true true true true , , , , , C 5 1 U t i l i t i e s . REGISTER_ACC ) ; PSW.CY ) ; PSW.AC ) ; PSW.OV) ; PSW. P ) ;
Listing 5.6: The Action List Builder visitor pattern for the ADDC [A, direct] instruction.
57
5 Static Analysis
instruction) is determined and in a later step the behavior of functions and interrupts is propagated through their callers in the CFG. The C51LVABuilder applies a LVA on single nodes in the CFG. Usually, a CFG consists of several functions that are called by, e.g., the main routine. Moreover, especially in embedded systems software the use of interrupts is common, thus, a program may respond to various sources of interrupts. In principle there is no dierence between interrupts and functions, except that interrupts can occur at every location along the CFG, whereas a function is always explicitly called. Thus, it is not sucient to consider the set of live variables on the node level, a more broaden approach is needed to obtain usable results by the LVA. This broaden approach is realized by the base classes of C51LVABuilder, namely LVABuilder and BackwardProceduralAnalysis, respectively. The corresponding type hierarchy is given in Figure 5.6. Analysis BackwardProceduralAnalysis LVABuilder C51LVABuilder (architectural dependencies) Figure 5.6: The type hierarchy of the C51LVABuilder. Within those classes, [mc]square implements the propagation of LVA relevant behavior from single nodes in the CFG to their predecessors and successors, from functions back to their caller, and from ISR to all those locations within the CFG where interrupts are enabled.
58
Analysis ForwardProceduralAnalysis RDABuilder C51RDABuilder (architectural dependencies) Figure 5.8: The type hierarchy of the C51RDABuilder. More details on the conceptional approach are given in [9], for implementation details the reader is referred to the actual source code of [mc]square.
7} 7} 7} 7}
Table 5.4: Register bank congurations of the Intel MCS-51. Register bank swapping is a frequent approach taken by the compiler for passing data to functions or for saving status information before entering ISRs. Conducting a register bank swap over pushing values of memory locations onto the stack before entering an ISR minimizes interrupt latency, and thus, it is the favored approach for time-critical interrupts. Programs with embedded assembly code, however, can change the register bank at any program location by bit-wise writing the register bank pointer. Knowing the actual value
59
5 Static Analysis
of the register bank pointer is a decisive criterion for the precision and usefulness of further analysis, such as LVA and RDA. For example, in case that a variable resides within memory area (a) of the microcontroller, the analysis results can be signicantly sharpened if precise values of the register bank pointer are determined. Consider the Intel MCS-51 instruction MOV [R0, #const], which copies an immediate value to the working register R0 of the currently active register bank. Apparently, MOV [R0, #const] reads the immediate #const and writes the working register R0. killLV () evaluates to R0 and genLV () to #const, respectively. In order to assign R0 to a certain register bank, however, there is a need for special treatment of the register bank pointer composed out of the control bits RS0 and RS1. Motivation Missing prior information about the actual values of RS0 and RS1 is cumbersome. Any following data-ow analysis suers from the generated over-approximation due to the unknown value of the register bank pointer. This eect is detailed in Table 5.5. For example, if only a single bit of the register bank pointer is ambiguous (either or , see Section 5.2.6 for denition), none of the working registers can be marked as killed since the active register bank is unknown. The actual working register may be located at register banks 0, 1, 2, or 3. As mentioned in Section 5.1.4, dead variables are reset during state space building, thus, leading to a greater number of equal states that can be merged. Therefore, a bit-wise analysis of the bank selection pointer seems worthwhile and actively contributes to smaller state spaces. Bank Selection Pointer Banks RS0 RS1 0 0 0 1 0 1 2 1 0 3 1 1 {0, 2} / 0 {1, 3} / 1 {0, 1} 0 / {2, 3} 1 / {0, 1, 2, 3} / / killLV () Register Bank Memory R0 {0} R1 {0} R2 {0} R3 {0} R0 {0},R2 {0} R1 {0},R3 {0} R0 {0},R1 {0} R2 {0},R3 {0} R0 {0},R1 {0},R2 {0},R3 {0}
Safe?
no no no no no
Bit-wise Modeling The register bank pointer is modeled at bit-level granularity to capture the eects of bit-wise operations on the PSW. For most of the other analyses, registers are modeled at byte-level granularity, which turned out to be accurate enough. Based on ideas from abstract interpretation [92], a single bit is represented using a complete lattice as shown in Figure 5.9. In the following, the lattice for a single bit depicted on the right-hand side is denoted by L1 .
60
1 11 1
1 10 1
0 01 0
0 00 0 0
(b) For a single bit.
Figure 5.9: Bit-wise modeling of the register bank selection pointer. The lattice L1 is composed of the values 0 (false), 1 (true), a top element (all), and a bottom element (unknown). The top element represents a bit that may have the value 0 or 1, and the bottom element states that no information is available at all. A 4-valued approach of bit-wise modeling is required since merging dierent paths in the CFG forces the analysis to generate a safe over-approximation. Branches in the CFG origin from conditional branching instructions, which change the ow of program execution. Examples are JZ (jump if accumulator zero), CJNE (compare jump if not equal), and DJNZ (decrement jump if not zero). Merging multiple predecessors in the CFG and combining their individual contributions at conuence points is performed by a join-operation as illustrated in Figure 5.10. the join-operator 0 1 0 0 / 1 / 1 / / / DJNZ / / / / SETB join MOV
Formal Description In the presented approach, the existing RDA is extended to analyze the register bank pointer at bit-level granularity. In a rst run of the analysis the reaching denitions for the bits RS0 and RS1 are gathered by a RDA at bit-level. In the rst pass, all register banks are assumed to be active in each program location. Then, in further iterations, the application of the join-operator introduced in Table 5.5 leads to more precise results. The join-operator is implicitly encoded in the equations explained in the remainder of this section. In the following, the notation of Nielson et al. [81] is used for the denition of the functions genRBA and killRBA , which form the basis of the extension of the RDA. The function : L1 2{0,1} is used to project lattice elements representing register bank congurations to the domain of values they represent.
61
5 Static Analysis
{0, 1} if r = , (r) = if r = , r otherwise. The function is used in function : L1 L1 2{0,1,2,3} , which computes integer representations of possible register bank congurations. Here, (r0 , r1 ) returns the set of all register banks that may be active due to the values of r0 and r1 .
(r0 , r1 ) = {2 x0 + x1 |x0 (r0 ), x1 (r1 )} A reaching denition is a pair (v, ), where v represents a memory location or a register and represents an instruction. Reaching denitions with register bank analysis are computed in several iterations. In the following, the values of RS0 and RS1 in program location after the ith iteration are denoted by Ri () and Ri (). They can be exRS0 RS1 tracted from the results of the i-iteration of the analysis. It is initially R0 = R0 = , RS0 RS1 which means that all register banks are assumed to be active. This leads to a conservative over-approximation of reaching denitions in the rst iteration. For an assignment to R? {k}, for instance through an instruction MOV [R0, #0x80], a reaching denition for register R0 on register bank b, denoted by Rb {k}, is generated in program location using genRBA , if there exists a register bank conguration b. The notation R? {k} denotes that from the instruction itself, no knowledge about the active register bank is present.
geni+1 () = {(Rb {k}, )|R? {k} is assigned a value RBA in r0 Ri () RS0 r1 Ri () b (r0 , r1 )} RS1 In case the register bank conguration is ambiguous, an over-approximation of the real behavior is generated because a reaching denition is generated for each possible register bank conguration. In like manner, a reaching denition is deleted by killRBA only if the register bank conguration is unambiguous. This means, a denition can only be overwritten if only a single register bank conguration is possible. Otherwise, no reaching denitions can be killed in order to guarantee an over-approximation.
in r0 Ri (), r1 Ri () : RS0 RS1 b (r0 , r1 ) | (r0 , r1 )| = 1} In case no assignment to a memory location addressable using register banks is found, the common equations for RDA are used.
62
in k 8 (R{k}, ) RDAi1 ()} A geni RDA () = {(R{k}, )|R{k} is assigned a value in k 8} The entry- and exit-functions for RDA using RBA are then expressed in such a way i that the specic equations geni RBA and killRBA are only used for those memory locations addressable through register banks. That is, these functions are only used for Rb {k} with 0 b 3 and 0 k 7. Hence, RBA is used for absolute memory addresses from 0x00 i to 0x1F. For all other memory locations, geni RDA and killRDA are used. RDAi () = (RDAi () \ A
i i (killRDA () killRBA ())) i geni RDA () genRBA ()
RDAi () = A
The results are rened in further iterations. Due to monotony3 , the results become smaller after each iteration and eventually stabilize after a nite number of iterations. In practice, a xed point was reached for all programs checked already after the second iteration, but it is possible to construct programs where more iterations are required. After each iteration, concrete values for RS0 and RS1 are extracted from the RDA if possible and used in the next iteration. Consequently, RBA is conducted at least twice: The rst time to collect reaching denitions for RS0 and RS1 and further times to rene the analysis results by actively using the previously extracted values of the register bank pointer for read and write accesses on working registers. For example, Ri () is a reaching denition for the bit RS0 at a RS0 certain program location , containing a set of all denitions geni RDA () detected for RS0 through the program. The denitions originate from the predecessors of in the CFG. In like manner, RBA also contributes to the precision of LVA by allowing a more precise reasoning about which values are read in which locations. This enhanced precision is demonstrated using an example in the following section. Example To highlight the eectiveness of the introduced RBA, the analysis of an assembler program that alters the register bank pointer is described. In particular, the contribution of RBA to the precision of LVA is evaluated. The program is given in Listing 5.7.
1 2 3 4 3
A function f : L1 L2 between partial ordered sets L1 = (L1 , 1 ) and L2 = (L2 , l, l L1 : l 1 l f (l) 2 f (l ). denotes partial ordering [81].
is monotone if
63
5 Static Analysis C: 0 x0105 C: 0 x0105 C: 0 x0106 C: 0 x0108 C: 0 x0109 C: 0 x010B C: 0 x010D C: 0 x010F C: 0 x0111 C: 0 x0112 C: 0 x0115 C: 0 x0115 C: 0 x0118 C: 0 x011A C: 0 x011B C: 0 x011E C: 0 x011E C: 0 x0120 C: 0 x0121 RAM_CLR: DEC MOV MOV JNZ MOV JNZ MOV INC LJMP READ_P3: MOV MOV INC MOV CONT: MOV ADD SJMP END
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
18 7600 E8 70FA E590 7006 ABA0 0B 02011E 75D008 ABB0 0B 75D000 E50B 2B 80FE
R0 @R0,#0 x00 A, R0 RAM_CLR A, P1 READ_P3 R3 , P2 R3 CONT PSW,#0 x08 R3 , P3 R3 PSW,#0 x00 A, 0x0B A, R3 $
Listing 5.7: Example assembly code. During the start-up code, the PSW is initialized with 0x00 (see source code lines 3-9), thus, the initial register bank is register bank 0 (RS0=0, RS1=0). In source code line 16, however, the register bank pointer is altered and for lines 16-19 the active register bank is register bank 1 (RS0=0, RS1=1). For the remaining program, register bank 0 remains active. The results of the LVA are listed in Table 5.6. For the sake of illustration, ISRs are not considered within the example to keep things simple and focus on comprehension of the main idea of the analysis. Functions and ISRs require the propagation of local analysis results to call-sites. In consequence, this makes the intermediate steps and results dicult to follow. The results are evaluated in two ways. First, the results with the support of RBA are described. These results are compared to the original analysis without the additional information gathered by RBA. The new analysis successfully generates the desired overapproximation and narrows the analysis results. For example, instead of adding registers R0 {3}, R1 {3}, R2 {3}, R3 {3}, and P 3 to the set of live variables at line 17 (program location 0x118), the RBA reveals that the exact set of live variables at this location is only composed of R0 {3} (working register 3 on bank 0) and P 3. Hence, the number of live variables reduces from 5 down to 2, which is a signicant improvement. Consequently, reducing the number of live variables increases the number of variables that can be marked as dead. The same applies to program location 0x10f where the RBA successfully determines register bank 1 as active. Thus, instead of setting registers R0 {3}, R1 {3}, R2 {3}, R3 {3}, and P 2 live, RBA reduces the set of live registers down to R1 {3} and P 2. Moreover, for the program locations 0x115, 0x118, 0x11a, and 0x11b, the analysis manages to recognize the change of the register bank pointer from 0 to 1, which is conducted by the instruction MOV [PSW, #0x08] in program location 0x115. Evaluating the resulting set of live variables in this example reveals that without the new RBA one is unable to make precise propositions about the register bank pointer
64
5.2 Implementation Static Analysis for the C51Simulator (0x000) LJMP 0x0100
(0x105) DEC R0
(0x108) MOV A, R0
(0x10b) MOV A, P1
(0x115) MOV PSW, #0x08 (0x10f) MOV R3, P2 (0x118) MOV R3, P3 (0x111) INC R3 (0x11a) INC R3 (0x112) LJMP 0x011E (0x11b) MOV PSW #0x00
(0x120) ADD A, R3
(0x121) SJMP $
Figure 5.11: The corresponding CFG as generated with [mc]square for the assembly code in Listing 5.7.
65
5 Static Analysis
conguration. In this case, the highest degree of over-approximation for working registers has to be applied, i.e., all four register bank combinations are added to the set of live variables (see Table 5.6). The RBA, however, signicantly improves the LVA results. The actual contribution to state space reduction is dicult to state due to the strong interdependence of the analyses. It is simple to construct example codes where an enabled RBA leads to signicant state space reductions. On the other hand, examples exist where RBA fails to further shrink the state space. To give an estimation for the example code at hand, the overall state space without static analysis consists of 263,683 states. However, in case static analysis supported by RBA is activated the state space shrinks down to 196,740 states leading to a reduction of appr. 25% for this specic example. Note that the large number of states results from the fact that the application reads three I/O ports. PC 0x000 0x100 0x103 0x105 0x106 0x108 0x109 0x10b 0x10d 0x10f 0x111 0x112 0x115 0x118 0x11a 0x11b 0x11e 0x120 0x121 with RBA R0 {3},R1 {3},P1,P2,P3 R0 {3},R1 {3},P1,P2,P3 R0 {3},R1 {3},P1,P2,P3 R0 {0,3},R1 {3},P1,P2,P3 R0 {0,3},R1 {3},P1,P2,P3 R0 {0,3},R1 {3},P1,P2,P3 R0 {0,3},R1 {3},P1,P2,P3,A R0 {3},R1 {3},P1,P2,P3 R0 {3},R1 {3},P2,P3,A R1 {3},P2 R0 {3},R1 {3} R0 {3},R1 {3} R0 {3},P3 R0 {3},P3 R0 {3},R1 {3} R0 {3},R1 {3} R0 {3},R1 {3},A R0 {3},A without RBA R0 {0,3},R1 {0,3},R2 {0,3},R3 {0,3},P1,P2,P3 R0 {0,3},R1 {0,3},R2 {0,3},R3 {0,3},P1,P2,P3 R0 {0,3},R1 {0,3},R2 {0,3},R3 {0,3},P1,P2,P3 R0 {0,3},R1 {0,3},R2 {0,3},R3 {0,3},P1,P2,P3 R0 {0,3},R1 {0,3},R2 {0,3},R3 {0,3},P1,P2,P3 R0 {0,3},R1 {0,3},R2 {0,3},R3 {0,3},P1,P2,P3 R0 {0,3},R1 {0,3},R2 {0,3},R3 {0,3},P1,P2,P3,A R0 {3},R1 {3},R2 {3},R3 {3},P1,P2,P3 R0 {3},R1 {3},R2 {3},R3 {3},P2,P3,A R0 {3},R1 {3},R2 {3},R3 {3},P2 R0 {3},R1 {3},R2 {3},R3 {3} R0 {3},R1 {3},R2 {3},R3 {3} R0 {3},R1 {3},R2 {3},R3 {3},P3 R0 {3},R1 {3},R2 {3},R3 {3},P3 R0 {3},R1 {3},R2 {3},R3 {3} R0 {3},R1 {3},R2 {3},R3 {3} R0 {3},R1 {3},R2 {3},R3 {3} R0 {3},R1 {3},R2 {3},R3 {3},A
Table 5.6: Comparison of resulting live variables. Summarizing, the described approach of RBA is a powerful contribution to narrow dataow analysis results for the Intel MCS-51 architecture. It can be applied to a variety of programs, showing its precision whenever the compiler makes use of instructions involving register banks. For the Intel MCS-51 microcontroller, the RBA handles all of the 160 (out of 256) instructions that use register banks.
66
the correct order. For the SA [9], the needed adaption for the Intel MCS-51 target were little, thus, not elaborated in this thesis.
S2
S3
}
S6 ...
S1
S4
S4
S5 ...
Figure 5.12: The principle of PR. In microcontroller code such single successor chains are, for example, found in ISRs. Although this abstraction contributes greatly to state space reductions, a minor drawback still exists. The validity of the CTL neXt operator is not preserved due to the compression of successor chains into single states. Thus, using path reduction leads to a restriction of applicable CTL statements and may lead to incomplete counterexamples that are dicult to understand. The subset of statements which can be used in combination of PR are called CTL-X, as described by Yorav and Grumberg in [91]. For the process of collapsing multiple successors to a single state, a set of rules for path reduction for the Intel MCS-51 target was established:
67
5 Static Analysis
1. PR cannot be applied if the given CTL specication makes use of the neXt operator. 2. PR cannot be applied if one of the successors is an ISR. Similar to rule #4. 3. Any state in which a register is written that is part of the CTL specication cannot be collapsed, since the model checking algorithm has to evaluate the value of this register at this state in order to prove or falsify the specication. 4. Any branch in the CFG determines an end point such as S4 in Figure 5.12. Thus, path reduction must preserve the full control ow of the program. The idea of PR is implemented in two components of [mc]square, i.e., the C51Simulator and the Path Compressor. The C51Simulator part is called the dynamic and the Path Compressor part is called the static component of PR. The Path Compressor iterates over the nodes in the CFG and tracks whether the node writes a memory location involved in the CTL formula, thus, covering rule #3. The obtained results are back-annotated into the CFG. Indirect control ow and successors branching to ISRs are detected on-the-y whilst state space building by the C51Simulator, covering rule #2 and rule #4. Finally, rule #1 is checked by the input parser.
68
addressing mode, which provides exibility for the compiler. For the Intel MCS-51 the working registers R0 or R1 may serve as base registers for indirect addressing. Again, when executing instructions such as MOV [A, @R0], it is the register bank pointer that selects the corresponding base register from one of the four register banks. Thus, for resolving the actual base register the aforementioned RBA is used. Without actually executing the code prior to the instruction, which uses indirect addressing, it is in most cases not trivial to predict the content of the base register. This uncertainty forces our analysis to generate a conservative over-approximation. Consider the assembly snippet depicted in Listing 5.8.
1 2 3 4 5
Listing 5.8: Intel MCS-51 assembly snippet. In fact, resolving the destination register is a rather challenging problem for the static analysis. The destination register is the value held by R0 in line 5 in Listing 5.8. To resolve the actual value one has to consider: (i) The actual value of memory locations 0x25 and 0x26 needs to be detected. In order to do so, one has to trace back the operations performed on these memory locations until one can reason about the circumstances under which the actual values are generated. (ii) The exact semantics of the involved instructions is required. Although guessing the eect of certain instructions on the memory content of the microcontroller seems to be obvious for instructions such as CLR and MOV, it is a challenging task for complex instructions such as ADD at least without explicitly executing the instruction, for instance, by a target platform simulator. (iii) Embedded systems communicate actively with their environment, thus, various interrupt sources are likely to interfere with execution of the main process. Special care has to be taken in this case. Interrupt handlers may alter the values of memory locations 0x25, 0x26, or even the value of the working register R0. This situation becomes more challenging on target architectures supporting nested interrupts. (iv) Even though the presented assembly code does not contain any program branching instructions, the remaining program memory may contain direct and indirect jumps targeting any of the program locations stated in Listing 5.8. For instance, a branching instruction may target the program location holding MOV [R0, A] with an entire dierent register conguration compared to the sequential execution of the program fragment.
69
5 Static Analysis
statically computable such as target addresses of indirect branches the analysis framework is forced to generate a conservative over-approximation. For example, the unconditional indirect jump statement JMP [@A+DPTR] would add edges to all possible program locations reachable by the JMP [@A+DPTR] instruction. Indirect branches to dynamically calculated targets are fragments commonly used by the compiler in order to generate optimized code. As a matter of fact, in the embedded systems domain highly optimizing compilers are used due to prevailing resource constraints. An interesting aspect when dealing with indirect control ow is the fact that a target addresses of branch instructions can origin from either (i) the environment or (ii) from lookup tables stored in the program memory. The latter is the more common one, since reading branch target addresses from the environment is rarely found in real life applications.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
void main ( ) { ... switch ( var ) { case 0xAA : case 0xBB : case 0xCC : case 0xDD: case 0xEE : case 0xFF : case 0xC0 : default : } while ( 1 ) ; }
Listing 5.9: C source code containing switch statement. Lookup tables, among others, are used by the compiler to realize switch-case statements as shown in Listing 5.9. The program in Listing 5.10 shows the resulting assembler code generated by the Keil C51 Compiler v8.01, without any optimizations enabled. The assembler routine C?CCASE is called from the main method to achieve the required behavior needed for the switch statement (cf. Listing 5.9, line 14). The call of the subroutine (indirectly) pushes the current program counter value (0x0808) on the stack. Thereafter, the new data pointer value of 0x0808 is loaded from the stack by two consecutive POP statements. Next, the Accumulator holding the value of variable var1 is loaded into working register R0 and the Accumulator is cleared afterwards. Line 7 fetches a byte from program memory at address C:0x0808. Listing 5.11 presents a memory dump of the particular program memory section, revealing that the instruction in line 7 actually reads the byte 0x8. A conditional jump to address C:0x0860 is executed. Instructions at source lines 22 and 23 loading the comparison value 0xAA for the rst case branch (cf. Listing 5.9, line 14). The comparison value residing in R0 and the Accumulator are XORed, thus, carrying out a compare of the two values. In case the two values are equal, the Accumulator is set to zero after the comparison. For the rst run the comparison value (0xAA) does not match the value of variable var1 (0xC0), therefore, program ow reaches lines 26 to 29, incrementing the data pointer by three bytes. The aforementioned sequence is repeated for the comparison values 0xBB and 0xC0, respectively. Both do not match the value of variable var1. Next, the program code is
70
executed with the comparison value of 0xC0 that matches the actual value of variable var1. The comparison of the two values in line 24 evaluates to a cleared accumulator in line 25, thus, forcing a jump to program location C:0x0855 (cf. Listing 5.10, line 14). Lines 14 to 17 are loading the address of the corresponding function C:0x082B and the following two instructions reset the data pointer (DPTR). Finally, the indirect jump in line 21 branches to the selected function void foo3(void), which resides at program address C:0x082B (cf. Listing 5.12, line 3).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
C: 0 x0805 C?CCASE: C: 0 x0845 C: 0 x0847 C: 0 x0849 C: 0 x084A C: 0 x084B C: 0 x084C C: 0 x084E C: 0 x0850 C: 0 x0851 C: 0 x0853 C: 0 x0854 C: 0 x0855 C: 0 x0856 C: 0 x0857 C: 0 x0859 C: 0 x085A C: 0 x085C C: 0 x085E C: 0 x085F C: 0 x0860 C: 0 x0862 C: 0 x0863 C: 0 x0864 C: 0 x0866 C: 0 x0867 C: 0 x0868 C: 0 x0869
120845 D083 D082 F8 E4 93 7012 7401 93 700D A3 A3 93 F8 7401 93 F582 8883 E4 73 7402 93 68 60EF A3 A3 A3 80DF
LCALL POP POP MOV CLR MOVC JNZ MOV MOVC JNZ INC INC MOVC MOV MOV MOVC MOV MOV CLR JMP MOV MOVC XRL JZ INC INC INC SJMP
C?CCASE(C: 0 8 4 5 ) DPH( 0 x83 ) DPL( 0 x82 ) R0 ,A A A,@A+DPTR C: 0 8 6 0 A,#0 x01 A,@A+DPTR C: 0 8 6 0 DPTR DPTR A,@A+DPTR R0 ,A A,#0 x01 A,@A+DPTR DPL( 0 x82 ) ,A DPH( 0 x83 ) , R0 A @A+DPTR A,#0 x02 A,@A+DPTR A, R0 C: 0 8 5 5 DPTR DPTR DPTR C: 0 8 4A
Listing 5.10: Switch Statement Assembler code snippet. The corresponding lookup table assembled by the compiler contains the comparison values AA, BB, C0, CC, DD, EE, and FF (see Listing 5.11) for the case statements as well as the entry addresses of the called functions (see Listing 5.11 and Listing 5.12).
1 2 3 4
08 C0 35 42
21 08 EE xx
AA 2B 08 xx
08 CC 3A xx
26 08 FF xx
BB 30 00 xx
08 DD 00 xx
3F 08 08 xx
Listing 5.11: Program memory content. Again, at least for this particular case, it seems feasible to reason about actual target locations by searching for the pattern of comparison values in the program memory. Such naive approaches, however, may work for a very limited number of congurations, but they
71
5 Static Analysis
heavily depend on the compiler version, optimization levels, etc. Most important, they are only applicable for a certain target architecture. Consequently, a rather holistic approach needs to be found to overcome the problem of generating a precise CFG for programs containing indirect control ow.
1 2 3 4 5 6 7 8
void f o o 1 ( void ) ; void f o o 2 ( void ) ; void f o o 3 ( void ) ; void f o o 4 ( void ) ; void f o o 5 ( void ) ; void f o o 6 ( void ) ; void f o o 7 ( void ) ; while ( 1 ) ;
72
locations prior to the analysis, or by directly specifying upper loop bounds for constructs that cannot be analyzed automatically. Considering the enormous amount of research already performed on detecting loop bonds, we are eager to reuse the existing knowledge. For our particular purpose, we are interested in the following cases: (i) For a precise pointer analysis, a narrow approximation of loop bounds is required. This necessitates to take the eects of complete sequences of instructions into account as the conditions of branching instructions are frequently computed in a sequence of instructions. (ii) Other techniques such as program slicing [96, 77] require the detection of loop termination. During program slicing for model checking, instructions that have no inuence on the validity of a specication are removed from the program. Divergent behavior, i.e., non-termination, of the original program must remain visible in the sliced program. Hence, loops for which termination cannot be proven cannot be sliced. Moreover, statements that inuence loop conditions cannot be sliced, which strongly aects sizes of program slices. In summary, a combination of detecting loop bounds and loop termination is required in order to compute precise results for slicing.
5.3.5 Summary
Even though application tailored analysis methods, such as RBA, allow a signicant narrowing of data-ow analyses, there are still diculties in static analysis for assembly code to overcome. A scalable and precise pointer analysis would be a major boost for analysis precision within [mc]square. In consequence, future research incentives are required to develop a widely generic approach for resolving indirect read and write accesses on assembly code level. One requirement for such an approach is that only minor modications to the analysis are required to take peculiarities of dierent target architectures into account. Such a framework could as well be used for predicting actual target addresses on the assembly code level in order to accomplish CFG building of programs featuring indirect control.
73
5 Static Analysis
74
In what follows, a real life industry case study is conducted with the [mc]square model checker. The case study focuses on (i) assessing the feasibility of [mc]square when applied to real life embedded applications, (ii) evaluating the eects of the implemented abstraction techniques on the resulting state space size, and (iii) identifying future research directions. First, an introduction and a motivation for the case study is given. Next, hardware and software components of the application are presented. Later on, the used communication protocol is sketched. Then, temporal logic properties are postulated that correspond to the given textual specication. Finally, results and ndings are presented.
75
in Aachen, Germany. Texion Software Solutions core business is the development of individual, application tailored embedded hardware and software solutions. Their product portfolio includes the software ProFab, a software system for production data acquisition. Their customers use the software system for networking of textile knitting machines. A textile knitting machine produces various types of knitted fabrics of varying degrees of complexity. Modern knitting machines usually contain highly complex electronics controlling the needles and the yarn. Figure 6.1(a) shows such a machine. As in every industrial application software reliability is a major concern. Thus, formal verication of the knitting machines software is worthwhile, since failures caused by software faults are costly in terms of production losses and the associated additional maintenance eort. The target application of the case study is the software for a knitting machine monitoring device, as shown in Fig. 6.1(b). The source code for the knitting machine monitoring device was selected for the case study based on the following considerations: Good conformance of the application with the applications we are aiming at. The application targets the Intel MCS-51 microcontroller, uses several on-chip peripheral modules, interacts with its environment, and makes use of various interrupt sources. Commonality of interests with the developers and their willingness to cooperate. Our industry partner supported us by providing the full source code and a sample device. Furthermore, we can access all accompanying documents such as the software specication and the hardware schematics. Criticality and the need of high reliability. As aforementioned, awless software is crucial, since every malfunction is costly in industrial practice. Complexity. The number of source code lines is within our reach, i.e., from the conceptional point of view, [mc]square is able to handle applications of this size. It should be noted, that the source code line count is a rather unsuited indicator, whether an application can be successfully model checked or the model checker will run out of resources (state-explosion problem) whilst examining the code. It is almost solely the source code complexity that is crucial.
A Schmitt trigger is a comparator circuit that incorporates positive feedback. When the input is higher than a certain threshold, the output is high. When the input is below another (lower) threshold, the output is low. When the input is between the two, the output retains its value [97].
76
77
(b) Knitting machine monitoring device.
corresponding I/O pins of the microcontroller (four pins of Port 3 and four pins of Port 1). Microcontroller executes the software subject to verication. Serial Interface provides the physical link to the host application trough a RS232 interface. Host application uses the data gathered by the monitoring device for further processing. Miscellaneous (not depicted in Figure 6.2) Watchdog module is a hardware timing module that triggers the reset input of the microcontroller due to a faulty condition. The fault condition is reached if the watchdog hardware timer overows. The timer overow can be avoided if the microcontroller application resets the watchdog module periodically. Power and clock generation provides an inverse-polarity protection and a 5 V ltered and stabilized power supply. Furthermore, this module contains a quartzcontrolled clock generation. Light Emitting Diode (LED) module operates three LEDs, signalizing serial communication trac and the liveness of the application. Potential separation uses a photo-coupler for electrical isolation and performs the needed voltage level adjustment.
Knitting machine Knitting machine monitoring device Microcontroller 87C51 12 MHz Watchdog Serial interface Input module Host application
Power supply
LED module
78
Background
void main (void){ InitBoard(); SendTxt(STX,'R',ETX); SetTime(); while(1){ Watchdog();
Foreground
Loop
Serial ISR();
Figure 6.4: The software components. From the conceptional point of view, the source code can be divided into ve building blocks (cf. Figure 6.4): Revolutions Per Minute (RPM) module manages two external interrupts to count pulses from external rotary encoders. It initializes the two interrupt sources and denes their interrupt priority. The pulse count is internally mapped to a 16 bit wide unsigned data type. Timer module uses the Timer 0 peripheral module of the microcontroller to provide a system tick and four software timers. Furthermore, it provides trivial functions for time management like reset(), set(), and get_time(). State machine implements the serial communication protocol and performs the needed housekeeping. Serial interface module initializes the serial communication device of the microcontroller to 9600 Baud and uses dedicated circular buers for managing receive and transmit queues. It provides methods for sending and receiving characters. In- Output module reads the eight input ports for the monitoring function and handles the watchdog reset. Furthermore, it toggles the liveness LED.
79
The full source code of the case study consists of about 600 lines of C-code (i.e., 1400 lines of assembly code).
occupied
write pointer
Figure 6.5: A software circular buer model. The read pointer indicates the element that is read next and the write pointer determines the location, which will be lled with the next character. Altogether, the case study uses two dedicated circular buer structures, i.e., one for receiving and one for sending characters. The C code macros and the initialization calls are given in Listing 6.1.
1 2 3 4 5 6 7 8 9 10 11 12 / Header macro / #d e f i n e R i n g B u f f e r ( Name , DataType , IndexType , Exp , A t t r i b u t e ) \ s t r u c t {\ IndexType ReadIndex ; \ IndexType W r i t e I n d e x ; \ IndexType Mask ; \ DataType B u f f e r [1<<Exp ] ; \ } A t t r i b u t e Name = { 0 , 0 , ( 1 << ( Exp ) ) 1} / Ring b u f f e r i n i t i a l i z a t i o n / R i n g B u f f e r ( R x B u f f e r , c h a r , word , 2 , R i n g B u f f e r ( T x B u f f e r , c h a r , word , 2 ,
); );
/ 4 charRingBuffer f o r / 4 charRingBuffer f o r
r e c e i v e r / t r a n s m i t t e r /
Listing 6.1: Ringbuer C code macro. Considering the initialization code in Listing 6.1, it is easily seen that four byte-wide buers are used. According to Table 6.1, a single RingBuer element consists of 10 bytes altogether. As [mc]square reads and parses relevant debug info, it allows C-code variable names to be included into the temporal specication, i.e., CTL formulas. Thus, the column Formula name in Table 6.1 refers to the actual expression that is used within the CTL formulas.
80
Element IndexType ReadIndex IndexType WriteIndex IndexType Mask DataType Buer[1Exp] Sum
Length [byte] 2 2 2 4 10
81
Master
Slave
$ D #
$ D RPM1 RPM2 #
$ R #
$ R #
$ Z #
$ Z CNT1 CNT2 #
$ E #
$ E INP #
$ V #
82
# Bytes # # # # # 5 4 6 3 5 D R Z E V # 3 $ V # 3 $ E # 3 $ Z # 3 $ R # 3 $ D
Comment
83
Slave Response [RPM1 RPM2 ] revolutions per minute [2 bytes] system reset [0 bytes] [CNT1 CNT2 ] counter value [2 bytes] [INP] input representation [1 byte] [VER1 VER2 VER3 ] version string [3 bytes]
returns the current revolutions per minute resets the device returns the current pulse counter value returns the current input representation returns the software version number as string
Variable Revolutions RxBuer_i RxBuer_1 RxBuer_4 TxBuer_j TxBuer_1 TxBuer_4 Command_state startUpCodeFinished mark
6 Real Life Case Study
Variables used by the target application (excerpt) Scope Initial Length Comment global 0xffff 2 byte Holds the current RPM global 0x00 1 byte ith byte of receive buer memory area global 0x00 1 byte The receive circular buer read pointer global 0x00 1 byte The receive circular buer write pointer global 0x00 1 byte j th byte of transmit buer memory area global 0x00 1 byte The transmit circular buer read pointer global 0x00 1 byte The transmit circular buer write pointer local 0x00 1 byte Holds the actual state of the state machine Supplementary variables inserted for model checking global 0x00 1 byte Set to 1 when main() is entered global 0x00 1 byte Serves as marker when entering certain PC locations Table 6.3: Case study variables and their meaning.
84
Property #Explanation Textual representation As stated in the textual specication of the knitting machine monitoring device application. CTL representation ( [mc]square notation) The postulated CTL formula. The formula is given in the exact same notation as it is used as input to [mc]square. Comment Additional information and explanation of the CTL formula.
Property #1 Textual representation The variable Revolutions is initialized to 0xffff. CTL representation ( [mc]square notation) (AG (startUpCodeFinished =0 & Revolutions=0x0000 A Revolutions=0x0000 U Revolutions=0xff00) & AG (startUpCodeFinished =0 & Revolutions=0xff00 A Revolutions=0xff00 U Revolutions=0xffff) & EF Revolutions=0x0000 & EF Revolutions=0xff00) Comment If variable Revolutions is 0x0000, then it remains 0x0000 until it becomes 0xff00. If variable Revolutions is 0xff00, then it remains 0xff00 until it becomes 0xffff. There is a path where variable Revolutions is 0x0000 and there is a path where Revolutions is 0xff00. The initialization must be completed before the startup code is left.
Property #1a Textual representation The variable Revolutions is initialized to 0xffff. CTL representation ( [mc]square notation) AF Revolutions=0xffff & startUpCodeFinished =1 Comment On all paths the variable Revolutions is of value 0xffff when the startup code is left.
Property #1b Textual representation The variable Revolutions is initialized to 0xffff. CTL representation ( [mc]square notation) AF Revolutions=0xffff Comment On all paths there is a state where variable Revolutions is set to 0xffff.
85
Property #2 Textual representation Initialization of the receive circular buer. CTL representation ( [mc]square notation) AF(RxBuer_0 =0x00 & RxBuer_1 =0x00 & RxBuer_2 =0x00 & RxBuer_3 =0x00 & RxBuer_4 =0x00 & RxBuer_5 =0x03 & RxBuer_6 =0x00 & RxBuer_7 =0x00 & RxBuer_8 =0x00 & RxBuer_9 =0x00 & startUpCodeFinished =1) Comment On all paths within the startup code there is nally a state where all bytes of RxBuer_i where i = {0 . . . 9} are initialized to 0x00, except RxBuer_5 which is initialized to 0x03, since it acts as the circular buer mask. See Listing 6.1 for details.
Property #3 Textual representation Initialization of the transmit circular buer. CTL representation ( [mc]square notation) AF(TxBuer_0 =0x00 & TxBuer_1 =0x00 & TxBuer_2 =0x00 & TxBuer_3 =0x00 & TxBuer_4 =0x00 & TxBuer_5 =0x03 & TxBuer_6 =0x00 & TxBuer_7 =0x00 & TxBuer_8 =0x00 & TxBuer_9 =0x00 & startUpCodeFinished =1) Comment On all paths within the startup code there is nally a state where all bytes of TxBuer_j where j = {0 . . . 9} are initialized to 0x00, except TxBuer_5 which is initialized to 0x03, since it acts as the circular buer mask. See Listing 6.1 for details.
Property #4 Textual representation It is possible to reach the sleep state of the application where the application idles in an endless loop. CTL representation ( [mc]square notation) EF mark =MARK_SLEEP Comment There is a state where the application reaches the sleep state. Note that, this formula can also be expressed by involving the PC into the property, such as AG (EF PC =0xc0ffee). For the sake of clarity, however, the variable mark is introduced to allow self-explanatory CTL expressions.
86
Property #5 Textual representation It is possible to reach the send version state of the application where the application sends its version string to the host application. CTL representation ( [mc]square notation) EF mark =MARK_SENDVERSION Comment There is a state where the application reaches the send version state.
Property #6 Textual representation It is possible to reach the send inputs state of the application where the application sends the actual value of the digital input lines to the host application. CTL representation ( [mc]square notation) EF mark =MARK_SENDINPUTS Comment There is a state where the application reaches the send inputs state.
Property #7 Textual representation It is possible to reach the send pulse count state of the application where the application sends the actual value of the pulse counter to the host application. CTL representation ( [mc]square notation) EF mark =MARK_SENDPULSCNT Comment There is a state where the application reaches the send pulse count state.
Property #8 Textual representation It is possible to reach the send RPM state of the application where the application sends the actual RPM value to the host application. CTL representation ( [mc]square notation) EF mark =MARK_SENDRPM Comment There is a state where the application reaches the send RPM state.
87
Property #9 Textual representation The default path of the receiver state machine in function readCommand() (cf. Listing 6.2) is executed at least once. CTL representation ( [mc]square notation) EF mark =MARK_DEFAULT Comment There is a state where the default path of the switch statement in function readCommand() is executed.
Property #10 Textual representation The receiver state machine may only reside in states 0, 1, or 2. All other states are invalid. CTL representation ( [mc]square notation) Inv:(Command_state=0 | Command_state=1 |Command_state=2) Comment On all paths the actual state of the state machine is either 0, 1, or 2. The term Inv stands for invariant model checking. An equivalent expression of this formula is AG (Command_state=0 | Command_state=1 | Command_state=2)
Property #11 Textual representation Changes in the receiver state machine can only follow the following patterns: 0 1 2 0 or 0 1 0. All other transitions are invalid. CTL representation ( [mc]square notation) (AG (Command_state=0 A Command_state=0 U Command_state=1 | Command_state=0) & AG (Command_state=1 A Command_state=1 U Command_state=0 | Command_state=2 | Command_state=1) & AG (Command_state=2 A Command_state=2 U Command_state=0 | Command_state=2) & AF Command_state=0) Comment If Cmd_state = 0, Cmd_state remains 0 until it changes to 1, if Cmd_state = 1 then Cmd_state remains 1 until it changes to 0 or 1, if Cmd_state = 2, Cmd_state remains 2 until it changes to 0. There is a path where Cmd_state initially becomes 0 and the receiver state machine may always remain in its current state.
88
Property #12 Textual representation The serial receive circular buer read and write pointer may never exceed the circular buer bounds. CTL representation ( [mc]square notation) AG (RxBuer_1 <4 & RxBuer_0 =0 & RxBuer_3 <4 & RxBuer_4 =0) Comment RxBuer_1 (the read pointer low byte) and RxBuer_3 (the write pointer low byte) are on all paths lower than the circular buer bound, i.e., 4 bytes. Moreover, the high byte of the read and write pointer (RxBuer_0 and RxBuer_4 ) remain 0.
Property #13 Textual representation The serial transmit circular buer read and write pointer may never exceed the circular buer bounds. CTL representation ( [mc]square notation) AG (TxBuer_1 <4 & TxBuer_0 =0 & TxBuer_3 <4 & TxBuer_4 =0) Comment TxBuer_1 (the read pointer low byte) and TxBuer_3 (the write pointer low byte) are on all paths lower than the circular buer bound, i.e., 4 bytes. Moreover, the high byte of the read and write pointer (TxBuer_0 and TxBuer_4 ) remain 0.
Property #14 Textual representation The microcontroller application sends $ R # to the host application after power-up. CTL representation ( [mc]square notation) EF (TxBuer_6 =$ & TxBuer_7 =R & TxBuer_8 =# & TxBuer_9 =0) Comment There is a path where the transmit circular buer is lled with the sequence $ R #.
6.4.3 Comments
It is notable, that property #1 reveals one of the major strengths of our assembly code model checking approach. As the verication process is based on machine instructions, it is even possible to verify the exact initialization sequence of the 16 bit wide variable Revolutions. Property #1 requires that the high byte (located on the higher address) is initialized rst and then the low byte is initialized, as it is the usual way on little-endian processor architectures. In contrast, a model checker targeting C code is most times not able to make assumptions on byte/memory location granularity due to the missing details about the target platform. Property # 1a is a property suited for C code model checkers. However, as the property EF Revolutions=0xffff & startUpCodeFinished =1 only veries that variable Revolutions will eventually reach the value 0xffff within the startup sequence, it excludes the details on how the initialization of Revolutions is accomplished.
89
For example, it is possible that the variable Revolutions is for various reasons rst set to 0xfc0f and later on set to 0xffff. As a result, property #2 will evaluate to true, since the initialization sequence is not suciently specied. However, as [mc]square allows CTL properties to include single memory locations, property #1 will evaluate to false showing the erroneous behavior within the startup code as counterexample. The same applies to properties #12 and #13.
Property #5a Textual representation It is always possible to reach the send version state of the application where the application sends its version string to the host application. CTL representation ( [mc]square notation) AG(EF mark =MARK_SENDVERSION) Comment On every path the application nally reaches the send version state.
Property #6a Textual representation It is always possible to reach the send inputs state of the application where the application sends the actual value of the digital input lines to the host application. CTL representation ( [mc]square notation) AG(EF mark =MARK_SENDINPUTS) Comment On every path the application nally reaches the send inputs state.
90
Property #7a Textual representation It is always possible to reach the send pulse count state of the application where the application sends the actual value of the pulse counter to the host application. CTL representation ( [mc]square notation) AG(EF mark =MARK_SENDPULSCNT) Comment On every path the application nally reaches the send pulse count state.
Property #8a Textual representation It is always possible to reach the send RPM state of the application where the application sends the actual RPM value to the host application. CTL representation ( [mc]square notation) AG(EF mark =MARK_SENDRPM) Comment On every path the application nally reaches the send RPM state.
The above stated property is invalid and incomplete due to several reasons: The property does not consider the value of the read and write pointers of the circular buer. What if the read pointers Tx|RxBuer_{0, 1} do not point to the rst element in the circular buer? What if the transmit write pointer TxBuer_{2, 3} does not point to the rst element in the circular buer?
91
What if the model checker encounters a path where the serial interrupt is never red? The property does not consider the value of the circular buer mask, i.e., Tx|RxBuer_{4, 5}. The property does not consider the value of the fourth byte of the receive buer, i.e., RxBuer_9. How to make sure that, after once decoding $ Z #, the receive circular buer is not altered anymore? Once the receive circular buer is lled with $ Z #, how to make sure that the application does read the circular buer content in a sequential way? What if, i.e., the read pointer is incremented twice, or it is not incremented at all? It is obvious that creating valid CTL expressions for the communication protocol verication at least without additional knowledge of software internals is quite challenging. It might be possible for some corner cases, however, the resulting formulas are of unhandy length and complex to understand in their full details. By all means, there is no way to express fairness in plain CTL model checking. The Unfair Path Recapitulating, [mc]square abstracts from time, i.e., can be categorized as timeless, pure CTL model checker (cf. Section 3.5). Consider the property AG (AF mark =MARK_SENDRPM), which claims that on every path it must always be possible to nally reach the desired mark, e.g., mark =MARK_SENDRPM. As the target application uses interrupts and the Intel MCS-51 allows interrupt nesting, it might be possible that the model checker may get stuck within an interrupt loop, where immediately after an interrupt is executed, the ISR is re-entered again. Such an interrupt loop is depicted in Figure 6.7 by the path {I1, I2, I3, I4, I1, I2, ...}, which we call an unfair path. The model checker now may nd a counterexample where the property AG (AF mark =MARK_SENDRPM) is disproved due to this unfair path by getting stuck inside the interrupt loop. Clearly, such unfair paths are very unlikely when executing the code on the actual target hardware, albeit theoretical possible. As a matter of fact, the serial interrupt may only occur at multiple time instances of the selected serial baud rate, and thus, is very unlikely to produce unfair paths. As [mc]square follows a timeless model checking approach, we cannot use timing constraints to eliminate this behavior. In fact, we have to overcome the lack of fairness in CTL in order to obtain meaningful results for the present case study. The Lack of Fairness in CTL When using formal verication tools, one is often only interested in proving a property over fair paths. Thus, certain paths that are considered to be unrealistic for the actual target hardware, in our case the Intel MCS-51 microcontroller, need to be ruled out. In literature [29, 40] an unfair computation is described as an unreasonable computation that ignores certain transition alternatives forever and all the others are described as fair. In order to express fairness, fairness constraints [29] are used that operate on a path level and
92
A1
A2
A3
A4 mark =MARK_SENDRPM
S0
enter ISR
I1
I2
I3
I4
replace the standard meaning for all paths with for all fair paths and there exists a path with there exists a fair path. Unfortunately, such fairness properties cannot be expressed directly in CTL [100, 36, 101] but can be expressed in CTL*. In contrast, fairness assumptions can be easily added as a premise to an LTL formula. In LTL a fairness assumption can be stated in the form of (fairness)(property), e.g., (GF enabled)(GF occurs). However, Clarke et al. [28] show how a Kripke structure2 can be enriched by fairness constraints in order to enable fairness in CTL. In their approach, a fair path must contain an element of each fairness constraint innitely often. A path is fair if each constraint is true innitely often along the path. Consequently, they restrict path quantiers in the logic to those fair paths. We use, from the conceptional point of view, a similar approach of introducing fairness into CTL model checking with [mc]square. In the following, we present the introduction of fairness through a model of the microcontroller environment. Introducting Fairness through Environment Modeling As [mc]square implements CTL model checking algorithms, we are limited to plain CTL without fairness constraints, thus, fairness must be introduced via an additional concept. To that end, we make use of environment modeling. Within [mc]square this particular feature is termed User Dened Environment (UDE) modeling [102, 103]. The UDE constrains the inputs read from the environment to a manually specied set of values and allows to control the occurrence of interrupt sources. An automata is used to dene input values as well as interrupt and value transitions. The use of UDE leads to the adapted model checking workow as shown in Figure 6.8.
2
In their approach, a fair Kripke structure is a 4-tuple M = (S, R, L, F), where S, R, and L are dened as in Section 3.4.2 and F 2S is a set of fairness constraints. is a path in M and inf() is dened as inf() = {s | s = si for innitely many i}. A path is fair i for every P F, inf() P = .
93
The model checking workow is now enriched by a third input, namely the environment automata U. An UDE is realized by a communicating nite state machine [104], which interacts with a representation of the microcontroller, i.e., the C51Simulator component (see Section 3.6). During model checking, this automata represents the environment, thus, inuences the behavior of the C51Simulator. From the user point of view, UDE automata are created with a graphical editor inside the GUI of [mc]square. It works like any other automata drawing tool, e.g., the user adds states and transitions through toolbars onto a canvas. Alternatively, UDEs can be dened by using an environment description language [102]. Both approaches have the same expressiveness [103]. For the present case study, an UDE is used to dene an exact sequence of values read from the serial port. Moreover, we block or re the serial ISR at certain points to ensure fairness. UDE automata U Assembly source code or Hex le [mc]square Model checker
System model M
System property
M |= ?
Notication
yes
no
Counterexample
Figure 6.8: The model checking workow of [mc]square with UDE (cf. Figure 3.3).
An Environment Automata for Fairness in the Communication Protocol Verication Having discussed the principles of UDEs in [mc]square, the remainder of this section is dedicated to the procedure of nding a suitable UDE automata for fairness in verifying the communication protocol of the case study. Two requirements for the desired UDE automata are derived: (i) Values are read from the serial interface according to the communication protocol specication (cf. Figure 6.6). (ii) The unfair path of interrupt loops (see Section 6.4.5) is avoided through blocking of the corresponding ISRs for at least a single instruction after executing a RETI
94
instruction. In other words, progress in the background part of the application (cf. Figure 6.3) is assured. With respect to this requirements, we can generate the environment automata U1, as shown in Figure 6.9. State changes among {S0, S1, S2, S3} are triggered whenever the application reads a value from the serial communication interface. For example, the transition label SBUF 35! indicates that the UDE forces the simulator to determinize the serial receive register SBUF to the value of 35, i.e., ASCII # (cf. Table 6.4). The states {B0, B1, B2, B3} are responsible for blocking the serial interrupt, after the serial ISR is left (transition ISR leave). The execution of the next instruction triggers the transition (Instr leave) back to one of the states of {S0, S1, S2, S3}.
SBUF {35,82,36}!
SBUF 35!
init
S0
Instr leave
ISR leave
ISR leave
B0
block ISR
B1
block ISR
B2
block ISR
ISR leave
B3
block ISR
Figure 6.9: A rst UDE automata proposal (U1). Item SBUF d 35 d 36 d 68 d 69 d 86 d 82 d 90 Description The serial transmit/receive register Decimal equivalent of ASCII # Decimal equivalent of ASCII $ Decimal equivalent of ASCII D Decimal equivalent of ASCII E Decimal equivalent of ASCII V Decimal equivalent of ASCII R Decimal equivalent of ASCII Z
Table 6.4: Denitions for UDE modeling. Nevertheless, the automata U1 is still not sucient for our protocol verication venture, due to the following consideration What happens if the serial (receive) interrupt does not occur at all? In fact, this conguration is possible on the real target hardware. If the host application does not initiate any serial communication at all, the knitting machine monitoring device will not send any answer to the host. This can be seen as the idle state of the application.
95
ISR leave
However, as we are interested in the communication sequences rather than the idle state, the path where no serial (receive) interrupt occurs is unfair too. As a result, when model checking the communication protocol with the UDE automata U1, [mc]square disproves the properties by presenting counterexamples where the serial interrupt is never activated. In order to eliminate those paths, we extend the automata U1 to U2, as shown in Figure 6.10. Considering automata U2, the states {F0, F1, F2, F3} actively trigger the execution of the serial interrupt. Again, {B0, B1, B2, B3} are states where the serial interrupt is blocked. Basically, there are four sequences that are equivalent, i.e., {S0, B0, F0}, {S1, B1, F1}, {S2, B2, F2}, and {S3, B3, F3}. Transitions among those sequences ({F0, S1}, {F1, S2}, {F2, S3}, and {F3, S0}) are used to determinize the serial receive register SBUF to the values of the protocol as dened in Table 6.3.3. The transitions labeled with ISR leave from {F0, F1, F2, F3} back to {B0, B1, B2, B3} are needed due to an implementation detail of the application. In case the circular buer is full, the application skips incoming serial data bytes, thus, it may happen that a serial receive interrupt occurs, but the application does not read the value of the SBUF register. Hence, the transition is needed to prevent the automata U2 from being stuck in the states where the receive circular buer is full and the serial interrupt is red again, i.e., one of {F0, F1, F2, F3}. Note that the transition SBUF {35,36,82}! is used to send any of these three bytes between a communication sequence. Thus, in our UDE model possible sequences are {35,36,82,35} (#,$,R,#), {35,36,82,36} (#,$,R,$), {35,36,82,82} (#,$,R,R), {35,35,36,82} (#,#,$,R), . . . We can easily extend this claim to a full nondeterministic read between a communication sequence, e.g., by changing the transition to SBUF {0 . . . 255}!, however, as it will turn out in the remainder of this section this is not needed for our verication process. It should be noted that without detailed knowledge of the applications software structure it is almost impossible to obtain a proper UDE automata at least for the example code at hand. As we are aiming towards a formal verication tool that can be used as early as during the development phase, the software insight is brought into by the software development team. After demonstrating that an UDE automata is capable of introducing fairness to the model checking process, in the following, properties for the communication protocol verication are stated that are model checked with support of the automata U2. The extended [mc]square workow is used as shown in Figure 6.8. Note that, automata U2 exactly corresponds to property #Comm2. For the remaining properties, we adapt the transitions with the actual values of the command, i.e., we replace the transitions SBUF 82! and SBUF {35,36,82}! of property #Comm2 with SBUF 68! and SBUF {35,36,68}! to obtain the UDE automata for property #Comm1.
96
SBUF 35!
f ire ISR
SBUF 36! S1
f ire ISR
F0
F2
S3
Instr leave
Instr leave
ISR leave
ISR leave
ISR leave
block ISR
B0
block ISR
B1 Instr leave
block ISR
B2
block ISR
B3 Instr leave F3
ISR leave
ISR leave
ISR leave
S0
f ire ISR
F1 SBUF 82!
S2
f ire ISR
SBUF {35,36,82}!
SBUF {35,36,82}!
init
I0
97
ISR leave
ISR leave
Property #Comm1 Textual representation After receiving $ D # the knitting monitoring device answers with $ D RPM1 RPM2 #, i.e., sends the variable Revolutions. CTL representation ( [mc]square notation) AG(AF mark =MARK_SENDRPM) Comment If the application reads $ D # from the serial port the application always reaches the state MARK_SENDRPM.
Property #Comm2 Textual representation After receiving $ R # the knitting monitoring device answers with $ R #, i.e., enters the reset state. CTL representation ( [mc]square notation) AG(AF mark =MARK_RESET) Comment If the application reads $ R # from the serial port the application always reaches the state MARK_RESET.
Property #Comm3 Textual representation After receiving $ Z # the knitting monitoring device answers with $ Z CNT1 CNT2 #, i.e., sends the current pulse counter value. CTL representation ( [mc]square notation) AG(AF mark =MARK_SENDCOUNTER) Comment If the application reads $ Z CNT # from the serial port the application always reaches the state MARK_SENDCOUNTER.
Property #Comm4 Textual representation After receiving $ E # the knitting monitoring device answers with $ E INP #, i.e., sends the current input lines value. CTL representation ( [mc]square notation) AG(AF mark =MARK_SENDINPUTS) Comment If the application reads $ E # from the serial port the application always reaches the state MARK_SENDINPUTS.
98
6.5 Results
Property #Comm5 Textual representation After receiving $ V # the knitting monitoring device answers with $ V VER1 VER2 VER3 #, i.e., sends the software version string. CTL representation ( [mc]square notation) AG(AF mark =MARK_SENDVERSION) Comment If the application reads $ V # from the serial port the application always reaches the state MARK_SENDVERSION.
6.5 Results
In what follows, the results of the case study are presented. Note that for claritys sake only a summary of the most signicant results is given in this section.
6.5.1 Numbers
Table 6.5 shows the results of the rst [mc]square model checking run. The item States created refers to the overall states that are created by the model checker. The item States stored comprises the states that are stored in main memory. It is compiled out of the total states created (i.e., the item States created ) shortened by the number of state revisits, i.e., single states that are already present in the main memory. The column Time refers to the time needed by [mc]square to nish model checking. The numbers were generated on a Dual-Core AMD OpteronTM Processor 8220 with 2.80 Ghz (8 cores), equipped with 256 GB of RAM running 64 bit Windows Server Enterprise Edition and a JavaTM server virtual machine version 1.6.0 (with settings -server -Xmx200G -Xss120M). Source code revision 4338 of [mc]square was used.
99
Property
M |= ?
Time
Stored Created Revisits hh:mm:ss 2,107,785 22,627,672 721,167 00:05:11 230 634 0 00:00:01 146 145 0 00:00:01 230 634 0 00:00:01 230 634 0 00:00:01 123,242 126,486 3,244 00:00:02 123,304 126,551 3,247 00:00:02 104,234 107,051 2,817 00:00:02 142,248 156,924 3,676 00:00:02 161,195 165,300 4,106 00:00:03 2,325,333 20,099,919 525,454 00:04:25 2,333,585 20,065,707 524,632 00:04:28 2,333,585 20,065,707 524,632 00:04:20 2,497,397 19,999,441 524,632 00:04:33 2,359,645 20,018,732 524,632 00:04:35 1,519 344 14 00:00:01 Revisited Properties #4 to #8 4a 2,325,333 20,099,919 525,454 00:04:30 5a 20,459 150,896 3,247 00:00:03 6a 20,456 150,890 3,247 00:00:02 7a 23,893 173,084 3,676 00:00:02 8a 27,328 195,275 4,1063 00:00:03 Comm. Protocol Verication with UDE (faulty receiver implementation) Comm1 4,810 4,811 1 00:00:01 Comm2 4,810 4,811 1 00:00:01 Comm3 4,810 4,811 1 00:00:01 Comm4 4,810 4,811 1 00:00:01 Comm5 4,810 4,811 1 00:00:01 Comm. Protocol Verication with UDE (xed receiver implementation) Comm1 101,277 101,616 339 00:00:02 Comm2 48,513 48,672 159 00:00:01 Comm3 101,205 101,544 339 00:00:01 Comm4 100,989 101,328 339 00:00:02 Comm5 101,133 101,472 339 00:00:02 Table 6.5: Case study results.
# 1 1a 1b 2 3 4 5 6 7 8 9 10 11 12 13 14
100
PR
6.5 Results
Note that, ? {D,R,Z,E,V} (cf. Figure 6.6). Note that, the statement n > 0 is needed, since the conguration of n = 0 is invalid, thus, a sequence of {D,#} does not yield any response of the microcontroller application.
101
host application sends an odd number of start bytes, i.e., {[2n+1]n>0 $],?,#}. Listing 6.2 shows the erroneous implementation of the receiver state machine. The counterexample presented by [mc]square reveals that a sequence of {[2n]n>0 $],?,#} resets the variable Command_state to 0 after the initial state is left for the rst time. In fact, the readCommand() implementation is now only sensitive to start bytes ($), but the host application already starts transmitting the command part (one of {D,R,Z,E,V}) of the communication sequence. As a result, the following protocol bytes are skipped as long as a new valid command sequence ({[2n + 1]n>0 $],?,#}) is received. The root cause of the failure lies in source code line 26 of Listing 6.2. The application waits for either a start byte (STX) or a stop byte (ETX), thus, an even number of start bytes resets the variable Command_state to 0 over and over again, as shown in source code line 28.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 c h a r readCommand ( v o i d ) { s t a t i c b y t e Command_state = 0 ; s t a t i c c h a r Command ; char c ; if ( charavail ( ) ) { c = rcvchar ( ) ; } else { return 0; } switch ( Command_state ) { case 0: / I n i t i a l S t a t e / i f ( c == STX) { Command_state = } break ; case 1: i f ( ( c == STX) | | ( c == { Command_state = } else { Command = c ; Command_state = } break ; case 2: Command_state = 0 ; i f ( c == ETX) { r e t u r n Command ; } break ; default : Command_state = 0 ; mark = MARK_DEFAULT; break ; } 0;
1;
/ Command found / 2;
/ Command r e t u r n e d
i f ETX r e c e i v e d /
return }
Listing 6.2: The erroneous receiver state machine implementation. Revising the Receiver State Machine Implementation In order to correct the receiver state machine implementation, source code lines 25-35 of Listing 6.2 are adapted to lines 35-39 of Listing 6.3. Whenever the host now sends sequences of the form {[2n]n>0 $],?,#}, the variable Command_state is not erroneously reset to 0, but set to 1 which forces the state machine to wait for the following command byte.
102
6.5 Results
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
c h a r readCommand ( v o i d ) { s t a t i c b y t e Command_state = 0 ; s t a t i c c h a r Command ; char c ; if ( charavail ( ) ) { c = rcvchar ( ) ; } else { return 0; } switch ( Command_state ) { case 0: / I n i t i a l S t a t e / i f ( c == STX) { Command_state = } break ; case 1: i f ( c == ETX) { Command_state = } e l s e i f ( c == STX) { Command_state = } else { Command = c ; Command_state = } break ; case 2: Command_state = 0 ; i f ( c == ETX) { r e t u r n Command ; } break ; default : Command_state = 0 ; mark = MARK_DEFAULT; break ; } 0;
1;
0;
/ ETX r e c e i v e d /
1;
/ STX r e p e a t /
/ Command found / 2;
/ Command r e t u r n e d
i f ETX r e c e i v e d /
return }
103
Time
Abstraction techniques State space Static analysis NDPSW DNDlA DND DVR RBA IFA SA
Table 6.6: Case study results for plain state space building. Based on the results in Table 6.6, the following consequences are drawn: (i) For the present case study, it is not feasible to build the state space without any abstraction techniques applied. This is not surprising since the application heavily interacts with the environment. (ii) The model checking run with enabled Delayed Nondeterminism option was canceled after running two days on the server. Thus, Delayed Nondeterminism provides not enough abstraction to build the state space within reasonable time and resource constraints. The same applies to the option Delayed Nondeterminism with Look Ahead. (iii) The state space could only be built when enabling the options Delayed Nondeterminism with Look Ahead and Nondeterministic Program Status Word. These options drastically reduced the state space and consequently the run time. (iv) Enabling static analysis additionally helps to mitigate the state-explosion problem. Especially the option Path Reduction is a great contributor to state space reduction. (v) For the conducted case study the option Dead Variable Reduction leads only to a minor reduction of system states. This can be explained by the size of the source code. Static code analysis gets coarser with increasing source code complexity. Nevertheless, this result can be seen as an indicator that there is still vast room for improvements in the existing data-ow analyses. (vi) Due to the implemented abstraction techniques for the Intel MCS-51 target, the state space could be reduced to a number that can easily be handled by conventional desktop computers. There is no need for a dedicated server in order to build the state space. Note that the presented results are only valid for the investigated case study, actual savings of the individual abstraction techniques heavily depend on the source code structure, complexity, and the number of accesses to nondeterministic memory locations.
104
PR
This section summarizes remaining challenges of the [mc]square approach and highlights future research possibilities. First, the problem of nding understandable counterexamples is highlighted. Next, the issue of verifying the simulator implementation is discussed and the idea of automatically generating simulators out of high level descriptions is presented. Then, the need of counterexample validation is claried. Finally, coping with the state-explosion problem is discussed.
105
microcontroller is a challenging, lengthy, and error-prone task. Consequently, verication of the simulator is tricky, too. Whereas the instruction set part of the implementation can be easily veried against other commercial available target simulators (cf. Section 3.6.3), reasonable verication of the customized parts of the target simulator still remains an open issue. It is fair to state that implementation errors residing in the simulator itself are very likely to be uncovered during model checking since the model checker urges the simulator to execute single instructions with a huge number of input congurations. As a result, bugs that stem from the simulator implementation are revealed by wrong counterexamples presented by [mc]square. This is an especially eective method to get the simulator bug-free in the early stages of an implementation. However, the following conclusions are noted: (i) Strategies for handling the verication of the customized CPU simulator have to be proposed. (ii) Derive methods for an automatic verication of the CPU simulator.
106
Thus, an innovative approach for counterexample validation is needed. In the long run, it might be feasible to tie model checking to the root where all software errors are emerging from to the hardware unit whose software is subject to verication itself. It might be promising to extend existing microcontroller IP cores in a way to support state space building and the automatic validation of counterexamples. However, the following conclusions are noted: (i) Further research is needed to automatically validate a given counterexample on the real target hardware. (ii) Evaluate possibilities of generating a customized IP core for state space generation and counterexample validation, based on available microcontroller implementations. (iii) Strategies to contain massive over-approximations due to abstraction techniques are needed.
107
108
8 Conclusion
The main contribution of the present master thesis is the enhancement of the existing C51Simulator component of the [mc]square model checker. The Intel MCS-51 simulator is extended by (i) state-space abstraction techniques and (ii) integrated into the static analysis framework of [mc]square. Finally, (iii) a real life embedded systems application is formally veried with [mc]square by taking advantage of the implemented abstraction techniques. Regarding (i), a novel abstraction technique, termed Delayed Nondeterminism with Look Ahead is proposed that when applied to formal verication of I/O intensive embedded systems assembly code is able to achieve a quite notable state space reduction. In particular, this approach helps to avoid the generation of successor states whenever a microcontroller executes logic operations. The presented approach centers around the coherence among the boolean operators , , and with particular regard to 3-valued logic. Regarding (ii), existing static analyses are adapted to the Intel MCS-51 architecture. Furthermore, a novel data-ow analysis, termed Register Bank Analysis is introduced. This new analysis is used to support static assembly code analysis within [mc]square. In particular, the approach leads to more precise Reaching Denition Analysis and Live Variable Analysis results, which allows the detection of additional dead variables. Thus, the number of overall system states is reduced during model checking. Typical data-ow analysis for high-level languages cannot be applied to assembly code one to one. Analyses such as Reaching Denition Analysis have to be adapted to be applicable to assembly code. The approach shows that it is necessary to take architectural peculiarities into account during the analysis to achieve precise results. Regarding (iii), a real life industrial application is model checked by [mc]square. A specication in Computational Tree Logic (CTL) is derived out of a textual specication. It is possible to reveal an implementation error concerning the receiver state machine, responsible for decoding incoming data bytes from the serial interface. The found error is very likely to go unnoticed during traditional testing methods, since the erroneous behavior only shows up in the rare case the host application sends sequences in the form of {[2n]n>0 $],?,#} to the target microcontroller, i.e., sequences with an even number of start bytes. It turns out that some of the system properties cannot be suciently specied in CTL, due to the lack of fairness in CTL. Unfair paths in the microcontroller program are determined and ruled out by taking advantage of a concept termed User Dened Environment (UDE). With UDE it is possible to introduce the required fairness constraints into the [mc]square model checking process. A solution to x the existing implementation error is given and is proved to be correct in a further model checking run. [mc]square proved to be a promising approach for model checking and static analysis of Intel MCS-51 assembly source code. It aims at a push-button formal verication approach of embedded systems code by relying on custom, highly optimized simulator components. When compared to traditional model checking approaches, the eort for the verication of a system or software is shifted from the user and a system model towards the model
109
8 Conclusion
checking tool and the implementation itself. Nevertheless, as recognized by Gerth in [3], the real challenge besides all the technical issues that have to be solved in formal verication lies in convincing the design teams that devoting some of their verication resources to formal methods leads to a higher design quality. Thus, the major future challenge is to move formal verication upstream in the embedded systems design ow. Contributors, such as ever shortening design cycles and stringent time to market requirements, strongly support the claim for formal verication even at very early design stages. It is about time to transform projects successful in research and academia into practical tools ready to be used within the day-to-day (embedded) software engineering practice. To conclude, a vague and rather incomplete personal outlook on future trends in formal verication is given: (i) The holy grail of full program verication has been abandoned - It will probably remain abandoned for the next years. (ii) Less ambitious tools like [mc]square might emerge and become more widely used to formally verify sensitive parts of the application software. (iii) Future tools will exploit ideas from various analysis disciplines, such as abstract interpretation, static analysis, and model checking. (iv) Future tools will aim at alleviating the chicken-and-egg problem of writing specications.
110
Bibliography
[1] M. Woodward and P. Mosterman, Challenges for embedded software development, in Proceedings of the 50th International Midwest Symposium on Circuits and Systems (MWSCAS), Montreal, Canada, August 2007, pp. 630633. [2] G. J. Holzmann, Software safety in rocket science, ERCM News Special: SafetyCritical Software, vol. 75, pp. 1415, October 2008. [3] R. Gerth, Model checking if your life depends on it: A view from Intels trenches, in Proceedings of the 8th International SPIN Workshop, Toronto, Canada, 2001. [4] L. Holenderski, A model checking project at Philips research, in Proceedings of the 8th International SPIN Workshop, Toronto, Canada, 2001. [5] D. Coer, E. Engstrom, R. Goldman, D. Musliner, and S. Vestal, Applications of model checking at Honeywell Laboratories, in Proceedings of the 8th International SPIN Workshop, Toronto, Canada, 2001. [6] B. Schlich and S. Kowalewski, [mc]square: A model checker for microcontroller code, in Proceedings of the 2nd International Symposium on Leveraging Applications of Formal Methods, Verication and Validation (ISoLA 2006), Paphos, Cyprus, 2006, pp. 466473. [7] A. Fehnker, R. Huuck, F. Rauch, and S. Seefried, Some assembly required - program analysis of embedded systems code, in Proceedings of the 8th IEEE International Working Conference on Source Code Analysis and Manipulation, Beijing, China, September 2008, pp. 1524. [8] E. Mercer and M. Jones, Model checking machine code with the GNU debugger, in Proceedings of the 12th SPIN Workshop on Model Checking Software, ser. Lecture Notes in Computer Science, vol. 3639, August 2005. [9] B. Schlich, Model checking of software for microcontrollers, Dissertation, RWTH Aachen University, Aachen, Germany, June 2008. [Online]. Available: http://sunsite.informatik.rwth-aachen.de/Publications/AIB/2008/2008-14.pdf [10] UAS Technikum Wien, FHplus project design methods for embedded control systems (DECS), visited: May 2009. [Online]. Available: http://embsys. technikum-wien.at/projects/decs/index.html [11] T. Reinbacher, M. Kramer, M. Horauer, and B. Schlich, Challenges in embedded model checking a simulator for the [mc]square model checker, in Proceedings of the 3rd Intl Symposium on Industrial Embedded Systems (SIES 2008), Montpellier, France, 2008, pp. 245248.
111
Bibliography
[12] , Motivating model checking for embedded systems software, in Proceedings of the 4th IEEE/ASME Intl Conf. Mechatronic and Embedded Systems and Applications (MESA 2008), Beijing, China, October 2008, pp. 546551. [13] T. Reinbacher, M. Horauer, and B. Schlich, Using 3-valued memory representation for state space reduction in embedded assembly code model checking, in Proceedings of the 12th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS 2009), Liberec, Czech Republic, April 15-17 2009, pp. 114119. [14] T. Reinbacher, J. Brauer, M. Horauer, and B. Schlich, Rening assembly code static analysis for the Intel MCS-51 microcontroller, in Proceedings of the 4th Intl Symposium on Industrial Embedded Systems (SIES 2009), Lausanne, Switzerland, July 8-10 2009, accepted for publication. [15] E. S. Raymond, The Cathedral and the Bazaar. Musings on Linux and Open Source by an Accidental Revolutionary. OReilly Media, 1999. [Online]. Available: http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ [16] J. L. Lions, ARIANE 5 ight 501 failure report, July 1996, visited: May 2009. [Online]. Available: http://www.ima.umn.edu/~arnold/disasters/ariane5rep.html [17] M. I. Board, Mars climate orbiter - phase I report, November 1999, visited: May 2009. [Online]. Available: ftp://ftp.hq.nasa.gov/pub/pao/reports/1999/MCO_ report.pdf [18] NYISO, Interim report on the August 14, 2003 blackout, Januar 2004, visited: May 2009. [Online]. Available: http://www.hks.harvard.edu/hepg/Papers/NYISO. blackout.report.8.Jan.04.pdf [19] M. Kanellos, Software glitch stalls some Toyota hybrids, October 2005, visited: May 2009. [Online]. Available: http://news.cnet.com/ Software-glitches-stalls-some-Toyota-hybrids/2100-11389_3-5895574.html [20] D. Gainer, Microsoft Excel calculation issue update, September 2007, visited: May 2009. [Online]. Available: http://blogs.msdn.com/excel/archive/2007/09/25/ calculation-issue-update.aspx [21] APA, A1-Netzausfall: Mobilkom gibt Entwarnung, October 2008, in German. Visited: May 2009. [Online]. Available: http://futurezone.orf.at/stories/317646/ [22] E. A. Emerson, The beginning of model checking: A personal perspective, 25 Years of Model Checking: History, Achievements, Perspectives, pp. 2745, 2008. [23] T. Reinbacher, Introduction to embedded software verication, 2008, Students Paper, UAS Technikum Wien, Master Embedded Systems, Course: System Architecture and Engineering SS-08. [24] A. Turing, On computable numbers, with an application to the Entscheidungsproblem, in Proceedings of the London Mathematical Society, ser. 2, vol. 42, 1936, pp. 230265.
112
Bibliography
[25] T. Hoare and J. Misra, Veried software: theories, tools, experiments vision of a grand challenge project, in Veried Software: Theories, Tools, Experiments (VSTTE 2005), Toronto, Canada, 2005. [26] D. A. Wheeler, Linux Kernel 2.6: Its worth more! November 2007, visited: May 2009. [Online]. Available: http://www.dwheeler.com/essays/linux-kernel-cost.html [27] US Department of Commerce, The economic impacts of inadequate infrastructure for software testing, May 2002. [Online]. Available: http://www.nist.gov/director/ prog-ofc/report02-3.pdf [28] E. M. Clarke, O. Grumberg, and D. A. Peled, Model Checking. 1999, ISBN 0262032708. [29] C. Baier and J.-P. Katoen, Principles of Model Checking. ISBN 026202649X. The MIT Press,
[30] E. M. Clarke and E. A. Emerson, Design and synthesis of synchronization skeletons using branching time temporal logic, in Workshop on Logic of Programs, ser. Lecture Notes in Computer Science, vol. 131, 1981, pp. 5271. [31] J.-P. Queille and J. Sifakis, Specication and verication of concurrent systems in CESAR, in Proceedings of the 5th Colloquium on International Symposium on Programming, London, UK, 1982, pp. 337351. [32] E. Clarke, The birth of model checking, in 25 Years of Model Checking, ser. Lecture Notes in Computer Science, vol. 5000, 2008, pp. 126. [33] E. A. Emerson and E. M. Clarke, Characterizing correctness properties of parallel programs using xpoints, in Proceedings of the 7th Colloquium on Automata, Languages and Programming, 1980, pp. 169181. [34] A. Pnueli, The temporal semantics of concurrent programs, in Proceedings of the International Sympoisum on Semantics of Concurrent Computation. Springer-Verlag, 1979, pp. 120. [35] A. Pnueli and Z. Manna, The Temporal Logic of Reactive and Concurrent Systems: Specication. Springer-Verlag Gmbh, 1991, ISBN 0387976647. [36] E. M. Clarke and I. A. Draghicescu, Expressibility results for linear-time and branching-time logics, in Linear Time, Branching Time and Partial Order in Logics and Models for Concurrency, School/Workshop, vol. 354, London, UK, 1989, pp. 428437. [37] G. J. Holzmann, The model checker SPIN, IEEE Transactions on Software Engineering, vol. 23, pp. 279295, 1997. [38] K. Heljanko, Model checking the branching time temporal logic CTL, Helsinki University of Technology, Digital Systems Laboratory, Espoo, Finland, Tech. Rep. A45, May 1997.
113
Bibliography
[39] T. Schuele and K. Schneider, Global vs. local model checking: a comparison of verication techniques for innite state systems, in Proceedings of the 2nd IEEE International Conference on Software Engineering and Formal Methods (SEFM 2004), Beijing, China, September 26 - 30 2004, pp. 67 76. [40] D. Peled, Software Reliability Methods. Springer, 2001, ISBN 0387951067. [41] G. Balakrishnan, T. Reps, D. Melski, and T. Teitelbaum, WYSINWYX: What you see is not what you execute, in Veried Software: Theories, Tools, Experiments (VSTTE 2005), Toronto, Canada, 2005. [42] B. Schlich, M. Rohrbach, M. Weber, and S. Kowalewski, Model checking software for microcontrollers, Technischer Bericht AIB-2006-11, RWTH Aachen, Tech. Rep., 2006. [43] F. Scheuer, Extending the model checker [mc]square to handle the inneon XC167 mircocontroller, Masters thesis, RWTH Aachen University, Department of Computer Science 11, May 2007, (in German). [44] J. Wernerus, Model-checking of instruction list programs for programmable logic controllers using [mc]square, Masters thesis, RWTH Aachen University, Department of Computer Science 11, 2008, (in German). [45] T. Reinbacher, MCS-51 simulator integration into the [mc]square model checker, Department of Embedded Systems, University of Applied Sciences Technikum Wien, Tech. Rep., 2007. [46] Intel Cooperation, MCS 51 Microcontroller Family Users Manual, 1994, order No.: 272383-002. [47] NXP Semiconductors, 80C51 family programmers guide and instruction set, 1997. [Online]. Available: http://www.standardics.nxp.com [48] S. Dutta, SDCC compiler user guide, online, visited: Available: http://sdcc.sourceforge.net/doc/sdccman.pdf May 2009. [Online].
[49] T. Noll and B. Schlich, Delayed nondeterminism in model checking embedded systems assembly code, in Proceedings of the 3rd Intl Haifa Verication Conf. (HVC 2007), ser. Lecture Notes in Computer Science, vol. 4899, 2008, pp. 185201. [50] T. Ball and S. Rajamani, The SLAM project: Debugging system software via static analysis, in Proceedings of the Symposium on Principles of Programming Languages (POPL 2002), Portland, USA, January 16-18 2002, pp. 13. [51] D. Beyer, T. Henzinger, R. Jhala, and R. Majumdar, The software model checker BLAST: Applications to software engineering, International Journal on Software Tools for Technology Transfer, vol. 9, pp. 505525, 2007. [52] S. Chaki, E. Clarke, A. Groce, J. Ouaknine, O. Strichman, and K. Yorav, Ecient verication of sequential and concurrent C programs, Formal Methods in System Design (FMSD), vol. 25, pp. 129166, 2004.
114
Bibliography
[53] H. Chen, D. Dean, and D. Wagner, Model checking one million lines of C code, in Proceedings of the 11th Annual Network and Distributed System Security Symposium (NDSS), 2004, pp. 171185. [54] M. Gallardo, P. Merino, and D. Sanan, Towards model checking C code with OPEN/CAESAR, in Proceedings of the 4th International Workshop on Modelling, Simulation, Verication and Validation of Enterprise Information Systems (MSVVEIS06), Paphos, Cyprus, 2006, pp. 198201. [55] P. de la Cmara, M. Gallardo, P. Merino, and D. Sann, Model checking software with well-dened APIs: the socket case, in Proceedings of the 10th international workshop on Formal Methods for Industrial Critical Systems (FMICS), Lisbon, Portugal, 2005, pp. 1726. [56] B. Schlich and S. Kowalewski, Model checking c source code for embedded systems, in Proceedings of the IEEE/NASA Workshop Leveraging Applications of Formal Methods, Verication, and Validation (ISoLA 2005), 2005. [57] T. Mehler, Challenges and applications of assembly-level software model checking, Dissertation, University of Dortmund, 2006. [Online]. Available: https://eldorado.tu-dortmund.de/bitstream/2003/22435/1/main.pdf [58] S. C. Kleene, Introduction to Metamathematics, 11st ed. North Holland, 1996. [59] G. Bruns and P. Godefroid, Model checking partial state spaces with 3-valued temporal logics, in Proceedings of the 11th Intl Conf. Computer Aided Verication (CAV99), ser. Lecture Notes in Computer Science, vol. 1633, 1999, pp. 274287. [60] E. Yahav, Verifying safety properties of concurrent Java programs using 3-valued logic, in Proceedings of the 27th ACM Principles of Programming Languages Conference (POPL 2000), vol. 36, no. 3, Boston, USA, 2001, pp. 2740. [61] R. E. Bryant, A methodology for hardware verication based on logic simulation, Journal of the ACM, vol. 38, no. 2, pp. 299328, 1991. [62] T. Feng, L.-C. Wang, K.-T. Cheng, M. Pandey, and M. S. Abadir, Enhanced symbolic simulation for ecient verication of embedded array systems, in Proceedings of the 2003 Conference on Asia South Pacic Design Automation (ASPDAC), Kitakyushu, Japan, 2003. [63] C. Seger and R. Bryant, Formal verication by symbolic evaluation of partiallyordered trajectories, in Formal Methods in System Design, 1993, pp. 147190. [64] C.-J. H. Seger, R. B. Jones, J. W. OLeary, T. Melham, M. D. Aagaard, C. Barrett, and D. Syme, An industrially eective environment for formal hardware verication, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 24, pp. 13811405, 2005. [65] P. Godefroid, N. Klarlund, and K. Sen, DART: directed automated random testing, in Proceedings of the 2005 ACM SIGPLAN Conf. Programming language design and implementation (PLDI 05), vol. 40, no. 6, 2005, pp. 213223.
115
Bibliography
[66] K. Sen, D. Marinov, and G. Agha, CUTE: a concolic unit testing engine for C, in Proceedings of the 10th European Software Engineering Conference/13th ACM SIGSOFT Int. Symp. on Foundations of Software Engineering (ESEC/FSE-13 2005), 2005, pp. 263272. [67] T. Reps, M. Sagiv, and R. Wilhelm, Static program analysis via 3-valued logic, in Proceedings of the 16th Intl Conf. Computer Aided Verication (CAV 2004), ser. Lecture Notes in Computer Science, vol. 3114, Boston, USA, July 13-17 2004, pp. 1530. [68] M. Sagiv, T. Reps, and R. Wilhelm, Parametric shape analysis via 3-valued logic, ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 24, no. 3, pp. 217298, 2002. [69] A. Fehnker, R. Huuck, B. Schlich, and M. Tapp, Static analysis for microcontrollers, in Proceedings of the Current Trends in Theory and Practice of Computer Science (SOFSEM 09), ser. Lecture Notes in Computer Science, pindlerv Mln, Czech Republic, 2009, to appear. [70] J. Regehr and A. Reid, HOIST: A system for automatically deriving static analyzers for embedded systems, ACM SIGOPS Operating Systems Review, vol. 38, no. 5, pp. 133143, 2004. [71] J. Regehr and U. Duongsaa, Deriving abstract transfer functions for analyzing embedded software, in Proceedings of the ACM SIGPLAN/SIGBED Conference on Language, Compiler, and Tool Support for Embedded Systems (LCTES 2006), Ottawa, Canada, 2006, pp. 3443. [72] J. Bergeron, M. Debbabi, M. M. Erhioui, and B. Ktari, Static analysis of binary code to isolate malicious behaviors, in Proceedings of the 8th Workshop on Enabling Technologies on Infrastructure for Collaborative Enterprises (WETICE 1999), Stanford, USA, 1999, pp. 184189. [73] D. Brylow, N. Damgaard, and J. Palsberg, Static checking of interrupt-driven software, in Proceedings of the 23rd International Conference on Software Engineering (ICSE 2001), Toronto, Canada, 2001, pp. 4756. [74] F. Martin, M. Alt, R. Wilhelm, and C. Ferdinand, Analysis of loops, in Proceedings of the 7th International Conference on Compiler Construction (CC 1998), ser. Lecture Notes in Computer Science, vol. 1383, Lisbon, Portugal, 1998, pp. 8094. [75] J. Regehr, A. Reid, and K. Webb, Eliminating stack overow by abstract interpretation, in Proceedings of the 3rd International Conference on Embedded Software (EMSOFT 2003), Philadelphia, USA, 2003, pp. 306322. [76] C. Linn, S. Debray, G. Andrews, and B. Schwarz, Stack analysis of x86 executables, 2004, visited: May 2009. [Online]. Available: http: //www.cs.arizona.edu/people/debray/papers/stack-analysis.ps [77] C. Cifuentes and A. Fraboulet, Intraprocedural static slicing of binary executables, in Proceedings of the International Conference on Software Maintenance (ICSM 1997), Bari, Italy, 1997, pp. 188195.
116
Bibliography
[78] A. Lal and T. Reps, Reducing concurrent analysis under a context bound to sequential analysis, in Proceedings of the 20th International Conference on Computer Aided Verication (CAV 2008), Princeton, USA, 2008. [79] S. Qadeer and J. Rehof, Context-bounded model checking of concurrent software, in Proceedings of the 11th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS 2005), ser. LNCS, vol. 3440, Edinburgh, UK, 2005, pp. 93107. [80] A. Lal, T. Touili, N. Kidd, and T. Reps, Interprocedural analysis of concurrent programs under a context bound, in Proceedings of the 14th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS 2008), ser. LNCS, vol. 4963, Budapest, Hungary, 2008, pp. 282298. [81] F. Nielson, H. Nielson, and C. Hankin, Principles of Program Analysis. 2004, ISBN 3540654100. Springer,
[82] J. E. Hopcroft and J. D. Ullman, Introduction to Automata Theory, Languages, and Computation. Addison Wesley, 1979, ISBN 0201441241. [83] G. E. Moore, Cramming more components onto integrated circuits, Electronics Magazine, vol. 38, no. 8, April 1965. [Online]. Available: ftp://download.intel.com/ museum/Moores_Law/Articles-Press_Releases/Gordon_Moore_1965_Article.pdf [84] J. P. Arpasi, Introduction to ternary logic, November 2003. [Online]. Available: http://www.aymara.org/ternary/ternary.pdf [85] R. Martin, Agile Software Development, Principles, Patterns, and Practices. Prentice Hall, 2002, ISBN 0135974445. [86] V. Kamin, Extending the symbolic representation of states in [mc]square, Masters thesis, RWTH Aachen University, Department of Computer Science 11, 2008, (in German). [87] A. Aho, M. Lam, R. Sethi, and J. Ullman, Compilers: Principles, Techniques, and Tools, 2nd ed. Addison Wesley, 2006. [88] J. Blieberger and B. Burgstaller, Symbolic reaching denitions analysis of Ada programs, in Lecture Notes in Computer Science, 1998. [89] S. Mahlke, Introduction to compilers: Dataow analysis, liveness analysis, reaching denitions, 2003, Lecture Notes EECS 483 (Lecture 18), University of Michigan, November 2003. [Online]. Available: http://www.eecs.umich.edu/~mahlke/483f03/ lectures/483L18.pdf [90] B. Schlich, J. Lll, and S. Kowalewski, Application of static analyses for state space reduction to microcontroller assembly code, in Proceedings of the 12th Intl Workshop Formal Methods for Industrial Critical Systems (FMICS 2007), ser. Lecture Notes in Computer Science, vol. 4916, Berlin, Germany, 2007, pp. 2137. [91] K. Yorav and O. Grumberg, Static analysis for state-space reductions preserving temporal logics, Formal Methods in System Design, vol. 25, pp. 6796, 2004.
117
Bibliography
[92] P. Cousot and R. Cousot, Abstract interpretation: a unied lattice model for static analysis of programs by construction or approximation of xpoints, in Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, 1977. [93] B. Anckaert, M. Madou, and K. D. Bosschere, A model for self-modifying code, in Lecture Notes in Computer Science, vol. 2007, 2007, pp. 232248. [94] R. Heckmann and C. Ferdinand, Worst-case execution time prediction by static program analysis, White Paper, 2008, AbsInt Angewandte Informatik GmbH. [Online]. Available: http://www.absint.com/aiT_WCET.pdf [95] C. Healy, M. Sjdin, V. Rustagi, D. Whalley, and R. V. Engelen, Supporting timing analysis by automatic bounding of loopiterations, Real-Time Systems, vol. 18, pp. 129156, 2000. [96] M. Weiser, Program slicing, in Proceedings of the 5th International Conference on Software engineering (ICSE 81), San Diego, USA 1981, pp. 439449. [97] P. Horowitz and W. Hill, The art of electronics. Cambridge Univ. Press, 1980, ISBN 0521370957. [98] J. J. Labrosse, MicroC/OS-II The Real Time Kernel. 1578201039. CMP Books, 2002, ISBN
[99] M. J. Pont, Patterns for time-triggered embedded systems: building reliable applications with the 8051 family of microcontrollers. New York, NY, USA: ACM Press/Addison-Wesley Publishing Co., 2001. [100] L. Lamport, Sometime is sometimes not never: on the temporal logic of programs, in Proceedings of the 7th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL 80), Las Vegas, USA, 1980, pp. 174185. [101] E. A. Emerson and J. Y. Halpern, Sometimes and not never revisited: on branching versus linear time (preliminary report), in Proceedings of the 10th ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages (POPL 83), Austin, USA, 1983, pp. 127140. [102] D. Gckel, Extending the model checker [mc]square by user-dened environments, Masters thesis, RWTH Aachen University, Department of Computer Science 11, December 2007, (in German). [103] B. Schlich, D. Gckel, and S. Kowalewski, Modeling the environment of microcontrollers to tackle the state-explosion problem in model checking, in Proceedings of the 7th Symp. Formal Methods for Automation and Safety in Railway and Automotive Systems (FORMS/FORMAT 2008), Budapest, Hungary, 2008, pp. 2734. [104] D. Brand and P. Zaropulo, On communicating nite-state machines, J. ACM, vol. 30, no. 2, pp. 323342, 1983.
118
List of Figures
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 4.1 4.2 4.3 4.4 4.5 Formal verication methods classication. . . . . . . . . . . . . . . . . . . . CTL examples and intuitions. . . . . . . . . . . . . . . . . . . . . . . . . . . The model checking workow. . . . . . . . . . . . . . . . . . . . . . . . . . . The coee vending machine example. . . . . . . . . . . . . . . . . . . . . . . The model checking workow of the [mc]square approach (cf. Figure 3.3). The [mc]square framework. . . . . . . . . . . . . . . . . . . . . . . . . . . C51Simulator verication process. . . . . . . . . . . . . . . . . . . . . . . . . Software architecture of the C51Simulator. . . . . . . . . . . . . . . . . . . . Over- and under-approximation in abstraction [81]. . . . . . . . . . . . . . . Nondeterministic state space representation. . . . . . . . . . . . . . . . . . . The Delayed Nondeterminism approach of handling the MOV [0xA, 0xB] instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The state-explosion problem. . . . . . . . . . . . . . . . . . . . . . . . . . . Successor state generation and resulting system states with options: instantiate immediately, Delayed Nondeterminism, and Delayed Nondeterminism with Look Ahead for the assembly code presented in Listing 4.3. . . . . . . 7 10 11 13 15 18 19 20 26 29 31 34
36 44 46 50 52 57 58 58 59 61 61 65 67 77 78 79 79 80 82 93 94
The resulting CFG for Listing 5.1. . . . . . . . . . . . . . . . . . . . . . . . Data-ow analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LVA example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The [mc]square static analysis framework for the Intel MCS-51 target. . . The type hierarchy of the C51LVALatticeElement. . . . . . . . . . . . . . . The type hierarchy of the C51LVABuilder. . . . . . . . . . . . . . . . . . . . The type hierarchy of the C51RDALatticeElement. . . . . . . . . . . . . . . The type hierarchy of the C51RDABuilder. . . . . . . . . . . . . . . . . . . Bit-wise modeling of the register bank selection pointer. . . . . . . . . . . . The join-operator and a simple CFG. . . . . . . . . . . . . . . . . . . . . . . The corresponding CFG as generated with [mc]square for the assembly code in Listing 5.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.12 The principle of PR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 The target application. . . . . . . . . . . . . . . . . . . . . . The knitting machine monitoring device. . . . . . . . . . . . The foreground/background design pattern. . . . . . . . . . The software components. . . . . . . . . . . . . . . . . . . . A software circular buer model. . . . . . . . . . . . . . . . Communication sequence chart. . . . . . . . . . . . . . . . . The unfair path. . . . . . . . . . . . . . . . . . . . . . . . . The model checking workow of [mc]square with UDE (cf. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 3.3). . . . . . . . . .
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11
119
6.9 A rst UDE automata proposal (U1). . . . . . . . . . . . . . . . . . . . . . . 95 6.10 The nal UDE automata (U2). . . . . . . . . . . . . . . . . . . . . . . . . . 97
120
List of Tables
3.1 3.2 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5.1 5.2 5.3 5.4 5.5 5.6 6.1 6.2 6.3 6.4 6.5 6.6 Memory representation in [mc]square. . . . . . . . . . . . . . . . . . . . . 20 ND memory representations and resulting value combinations. . . . . . . . . 21 Data memory size and resulting system states. . . . . . . . . . . . . Comparison of abstraction techniques for the C51Simulator. . . . . . Memory contents before and after the MOV instruction. . . . . . . . . Truth table for 3-valued logic. . . . . . . . . . . . . . . . . . . . . . . How bitmasks are used in embedded software. . . . . . . . . . . . . . Details on the Delayed Nondeterminism with Look Ahead approach. The ADDC [A, R0] example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 30 30 32 33 35 39 47 49 57 59 60 66 81 83 84 95 100 104
Results after solving data-ow equations for source code Listing 5.3. . . . Results after solving LVA data-ow equations for source code Listing 5.4. Action List Building a few examples. . . . . . . . . . . . . . . . . . . . . Register bank congurations of the Intel MCS-51. . . . . . . . . . . . . . . Evaluating killLV () for MOV [R0, #const]. . . . . . . . . . . . . . . . . . Comparison of resulting live variables. . . . . . . . . . . . . . . . . . . . . Ringbuer elements and their size. . . . . . . . . The master-slave communication protocol. . . . . Case study variables and their meaning. . . . . . Denitions for UDE modeling. . . . . . . . . . . . Case study results. . . . . . . . . . . . . . . . . . Case study results for plain state space building. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
121
122
List of Algorithms
1 2 3 A xed point iterating algorithm to solve data-ow equations for the RDA problem [87]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 A xed point iterating algorithm to solve data-ow equations for the LVA problem [87]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 CFG building algorithm for the Intel MCS-51 target. . . . . . . . . . . . . . . 55
123
124
Listings
4.1 4.2 4.3 4.4 4.5 4.6 4.7 Assembly code excerpt. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Embedded C code example program for the Intel MCS-51 target. . . . . . . Translated assembly code for source code lines 4-5 of Listing 4.2. . . . . . . The Delayed Nondeterminism visitor pattern for the ANL [direct, #immediate] instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Delayed Nondeterminism with Look Ahead visitor pattern for the ANL [direct, #immediate] instruction. . . . . . . . . . . . . . . . . . . . . . . The Delayed Nondeterminism visitor pattern for the ADDC [A, R0] instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Nondeterministic Program Status Word visitor pattern for the ADDC [A, R0] instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Source code used for CFG building. . . . . . . . . . RDA example code. . . . . . . . . . . . . . . . . . RDA example code. . . . . . . . . . . . . . . . . . LVA example code (cf. [81]). . . . . . . . . . . . . . Code sharing within the program memory. . . . . . The Action List Builder visitor pattern for the ADDC Example assembly code. . . . . . . . . . . . . . . . Intel MCS-51 assembly snippet. . . . . . . . . . . . C source code containing switch statement. . . . . Switch Statement Assembler code snippet. . . . . . Program memory content. . . . . . . . . . . . . . . Entry addresses for called functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [A, direct] instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 33 33 37 37 39 40 44 46 47 48 54 57 63 69 70 71 71 72
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 6.1 6.2 6.3
Ringbuer C code macro. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 The erroneous receiver state machine implementation. . . . . . . . . . . . . 102 The revised receiver state machine implementation. . . . . . . . . . . . . . . 103
125
126
List of Abbreviations
ACM ASCII ASIC BDD CFA CFG CISC COTS CPU CTL CTL* DND DNDlA DVR ECM FIFO FPGA FSM GNU GUI IC IE IFA IP IRAM ISR LED LTL LVA ND NDPSW PC PDAG PLC POR PROMELA PR PSW RAM RBA Association for Computing Machinery American Standard Code for Information Interchange Application Specic Integrated Circuit Binary Decision Diagrams Control Flow Analysis Control Flow Graph Complex Instruction Set Computer Commercial O The Shelf Central Processing Unit Computational Tree Logic Computational Tree Logic* Delayed Nondeterminism Delayed Nondeterminism with Look Ahead Dead Variable Reduction Electronic Control Module First In First Out Field Programmable Gate Array Finite State Machine GNU is not Unix Graphical User Interface Integrated Circuit Interrupt Enable Interrupt Flag Analysis Intellectual Property Internal Random Access Memory Interrupt Service Routine Light Emitting Diode Linear Temporal Logic Live Variable Analysis Nondeterministic Nondeterministic Program Status Word Program Counter Propositional Directed Acyclic Graph Programmable Logic Controller Partial Order Reduction Process or Protocol Meta Language Path Reduction Program Status Word Random Access Memory Register Bank Analysis
127
RDA RISC ROM RPM RTL SA SDCC SFR SIES STE UART UDE VHDL
Reaching Denition Analysis Reduced Instruction Set Computer Read Only Memory Revolutions Per Minute Register Transfer Level Stack Analysis Small Device C Compiler Special Function Register Symposium on Industrial Embedded Systems Symbolic Trajectory Evaluation Universal Asynchronous Receiver Transmitter User Dened Environment Very High Speed Integrated Circuit Hardware Description Language
128