Lectures Flashcards

(148 cards)

1
Q

What is Reaching Definitions Analysis?

A

For each location, RDA determines for each variable v the last locations where v may have been assigned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In what direction does the LVA go and why?

A
  • When a variable is written, it becomes dead in the locations
    before it
  • When a variable is read, it becomes alive in the locations
    before it

-> So for us to determine for each location which variables may potentially be read before their next write, the LVA should traverse the CFA in backward direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When a variable is alive?

A

When a variable is read, it becomes alive in the locations
before it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When is a variable dead?

A

When a variable is written, it becomes dead in the locations
before it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a live variable?

A
  • A variable is live at a location if there exists a path from this
    location to a read of the variable (without being written in-between)
  • We assume all output variables Var_out to be live at the end of a PLC cycle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define LVA.

A
  • Live Variables Analysis
  • Determines for each location which variables may potentially be read before their next write
  • Can be used to avoid storing information about dead variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the connection between CFAs and Semantics?

A
  • All program executions (as given by our semantics) are represented in the CFA
  • Not all paths through the CFA are possible
  • When the model contains all real behaviours but might include spurious behaviours, it is called an overapproximation, and it is typical for static analysis
  • Reason: Exact analysis is typically impossible or prohibitively expensive
  • Overapproximation gives us a specific kind of inexactness ⇒ Results are still useful
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How would a CFA look like for a while statement?

A

We remove the exit location at CFA 1 and redirect all incoming edges the the global “l” entry location. The “l” entry location has two edges - the one leads to CFA 1 if b holds and the other leads to the exit location 0, if b doesn’t hold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How would a CFA look like for an IF statement?

A

We remove both exit locations at CFA 1 and CFA 2, redirect their incoming edges to an 0 state which is the global exit location. e add a global entry location and if the condition holds, then we go to CFA 1, if not - to CFA 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How would a CFA for a sequence of stmt’s look like?

A

If we have CFA 1 and CFA 2, we remove the exit location of the CFA 1 and every edge that leads to the removed exit location we redirect to the entry location of CFA 2. We set the global entry location to the entry location of CFA 1 and the global exit location to th exit location of CFA 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define CFA.

A

A Control Flow Automaton (CFA) is a 4-tuple
A = (L, E, entry, exit) where
▶ L ⊆ N is the finite set of locations
▶ E ⊆ L × Frag × L is the finite set of edges
▶ entry ∈ L is the entry location
▶ exit ∈ L is the exit location

  • define the paths through the program
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why do we need control flow?

A

Because the behaviour of a program at locations is dependent on surrounding locations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is SOS and how do they work?

A
  • Structural Operation Semantics
    • Define behaviour of program through behaviour of its parts
  • Specification in terms of inference rules (SOS-rules)
    -> Premise/Conclusion
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain a Concrete Transition Relation.

A

A concrete transition relation

→ ⊆ (Stmt × Σ) × ((Stmt ∪ {↓}) × Σ)

defines the evolution of concrete configurations during program execution.

t.e. kak se promenqt konfiguraciite dokato palim programata

t.e. 2 konfiguracii oznachava dvoika - edin stmt (programa) i stoinostite na variables v neq programa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a concrete configuration?

A

We call a pair ⟨stmt, σ⟩ ∈ ((Stmt ∪ {↓}) × Σ) a concrete configuration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain the evaluation functions.

A

Given a state σ ∈ Σ, the evaluation functions

  • val^Z_σ : AExpr → Z
  • val^B_σ : BExpr → B

are defined recursively over the structure of the expressions.

Example:
Given σ = [x → 3, y → 5] ∈ Σ, it holds that
* val^Z_σ (4) = 4
* val^Z_σ (x) = 3
* val^Z_σ (x + 4) = 7
* val^B_σ (x + 4 > 5) = true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a concrete state space?

A

The concrete state space is the set of such valuations

Σ := {σ | σ : Var → Z}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a concrete state?

A

A concrete state is a function assigning a value to each variable σ : Var → Z

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When is a ST_0 program P valid?

A

P is a valid ST0 program ⇐⇒ stmt_P has the form CYCLE stmt_rem where stmt_rem does not contain CYCLE stmt or IO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is ST_0 program?

A

An ST0 program P is a tuple
P = (Var_in , Var_state , Var_out , stmt_P , init_P)

Var_in and Var_out are sets of the input and output variables.

Var_state are the state variables (t.e. vsichki variables deto se definirat za da se izplozvat v programata)

stmt_P is a labelled statement over the variables Var (Var is the set of all the variables above)

init_P is the initial assignment of all program variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is ST_0

A

Simplified programming language for Structured Text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are formal methods?

A

Formal methods are made of mathematically tools for
the specification, design and verification of software and
hardware systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the purpose of formal methods?

A

Tests give limited results, there are too many things and instances to be tested. That is why we need guaranteed system properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does a definition of a variable means?

A

The assignment of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
A definition of a variable v at a location l_i reaches a location l_j if?
A path exists and v is not assigned in any statement in-between.
26
How does RDA work?
If we have "res at 3", we look at the edges before res but not only until the entry node but also go to the exit, as if it was just another cycle!
27
What happens when a variable is read in RDA?
Variable reads have no impact on RDA information
28
What happens when a variable is written in RDA?
When a variable is written, the following locations gain pairs for that variable and edge, but lose pairs for that variable and other edges
29
In what direction does the RDA go and why?
- Variable reads have no impact on RDA information - When a variable is written, the following locations gain pairs for that variable and edge, but lose pairs for that variable and other edges -> The RDA should go in forward direction
30
What does write access do in the RDA?
It kills but also generates.
31
What do we do if a location has a several incoming edges?
We unite the RDA information.
32
What do we want in a good analysis overall?
The smaller a solution is, the more accurate it is. - Here, talks about over-approximation (For example, an over-approximating solution to the LVA equation system will definitely contain all live variables, but might also contain some dead variables, but we are interested in the smallest solution) => poset
33
What is a partially ordered set?
(D, ⊆) where ⊆ /in 2 hoch (DxD) is reflexive, transitive and anti-symmetric.
34
How do we depict posets?
With Hasse diagrams
35
What are Hasse diagrams good for?
Hasse diagrams depict posets where two elements d1, d2 from D have a nedge iff - d1⊆ (bez chertata otdolu) d2 (if one is strictly smaller than the other) - and there is no element that fits between them
36
What is an upper bound?
Let (D, ⊑) be a poset. An element d ∈ D is an upper bound of S ⊆ D if ∀s∈S s ⊑ d
37
What is a least upper bound?
Let (D, ⊑) be a poset. An upper bound d ∈ D of S ⊆ D is the least upper bound if d ⊑ d′ holds for every upper bound d′ of S
38
How do we denote the least upper bound?
We use d1 ⊔ d2 to denote the least upper bound of {d1, d2}. Ex. 1 ⊔ {1,3,6} = 6 because it is the biggest number Ex. 2 {x} ⊔ {y,z} = {x,y,z}
39
What is a complete lattice?
A complete lattice is a partial order (D, ⊑) where each S ⊆ D has a least upper bound. - For exmple, (N, <) has no greatest element
40
Define a dataflow system.
The dataflow system S = (L, I, F, (D, ⊑), ι, φ) is defined by ▶ a set of labels L ▶ initial labels I ⊆ L ▶ a flow relation F ⊆ L × Frag × L ▶ a complete lattice (D, ⊑) ▶ initial information ι ∈ D ▶ transfer functions φ
41
Explain an Induced Iteration Function
- Equation System contains recursive dependencies -> we have to compute the fixpoint iteratively - Termination is not guaranteed - Solution is not unique - So the least fixpoint is most precise - Updates of analysis information may cause it to oscillate ( = kolebaq se) - So we need transfer functions to be monotonic
42
What does it mean for a function to be monotonic?
Let (D, ⊑) and (D′, ⊑′) be posets and Φ: D → D′. Φ is monotonic iff ∀d1,d2∈D d1 ⊑ d2 ⇒ Φ(d1) ⊑′ Φ(d2)
43
Define a chain.
Let (D, ⊑) be a partial order. A non-empty subset S ⊆ D is a chain if ∀d1,d2∈S d1 ⊑ d2 ∨ d2 ⊑ d1 (basically ako ima edge mejdu tqh v hasse diagrams)
44
Define ascending and descending chains.
Let (D, ⊑) be a partial order and S ⊆ D a chain. S is called an ascending chain if ∃f :S→N∀s1,s2∈S : s1 ⊑ s2 ⇔ f (s1) ≤ f (s2) and a descending if ∃f :S→N∀s1,s2∈S : s1 ⊒ s2 ⇔ f (s1) ≤ f (s2)
45
What is the connection between a transfer function and chains?
Monotonic transfer functions generate a (possibly infinite) ascending chain for every location -> Need guarantee for termination
46
What is the Ascending/Descending Chain Condition?
A partial order (D, ⊑) satisfies the ascending/descending chain condition (ACC/DCC) if all ascending/descending chains S ⊆ D are finite.
47
Explain the Worklist Algorithm.
Create a stack of edges ∈ F . In each step, calculate the transition for the top edge, and add all successor edges to the stack if information at the target changed.
48
What does the most precise solution correspond to?
A least fixpoint.
49
What are the requirements for a Fixpoint Theorem?
1. Complete lattice (D, ⊑) 2. Monotonicity of Φ
50
What is the Fixpoint Theorem?
If you have a monotonic function f on a complete lattice (a special kind of poset where every subset has a least upper bound and greatest lower bound), then f f has at least one fixpoint.
51
Why is the termination guaranteed?
Termination guaranteed by ACC.
52
What is the purpose of the Worklist Algorithm?
Worklist algorithm avoids recomputation of unaffected AIs. It is sound but it requires the ACC condition for termination!
53
Why do we need a VSA?
Many unwanted program behaviours depend on the possible valuations of the program variables * Division by zero * Overflow * Infinite loops ⇒ We can do some sort of error-detection by calculating the possible value sets for each variable, at each location
54
What is a VSA?
A value set analysis (VSA) is a dataflow analysis that, for each location, stores a representation of possible valuations of the program variables
55
What should be the domain of the VSA?
Value set analysis has to provide value sets for all program variables ⇒ The domain should be some sort of assignment
56
What is an Assignment Lattice?
Contains all possible values with the join operator
57
What does the domain in VSA use?
Intervals, instead of integers - reduced complexity, more abstract - for the intervals we again use overapproximation
58
What do we need for perfect modelling?
A SMT Solver but it is expensive!
59
Explain Value Set Analysis.
- Motivation : Many unwanted program behaviours depend on the possible valuations of the program variables * Division by zero * Overflow * Infinite loops ⇒ We can do some sort of error-detection by calculating the possible value sets for each variable, at each location - Domain: Value set analysis has to provide value sets for all program variables ⇒ The domain should be some sort of assignment: assignment lattice - contains all possible values - We use intervals, not integers, overapproximation
60
Why is ST0 used instead of ST?
ST is a high-level programming language that means it has weak formalisms. It does not have semantics. For automated analysis, we want something with well-defined semantics. Which is ST0. ST0 is more simplified as it has ● no function calls ● no complex data structures or references ● no boolean variables
61
How do we define the semantics of ST_0?
State ● State Space ● Expression (arithmetic and boolean) ● Transition relation ● SOS (Structural Operational Semantics) ○ Define a program behaviour by defining the parts ○ Rules Premise and conclusion ○ 10 rules (skip, IO, cycle, asgn, if1 x 2, while x 2, seq x 2) -> Syntax of an ST0 expression: ● arithmetic constant and variable ● arithmetic addition, subtraction, multiplication, negation ● boolean value, equality, greater than, negation, conjunction ● other things by syntactic sugar Syntax of an ST0 statement: ● skip, assignment, first do this stmt then that stmt ● if b then this stmt else that stmt, while b then stmt ● cycle stmt, I/O Components is an ST0 programme: ● var_in, var_out, var_state ● init_p, stmt_p
62
What’s the difference between weak and strong formalisms?
Strong formalism: syntax and semantics Weak formalism: syntax
63
LVA, RDA - once we have the common pattern, why do we use the common pattern algorithm? What's the benefit of it? how do we conduct the data flow analysis?
LVA: ● Determines for each location which variable may be potentially read before the next write ● A variable is alive if there is path from the location to the read without any write in between ● Analysis is done in backward direction in CFA RDA: ● Determines for each location, where the variable was last defined/ assigned ● Additional virtual location and edge to denote the initial assignment ● Analysis is done forward direction ● Used for slicing Common Pattern Algo ● try to fuse both the functions to a into a Analysis Info ● Use AI to create Dataflow system and solve by Induced iteration function ● Posets ● Complete lattice ● (L, I, F,(D,<),I,phi)
64
What is Concrete State Space?
It is a set of functions in which each function assigns a value to all variables.
65
What is a Concrete Transition Relation?
It tells how to go from one place to another. From a non-terminated statement and a state, we go to a statement which may be terminated and a new state.
66
Tell about Structural Operational Semantics and its transition relations.
SOS define behaviour of a programme through behaviour of its parts. It gives a syntax oriented, inductive definition of semantics. Premise (Rule) ------------------- Conclusion Iterative application yields a derivative tree.
67
Why do we use LVA and RDA?
● Provable Correctness: Soundness and completeness of the algorithm can be formally proven based on mathematical principles, guaranteeing reliable results with minimal room for error. ● Efficiency: The system can be solved efficiently using iterative algorithms, such as worklist scheduling. ● Modularity: The system is modular, so that different dataflow analysis problems can be solved by changing the transfer functions and the meet operator.
68
What is the benefit of LVA and RDA?
● Efficiency: The system can be solved efficiently using iterative algorithms, such as worklist scheduling. ● Modularity: The system is modular, so that different dataflow analysis problems can be solved by changing the transfer functions and the meet operator.
69
Does least fix point terminate?
Yes, because of the ascending chain condition.
70
What is ACC? How does it help in the termination?
ACC: Every subset of the domain D should be finite Let (D, ⊑) be a partial order and S ⊆ D a chain. S is called an ascending chain if ∃f:S→N∀s1,s2∈S : s1 ⊑ s2 ⇔ f(s1) ≤ f(s2) ACC ensures there are only finite set of elements
71
How do we use VSA and why?
VSA ● For each location, it stores a representation of a possible set of values of a program variable. Why? ● Detecting unwanted program behaviour ● Generate warning: ○ Unreachable code: with empty assignment for all variables ○ Datatype overflow: x=a, width(x) might be small, all can be detected ○ Expression annihilation: Mult by 0
72
What is a Control Flow Automata? How is it different from ST0?
CFA is a graphical representation of an ST0 programme. It is a 4 tuple: ● L: finite set of locations ● E: finite set of edges ● entry and exit location CFA may over-approximate the ST0 programme, thus not all paths may be feasible. Give example of x*x<0
73
What is a Complete Lattice?
It is a partial order, where each subset has a least upper bound.
74
What are the components of a Dataflow System?
It is a mathematical structure used to hold analysis information. ● a set of labels ● initial labels ● a flow relation - tell how to go from loc to loc by executing a fragment ● a complete lattice ● initial information ● transfer function - tells how an edge transforms information
75
What is an Induced Iteration Function? What are the problems with it?
Here we are taking 1 piece of information for each location in our programme and making with this iteration function new information for each location. It takes AI and computes new AI. It computes the fixpoint iteratively. Problems: ● termination not guaranteed ● solution not unique ● least fixpoint is most precise
76
What is monotonicity?
A poset is monotonous if for all subsets in it, if a subset is smaller than or equal to another, then even after applying the transfer function, the smaller than or equal to relation should hold. It is required so that updates of AI do not oscillate.
77
Why is widening used and how does it guarantee termination?
The problem with VSA is that it is non terminating so widening is the solution Example: If there is an infinite ascension, instead of incrementing one by one, set the interval to infinity and propagate forward. Makes very safe over-approximation but leads to loss of precision. Widening turns infinite chains into finite chains which causes guarantees for termination. The operator must be sound (widening operator gives us something greater than equal to join operator, tells that widening is an over approximation) and stable (order does not matter, will reach fix point, tells that the operator makes our analysis terminate)
78
What is narrowing? Why should we use it?
Narrowing: - Fixed-point iteration may not terminate in absence of ACC - Widening turns infinite into finite asc chains by over-approximation -> termination, but loss of precision - Narrowing is better in over-approximation but introduces infinite desc chains -> non-termination -> enforce via narrowing operator - - Sensitive to Worklist order
79
How do we integrate widening in the worklist algorithm?
We use join operator to combine old and new information so we widen old information with respect to the new one but it is sensitive to worklist order
80
What is Control Dependency?
Given a CFA, a location l or edge e is control dependent on another location l’ if l’′ “decides” whether l or e are executed or not. Is 3 contron dependent on 1? Dali 1 reshava dali 3 moje da byde poseteno ili ne V toq sluchai ne, zashtoto 2 reshava dali 3 da byde poseteno, ne 1 No 5 e control dependent on 1, zashtoto ako 1 izbere nadqsno - nqma da poluchim 5, ako izbere nalqvo - shte poluchim 100% 5
81
What is the Control Dependency Function?
In an ST0 programme, the control dependency function will give for each fragment, the location it is control dependent on or if such a location does not exist. Denoted as 〚.〛
82
What is the Programm Dependence Graph?
It is the combination of RDA and control dependency. The vertices of the PDG are the edges of the CFA. The edges of the PDG are the dependencies between the vertices. Slicing is done on PDG
83
What is slicing and its types?
Given with an instruction line and a set of variables, ● backward slicing computes the relevant code and variables on which they depend on ● forward slicing computes the code affected by them
84
What are the slicing dimensions?
-> They are: ● Syntax ○ preserving: just delete statements from the original programm ○ amorphous: new statements possible ● Semantic ○ strong: on termination, the sliced programme behaves as the original ○ weak: may not terminate or behaves differently ● Size ○ less: fewer instructions than gathered by PDG traversal ○ equal: same amount of instructions than gathered by PDG traversal ○ more: more instructions than gathered by PDG traversal Not all combinations are possible in general.
85
What are the other types of slicing?
● static: no assumptions on input values ● dynamic: fixed value for each input ● conditioned: boolean constraints on input values
86
What are the applications of slicing?
They are: ● Support of debugging - original motivation ● Model reduction for verification ● Predict impact of changes in the code ● Reduce necessary tests in regression testing ● Refactoring - method extraction
87
What is the motivation behind Galois connections?
- Proving certain properties may not require precise information or too costly - Instead of operating on lattice of concrete values, use of abstract values -> so we need a relation between concrete and absolute domain
88
Define alpha and gamma functions.
- Alpha is the abstraction function, gamma is the concretisation function. - The abstraction function over-approximates - Abstraction after concretisation yields no imprecision (you won’t lose any accuracy)
89
What is a Galois connection?
(alpha, gamma) if alpha and gamma monotonic
90
What are the galois connections useful for?
Useful for transforming computations into approximate computations with better time, space, and termination.
91
How do we transform exactly computations into approximate?
Two ways of transforming the analysis: 1. Replace analysis using L with an analysis using M (inducing along abstraction) 2. Only use M for approximating the fixed point computations in L (inducing along concretisation)
92
Talk about the Inducing along Abstraction.
-You completely switch from the concrete to the abstract domain. -You define a new analysis in the abstract world (M). - First we have concretisation, then the f function, the abstraction - This is very expensive, that is why we use safe approximation
93
What is a safe approximation?
- it overapproximates the result of the original function in a controlled way. - Approximation is safe if after f# we get something larger than or equal to what we had. Ideally equal, but realistically as small as possible. It might miss some specifications, but it never introduces unsafe behaviour that didn't exist in the original system. - It is possible for some cases or hard for others (for example, the xor operator for the addition function works good for signs, but not for intervals)
94
Talk about the Inducing along Concretisation.
- Use the abstract domain just to perform widening to make sure the analysis terminates, but we use concrete analysis, otherwise now we have something abstract, we are using it to build something concrete which is why this is called inducing along the concretisation function. - We need to find the least fixpoint, so we use a widening operator to induce a concrete widening operator. - Steps: 1. Abstract from both l1 and l2 2. Perform widening on abstract domain M 3. Concretise widened result to L - Properties of the new concrete widening operator 1. Approximates least fixed point of f : L → L while calculating over M 2. Advantage: Coarse structure of M only used to ensure convergence but does not reduce precision of other operations
95
How can multiple galois connections be combined?
2 ways - Sequential composition - You apply one Galois connection after another like chaining steps, when you want to abstract in stages - Parallel composition - you combine different aspects of the program side by side - Independent attribute method - analyze each property separately and combine results - Relational method - analyze how the properties are related
96
What do we represent with Galois Connections?
Values and functions
97
What is the problem with Galois Connections?
We now want an abstract version of the semantics for efficient use ▶ Problem: Abstraction via Galois Connections abstracts values and functions, not relations ▶ Solution: Reinterpret transition relation as transition function and abstract with the known method
98
What is an Extraction Function and how to use Safe Approximation on it?
It is the abstraction through point-wise mapping. It takes a concrete value from the concrete set and builds an abstract value.
99
What is State Space Exploration?
Exploring a state space is the process of enumerating possible states in search of a goal state.
100
What is the "next" function?
A concrete transition function
101
what is a concrete transition function?
it uses the concrete transition relation to transform from one statement to another.
102
what is an abstract transition function?
it is a function in which we take a concrete transition function and make it more abstract. this is done by doing galois connections. to sum it up, it is a concept which is used to describe the relationship between a concrete transition function and a galois connection. We just talked about the next function: next:P(State)→P(State) It takes a set of states and returns all states reachable in one step. This is concrete because it works with the real program states. 2. Why abstract it? The problem: The set of states is often infinite (e.g. all integers, all memory configurations). Computing next exactly is not feasible. So instead of exact states, we want to compute over a smaller, abstract domain M.
103
what is an abstract state and abstract state space?
An (abstract) state is a function assigning an (abstract) value to each variable. The (abstract) state space is the set of such valuations.
104
what is plc
programmable logic controller
105
How are PLC states different?
State of PLC during cycle not observable. When engineers talk about state, they mean the valuation at the end/beginning of a cycle. An additional cycle SOS rule is required.
106
explain cegar
because the states of the plc are not observable, we use cylce transition relations and functions because abstract space state suggests violation because the chosen abstraction is too coarse. so we need other abstraction. -> cegar - it is an iterative algorithm used for model checking , particurarly suitable for verifying large and complex software systems. it uses an abstract model , simplified from the original system and iteratively improves its accuracy -> explain the diagram : initial abstraction -> property satisfied -> yes -> property verified -> no Analyse counter example -> spurious -> refine by removing counterexample -> realisable -> violation found
107
What is a specification automata?
it is a tuple with set of states, alphabet, initial state, transition relation, and set of alarm states
108
what does it mean to satisfy a specification?
each path in the transition system (abstract space state) must have a corresponding accepting run in the automaton -> each state in such path must fulfill the corresponding condition in the run for all concrete values
109
What is Linear Time Property?
A LT Property over AP(set of atomic propositions) is a language E of infinite words over the alphabet 2AP(power set of AP). eg: "Whenever a holds, b should hold for at least 2 states"
110
What are Bad Prefixes and Safety Properties?
Bad Prefix is a trace where the LT property is violated. This trace is finite. Safety property says that bad things should not happen. An LT property is a Safety Property if for all the traces No matter how we extend a bad prefix, we will not end up with a word in E
111
What is model checking
it is a formal verification technique used to automatically verify the correctness of a system based on a formal model and specified properties
112
How does Model Checking work?
Construct a synchronised product of the transition system and the specification automaton. This synchronised product tells us if an alarm state is reachable, if yes then the specification is violated. Safety properties easily expressible via alarm states."Something should not happen" ⇒ Encode into state-sequence ending in an alarm state.
113
what is ltl
linear temporal logic it is used to specify Safety “The system will never reach a bad state” ▶ Liveness “The system will always reach a desired state again” ▶ Both “The system will always reach a state from which no bad states are reachable” In LTL, one can encode formulae about the future of paths, e.g., a condition will eventually be true, a condition will be true until another fact becomes true
114
How do we model check LTL?
Model Checking Idea: 1. Negate and normalise the LTL formula (only variables should be negated) 2. Transform the negated LTL formula into a Büchi automaton (BA) A¬Φ 3. Check if L(TS × A¬Φ) = ∅, check whether the acceptance condition can not be triggered.
115
What is a Büchi Automaton?
An automaton which can accept/reject infinite inputs it is more powerful than ltl and specification automata but hard to use directly
116
How does Galois connection relate to model checking? What is model checking? What specifications are used for the representations?
Abstraction plays a crucial role in model checking, especially for large systems. Manually checking all possible states can be infeasible. Galois connections provide a structured way to create and reason about abstractions of the original model. Model Checking: It is a formal verification technique used to automatically verify the correctness of a system based on a formal model and specified properties Specifications used: - ltl - specification automata - büchi automata
117
Compare Specification Automata with LTL.
They are incomparable. ● LTL can not "count", like divide by 3, even or odd “whenever a holds, not a holds for an even number of times before a holds again” ● LTL can express more than safety properties, like expecting something to happen in the future, “there will always be a point from which a holds forever” ● Specification Automata are restricted to safety properties, only checks bad prefixes ● Büchi Automata are more powerful than both, but hard to use directly
118
Explain Explicit State Representation.
A set of states is represented explicitly, if the data of each state is stored explicitly, i.e. can be accessed and modified directly
119
Explain Symbolic State Representation.
A set of states is represented symbolically if data of each state is stored implicitly i.e. access requires processing or solving -> Space efficient storage possible -> Relation constrains easy to represent -> Transformation of formula causes transformation of state-set
120
Why do we want to use Symbolic State Representation?
In practice, state space is huge but the memory is limited but the reachable state space often features structures and symmetries. Most domains are hard to represent explicitly, e.g. x * y < const
121
Explain Explicit Transition Relation.
1. Relation stored as graph between explicit states 2. Checking reachability of some bad states amounts to graph traversal (the process of systematically visiting all elements within a data structure)
122
Explain Symbolic Transition Relation.
Relation stored as predicate over two variables instances i.e. source and destination valuation e.g. T (x, y, x′, y′) := x < y ∧ x′ = x ∧ y′ = y -> checking reachability of some bad state amounts to checking satisfiability of a formula
123
Compare Explicit and Symbolic Transition Relations.
Explicit - Modifying single state - Computational complexity Symbolic - Modifying set of states - Memory Efficiency The use of concrete or abstract domains is unrelated to explicit or symbolic representation ▶ All four combinations are possible and have pros and cons ⇒ Explicit concrete execution often infeasible due to memory constraints but feasible via symbolic (concrete) execution
124
What are the problems in SAT checking?
SAT is NP-complete We need SMT solver as formulae contain non-boolean variable
125
Explain SMT
▶ A (Boolean) SAT instance is a set of constraints over boolean variables ▶ An SMT instance is a generalisation of a SAT instance where some variables are replaced by predicates (toest zamestvame bukvite s chisla ili izrazi) -> An instance is satisfiable iff a valuation exists which satisfies all constraints
126
Explain the Solver Approaches of SMT.
1. Eager Approach -> reduce to pure SAT Checking but not suitabe for all theories and doesnt exploit high level information (e.g. de Morgan) 2. Lazy Approach -> * Prior to theory-specific reasoning, check Boolean abstraction * If abstraction satisfiable, take corresponding theory-constraints into consideration * CEGAR-based How it works: - Create a Boolean skeleton - Check if satisfiable (t.e. assign-i na vsichkite bukvichki true or false) - Check consistency of respective constraints (t.e. dali sme na b1 da e true i na b3 da e true ama v realnostta b1 e x < 0, a b3 e x = 7, tei che ne stava) - Learn an explanation for inconsistency (t.e. dobavi nqkakvo pravilo deto shte predotvrati tova) - Check new skeleton - Check if SAT - Check constistency No inconsistnecy? SMT instance is satisfiable
127
Explain Bounded Model Checking.
It is a method for testing software and hardware systems for errors but it works by simulating the system for a limited number of steps or bounds and checking if it violates any properties during simulation Given an initial constraint a bound k and a characterization of bad states we check if bad states are reachable in k states - if satisfiable, the valuations c_0, ..., c_k represent a path up to length of k exibiting the bad behavior -if unsatisfiable there is no such path up to the length of k, but there might be a longer one -> it can be much faster than other model checking, especially for large systems -> but it can only find errors that occur withing the specified bound, it can be difficult to choose the bound
128
What is incremental SMT solving?
manage constraints using a stack -> faster than monolithic approach
129
Explain incremental BMC.
- no need to consider k+1 th step if a bad state has been found at distance k -when considering k+1 th step, reusing auxiliary lemmata of previous (k-th) check possible - avoid reconsidering visited configurations -if there is a bad state reachable, there is a finite path leading to it (in both finite and infinite spaces) - one can prove that no bad state is reachable if reachable state space is finite and k is sufficiently large -> n-induction for infinite spaces
130
What is a reachability diameter?
minimal distance to reach all reachable states but computation is as hard as determing reachable state space
131
Do we encode at every location?
In practise, we are only interested in the valuation at the beginning ot at the end of a cycle
132
What is the strongest postcondition?
SP_stmt(P) is the strongest predicate holding after executing stmt in a state satisfying P - SP_stmt is the strongest postcondition predicate transformer for a statement stmt - if P=>Q, then P is stronger than Q t.e. ako imame x := x + 1 or x := x + 2 i x = 2 togava the strongest postcondition shte e x′=3∨x′=4 zashtoto e nai-ogranichavashto i nai-tochni otgovori dava
133
Explain Basic Block Encoding.
- A basic block is a maximal non-branching sequence of statements - Basic block encoding treats each basic block like a single composite statement, i.e. logically characterizing it without intermediate locations and valuations - apply repeatedly until fixpoint is reached - Combine semantics by applying strongest postcondition to an identity mapping
134
Explain Large Block Encoding.
- A large block is a maximal non-looping sequence of statements - Large block encoding treats each large block like a single composite statement - Apply repeatedly until fixpoint reached - Combine semantics by applying strongest postcondition to an identity mapping
135
How do we test symbolic?
▶ Instead of executing program on concrete inputs use symbolic inputs ▶ Explore all paths through the program (ambitious goal) ▶ For each path, collect constraints at branching points ▶ Solve constraints to generate concrete input values ▶ Direct search into new locations by negating constraints
136
Explain Symbolic Context.
consists of ▶ a program counter pc ∈ LP ▶ a symbolic store ρ : Var → SymVal which maps variables to to their (possibly) symbolic value ▶ a set of constraints Π corresponding to the choices at previous branches -> Execution of a symbolic context results in application of the corresponding statement’s symbolic semantics -> Termination - Procedure not complete - cannot determine unreachability. In practice, timeout is introduced -> Priority Queue - Maximises coverage, ordering by minimal distance to unvisited locations (most promising one is popped)
137
What are the test vectors for?
A test vector is simply an input/output example generated from symbolic execution. Formally, a test vector is a sequence of pairs: V=(Varin→Z×Varout→Z)∗ That means each step contains: -An input assignment (what values the input variables take) -An output assignment (what values the output variables produce) The ∗ means we can have a sequence over multiple cycles (time steps). Example: V1=[(x=2)↦(y=5),(x=3)↦(y=6)]
138
How do we generate tests?
The algorithm automatically builds a Structured Text test harness: Wrap the original program as a function block. Generate a driver program with a switch over test cases. For each test case, feed inputs, run the function block, check outputs, and report pass/fail.
139
What is the benefit of large block encoding?
- Reduced state space: combining blocks leads to fewer unique program states, speeding up verification - Faster analysis with SMT solvers: symbolically analyzes larger blocks, potentially manual effort - Reduced path explosion: fewer paths to explore by analyzing combined blocks, improving scalability
140
What does a ST_0 expression contain?
An arithmetic expression, Boolean expression, bitfield, and pointer expression
141
Which integers do we use in ST_0?
Before, we modeled program variables as infinite integers. But real hardware uses fixed-width registers. With ST_0, all bitfields are modeled as 8-bit values, so in verification we must model the exact number of bits to capture real hardware behavior correctly.
142
What are machine integers?
A machine integer of width n is a fixed-width n-bit vector that represents an integer value in hardware, with arithmetic performed modulo 2n. In abstract domains, machine integers may be extended with unknown bits (*) to represent uncertainty. We should not model bit field values with intervals but with machine integers. To use machine integers as abstract values, we need to know how to add, subtract, and multiply them (among other things) ⇒ We need a Galois connection to sets of integers.
143
What is Product Domain?
A product domain combines intervals and machine integers into pairs to get the best of both worlds — intervals for arithmetic, machine integers for bitwise operations — then reduces and normalizes them for sound analysis. //////// Machine integers are suitable for bitwise operations and represent memory locations of limited size. For arithmetic operations, the abstraction is not as suitable, intervals perform better- Ideally, wed like to combine the strengts of both abstractions. Idea. Store an interbal and a machine integer at the same time, using a modified version of the independent attribute method For intervals and machine integers, the product domain#s elements are pairs of one interval and one machine integer After this get the reduced product domain and normalise
144
What are the problems with pointers?
- Pointers need their own storage - Null pointer expressions get propagated to the top level and end up changing the output - Pointer Manipulation is a common source of errors
145
Explain SFC.
- Sequential Function Charts -Define the larger-scope behaviour of PLC over multiple layers -Components: *steps *transitions *actions *jumps *parallel/alternative branches -Steps can be active or unactive *activated -> entry action executed *deactivated -> exit action executed -if all of a transition#s predecessor steps are activ and the transition condition is fulfilled, the transition#s predecesssor steps are deactivated and its successor steps are activated -init_step - set of all steps that are active at the beginning of a program
146
Translation of SFC to ST_0.
-We need to track all currently active steps -> boolean var S.X for each step, tells if active -We also need to track whether a step will be active in the next cycle -> boolean var S._X for each step, tells if active in next step -Transitions - from each transition from steps S to T *check whether the input steps are active (S.X) *check if the transition has a fulfilled condition *if yes, set S.X to false and T._X to true *at the cycle end, set all .X to the corresponding ._X -Actions *for exit actions, S.X is true and S._X is false *for entry actions, S.X is false and S._X is true *for active actions, S.X is true -We define a statement stmtt for each t ∈ Trans and then concatenate them into one statement stmtTrans for all transitions, as well as one statement stmtStep to handle setting S.X to S._X.
147
Explain Analysis of SFC.
-All of our exisiting analysis can be applied to it with small modification to improve results -Symbolic methods, in particular bounded model checking, can be used as-is ▶ Unless actions contain loops, the construction is loop-free, which makes large-block encoding suitable ▶ Split into x and x′ variables allows to project away ._X -Abstract interpretation - mostly unmodified -Dataflow analysis - we get a valid overapproximation, but since the information from different steps overlaps in the ST0 code, the overapproximation is very coarse -> mode absraction
148
What is mode abstraction?
In order to split up dataflow information from different steps, we need to track information based on the currently active steps separately -> split variables into mode variables (tell what configuration the machine is in) and active variables (these actually do the operations) *in other analyses (than VSA), there is no direct way of gettinf the mode from the domain, so VSA is necessary (because in other analyses the domain is abstract so only the range is given, but VSA gives you the exact domain so we can use it to have the mode information)