Questions Flashcards

getgood (24 cards)

1
Q

At which cases do your static analysis outperform your dynamic?

A

Static analysis can reason about many possible inputs and execution paths at
once, including paths that dynamic analysis never explores. It performs better
when the program has many branches, large input spaces, or rare corner-case
behaviors, because it does not rely on specific test inputs to reach them. Dy-
namic analysis gives deep insight into the concrete runs it executes, but static
analysis covers a much broader range of potential behaviors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In what circumstances would you want a may vs a must analysis?

A

The explanation of both is:
* A may analysis finds all program states where a property may hold.
* A must analysis finds all program states where a property must hold.

If the goal is to find all states where the program may behave undesirably, you use a may analysis, because it over-approximates the program’s behaviors and therefore includes every state where an issue could potentially occur. If the goal is to find all states where the program is guaranteed to behave undesirably, you use a must analysis, because it under-approximates and returns
only states that the program truly reaches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Is program analysis easy?

A

To determine the easiness, it is useful to use the classification from Rice’s the-
orem, which states that any non-trivial semantic property of programs is unde-
cidable.
For non-trivial semantic properties this means that semantic analysis is never
going to be perfect. A non-trivial semantic property means some property of
the behavior of the program during execution.
The difficulty depends on the purpose of the analysis and the program being
analyzed. If the purpose is a syntactic property of counting the number of
assertions, that analysis is not as hard. This is a syntactic property and therefore
is a decidable problem. Syntatic analysis can be always made perfect unlike
semantic analysis.
The structure of a program is easier to determine than the behavior of a
program, because the behavior is a semantic property. Semantic properties
depend on the behavior of the program on different inputs and by Rice’s theorem
that is undecidable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name cases where the choice of a dynamic analysis makes the most sense?

A

Most programming languages come with a built-in interpreter, so it is sometimes
easier to run the program than using syntactic analysis. A dynamic analysis
also gives the concrete trace that produces the error.
It also makes sense to use dynamic analysis when looking for true positive
warnings or errors that only shows up when running the program.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name some ways can you describe the semantics of a program?

A

There are several methods, and we will mention three of them.
Axiomatic Semantics revolves around describing programs asserting precon-
ditions and postconditions to every statement in a program using Hoare triplets,
{P } C {Q}. Using this to prove correctness on parts of the program can lead to a proof of total correctness of the program, making it a strong tool for program
verification but not program analysis.
Denotational Semantics is about mapping the program to another language,
assigning it mathematical meanings. Using this method, the program’s meaning
can be described abstractly and in a precise manner.
Operational Semantics, which is the method relevant for the course, involves
describing the program as changes to a state. Within Operational Semantics
there is Structural Operational Semantics, aka Small Step Semantics, and Nat-
ural Operational Semantics, aka Big Step Semantics. Small Step Semantics
involves describing the change of the state of the program for a single opera-
tion, where Big Step Semantics describes the state of running the program until
it halts. Obviously, Big Step Semantics cannot reason about infinitely running
programs and cannot be implemented practically. Small Step Semantics can
however easily be converted into an interpreter, and the Big Step Semantics is
recoverable through using Small Step Semantics until the program terminates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Is Java a regular language, context free or recursive enumerable?

A

Java cannot be a regular language (type 3), since regular languages do not encompass nesting which java heavily relies on (e.g., parentheses, blocks, and
expressions). By the Java language specification, Java’s syntax uses a context-
free grammar. All context-free languages are also recursively enumerable; this
makes Java’s syntax recursively enumerable as well. However, Java can also
be viewed as a programming language that runs the program and either halts
or does not. We are then looking at the set: LJava−halt= {p|p is a valid Java
program and halts}. This makes Java recursively enumerable but not decidable, since Java is Turing-complete. A Turing machine can recognize halting
programs by running them, but cannot decide for all programs if they halt.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some good uses for SMT Solving?

A

SMT solving is useful for highly complex problems that can be solved logically
that include richer theories such as arithmetic, bit-vectors, arrays, and booleans.
In program analysis, it would be relevant using SMT solving to check for path
feasibility, generating test inputs, detecting assertion violations, etc. simply
put, it can logically reason about formulas of all sorts. Compared to keeping
intervals, the SMT solver is able to keep track of all logical constraints.
Other cases where SMT solving can be applied: NP hard problems, program
verification, bounded model checking, solving a system of linear equations, or
the N-queens problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the big limitations of Symbolic Ex-
ecution?

A

Although there are many advantages to using symbolic execution, abstracting
concrete values for memory aliasing and array-indexing makes no sense. For ex-
ample, you need a concrete value for finding where the data is stored (memory
aliasing). Additionally, one-way functions like hashing become very difficult to
solve. An significant challenge is path explosion, the number of possible exe-
cution paths grows exponentially with branching and loops, making exhaustive
exploration infeasible in practice. This would lead to extremely high memory
consumption and long analysis.
Symbolic execution depends on SMT solving that becomes computation-
ally expensive when constraints involve non-linear arithmetic, pointers, complex
data structures, or one way functions such as hashing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the primary limitation and challenges of dynamic analysis?

A

It might be hard or impossible to set up the correct environment of a program
without running it in production.
The amount of traces can also make it impossible to test them all and a bug
might exist outside the traces that was tested. So dynamic analysis can only
prove the existence of something not the non-existence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does it mean to interpret a piece of code?

A

To interpret a piece of code means to examine the code and execute its opera-
tions step by step according to the language’s semantics. An interpreter is called
an interpreter for a reason: It reads code (most likely bytecode), interprets it
and carries out the corresponding computations. This is exactly a practical im-
plementation of Small Step Semantics. While not all languages are interpreted
at runtime, interpretation provides a direct way of executing code by following
its semantics step by step.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the advantage of Galoi connections?

A

The Galois connection maps concrete values to abstract values and back. It
guarantees the abstraction is sound and well-defined, allowing us to reason about
concrete behavior based on abstract analysis results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the drawback of a context sensitive
analysis?

A

The main drawback of context-sensitive analysis is cost. The analyzer must
maintain separate abstract states for different call chains, which can lead to a
combinatorial explosion in both time and memory usage. For recursive methods,
the number of contexts may be unbounded or dependent on input, making
the analysis potentially non-terminating. Thus, context sensitivity increases
precision, but at the price of much higher computational overhead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When is random sampling of inputs better than testing small values and visa versa?

A

Testing small values can work because of the small-scope hypothesis, which says
that most bugs in a program can be found with a small sample of inputs. Many
programs have branching conditions based on integer values where the absolute
number is small. It also has the advantage that if a bug is found, it finds the
smallest input that creates that error.
Random sampling is good when the input space is large and complex, so
random sampling gives a chance to hit interesting traces. Random sampling
is also good at finding unexpected bugs because random inputs might explore
unusual or unexpected paths.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When is syntactic analysis a good tool?

A

As mentioned previously, syntactic analysis focuses on structural patterns and
syntax. This makes it a great tool for pattern matching, early-stage bug detec-
tion, and a fast analyzer.
Syntactic analysis is great for identifying patterns and stylistic problems in code.
This is similar to linters that catch common mistakes, like naming conventions.
This leads to the types of errors it catches: mismatched parentheses, missing
semicolons, invalid operators, undeclared variables, etc. All of these checks are completed before runtime. This makes syntactic analysis fast since there is no
need to run the program and write to the stack.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which cases are the Sign abstraction not able
to catch?

A

The sign abstraction cannot capture exact values or small ranges. It loses detail
about which concrete values cause an issue. Although it is sound, it may produce
false positives because many values collapse into the same abstract category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why are method calls complicated?

A

Method calls are complicated to analyze because they break local reasoning.
When a method is called, the analyzer must understand behaviour in a different
environment with new variable bindings and a different call stack. Multiple call
sites may invoke the same method with different inputs, and in object-oriented
languages such as Java, dynamic dispatch makes the actual callee dependent
on the runtime type. Recursion adds further complexity, as the depth of calls
may depend on input and can be unbounded. All of this makes it difficult to
analyze a method call statically without additional techniques such as inlining
or call-graph construction.

17
Q

Why do we need a widening operator?

A

In certain situations, a fixpoint cannot be reached: A program may contain a
loop or recursive process that produces an infinite ascending chain of changes to
the abstract state. A widening operator, ∇, forces convergence by computing a
fixpoint, ensuring that the analysis eventually terminates. While this fixpoint
may be less precise, since it may be over-approximated, it guarantees that the
static analysis will converge.

18
Q

Why should we use abstractions when analyzing code?

A

Abstractions let us analyze programs without having to explore every concrete
value or state by over-approximating sets of values. They reduce an infinite state
space to a finite one, making static analysis computable. Abstractions also let
us reason about many values at once, which improves performance while keeping
the analysis sound.

19
Q

Why would you use Concolic Execution?

A

Concolic execution, which is a combination of concrete and symbolic execution,
which is used to overcome some practical limitation of pure symbolic execution.
Pure symbolic analysis suffers from path explosion, complex SMT constraints,
and difficulties handling operations that cannot be modelled symbolically such
as hashing, system calls or external input dependencies.
Concolic executions offers following benefits:
* Avoid infeasible paths, concrete execution prevents exploration of sym-
bolic branches that would not show up in practice.
* Reduces constraint complexity, concrete values replace parts of expressions
that would have been too hard for the SMT solver to understand (e.g.,
hash functions, array indexes).
* Systematic path exploration, negating symbolic branch conditions allows
the tester to generate new inputs that explore alternative execution paths.
* Fewer test cases are needed, paths that are logically equivalent share the
same symbolic state, reducing redundancy.
Therefore, concolic execution provides a balance between scalability and
precision, which makes it pretty effective for automated test generation, bug
detection, and exploring program behaviour.

20
Q

Why would you use context sensitive analysis?

A

Context-sensitive analysis distinguishes different calls to the same method, al-
lowing the analysis to preserve information about which caller triggered which
behaviour. This improves precision because method behaviour can vary depend-
ing on the calling context. It is particularly useful in languages with polymor-
phism and dynamic dispatch, where the actual target method may depend on
the object type and call site.

21
Q

Is analysing bytecode a semantic analysis ?

A

It depends on how you analyse the bytecode. If you only count or pattern-match
instructions in the bytecode, then it’s a syntactic analysis as you only look at the
structure. If you instead look at and reason about the language’s operational semantics
(state space, single-set transition relations, and initial states) then the analysis would be
semantic.

22
Q

What are the limitations of a syntactic analysis?

A

Syntactic analysis only sees structure, but not semantics and the meaning of code.
Hence it can conclude that a program is structured correctly, but it cannot detect issues
such as the trace of an if-statement.

23
Q

What does it mean for two variables to be dependent ?

A

Two variables are dependent if changing the value or condition of one affects or
constrains the possible values or paths of the other. In program analysis, this means
certain execution paths or variable states can only occur together, as the control flow
depends on their relationships.

24
Q

What is the primary problem when doing
unbounded static analysis?

A

Non-termination. If the abstract domain has infinite height, the analysis never reaches a
fixed point. For example, in interval abstraction, i can grow infinitely ([0,0], [0,1], [0,2],
…). To avoid this, analyses must ensure a fixed point is reached, otherwise, they run
forever.