bytecode verification in a single forward-pass

ECMA for CLR claims that "It shall be possible, with a single forward-pass through the CIL instruction stream for any method, to infer the exact state of the evaluation stack at every instruction"

In other words, itshall be possible to perform bytecode verification in a single forward-pass.

To make that possible , ECMA requires that

"if that single-pass analysis arrives at an

instruction, call it location X,
that immediately follows an unconditional

branch, and where X is not the
target of an earlier branch instruction, then

the state of the evaluation
stack at X shall be empty."

On the other

hand, ECMA claims that it "simulates all control flow paths".

I don't get

how this simulation of all paths is done in a "single

forward-pass".

Suppose that I have a backward branch instruction Y,

targeting an (earlier)
instruction Z. The stack state at Y has to be

"merged" with the stack state existing at Z, right?

But then, I have

to propagate again the (new) stack state at Z to its
successors. But here,

it is a contradiction with the "single forward-pass"
bytecode

verification.

My guess is the following:

  • that "shall be possible" doesn't mean that the bytecode verification is performed in a single forward pass on the original bytecode stream. The code array has first to be rearranged in order to make possible the verification in a single pass. That probably happens during a JIT compilation.

  • If it's not rearranged, the bytecode verification is similar to the one of JVM, i.e., a dataflow analysis with a fix point interation.
Please someone correct me if I'm wrong.

Thanks!

[1825 byte] By [catalinione] at [2007-12-27]
# 1
I'm not able to follow your train of thought here. The way I read it: if location X is not the target of any previously parsed branch instruction, it must be the target of a backward branch instruction later in the instruction stream. To discover the stack state of location X, it would thus be necessary to parse ahead to the branch instruction and discover the stack state at the location of the branch. That violates the "single forward pass" requirement. To meet the requirement, it is thus necessary to assume that the stack is empty at location X. And by inference, it needs to be empty at the branch instruction which is easily verified.

With this requirement in place, there won't be any need to rearrange the byte codes and the byte code verifier can be quite fast with minimal internal state requirements.

nobugz at 2007-9-4 > top of Msdn Tech,.NET Development,Common Language Runtime...
# 2

My concern is not related with that constraint.

Consider the following example:

X1 and X2 are direct subclasses of X3

and assume that in the single forward-pass, the verification infers the following stack states:

1: [X1]
2: [int32]
3: []
4: [X2]

Assume that the instruction 4 is a conditional branch instruction, pointing (back) to the instruction 1. How does the verifier proceed? Are the two stack states [X1] and [X2] merged to [X3]? If yes, then the stack state at 1 becomes [X3], but then it shall be propagated to its successor 2 - and this is not single forward pass.

catalinione at 2007-9-4 > top of Msdn Tech,.NET Development,Common Language Runtime...
# 3
That is specifically disallowed by the specification. If location 4 branches back to location 1, the stack needs to be empty at both locations.
nobugz at 2007-9-4 > top of Msdn Tech,.NET Development,Common Language Runtime...
# 4
my example is accepted by the verification.

The constraint you're referring to requires that the instruction 1 shall immediately preceed an "unconditional branch instruction". But, in my example, there is no unconditional branch instruction at 0.

catalinione at 2007-9-4 > top of Msdn Tech,.NET Development,Common Language Runtime...
# 5
Okay, I missed the "conditional". Can you think of any C# code that would produce such a stack state?
nobugz at 2007-9-4 > top of Msdn Tech,.NET Development,Common Language Runtime...
# 6
I know what you mean, but that's not the point of the verification.

My program is a verifiable CIL program though it does not correspond to any program in a .NET compliant language

catalinione at 2007-9-4 > top of Msdn Tech,.NET Development,Common Language Runtime...
# 7

Hello All.

catalinione:

Leaving aside the discussion of the stack state, are you trying to say that a non-.NET compliant language should not be CIL verifiable?

MarkBenningfield at 2007-9-4 > top of Msdn Tech,.NET Development,Common Language Runtime...
# 8

No no. I think the point is that you can represent more in IL than you can in higher-level languages. The IL verifier itself doesn't care what languages was used to produce the IL.

When a branch occurs, the state of the stacks at the point of the branch target must be merged. If either the stack depth (as in this case) are not equal or the types on the stack are not compatible, the code is unverifiable. Here is an example I whipped up pretty quick in IL:

.method public static void Test(int32)
{
Start:
ldarg.0
dup
brzero End
ldc.i4.1
sub
starg 0
ldc.i4.0
br Start
End:
pop
ret
}

In a nutshell, this is pushing the number of arguments onto the stack as the parameter specifies. So the stack depth could be variable at the Start label. I think this more-or-less represents the idea above. This code is indeed unverifiable. When PEVerify is run on the code it reports the following:

[IL]: Error: [C:\public\verif.exe : <Module>::Test][offset 0x0000000E] Stack depth differs depending on path.

Brian Stern [MSFT]

BrianStern-MSFT at 2007-9-4 > top of Msdn Tech,.NET Development,Common Language Runtime...

.NET Development

Site Classified