MSIL compiler

I need to build a compiler that generate MSIL code. Can anyone help me? Is there a sample?

Thanks a lot!!!

[112 byte] By [PhoenixStudent] at [2008-2-22]
# 1
We do indeed provide a sample -- the Grammar-Based Analyzer -- which is a parser and MSIL code generator for a simple language.
AndyAyers-MSFT at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 2
Ok, but this sample don′t use the Phoenix′s MSIL generator. It writes, directly, the MSIL code instuction by instruction. I need a sample that use the MSIL writer to translate the IR to MSIL. Something like the CHAD compiler, but that generate MSIL code.
PhoenixStudent at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 3

Unfortunately, we don't have any samples that do this exactly. And while we provide most of the components you would need to create an MSIL compiler, there are some challenges in integrating these parts. I'll sketch an outline here.

First, you need to decide whether your compiler will follow the separate compilation model like C/C++ or the whole assembly model like C#. Since I do not know anything about your source language I can't give you any guidance there. Second, you would need to decide how the types in your language map on to MSIL types, and how much freedom users would have in your language to access the full MSIL type system (eg, importing types from other assemblies). Once you know these things, then you can use a combination of the metadata, pe write, and object writing capabilities of Phoenix to express your language in MSIL. The IR part is probably relatively straightforward; it's getting the type data correct and properly expressed in metadata that is the tricky part.

AndyAyers-MSFT at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 4
My Idea is to build a compiler for a functional language, like Haskell. Firstly, the main objective is to map functional features into Msil code. After that, I think to expand the scope to allow the use of .NET library.
I really need guidance to make this. All help is welcome!

Thanks!!!

PhoenixStudent at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 5
How can I map one simple program in Msil code using phoenix API? Is it possible to do this with pewriter and MSIL′s runtime and architecture? I need a simple sample, something that generates the code for a simple program, like:

Code Snippet


class Program
{
static void Main()
{
Hello hello = new Hello();
Console.WriteLine(hello.sayHello());
}

class Hello
{
public string sayHello()
{
return "Hello World!!";
}
}
}


I really need a guidance. Please, somebody help me?

Thanks

NewPhoenixStudent at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 6

We don't have any samples like this today, unfortunately. Your best bet is to start out with some relatively empty assembly as an input, and treat this as a pe read/write scenario. There is no good way to create a PE file or metadata from scratch.

I will see if I can put up a code sketch for you in the next few days.

AndyAyers-MSFT at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 7
Ok,

I will be waiting for the code sketch.

Thanks a lot!!!

NewPhoenixStudent at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 8
Hello Andy Ayers,

I really need you help. I′m waiting for the code sketch. When can you post it?

thanks.

NewPhoenixStudent at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 9

Sorry for the delay -- I've been out of the office for a few days, and it will take me a few more to get around to coding this up.

AndyAyers-MSFT at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 10

I've finally got something. It's a little on the long side but I will try posting it here and we will see how it goes. Since it is fairly long I am going to break it into parts.

You first need to create a trivial assembly to use as a model. I used ilasm for this, and the assembly file was simply:

Code Snippet
.assembly foo {}

Save this to foo.il, and then run ilasm to get foo.dll:

Code Snippet
ilasm /dll foo.il

The code illustrates how to create a PE file with custom content. In this case there's one class with one simple method. The Initialize and LookupAssembly methods are similar to the ones found in the mtrace sample. In the main function there are a number of subtle points and hard to diagnose problems. I'm just going to post the code as is for now, and then as questions come up I will describe what is going on in some of the trickier spots.

Also note that a real compiler would create the classes and IR by walking the parsed program input. I've left that part for you to fill in. Presumably you could write your own parser or try and adapt lex/yacc like is done in the grammar based analyzer sample.

Code Snippet

// Part 1 of 3

using System;

using System.Collections.Generic;

using System.Text;

namespace MsilCompiler

{

class Compiler

{

static void Main(string[] args)

{

// Initialize Phoenix.

Initialize(args);

// Create a new PE module unit.

Phx.PEModuleUnit peModuleUnit = Phx.PEModuleUnit.Open("foo.dll");

peModuleUnit.OutputImagePath = "foo2.exe";

peModuleUnit.LoadEncodedIRUnitList();

peModuleUnit.PreAssignTypeTokens();

// Create a reference to System.Object.

Phx.Symbols.AssemblySymbol mscorlibSymbol =

LookupAssembly(peModuleUnit, "mscorlib");

Phx.Name objTypeName =

Phx.Name.New(lifetime, "System.Object");

Phx.Symbols.MsilTypeSymbol objTypeSymbol =

Phx.Symbols.MsilTypeSymbol.New(peModuleUnit.SymbolTable, objTypeName, 0);

Phx.Types.AggregateType objType =

Phx.Types.AggregateType.NewDynamicSize(typeTable, objTypeSymbol);

objType.IsPrimary = true;

objTypeSymbol.Visibility = Phx.Symbols.Visibility.ClrTokenReference;

// Now, attach the class type to the mscorlib assembly reference.

mscorlibSymbol.InsertInLexicalScope(objTypeSymbol, objTypeName);

// Create a class that extends System.Object

Phx.Name classTypeName =

Phx.Name.New(Phx.GlobalData.GlobalLifetime, "SampleClass");

Phx.Symbols.MsilTypeSymbol classTypeSym =

Phx.Symbols.MsilTypeSymbol.New(peModuleUnit.SymbolTable, classTypeName, 0);

Phx.Types.AggregateType classType =

Phx.Types.AggregateType.NewDynamicSize(peModuleUnit.TypeTable,

classTypeSym);

classType.Access = Phx.Symbols.Access.Public;

classType.IsDefinition = true;

Phx.Types.BaseTypeLink baseLink =

Phx.Types.BaseTypeLink.New(objType, false);

classType.AppendBaseTypeLink(baseLink);

classType.IsPrimary = true;

classType.IsSelfDescribing = true;

// Now, insert the class type into the assembly scope.

Phx.AssemblyUnit assemblyUnit = peModuleUnit.AssemblyUnit;

Phx.Symbols.AssemblySymbol assemblySymbol = assemblyUnit.AssemblySymbol;

assemblySymbol.InsertInLexicalScope(classTypeSym, classTypeName);

// Create a method void Main(string[] args)

Phx.Types.Type arrayOfStringType =

Phx.Types.ManagedArrayType.New(typeTable, typeTable.ObjectPointerSystemStringType, null);

Phx.Types.FunctionType mainMethodType =

typeTable.GetFunctionType(Phx.Types.CallingConventionKind.ClrCall,

typeTable.VoidType, arrayOfStringType, null, null, null);

Phx.Name mainMethodName =

Phx.Name.New(Phx.GlobalData.GlobalLifetime, "Main");

Phx.Symbols.FunctionSymbol mainMethodSymbol =

Phx.Symbols.FunctionSymbol.New(peModuleUnit.SymbolTable, 0, mainMethodName,

mainMethodType, Phx.Symbols.Visibility.GlobalDefinition);

Phx.Name argsName = Phx.Name.New(lifetime, "args");

Phx.Symbols.FunctionArgument functionArgument = Phx.Symbols.FunctionArgument.New(mainMethodSymbol, argsName, 0);

mainMethodSymbol.AppendFunctionArgument(functionArgument);

// Add it as a method of the traceType.

classType.AddMethod(mainMethodSymbol);

classTypeSym.InsertInLexicalScope(mainMethodSymbol, mainMethodName);

// Allocate it inthe .text section.

Phx.Name textName =

Phx.Name.New(lifetime, ".text");

Phx.Symbols.SectionSymbol textSymbol =

peModuleUnit.SymbolTable.LookupByName(textName).AsSectionSymbol;

mainMethodSymbol.AllocationBaseSectionSymbol = textSymbol;

mainMethodSymbol.Access = Phx.Symbols.Access.Public;

mainMethodSymbol.MethodSpecifier = Phx.Symbols.MethodSpecifier.Static;

// End of Part 1 of 3

AndyAyers-MSFT at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 11

Here's the second part:

Code Snippet

// Part 2 of 3

// Create a function unit.

Phx.FunctionUnit functionUnit =

Phx.FunctionUnit.New(mainMethodSymbol, Phx.CodeGenerationMode.IJW, typeTable,

peModuleUnit.MsilRuntime.Architecture, peModuleUnit.MsilRuntime, peModuleUnit, 1);

mainMethodSymbol.FunctionUnit = functionUnit;

peModuleUnit.AppendChildUnit(functionUnit);

Phx.Symbols.Table functionSymbolTable =

Phx.Symbols.Table.New(functionUnit, 64, false);

functionUnit.AllocateLifetime();

Phx.Debug.Info.New(lifetime, functionUnit);

Phx.Name compilandName =

Phx.Name.New(lifetime, "");

uint debugTag =

functionUnit.DebugInfo.CreateTag(compilandName, 1, 1, 0);

functionUnit.CurrentDebugTag = debugTag;

// Create a parameter symbol for args.

Phx.Symbols.LocalVariableSymbol argsSymbol =

Phx.Symbols.LocalVariableSymbol.New(functionSymbolTable, 0, argsName,

typeTable.GetObjectPointerType(arrayOfStringType),

Phx.Symbols.StorageClass.Parameter);

// Create a local x of type int.

Phx.Name xName = Phx.Name.New(lifetime, "x");

Phx.Symbols.LocalVariableSymbol xSymbol =

Phx.Symbols.LocalVariableSymbol.New(functionSymbolTable, 0, xName,

typeTable.Int32Type, Phx.Symbols.StorageClass.Auto);

// Create some basic IR....

Phx.IR.Instruction startInstruction = functionUnit.FirstInstruction;

startInstruction.DebugTag = debugTag;

Phx.IR.Instruction endInstruction = functionUnit.LastInstruction;

endInstruction.DebugTag = debugTag;

Phx.IR.LabelInstruction enterInstruction =

Phx.IR.LabelInstruction.New(functionUnit, Phx.Common.Opcode.EnterFunction, mainMethodSymbol);

enterInstruction.AppendDestination(Phx.IR.VariableOperand.New(functionUnit, arrayOfStringType, argsSymbol));

startInstruction.InsertAfter(enterInstruction);

startInstruction.AppendLabelSource(Phx.IR.LabelOperandKind.Technical, Phx.IR.LabelOperand.New(functionUnit, enterInstruction));

Phx.IR.LabelInstruction exitInstruction =

Phx.IR.LabelInstruction.New(functionUnit, Phx.Common.Opcode.ExitFunction);

enterInstruction.InsertAfter(exitInstruction);

Phx.IR.Instruction addInstruction =

Phx.IR.Instruction.NewBinary(functionUnit, Phx.Common.Opcode.Add,

Phx.IR.VariableOperand.New(functionUnit, typeTable.Int32Type, xSymbol),

Phx.IR.VariableOperand.New(functionUnit, typeTable.Int32Type, xSymbol),

Phx.IR.ImmediateOperand.New(functionUnit, typeTable.Int32Type, 1));

enterInstruction.InsertAfter(addInstruction);

Phx.IR.BranchInstruction returnInstruction =

Phx.IR.BranchInstruction.NewReturn(functionUnit, Phx.Common.Opcode.Return, exitInstruction);

exitInstruction.InsertBefore(returnInstruction);

functionUnit.FinishCreation();

// Now encode this function.

Phx.Phases.PhaseConfiguration phaseConfiguration =

Phx.Phases.PhaseConfiguration.New(lifetime, "phases");

Phx.Phases.PhaseList phaseList = phaseConfiguration.PhaseList;

phaseList.AppendPhase(Phx.MirLowerPhase.New(phaseConfiguration));

phaseList.AppendPhase(Phx.Targets.Runtimes.CanonicalizePhase.New(phaseConfiguration));

phaseList.AppendPhase(Phx.Targets.Runtimes.LowerPhase.New(phaseConfiguration));

phaseList.AppendPhase(Phx.Targets.Runtimes.SwitchLowerPhase.New(phaseConfiguration));

phaseList.AppendPhase(Phx.StackAllocatePhase.New(phaseConfiguration));

phaseList.AppendPhase(Phx.Targets.Runtimes.FrameGenerationPhase.New(phaseConfiguration));

phaseList.AppendPhase(Phx.Graphs.BlockLayoutPhase.New(phaseConfiguration));

phaseList.AppendPhase(Phx.FlowOptimizer.FlowOptimizationsPhase.New(phaseConfiguration));

// Add runtime specific phases

functionUnit.Runtime.AddPhases(phaseConfiguration);

// Add plugin phases

Phx.GlobalData.BuildPlugInPhases(phaseConfiguration);

// Run the phases.

phaseConfiguration.PhaseList.DoPhaseList(functionUnit);

// Cleanup after encoding.

peModuleUnit.CloseUnit(functionUnit);

// Mark main as the main entry point of the executable.

peModuleUnit.StartPointFunctionSymbol = mainMethodSymbol;

// Write out the resulting binary. Since we're writing an executable,

// turn of the DLL bit in the PE header.

peModuleUnit.Characteristics ^= 0x2000;

peModuleUnit.Close();

}

// End of part 2 of 3

AndyAyers-MSFT at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 12

And the last part....

Code Snippet

// Part 3 of 3

static void Initialize(string[] args)

{

// Create and register architecture and runtime.

Phx.Targets.Architectures.Architecture msilArchitecture = Phx.Targets.Architectures.Msil.Architecture.New();

Phx.Targets.Runtimes.Runtime msilRuntime = Phx.Targets.Runtimes.Vccrt.Win.Msil.Runtime.New(msilArchitecture);

Phx.GlobalData.RegisterTargetArchitecture(msilArchitecture);

Phx.GlobalData.RegisterTargetRuntime(msilRuntime);

Phx.Targets.Architectures.Architecture x86Architecture = Phx.Targets.Architectures.X86.Architecture.New();

Phx.Targets.Runtimes.Runtime x86Runtime = Phx.Targets.Runtimes.Vccrt.Win32.X86.Runtime.New(x86Architecture);

Phx.GlobalData.RegisterTargetArchitecture(x86Architecture);

Phx.GlobalData.RegisterTargetRuntime(x86Runtime);

// Initialize the infrastructure.

Phx.Initialize.BeginInitialization();

// Initialize controls set by command line, register plugins,

// etc. Read control values first from the PHX environment variable,

// then the command line, then from the _PHX_ environment variable.

Phx.Initialize.EndInitialization("PHX|*|_PHX_|", args);

// Enable some controls. We want to see the names of types, types

// in our IR dumps, linenumbers in our IR dumps, and linenumbers

// in our object file.

Phx.Controls.Parser.ParseArgumentString(null,

"-dumptypesym -dump:types -dump:linenumbers -lineleveldebug");

// Cache pointer to the main type table.

typeTable = Phx.GlobalData.TypeTable;

// Create a lifetime.

lifetime =

Phx.Lifetime.New(Phx.LifetimeKind.Module, null);

}

static Phx.Symbols.AssemblySymbol LookupAssembly(Phx.ModuleUnit moduleUnit, string assemblyNameString)

{

Phx.Name assemblyName =

Phx.Name.New(lifetime, assemblyNameString);

Phx.Symbols.Table symbolTable = moduleUnit.SymbolTable;

Phx.Symbols.NameMap nameMap = symbolTable.NameMap;

// If the module was not compiled with debug, the module sym

// table may not have a name map.

if (nameMap == null)

{

nameMap = Phx.Symbols.NameMap.New(symbolTable, 64);

symbolTable.AddMap(nameMap);

}

Phx.Symbols.Symbol sym = nameMap.Lookup(assemblyName);

// Note that there might be a number of symbols with

// identical names, so search through until we have an

// assembly reference.

while (sym != null)

{

if (sym.IsAssemblySymbol)

{

return sym.AsAssemblySymbol;

}

sym = nameMap.LookupNext(sym);

}

return null;

}

static Phx.Types.Table typeTable;

static Phx.Lifetime lifetime;

}

}

AndyAyers-MSFT at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 13

Hi,

Firstly, thanks a lot! This example is being very helpful.

Now, the problems:

When I compile the code the foo2.exe file is generated, but the follow message is shown:

Code Snippet

Phoenix Assertion Failure: e:\sdkjul2007wix\src\phx\encoded-unit.cpp, Line 379
this->HasSeparatedLifetime
in (PEModule) foo.dll
in (Program) <unnamed unit>

I used the peverify and the ildasm to verify the foo.exe and everything seemed normal, but it doesn′t work. The only thing that I changed in your example was the foo.il file. I added ".assembly extern mscorlib {auto}", because without this the program didn′t find the mscorlib reference.

NewPhoenixStudent at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...
# 14

When you say it "doesn't work" are you referring to the assert? Or is there some other problem?

I should have mentioned that the code will produce the assert that you see. It comes from the small bit of unmanaged code that exists in the binary. You can safely ignore the assert. The program still completes properly.

I haven't tried this, but if you build the il assembly with /NOCORSTUB you may be able to disable generation of this native code in the initial assembly, and then maybe the assert will go away.

AndyAyers-MSFT at 2007-10-2 > top of Msdn Tech,Visual Studio,Phoenix...

Visual Studio

Site Classified