MSIL compiler
Thanks a lot!!!
Thanks a lot!!!
Unfortunately, we don't have any samples that do this exactly. And while we provide most of the components you would need to create an MSIL compiler, there are some challenges in integrating these parts. I'll sketch an outline here.
First, you need to decide whether your compiler will follow the separate compilation model like C/C++ or the whole assembly model like C#. Since I do not know anything about your source language I can't give you any guidance there. Second, you would need to decide how the types in your language map on to MSIL types, and how much freedom users would have in your language to access the full MSIL type system (eg, importing types from other assemblies). Once you know these things, then you can use a combination of the metadata, pe write, and object writing capabilities of Phoenix to express your language in MSIL. The IR part is probably relatively straightforward; it's getting the type data correct and properly expressed in metadata that is the tricky part.
Thanks!!!
class Program
{
static void Main()
{
Hello hello = new Hello();
Console.WriteLine(hello.sayHello());
}
class Hello
{
public string sayHello()
{
return "Hello World!!";
}
}
}
Thanks
We don't have any samples like this today, unfortunately. Your best bet is to start out with some relatively empty assembly as an input, and treat this as a pe read/write scenario. There is no good way to create a PE file or metadata from scratch.
I will see if I can put up a code sketch for you in the next few days.
thanks.
Sorry for the delay -- I've been out of the office for a few days, and it will take me a few more to get around to coding this up.
I've finally got something. It's a little on the long side but I will try posting it here and we will see how it goes. Since it is fairly long I am going to break it into parts.
You first need to create a trivial assembly to use as a model. I used ilasm for this, and the assembly file was simply:
Save this to foo.il, and then run ilasm to get foo.dll:
The code illustrates how to create a PE file with custom content. In this case there's one class with one simple method. The Initialize and LookupAssembly methods are similar to the ones found in the mtrace sample. In the main function there are a number of subtle points and hard to diagnose problems. I'm just going to post the code as is for now, and then as questions come up I will describe what is going on in some of the trickier spots.
Also note that a real compiler would create the classes and IR by walking the parsed program input. I've left that part for you to fill in. Presumably you could write your own parser or try and adapt lex/yacc like is done in the grammar based analyzer sample.
// Part 1 of 3
using System;
using System.Collections.Generic;
using System.Text;
namespace MsilCompiler
{
class Compiler
{
static void Main(string[] args)
{
// Initialize Phoenix.
Initialize(args);
// Create a new PE module unit.
Phx.PEModuleUnit peModuleUnit = Phx.PEModuleUnit.Open("foo.dll");
peModuleUnit.OutputImagePath = "foo2.exe";
peModuleUnit.LoadEncodedIRUnitList();
peModuleUnit.PreAssignTypeTokens();
// Create a reference to System.Object.
Phx.Symbols.AssemblySymbol mscorlibSymbol =
LookupAssembly(peModuleUnit, "mscorlib");
Phx.Name objTypeName =
Phx.Name.New(lifetime, "System.Object");
Phx.Symbols.MsilTypeSymbol objTypeSymbol =
Phx.Symbols.MsilTypeSymbol.New(peModuleUnit.SymbolTable, objTypeName, 0);
Phx.Types.AggregateType objType =
Phx.Types.AggregateType.NewDynamicSize(typeTable, objTypeSymbol);
objType.IsPrimary = true;
objTypeSymbol.Visibility = Phx.Symbols.Visibility.ClrTokenReference;
// Now, attach the class type to the mscorlib assembly reference.
mscorlibSymbol.InsertInLexicalScope(objTypeSymbol, objTypeName);
// Create a class that extends System.Object
Phx.Name classTypeName =
Phx.Name.New(Phx.GlobalData.GlobalLifetime, "SampleClass");
Phx.Symbols.MsilTypeSymbol classTypeSym =
Phx.Symbols.MsilTypeSymbol.New(peModuleUnit.SymbolTable, classTypeName, 0);
Phx.Types.AggregateType classType =
Phx.Types.AggregateType.NewDynamicSize(peModuleUnit.TypeTable,
classTypeSym);
classType.Access = Phx.Symbols.Access.Public;
classType.IsDefinition = true;
Phx.Types.BaseTypeLink baseLink =
Phx.Types.BaseTypeLink.New(objType, false);
classType.AppendBaseTypeLink(baseLink);
classType.IsPrimary = true;
classType.IsSelfDescribing = true;
// Now, insert the class type into the assembly scope.
Phx.AssemblyUnit assemblyUnit = peModuleUnit.AssemblyUnit;
Phx.Symbols.AssemblySymbol assemblySymbol = assemblyUnit.AssemblySymbol;
assemblySymbol.InsertInLexicalScope(classTypeSym, classTypeName);
// Create a method void Main(string[] args)
Phx.Types.Type arrayOfStringType =
Phx.Types.ManagedArrayType.New(typeTable, typeTable.ObjectPointerSystemStringType, null);
Phx.Types.FunctionType mainMethodType =
typeTable.GetFunctionType(Phx.Types.CallingConventionKind.ClrCall,
typeTable.VoidType, arrayOfStringType, null, null, null);
Phx.Name mainMethodName =
Phx.Name.New(Phx.GlobalData.GlobalLifetime, "Main");
Phx.Symbols.FunctionSymbol mainMethodSymbol =
Phx.Symbols.FunctionSymbol.New(peModuleUnit.SymbolTable, 0, mainMethodName,
mainMethodType, Phx.Symbols.Visibility.GlobalDefinition);
Phx.Name argsName = Phx.Name.New(lifetime, "args");
Phx.Symbols.FunctionArgument functionArgument = Phx.Symbols.FunctionArgument.New(mainMethodSymbol, argsName, 0);
mainMethodSymbol.AppendFunctionArgument(functionArgument);
// Add it as a method of the traceType.
classType.AddMethod(mainMethodSymbol);
classTypeSym.InsertInLexicalScope(mainMethodSymbol, mainMethodName);
// Allocate it inthe .text section.
Phx.Name textName =
Phx.Name.New(lifetime, ".text");
Phx.Symbols.SectionSymbol textSymbol =
peModuleUnit.SymbolTable.LookupByName(textName).AsSectionSymbol;
mainMethodSymbol.AllocationBaseSectionSymbol = textSymbol;
mainMethodSymbol.Access = Phx.Symbols.Access.Public;
mainMethodSymbol.MethodSpecifier = Phx.Symbols.MethodSpecifier.Static;
// End of Part 1 of 3
Here's the second part:
// Part 2 of 3
// Create a function unit.
Phx.FunctionUnit functionUnit =
Phx.FunctionUnit.New(mainMethodSymbol, Phx.CodeGenerationMode.IJW, typeTable,
peModuleUnit.MsilRuntime.Architecture, peModuleUnit.MsilRuntime, peModuleUnit, 1);
mainMethodSymbol.FunctionUnit = functionUnit;
peModuleUnit.AppendChildUnit(functionUnit);
Phx.Symbols.Table functionSymbolTable =
Phx.Symbols.Table.New(functionUnit, 64, false);
functionUnit.AllocateLifetime();
Phx.Debug.Info.New(lifetime, functionUnit);
Phx.Name compilandName =
Phx.Name.New(lifetime, "");
uint debugTag =
functionUnit.DebugInfo.CreateTag(compilandName, 1, 1, 0);
functionUnit.CurrentDebugTag = debugTag;
// Create a parameter symbol for args.
Phx.Symbols.LocalVariableSymbol argsSymbol =
Phx.Symbols.LocalVariableSymbol.New(functionSymbolTable, 0, argsName,
typeTable.GetObjectPointerType(arrayOfStringType),
Phx.Symbols.StorageClass.Parameter);
// Create a local x of type int.
Phx.Name xName = Phx.Name.New(lifetime, "x");
Phx.Symbols.LocalVariableSymbol xSymbol =
Phx.Symbols.LocalVariableSymbol.New(functionSymbolTable, 0, xName,
typeTable.Int32Type, Phx.Symbols.StorageClass.Auto);
// Create some basic IR....
Phx.IR.Instruction startInstruction = functionUnit.FirstInstruction;
startInstruction.DebugTag = debugTag;
Phx.IR.Instruction endInstruction = functionUnit.LastInstruction;
endInstruction.DebugTag = debugTag;
Phx.IR.LabelInstruction enterInstruction =
Phx.IR.LabelInstruction.New(functionUnit, Phx.Common.Opcode.EnterFunction, mainMethodSymbol);
enterInstruction.AppendDestination(Phx.IR.VariableOperand.New(functionUnit, arrayOfStringType, argsSymbol));
startInstruction.InsertAfter(enterInstruction);
startInstruction.AppendLabelSource(Phx.IR.LabelOperandKind.Technical, Phx.IR.LabelOperand.New(functionUnit, enterInstruction));
Phx.IR.LabelInstruction exitInstruction =
Phx.IR.LabelInstruction.New(functionUnit, Phx.Common.Opcode.ExitFunction);
enterInstruction.InsertAfter(exitInstruction);
Phx.IR.Instruction addInstruction =
Phx.IR.Instruction.NewBinary(functionUnit, Phx.Common.Opcode.Add,
Phx.IR.VariableOperand.New(functionUnit, typeTable.Int32Type, xSymbol),
Phx.IR.VariableOperand.New(functionUnit, typeTable.Int32Type, xSymbol),
Phx.IR.ImmediateOperand.New(functionUnit, typeTable.Int32Type, 1));
enterInstruction.InsertAfter(addInstruction);
Phx.IR.BranchInstruction returnInstruction =
Phx.IR.BranchInstruction.NewReturn(functionUnit, Phx.Common.Opcode.Return, exitInstruction);
exitInstruction.InsertBefore(returnInstruction);
functionUnit.FinishCreation();
// Now encode this function.
Phx.Phases.PhaseConfiguration phaseConfiguration =
Phx.Phases.PhaseConfiguration.New(lifetime, "phases");
Phx.Phases.PhaseList phaseList = phaseConfiguration.PhaseList;
phaseList.AppendPhase(Phx.MirLowerPhase.New(phaseConfiguration));
phaseList.AppendPhase(Phx.Targets.Runtimes.CanonicalizePhase.New(phaseConfiguration));
phaseList.AppendPhase(Phx.Targets.Runtimes.LowerPhase.New(phaseConfiguration));
phaseList.AppendPhase(Phx.Targets.Runtimes.SwitchLowerPhase.New(phaseConfiguration));
phaseList.AppendPhase(Phx.StackAllocatePhase.New(phaseConfiguration));
phaseList.AppendPhase(Phx.Targets.Runtimes.FrameGenerationPhase.New(phaseConfiguration));
phaseList.AppendPhase(Phx.Graphs.BlockLayoutPhase.New(phaseConfiguration));
phaseList.AppendPhase(Phx.FlowOptimizer.FlowOptimizationsPhase.New(phaseConfiguration));
// Add runtime specific phases
functionUnit.Runtime.AddPhases(phaseConfiguration);
// Add plugin phases
Phx.GlobalData.BuildPlugInPhases(phaseConfiguration);
// Run the phases.
phaseConfiguration.PhaseList.DoPhaseList(functionUnit);
// Cleanup after encoding.
peModuleUnit.CloseUnit(functionUnit);
// Mark main as the main entry point of the executable.
peModuleUnit.StartPointFunctionSymbol = mainMethodSymbol;
// Write out the resulting binary. Since we're writing an executable,
// turn of the DLL bit in the PE header.
peModuleUnit.Characteristics ^= 0x2000;
peModuleUnit.Close();
}
// End of part 2 of 3
And the last part....
// Part 3 of 3
static void Initialize(string[] args)
{
// Create and register architecture and runtime.
Phx.Targets.Architectures.Architecture msilArchitecture = Phx.Targets.Architectures.Msil.Architecture.New();
Phx.Targets.Runtimes.Runtime msilRuntime = Phx.Targets.Runtimes.Vccrt.Win.Msil.Runtime.New(msilArchitecture);
Phx.GlobalData.RegisterTargetArchitecture(msilArchitecture);
Phx.GlobalData.RegisterTargetRuntime(msilRuntime);
Phx.Targets.Architectures.Architecture x86Architecture = Phx.Targets.Architectures.X86.Architecture.New();
Phx.Targets.Runtimes.Runtime x86Runtime = Phx.Targets.Runtimes.Vccrt.Win32.X86.Runtime.New(x86Architecture);
Phx.GlobalData.RegisterTargetArchitecture(x86Architecture);
Phx.GlobalData.RegisterTargetRuntime(x86Runtime);
// Initialize the infrastructure.
Phx.Initialize.BeginInitialization();
// Initialize controls set by command line, register plugins,
// etc. Read control values first from the PHX environment variable,
// then the command line, then from the _PHX_ environment variable.
Phx.Initialize.EndInitialization("PHX|*|_PHX_|", args);
// Enable some controls. We want to see the names of types, types
// in our IR dumps, linenumbers in our IR dumps, and linenumbers
// in our object file.
Phx.Controls.Parser.ParseArgumentString(null,
"-dumptypesym -dump:types -dump:linenumbers -lineleveldebug");
// Cache pointer to the main type table.
typeTable = Phx.GlobalData.TypeTable;
// Create a lifetime.
lifetime =
Phx.Lifetime.New(Phx.LifetimeKind.Module, null);
}
static Phx.Symbols.AssemblySymbol LookupAssembly(Phx.ModuleUnit moduleUnit, string assemblyNameString)
{
Phx.Name assemblyName =
Phx.Name.New(lifetime, assemblyNameString);
Phx.Symbols.Table symbolTable = moduleUnit.SymbolTable;
Phx.Symbols.NameMap nameMap = symbolTable.NameMap;
// If the module was not compiled with debug, the module sym
// table may not have a name map.
if (nameMap == null)
{
nameMap = Phx.Symbols.NameMap.New(symbolTable, 64);
symbolTable.AddMap(nameMap);
}
Phx.Symbols.Symbol sym = nameMap.Lookup(assemblyName);
// Note that there might be a number of symbols with
// identical names, so search through until we have an
// assembly reference.
while (sym != null)
{
if (sym.IsAssemblySymbol)
{
return sym.AsAssemblySymbol;
}
sym = nameMap.LookupNext(sym);
}
return null;
}
static Phx.Types.Table typeTable;
static Phx.Lifetime lifetime;
}
}
Hi,
Firstly, thanks a lot! This example is being very helpful.
Now, the problems:
When I compile the code the foo2.exe file is generated, but the follow message is shown:
Code Snippet
I used the peverify and the ildasm to verify the foo.exe and everything seemed normal, but it doesn′t work. The only thing that I changed in your example was the foo.il file. I added ".assembly extern mscorlib {auto}", because without this the program didn′t find the mscorlib reference.
When you say it "doesn't work" are you referring to the assert? Or is there some other problem?
I should have mentioned that the code will produce the assert that you see. It comes from the small bit of unmanaged code that exists in the binary. You can safely ignore the assert. The program still completes properly.
I haven't tried this, but if you build the il assembly with /NOCORSTUB you may be able to disable generation of this native code in the initial assembly, and then maybe the assert will go away.