Optimizing existing MSIL programs
I am currently working on a fairly large and performance critical C# application.
The main issue we face is that the current CLR JIT compiler does not do any significant inlining, especially when working with structs (Value Types) which are otherwise very helpful from a performance point of view:
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=93858
I am not sure what phoenix does exactly, but if it is a generic framework for compiler optimizations, it should be possible to do something like
read in MSIL where structs are not inlined
inline all struct methods
perform subsequent optimizations
write out a new MSIL file (dll or exe).
Is that correct, or is Phoenix just an optimizing C++ compiler?
regards,
Rüdiger Klaehn
p.s. I know that inlining methods should be done by the JIT, but it seems that performance does not have any priority with the CLR JIT team... :-(
The jit team is actually working on this problem. And Phoenix is more than an optimizing C++ compiler.
You can almost certainly use Phoenix to ameliorate some of inlining the limitations in the current jit, using a PE read-write client and the general inliner framework we provide in phoenix, with suitable customization for MSIL. I say "almost" because nobody has tried this yet so you would be breaking new ground.
However, it will be challenging to get this to work across assembly boundaries, and the resulting assemblies will likely no longer have accurate source-level debug information. We don't have any samples that will help you, but for some inspiration you can look at Mark Eaddy's Morpher framework, which is able to do AOP-style weaving in a manner that is very similar to what would be done for inlining.
Thanks for the answer.
It is good to know that somebody at MS is finally fixing this issue. Our code is heavily relying on generics and inner loops do not contain many virtual method invocations, so once the inlining issues are taken care of, it will really fly.
By the way: I noticed that the quality of the generated code is a bit better in the x64 version of the JIT. But there is still a lot of work to do. One thing I would love to see is escape analysis and stack allocation. I know that the CLR allocator is quite fast, but nothing is faster than just increasing the stack pointer :-)
cheers,
Rüdiger