Read non standard-conformant XML

Hi!

I want to read an XML file, which is not completely standard conformant. To be more precise some of its nodes contain text with special characters (see the example below).

<RootElement>
<ConformantNode>Hello</ConformantNode>
<NonConformantNode>You & me</NonConformantNode>
</RootElement>

As you can see "NonConformantNode" contains an & char in its text, which always makes System.Xml.XmlReader throw an exception.
How can I read the content of this document, in other words, how can I read an XML document without automatic character decoding?

Thanks for your help in advance!

P.S.: I already tried setting XmlReaderSettings.CheckCharacters to false, but this doesn't do the trick and reading the entire file into a string and then replacing the special characters is not a good idea, since the files can be quite big and performance is critical.

[994 byte] By [tommazzo] at [2007-12-17]
# 1
You may want to use some "forgiving" parser instead of XmlReader. Try SgmlReader, http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=B90FDDCE-E60D-43F8-A5C4-C3BD760564BC
OlegTkachenko at 2007-9-9 > top of Msdn Tech,.NET Development,XML and the .NET Framework...
# 2
That looks like an easy solution to me. Thanks!
The only problem with it is that this parser is quite big, isn't there a way that I can inherit a class from XmlReader, which will behave just like XmlTextReader, just that it will not crash when it encounters an invalid character.
tommazzo at 2007-9-9 > top of Msdn Tech,.NET Development,XML and the .NET Framework...
# 3
You can, but you have to implement parsing then. That's exactly what SgmlReader does.
And make sure you really cannot get your XML fixed in the first place.
OlegTkachenko at 2007-9-9 > top of Msdn Tech,.NET Development,XML and the .NET Framework...
# 4
I'm afraid I have to, because application size is also more or less critical. Just one final question, is there any basic XmlReader implementation that I can use as a template or do I have to start from scratch?
tommazzo at 2007-9-9 > top of Msdn Tech,.NET Development,XML and the .NET Framework...
# 5
You can take a look into rotor' simplementation for XmlTextReader. Not sure if you can use it directly.
Also you can try to tweak Mono's sources for XmlTextReader.
OlegTkachenko at 2007-9-9 > top of Msdn Tech,.NET Development,XML and the .NET Framework...
# 6
SgmlReader code is probably smaller than XmlTextReader, and you can then rip out the guts of all the Sgml validation stuff.
ChrisLovett at 2007-9-9 > top of Msdn Tech,.NET Development,XML and the .NET Framework...

.NET Development

Site Classified