Friday, March 30, 2012

Is there a simple way to get everything there is to get from an XmlReader?

THE TASK
Using Visual Studio .NET 2003 and SQL Server 2000, I want to create a .NET application that gets SQL Server data obtained from a SELECT query with a FOR XML clause. I have seen the solution a long time ago in the MSCD .NET book "Developing Windows-Based Applications" for 70-306 and 70-316 and I thought, this looks really simple, I don't need to try it out. Now that I need to do it I am surprised how difficult it is.

TO RUN THE EXAMPLE
The code below demonstrates the expected solution based on the book and the failure of that solution. To run this example, create a Windows Forms project, put a TextBox called textBox1 and two Buttons called button1 and button2 on the Form. Double-click each button to generate the methods that handle the click events. Replace "MySQLServer" and "MyWorkstation" with your own names. The pubs database should be present, otherwise modify the connection string and the command text as well. Compile it and click button1.

THE OUTCOME
The text in textBox1 shows four XML elements but there are eight records in the publishers database. The pub_ID values show that every second record has been omitted.

THE PROBLEM
The code should be self-explanatory, so let's look directly at the while loop and the ReadOuterXml method. That's where the problem occurs.
The framework documentation reveals a bad surprise. If the XmlReader is positioned on a node at the leaf level, ReadOuterXml behaves like Read. This means the reader advances to the next position. In combination with Read this means that the reader advances by 2 positions although the while loop has only looped once. This may or may not happen depending on whether the reader happens to be on a leaf node or node. Try to play with nested parent and child records and you will see how funny this can be.
Apparently, this means there is no easy way to reliably loop through everything.
I find this behavior of the ReadOuterXml method very disappointing especially since there seems to be no straightforward way to control this behavior. When I research the subject, I only find articles that simply say "//Do some parsing here" in the while loop. I disagree. One should not do any parsing there. One should not query the node type and then reconstruct every aspect of the XML for every theroetically possible node type and write a lot of code to that end. Instead, one should trust that SQL Server has produced correct data and go ahead and save the data to a file. Compare how well-behaved the SqlDataReader is!
I wish the XmlReader had a ReadEverythingThereIsToReadAndReturnItAsATextStream method. Is there a known and simple solution to accomplish what that method whoud do?

THE CODE
private void button1_Click(object sender, System.EventArgs e)
{
string connectionString = "data source=MySQLServer;initial catalog=pubs;integrated security=SSPI;persist security info=False;workstation id=MyWorkStation;packet size=4096";
System.Data.SqlClient.SqlConnection connection = new System.Data.SqlClient.SqlConnection(connectionString);
string cmdText = "SELECT * FROM publishers FOR XML RAW";
System.Data.SqlClient.SqlCommand cmd = new System.Data.SqlClient.SqlCommand(cmdText, connection);
this.textBox1.Text = String.Empty;
connection.Open();
System.Xml.XmlReader reader = cmd.ExecuteXmlReader();
while (reader.Read())
{
this.textBox1.AppendText(reader.ReadOuterXml());
}
reader.Close();
connection.Close();
}

private void button2_Click(object sender, System.EventArgs e)
{
string filename = Application.StartupPath + System.IO.Path.DirectorySeparatorChar + "publishers.xml";
System.Xml.XmlTextWriter writer = new System.Xml.XmlTextWriter(filename, System.Text.Encoding.UTF8);
writer.Formatting = System.Xml.Formatting.Indented;
writer.WriteStartDocument();
writer.WriteStartElement("Publishers");
writer.WriteRaw(this.textBox1.Text);
writer.WriteEndDocument();
writer.Close();
}

If you want to write out everything the XmlReader returns then open up an XmlTextWriter and simply do xmlTextWriter.WriteNode(xmlReader, false).|||

Thank you Martin Honnen. What you wrote is useful, indeed.

HOWEVER!

I figured out a much simpler solution.

If there is no need to do any parsing on the XML that SQL Server returns, don't use an XmlReader and a while loop at all. Instead, do this:

string xmlResult = (string)cmd.ExecuteScalar();

I still think that the ReadOuterXml method (ReadInnerXml as well) should behave in a more controlled way, because as it is it's quite useless. But anyway, we have learned enough about this. Let's move on to new problems.

No comments:

Post a Comment