JavaScript XML Parser

For a web page I was working on I needed to parse out XML inside of JavaScript. Not XML that was part of the page itself, this was a generic XML block the server was sending JavaScript in response to an AJAX request. The code needed to do that still isn’t 100% clear to me, so I figured typing up an explanation of what I know would be a nice reminder.

Initializing the object and passing it XML to read not only is different between Internet Explorer (IE) and Firefox, but its different if you want to parse XML from a string or from a file. Once you have the XML loaded in the object the way you retrieve the information is the same.

So lets start by creating the object and loading it with XML from an XML string. To create it in IE you do the following:

var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");

You then have to set the async property to false. In Firefox you do:

var xmlParser = new DOMParser();

Next to load the XML string, in IE you call the loadXML() method and pass it the string to parse. In Firefox you call the parseFromString() method and pass it two values; the first is the XML string to parse, the second is a description of the content which should be “text/XML”.

To keep things consistent I wrap both of these steps into a single function. This way the browser specific code is kept in a single place. The function I use looks like this:

function CreateXMLStringParser(XMLString)
{
    try
    {
        var xmlParser = new DOMParser();
        var xmlDoc = xmlParser.parseFromString(XMLString, "text/xml");
    }
    catch(Err)
    {
        try
        {
            var xmlDoc= new ActiveXObject("Microsoft.XMLDOM");
            xmlDoc.async="false";
            xmlDoc.loadXML(XMLString);
        }
        catch(Err)
        {
            window.alert("Browser does not support XML parsing.");
            return false;
        }
    }

    return xmlDoc;
}

If you wanted to load the XML from a file you use the same object for IE. However, the object changes if you use Firefox, it becomes:

document.implementation.createDocument("","",null);

Next you need to tell the object what XML to load. This is done by calling the load() method and passing it the file name.

Even though the function to load the XML is the same I wrap both steps in a single function to keep things consistent. The function I use looks like this:

function CreateXMLFileParser(XMLFile)
{
    try
    {
        var xmlDoc = document.implementation.createDocument("", "", Null);
        xmlDoc.load(XMLFile);
    }
    catch(Err)
    {
        try
        {
            var xmlDoc= new ActiveXObject("Microsoft.XMLDOM");
            xmlDoc.async="false";
            xmlDoc.load(XMLFile);
        }
        catch(Err)
        {
            window.alert("Browser does not support XML parsing.");
            return false;
        }
    }

    return xmlDoc;
}

For the purposes of explaining how to use this object to read XML data I’m going to use this XML block:

<Root Attr="Value">
    <Child>
        Some Text
    </Child>
</Root>

I’m going to assume that its been preloaded into one of the objects discussed above using this code:

var XMLText = "<Root Attr="Value">\n   <Child>\n      Some Text\n   </Child>\n</Root>";
var XMLObj = CreateXMLStringParser(XMLText);

Now that we have an XML object loaded with a chunk of XML we need to go through its contents. The XML object provides a set of arrays that will allow you to navigate the XML data any way you’d like.

This is done with the documentElement property of the XML object. It contains a series of nested arrays that contain each node in the XML. Each node has three properties nodeName which holds the name of the tag, nodeValue which contains the text value of the tag, and childNodes which is an array nested elements.

The documentElement has the properties for the root node. In the sample XML it would be the node . So the value of XMLObj.documentElement.nodeName would be “Root”.

In this example childNodes has three pieces in it. Each place where text resides is considered an element and a piece is added to childNodes. The newline and spaces before and after the tag will be considered XML elements. XMLObj.documentElement.childNodes[0] is the newline and tabs before the tag, XMLObj.documentElement.childNodes[1] is the tag, and XMLObj.documentElement.childNodes[2] is the newline following the tag.

Each piece in the childNodes array has the same values as in documentElement, so you can explore the tag in the same way through XMLObj.documentElement.childNodes[1]. So the value of XMLObj.documentElement.childNodes[1].nodeName would be Child. XMLObj.documentElement.childNodes[1].childNodes would be an array with a single piece, the text inside the tag.

The nodeValue property contains the text value of the XML element. It will contain all of the whitespace and text in the element. So XMLObj.documentElement.childNodes[1].childNodes[0].nodeValue would be “\n Some Text\n ” since it is the text inside the tag.

If the element you are working on is another tag then nodeValue would be “null”. XMLObj.documentElement.childNodes[1].nodeValue would be null since it is the tag.

To avoid the extra childNodes pieces created by the XML formatting and keep the whitespace out of nodeValue you can make your XML like this:

<Root><Child>Some Text</Child></Root>

It’s a bit easier to parse out, but more difficult to read.

If you were looking for a specific tag you would have to loop through all the pieces of the childNodes array checking the nodeName properties until you found it. This can get a bit cumbersome, so theres a getElementsByTagName() method that will return an array with every occurance of that tag. For example XMLObj.documentElement.getElementsByTagName(”Child”)[0].childeNodes[0].nodeValue would be the text from the tag, “\n Some Text\n “.

The only other method I’d like to mention is getAttribute(). This allows you to retrieve attribute values of any XML tag. Its included in documentElement as well as all pieces of the childNodes arrays. You pass it the name of the attribute and it returns the value. For example XMLDoc.documentElement.getAttribute(”Attr”) would return “Value”.

7 Responses to “JavaScript XML Parser”

  1. 1
    Marc -Andre Menard Says:

    here is i have a problem with passing parameter to a xml file parser for the xsl tree…. it work in firefox, not in explorer…

    Maybe you can get me to the error/solution … you have a lot information here, but not this one precisely…

    Thanks in advance

    here is the link with the problem

    http://w3schools.invisionzone.com/index.php?showtopic=17636&st=0&p=96325&#entry96325

  2. 2
    Exile Says:

    I looked over the example and I’m not 100% sure what you’re trying to do. My guess is that you have some parameters in your XSL file that you’d like to set via Javascript prior to doing the transformation.

    For Internet Explorer you have to do something like this:

    //Load the XSL
    var xsltXsl = new ActiveXObject(”Microsoft.XMLDOM”);
    xsltXsl.async = false;
    xsltXsl.load(”YourStylesheet.xsl”);

    //Create a compiled XSLT object
    var xsltCompiled = new ActiveXObject(”MSXML2.XSLTemplate”);
    xsltCompiled.stylesheet = xsltXsl.documentElement;

    // create XSL-processor
    var xsltProc = xsltCompiled.createProcessor()

    //Load the XML
    var xmlDoc = new ActiveXObject(”Microsoft.XMLDOM”);
    xmlDoc.async = false;
    xmlDoc.load(”YourStylesheet.xsl”);

    //Setup the XSLT processor
    xsltProc.input = xmlDoc;
    xsltProc.addParameter(”NameOfYourParameter1″, “ValueOfYourParameter1″) xsltProc.addParameter(”NameOfYourParameter2″, “ValueOfYourParameter2″)

    //Perform the transformation
    xsltProc.transform()

    I’m not sure if thats exactly what you need though. If you can post the full example, including the XML and XSL files so that we can see the final results, it might help clarify exactly what you’re trying to do.

  3. 3
    Marc-Andre Menard Says:

    Amazing.. you answer and it alomost work…

    there is still a problem with the : Microsoft.XMLDOM
    error : object does not support this propriety… sob :-(

    here is the site page: http://www.alteraction.ca/html-ang/4-1-listedeprix2.html

  4. 4
    Marc-Andre Menard Says:

    i have change my 58 line code for a script that is 188 line… but seem to work well…. still dot work on safari.. dont know why !… what do you think of that

    http://www.alteraction.ca/html-ang/4-1-listedeprix3.html

    by the way .. if interested to see the scripts : it is at: http://www.alteraction.ca/scripts/xmlreader3.js
    http://www.alteraction.ca/scripts/xmlreader.js

  5. 5
    Exile Says:

    In your first example you created the object xsltCompiled as an Microsoft.XMLDOM object, I think it needs to be an MSXML2.XSLTemplate object instead.

    As for Safari I’m not sure what to say. I’ve done very little work in that browser so I don’t know what Javascript works in there. Sorry, I’m no much use outside of Firefox or IE.

  6. 6
    nik Says:

    Thanks for this example and the wonderful explanation you have provided.

  7. 7
    John Roemer Says:

    There’s at least a couple of good free open source JavaScript XML parsers out there you may want to take a look at. One of them, XML for Script (http://xmljs.sourceforge.net), has been around for a while. You can find good examples of what they offer at their site, including some live samples. A newer one, sw8t.xml (http://sw8t.com), has a JDOM-like flavor to it … it’s pretty complete, the API is very easy to use, it parses XML into a DOM-type structure, and parsing from a URL is very straightforward. There’s also good documentation on their site. I would recommend trying either of those out for what you’re doing.

Leave a Reply

© 2007 Mindlence