XPath and the default namespace

I’ve been working on a make script that extracts the files it needs to build from a Microsoft Visual Studio project file. The .vcxproj file format is XML so I imagined it would be easy to use a command line XML processor to do the job. However because the project file declares a default namespace this was not as easy as it first looked.

One tool that can perform XPath queries on XML files is xmllint. This is part of the libxml package and is available in most Linux distributions and also on Cygwin. As this script was for use with Window 7, it seemed like a good choice. The method for running an XPath query using xmllint is the undocumented command line option –xpath:

$ xmllint --xpath /XmlTag1/XmlTag2/etc... XmlFile.xml

This works fine on simple sample XML files such as the standard books.xml. However, running this against a .vcxproj file produces:

$ xmllint --xpath "/Project/ItemGroup/ClCompile" project.vcxproj
XPath set is empty

And if that fails, then the actual query that I want to run causes an error:

$ xmllint --xpath "/Project/ItemGroup/ClCompile/@Include" project.vcxproj
Segmentation fault (core dumped)

The reason this doesn’t work is down to the project file’s use of the default namespace. The .vcxproj file defines its default namespace as:

http://schemas.microsoft.com/developer/msbuild/2003

This means that all of the XML tags in the .vcxproj file belong to this namespace. But there is no way in an XPath query to specify what the default namespace is. You need to have mapped the namespace to an identifier within the tool first.

xmllint can do this remapping in its shell mode. But there is no current method to specify the default namespace through a command line option. This means the above XPath query would need to be as follows:

$ xmllint --xpath "/*[namespace-uri()='http://schemas.microsoft.com/developer/msbuild/2003' and local-name()='Project']/*[namespace-uri()='http://schemas.microsoft.com/developer/msbuild/2003' and local-name()='ItemGroup']/*[namespace-uri()='http://schemas.microsoft.com/developer/msbuild/2003' and local-name()='ClCompile']" project.vcxproj

Within the make script itself, I can use variables or macro expansion to make the command a bit more manageable. A more hackier alternative would be to strip out the default namespace declaration from the .vcxproj file. This can be done with the sed program:

$ sed -e "s/xmlns/ignore/" project.vcxproj | xmllint --xpath "/Project/ItemGroup/ClCompile/@Include" -

Another shorter hack would be to relax the matching criteria:

$ xmllint --xpath "//*[local-name()='ClCompile']/@Include" project.vcxproj

But however I implement it, the results need further parsing as they come back in the form of key=”value” attributes:

 Include="file1.cpp" Include="file2.cpp"

I can use another sed expression to turn that into a simple white-space separated list. But I’ll leave that for another post.


Comments

8 responses to “XPath and the default namespace”

  1. Try xmllint –xpath “string (//*[local-name()=’ClCompile’]/@Include)” project.vcxproj

  2. Andreas Heim Avatar
    Andreas Heim

    Sorry, should be:

    setns defns=http://schemas.microsoft.com/developer/msbuild/2003

  3. Andreas Heim Avatar
    Andreas Heim

    A possible solution is to use the shell mode of xmllint and a command file that is used as stdin:

    xmllint –shell project.vcxproj < commands

    Content of commands file:

    setrootns
    xpath /defaultns:Project/defaultns:ItemGroup/defaultns:ClCompile

    You could also define a shorter namespace alias:

    setns defns
    xpath /defns:Project/defns:ItemGroup/defns:ClCompile

    Of course, the output of the command has to be parsed but it seems to be more simple than parsing the output of

    xmllint –xpath …

  4. Hi Abika, I haven’t done any work parsing vcxproj files or XML files for a long time so I’m probably not the best person to ask. In general to access the n-th child of something you usually use n as an index: e.g. element[n] so I’d start my search there.

  5. Hi,

    How to get the nth child in the case with the namespace? how to use position in this type of queries?

  6. Charlie Reitzel Avatar
    Charlie Reitzel

    You can extract only the attribute values like so:

    xmllint –xpath “//*[local-name()=’ClCompile’]/@Include/text()”

  7. Thanks Emil, I’ll incorporate your correction.

  8. Emil Gelev Avatar
    Emil Gelev

    Actually to get any result you need local-name()=’ElementName’ instead of name()=’ElementName’

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.