I am Hack Sparrow
Captain of the Internets.

How to manhandle XML with namespace in PHP

Processing namespaced XML in PHP - the hardcore way

You might agree that handling XML is already a fuckin pain in the ass! There are so many cumbersome ways of handling XML in PHP, you don't have a clue which one to use and how the fuck they work anymore.

Then one beautiful day you get an XML which looks like this monster:

<?xml version="1.0" encoding="UTF-8"?>
<aws:SitesLinkingInResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11">
<aws:OperationRequest>
<aws:RequestId>
ca282ec6-2d08-4341-9f1d-50f8c1e3652b
</aws:RequestId>
</aws:OperationRequest>
<aws:SitesLinkingInResult>
<aws:Alexa>
<aws:SitesLinkingIn>
<aws:Site>
<aws:Title>
Google
</aws:Title>
<aws:Url>
http://www.google.com:80/Top/Computers/Internet/On_the_Web/Web_Portals/
</aws:Url>
</aws:Site>
<aws:Site>
<aws:Title>
www.fotolog.com:80/TsR_BkR_TsR
</aws:Title>
<aws:Url>
http://www.fotolog.com:80/TsR_BkR_TsR
</aws:Url>
</aws:Site>
</aws:SitesLinkingIn>
</aws:Alexa>
</aws:SitesLinkingInResult>
<aws:ResponseStatus xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:StatusCode>
Success
</aws:StatusCode>
</aws:ResponseStatus>
</aws:Response>
</aws:SitesLinkingInResponse>

You will find that none of the PHP XML parsing techniques work on it. I tell you, none of the code samples you find on the Web is going to help you. The shit you see above is an XML with namespace. As far as I know, there's no documentation on how to handle that kinda shit in PHP.

I have come up with a surprisingly easy solution to this problem. All you need is str_replace and simplexml_load_string to show the piece of shit, who the boss is and how much you own. I describe my technique below:

// Store the offending XML in a PHP string variable
$string = <<< XML
<?xml version="1.0" encoding="UTF-8"?>
<aws:SitesLinkingInResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11">
<aws:OperationRequest>
<aws:RequestId>
ca282ec6-2d08-4341-9f1d-50f8c1e3652b
</aws:RequestId>
</aws:OperationRequest>
<aws:SitesLinkingInResult>
<aws:Alexa>
<aws:SitesLinkingIn>
<aws:Site>
<aws:Title>
Google
</aws:Title>
<aws:Url>
http://www.google.com:80/Top/Computers/Internet/On_the_Web/Web_Portals/
</aws:Url>
</aws:Site>
<aws:Site>
<aws:Title>
www.fotolog.com:80/TsR_BkR_TsR
</aws:Title>
<aws:Url>
http://www.fotolog.com:80/TsR_BkR_TsR
</aws:Url>
</aws:Site>
</aws:SitesLinkingIn>
</aws:Alexa>
</aws:SitesLinkingInResult>
<aws:ResponseStatus xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:StatusCode>
Success
</aws:StatusCode>
</aws:ResponseStatus>
</aws:Response>
</aws:SitesLinkingInResponse>
XML;

// Remove the XML namespace opening tags
$string = str_replace('<aws:', '<', $string);
// Remove the XML namespace closing tags
$string = str_replace('</aws:', '</', $string);
// For good measure, remove anything that has to do with XML namespace
$string = str_replace('xmlns:aws', 'nonsense', $string);

// Load into simplexml_load_string()
$xml = simplexml_load_string($string);
// Echoes Success
echo $xml->Response[0]->ResponseStatus[0]->StatusCode;

There you go! When you can't handle XML in PHP, you manhandle it!!!

Exercise

  1. How do you handle XML namespace in PHP?

References

  1. Namespace
  2. XML
  3. PHP
  4. str_replace
  5. simplexml_load_string
  6. SimpleXML
  7. XMLReader
  8. PHP XML Parser

12 Responses to “How to manhandle XML with namespace in PHP”

  1. Tolulope Adeagbo says:

    Thanks mehn you’ve really saved my ass, Fuck namespaces in php

  2. Silas says:

    You can strip out all namespaces with this:
    $string = preg_replace(‘/(:]+:/’, ‘$1’, $string);

    And you can get rid of the xmlns stuff with
    $string = preg_replace(‘/ xmlns[^=]*=”[^”]*”/i’, ”, $string);

    As per the posts above, it’s better practice to register the namespaces, but if you just want a quick and dirty hack, here it is.

Make a Comment