Skip to content

DOMXpath, permuting namespace prefixes #18540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vavra opened this issue May 12, 2025 · 2 comments
Closed

DOMXpath, permuting namespace prefixes #18540

vavra opened this issue May 12, 2025 · 2 comments

Comments

@vavra
Copy link

vavra commented May 12, 2025

Description

The following code:

<?php
$str = '<?xml version="1.0" encoding="UTF-8"?><ns3:entDocumentResponse xmlns="http://domain.com/dictionary/technical/" 
     xmlns:ns2="http://domain.com/services/entityService/document/GetContentUrl_1_0" 
     xmlns:ns3="http://domain.com/services/entityService/document/1.0">
    <activities/>
    <ns3:response>
        <ns3:opGetContentUrl_1_0>
            <ns2:documentContentUrl>https://server.domain.com/ACS/servlet/someUrl</ns2:documentContentUrl>
        </ns3:opGetContentUrl_1_0>
    </ns3:response>
</ns3:entDocumentResponse>';

// situation 1
$xml = new \DOMDocument();
$xml->loadXML($str);
$xpath = new \DOMXPath($xml);
$xpath->registerNamespace('ns3', 'http://domain.com/services/entityService/document/1.0');
$xpath->registerNamespace('ns2', 'http://domain.com/services/entityService/document/GetContentUrl_1_0');
$res = $xpath->evaluate('/ns3:entDocumentResponse/ns3:response/ns3:opGetContentUrl_1_0/ns2:documentContentUrl');
if ($res instanceof \DOMNodeList && $res->length > 0)
  echo '[1] ', $res->item(0)->nodeValue, "\n";
else
  echo '[1] ', 'Not parsed for situation 1.', "\n";

// situation 2
$xml2 = new \DOMDocument();
$xml2->loadXML($str);
$xpath2 = new \DOMXPath($xml2);
$xpath2->registerNamespace('ns2', 'http://domain.com/services/entityService/document/1.0');
$xpath2->registerNamespace('ns3', 'http://domain.com/services/entityService/document/GetContentUrl_1_0');
$res = $xpath2->evaluate('/ns2:entDocumentResponse/ns2:response/ns2:opGetContentUrl_1_0/ns3:documentContentUrl');
if ($res instanceof \DOMNodeList && $res->length > 0)
  echo '[2] ', $res->item(0)->nodeValue, "\n";
else
  echo '[2] ', 'Not parsed for situation 2.', "\n";

// situation 3
$xml3 = new \DOMDocument();
$xml3->loadXML($str);
$xpath3 = new \DOMXPath($xml3);
$xpath3->registerNamespace('a', 'http://domain.com/services/entityService/document/1.0');
$xpath3->registerNamespace('b', 'http://domain.com/services/entityService/document/GetContentUrl_1_0');
$res = $xpath3->evaluate('/a:entDocumentResponse/a:response/a:opGetContentUrl_1_0/b:documentContentUrl');
if ($res instanceof \DOMNodeList && $res->length > 0)
  echo '[3] ', $res->item(0)->nodeValue, "\n";
else
  echo '[3] ', 'Not parsed for situation 3.', "\n";

Resulted in this output:

[1] https://server.domain.com/ACS/servlet/someUrl
[2] Not parsed for situation 2.
[3] https://server.domain.com/ACS/servlet/someUrl

But I expected this output instead:

[1] https://server.domain.com/ACS/servlet/someUrl
[2] https://server.domain.com/ACS/servlet/someUrl
[3] https://server.domain.com/ACS/servlet/someUrl

Permutation of namespace prefixes but correctly assigned namespaces uris leads to unparsing the xpath.
The problem is that it was permuted same prefixes as are in the parse xml - situation 2

PHP Version

PHP 8.3.14 (cli) (built: Nov 19 2024 15:53:36) (NTS Visual C++ 2019 x64)
Copyright (c) The PHP Group
Zend Engine v4.3.14, Copyright (c) Zend Technologies
    with Zend OPcache v8.3.14, Copyright (c), by Zend Technologies
    with Xdebug v3.4.1, Copyright (c) 2002-2025, by Derick Rethans

Operating System

Windows 11

@nielsdos
Copy link
Member

There's a second argument in the DOMXPath constructor:

public function DOMXPath::__construct(DOMDocument $document, bool $registerNodeNS = true)

The $registerNodeNS argument is by default true, and that means it'll automatically register the node namespaces from the document. That means that the xpath evaluator will first try to use the ns2 and ns3 namespaces from the document itself when matching a node, if those namespaces are in scope. If they're not in scope, only then will your overrides (via registerNamespace) be used.

So to make your example work: either you always use the ns2 and ns3 namespaces from the document and remove the calls to registerNamespace. Or you make sure that $registerNodeNS is false such that your overrides are always used.

Here's a fixed example that makes use of the argument: https://3v4l.org/ZOANr

@nielsdos nielsdos closed this as not planned Won't fix, can't repro, duplicate, stale May 12, 2025
@vavra
Copy link
Author

vavra commented May 13, 2025

Quite unexpected behavior. The call to registerNamespace should override the namespaces picked up from the XML document. Perhaps the second parameter shouldn't have a default value - leaving it unset should be marked as deprecated.
Anyway, thanks for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants