In view of COVID-19 precaution measures, we remind you that ImmuniWeb Platform allows to easily configure and safely buy online all available solutions in a few clicks.

Total Tests:
Stay in Touch

Weekly newsletter on AI, Application Security & Cybercrime

Your data will stay confidential Private and Confidential

How to Protect Your Web Applications From XXE Attacks

Tuesday, May 8, 2018 By Read Time: 4 min.

XML External Entities (XXE) Attacks are now the 4th greatest risk to web applications as per OWAPS Top 10.

XML eXternal Entities Attack or XXE for short is an old XML attack that got more attention lately since it was included in the new OWASP Top 10 2017 RC2 at the 4th position (A4:2017-XML External Entities (XXE)). Today we are going to talk a little bit about this attack. But first let’s figure out what XML files and its contents are.

How to Protect Your Web Applications From XXE Attacks

XML or (eXtensible Markup Language) files were designed to store and transport data. They are used to create custom user-defined tags describing data and give them some meaning in the application and also keeping the file both human and machine readable. Unlike HTML, which consists of pre-defined tags, XML files can be made of many different tags and elements depending on which application is going to use it. Basically an XML file is formed of four elements:

  • Document Type Definition (DTD)
  • XML Elements
  • XML Attributes
  • Entities

Document Type Definition (DTD)

According to W3Schools a DTD defines the structure and the elements and attributes of an XML document. With that, people can agree on a standard for interchanging data and an application can also use it to verify if XML is valid. There are two types of DTDs:

  • Internal DTD - In this type elements are declared within the XML files inside the <!DOCTYPE> definition
    Example: <!ENTITY writer "Magno Logan" >
  • External DTD - In this type elements are declared outside the XML files where the <!DOCTYPE> definition contains a reference to the DTD file.
    Example: <!ENTITY writer SYSTEM "" >

XML Elements

An XML Element is the main element of an XML file. It consists of both the opening and closing tag and whatever info is between those. So we can agree that an element can contain text, attributes and sometimes other elements. For example:

  1. <bookstore>
  2.     <book category="web-security">
  3.         <title>Learning XXE</title>
  4.         <author>Magno Logan</author>
  5.         <year>2018</year>
  6.         <price>49.99</price>
  7.     </book>
  8.  </bookstore>

XML Attributes

Just like the attributes on HTML tags, XML attributes give additional information about that specific element. They could also be a child element with that same information, but I think it was easier that way. The category in our last example is an XML Attribute.


XML Entities are very similar to Constant in programming languages. They can be internal or external and are used to define blocks of data that will be repeated in document both for speed and easy maintenance and also to define special characters that could cause some problem for XML parser to understand. There are four types of Entities:

  • Named Entity - it is used to refer to the entities inside the Internal DTD
  • Parameter Entity - it is used to group elements and attributes in a single entity
  • Character Entity - it is used to specify Unicode characters
  • External Entity - it is used to represent content of a External DTD

I believe that by now you already have a feeling where the problem is, right? If not, that’s okay too, we’ll explain everything in details. The XXE Attack basically happens because many older XML processors allow the specification of the external entity, a de-referenced and evaluated URI during XML processing. This failure allows the attackers to perform data extraction, make requests on the server, scan internal systems, perform DoS attacks, among others. According to OWASP there are four risk factors involved and in case your application performs any of those you should take the appropriate measures:

  • The application parses XML documents.
  • Tainted data is allowed within the system identifier portion of the entity, within the document type declaration (DTD).
  • The XML processor is configured to validate and process the DTD.
  • The XML processor is configured to resolve external entities within the DTD.

Let’s analyze the code below so that we can understand the risks and the impact of this issue:

  1. <?xml version="1.0" encoding="ISO-8859-1"?>
  2. <!DOCTYPE foo [
  3. <!ELEMENT foo ANY >
  4. <!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
  5. <foo>&xxe;</foo>

The request with the code above will reflect the contents of the passwd file to the user. In some cases the application may not reflect back the result, which we can’t say that the application is not vulnerable, similar to Blind SQL Injection. But what if we change the code a little bit? Is it possible to probe the server's private network? Yes it is! Here’s an example of how to do that:

  1. <?xml version="1.0" encoding="ISO-8859-1"?>
  2. <!DOCTYPE foo [
  3. <!ELEMENT foo ANY >
  4. <!ENTITY xxe SYSTEM "" >]>
  5. <foo>&xxe;</foo>

Now when we understand the attack and how dangerous it is, let’s discuss how to protect against it. According to the OWASP Top 10 here are a few techniques that would protect your application against this kind of attack:

  • Use less complex data formats such as JSON. Well, if you are creating a new app or service, that’s way easier to do. Since APIs and Microservices are all shifting to use JSON instead of XML.
  • Patch all XML processors and libraries in use by the application. Make sure that nothing broke the application and that it is still working properly by performing regression testing.
  • Disable XML external entity and DTD processing in all XML parsers in the application, but first make sure you don’t use them anywhere in your app.
  • Implement positive ("whitelisting") server-side input validation, filtering, or sanitization to prevent hostile data within XML documents, headers, or nodes. We’ve talked about this on the Proactive Controls Article Part1 and Part2.

For a quick guide on language-specific scenarios and code please check the XML External Entity (XXE) Prevention Cheat Sheet. Also if these controls are not entirely possible for your scenario, consider using a Web Application Firewall (WAF) with Virtual Patching or API Gateways to monitor, prevent and block XXE attacks. Don’t forget to check out our free Application Discovery service and stay tuned for our next posts.

Latest news and insights on AI and Machine Learning for application security testing, web, mobile and IoT security vulnerabilities, and application penetration testing.

User Comments
Add Comment

How it Works Ask a Question