@@ -15,12 +15,10 @@ XML Processing Modules
15
15
16
16
Python's interfaces for processing XML are grouped in the ``xml `` package.
17
17
18
- .. warning ::
18
+ .. note ::
19
19
20
- The XML modules are not secure against erroneous or maliciously
21
- constructed data. If you need to parse untrusted or
22
- unauthenticated data see the :ref: `xml-vulnerabilities ` and
23
- :ref: `defusedxml-package ` sections.
20
+ If you need to parse untrusted or unauthenticated data, see
21
+ :ref: `xml-security `.
24
22
25
23
It is important to note that modules in the :mod: `xml ` package require that
26
24
there be at least one SAX-compliant XML parser available. The Expat parser is
@@ -47,46 +45,22 @@ The XML handling submodules are:
47
45
* :mod: `xml.parsers.expat `: the Expat parser binding
48
46
49
47
48
+ .. _xml-security :
50
49
.. _xml-vulnerabilities :
51
50
52
- XML vulnerabilities
53
- -------------------
51
+ XML security
52
+ ------------
54
53
55
- The XML processing modules are not secure against maliciously constructed data.
56
54
An attacker can abuse XML features to carry out denial of service attacks,
57
55
access local files, generate network connections to other machines, or
58
56
circumvent firewalls.
59
57
60
- The following table gives an overview of the known attacks and whether
61
- the various modules are vulnerable to them.
62
-
63
- ========================= ================== ================== ================== ================== ==================
64
- kind sax etree minidom pulldom xmlrpc
65
- ========================= ================== ================== ================== ================== ==================
66
- billion laughs **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1)
67
- quadratic blowup **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1) **Vulnerable ** (1)
68
- external entity expansion Safe (5) Safe (2) Safe (3) Safe (5) Safe (4)
69
- `DTD `_ retrieval Safe (5) Safe Safe Safe (5) Safe
70
- decompression bomb Safe Safe Safe Safe **Vulnerable **
71
- large tokens **Vulnerable ** (6) **Vulnerable ** (6) **Vulnerable ** (6) **Vulnerable ** (6) **Vulnerable ** (6)
72
- ========================= ================== ================== ================== ================== ==================
73
-
74
- 1. Expat 2.4.1 and newer is not vulnerable to the "billion laughs" and
75
- "quadratic blowup" vulnerabilities. Items still listed as vulnerable due to
76
- potential reliance on system-provided libraries. Check
77
- :const: `!pyexpat.EXPAT_VERSION `.
78
- 2. :mod: `xml.etree.ElementTree ` doesn't expand external entities and raises a
79
- :exc: `~xml.etree.ElementTree.ParseError ` when an entity occurs.
80
- 3. :mod: `xml.dom.minidom ` doesn't expand external entities and simply returns
81
- the unexpanded entity verbatim.
82
- 4. :mod: `xmlrpc.client ` doesn't expand external entities and omits them.
83
- 5. Since Python 3.7.1, external general entities are no longer processed by
84
- default.
85
- 6. Expat 2.6.0 and newer is not vulnerable to denial of service
86
- through quadratic runtime caused by parsing large tokens.
87
- Items still listed as vulnerable due to
88
- potential reliance on system-provided libraries. Check
89
- :const: `!pyexpat.EXPAT_VERSION `.
58
+ Expat versions lower that 2.6.0 may be vulnerable to "billion laughs",
59
+ "quadratic blowup" and "large tokens". Python may be vulnerable if it uses such
60
+ older versions of Expat as a system-provided library.
61
+ Check :const: `!pyexpat.EXPAT_VERSION `.
62
+
63
+ :mod: `xmlrpc ` is **vulnerable ** to the "decompression bomb" attack.
90
64
91
65
92
66
billion laughs / exponential entity expansion
@@ -103,16 +77,6 @@ quadratic blowup entity expansion
103
77
efficient as the exponential case but it avoids triggering parser countermeasures
104
78
that forbid deeply nested entities.
105
79
106
- external entity expansion
107
- Entity declarations can contain more than just text for replacement. They can
108
- also point to external resources or local files. The XML
109
- parser accesses the resource and embeds the content into the XML document.
110
-
111
- `DTD `_ retrieval
112
- Some XML libraries like Python's :mod: `xml.dom.pulldom ` retrieve document type
113
- definitions from remote or local locations. The feature has similar
114
- implications as the external entity expansion issue.
115
-
116
80
decompression bomb
117
81
Decompression bombs (aka `ZIP bomb `_) apply to all XML libraries
118
82
that can parse compressed XML streams such as gzipped HTTP streams or
@@ -126,21 +90,5 @@ large tokens
126
90
be used to cause denial of service in the application parsing XML.
127
91
The issue is known as :cve: `2023-52425 `.
128
92
129
- The documentation for :pypi: `defusedxml ` on PyPI has further information about
130
- all known attack vectors with examples and references.
131
-
132
- .. _defusedxml-package :
133
-
134
- The :mod: `!defusedxml ` Package
135
- ------------------------------
136
-
137
- :pypi: `defusedxml ` is a pure Python package with modified subclasses of all stdlib
138
- XML parsers that prevent any potentially malicious operation. Use of this
139
- package is recommended for any server code that parses untrusted XML data. The
140
- package also ships with example exploits and extended documentation on more
141
- XML exploits such as XPath injection.
142
-
143
-
144
93
.. _Billion Laughs : https://en.wikipedia.org/wiki/Billion_laughs
145
94
.. _ZIP bomb : https://en.wikipedia.org/wiki/Zip_bomb
146
- .. _DTD : https://en.wikipedia.org/wiki/Document_type_definition
0 commit comments