sdk/python/Deserialization

Deserialization

pickle, marshal, yaml — unsafe deserialization

All 21Source: 4Sink: 12Sanitizer: 3
PyAst
ast

The ast module exposes Python's abstract syntax tree. ast.literal_eval is a safe evaluator for literals only. The builtins eval() and exec() execute arbitrary Python code — RCE sinks on user input. compile() produces code objects that reach exec().

3 sinks1 sanitizer5 methods
PyCsv
csv

The csv module. csv.writer + writerow on user-controlled cells produces CSV-formula injection when the receiver opens the CSV in Excel (cells starting with =, +, -, @ are interpreted as formulas). No stdlib sanitizer — prefix with a tab or apostrophe.

2 sources4 methods
PyDbm
dbm

The dbm family (dbm.gnu, dbm.ndbm, dbm.dumb). dbm.open() on untrusted files reads a DBM-format database. dbm.dumb is pickle-like and unsafe on untrusted input.

1 sink1 methods
PyDefusedXml
defusedxml

defusedxml is the hardened XML parser suite. It wraps xml.etree, xml.sax, xml.dom, lxml etc. with external-entity resolution disabled. Using defusedxml counterparts is the recommended sanitizer for XML sources.

2 sanitizers2 methods
PyJson
json

The json module for JSON encode / decode. Unlike pickle, json is safe by default — only parses primitives, lists, dicts. Still worth documenting because json.loads is a common source entry point and json.dumps on response values is where reflected XSS originates.

2 sources4 methods
PyLxml
lxml

lxml is the C-backed XML / HTML parser. etree.parse / fromstring with a custom XMLParser(resolve_entities=True) is an XXE sink. Default behavior in recent lxml is safer but the API still allows unsafe configurations.

2 sinks4 methods
PyMarshal
marshal

The marshal module for Python internal object serialization. Like pickle, marshal.load() / marshal.loads() execute code paths determined by the input bytes — unsafe on untrusted data. The module is undocumented for general use.

2 sinks2 methods
PyPickle
pickle

The pickle module for Python object serialization. pickle.load() and pickle.loads() execute arbitrary code during deserialization via __reduce__ — always unsafe with untrusted input. Use json or signed payloads instead.

3 sinks3 methods
PyPickletools
pickletools

Python stdlib module — pickletools. Auto-indexed from CDN. Method-level security roles have not been annotated; rule writers should inspect the source before use.

10 methods
PyPlistlib
plistlib

Python stdlib module — plistlib. Auto-indexed from CDN. Method-level security roles have not been annotated; rule writers should inspect the source before use.

9 methods
PyPyasn1
pyasn1

pyasn1 decodes ASN.1 structures. der_decoder.decode() on untrusted DER bytes can trigger denial-of-service via deep nesting. Typically used in certificate / LDAP contexts.

1 sink1 methods
PyPyexpat
pyexpat

Python stdlib module — pyexpat. Auto-indexed from CDN. Method-level security roles have not been annotated; rule writers should inspect the source before use.

5 methods
PyShelve
shelve

The shelve module persists arbitrary Python objects — backed by pickle under the hood. shelve.open() on untrusted files is a deserialization sink (RCE via pickle's __reduce__).

1 sink1 methods
PySimplejson
simplejson

Third-party Python package module — simplejson. Auto-indexed from CDN. Method-level security roles have not been annotated; rule writers should inspect the source before use.

10 methods
PyToml
toml

toml parses TOML configuration. toml.load() is a neutral data loader — values become sources when the config file is user-supplied. tomllib (stdlib, 3.11+) is the modern replacement.

2 sources2 methods
PyXmlDom
xml.dom

xml.dom.minidom for DOM-style XML parsing. Built on pyexpat which by default does not resolve external entities, but custom resolvers can reintroduce XXE. defusedxml.minidom is the hardened replacement.

2 sinks2 methods
PyXmlEtree
xml.etree.ElementTree

xml.etree.ElementTree is the stdlib XML parser. The C-accelerated parser has some built-in protections but still processes external entities in some configurations — XXE sink. Prefer defusedxml for untrusted XML.

3 sinks3 methods
PyXmlSax
xml.sax

xml.sax is the stdlib SAX parser. By default it resolves external entities — XXE sink on untrusted XML. Disable with parser.setFeature(feature_external_ges, False) or use defusedxml.sax.

3 sinks3 methods
PyXmlrpc
xmlrpc.client

xmlrpc.client and xmlrpc.server. ServerProxy RPCs execute arbitrary methods — dispatch on untrusted method names is a sink. ServerProxy + HTTP (not HTTPS) transmits credentials in plaintext.

2 sinks2 methods
PyXmltodict
xmltodict

xmltodict parses XML into nested dicts (uses expat under the hood). Entity expansion is disabled by default, but the module's parse() still exposes untrusted XML to the app. Not a full XXE defense.

1 source1 methods
PyYaml
yaml

PyYAML is the standard YAML library. yaml.load() with the default Loader (or UnsafeLoader / Loader) instantiates arbitrary Python classes — RCE sink on untrusted input. Use yaml.safe_load() or Loader=yaml.SafeLoader instead.

3 sinks2 sanitizers5 methods