Insecure MD5 Hash (PyCryptodome)

MEDIUM

MD5 is cryptographically broken due to practical collision attacks since 2004. Use SHA-256 or SHA-3 via PyCryptodome instead.

Rule Information

Language
Python
Category
Cryptography
Author
Shivasurya
Shivasurya
Last Updated
2026-03-22
Tags
pythonpycryptodomemd5weak-hashCWE-327OWASP-A02
CWE References

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-CRYPTO-SEC-012 --project .
1
2
3
4
5
6
rule.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

About This Rule

Understanding the vulnerability and how it is detected

Detects usage of MD5 via the PyCryptodome or PyCrypto library's `Crypto.Hash.MD5.new()` or `Cryptodome.Hash.MD5.new()` constructor. MD5 produces a 128-bit digest and has been considered cryptographically broken since 2004 (Wang et al., chosen-prefix collision attacks). In 2008 the Flame malware exploited MD5 weaknesses in a Windows Update certificate to execute arbitrary code on patched systems.

PyCryptodome is commonly used in Python projects for cryptographic operations and is the maintained successor to the original PyCrypto library. Both `Crypto.Hash.MD5` (PyCryptodome in drop-in compatibility mode) and `Cryptodome.Hash.MD5` (PyCryptodome standalone) are covered by this rule.

MD5 must not be used for digital signatures, data integrity verification, password hashing, or HMAC-based authentication. It remains acceptable for non-security checksums such as cache keys, file deduplication identifiers, or content-addressable storage where an attacker producing a collision confers no security benefit.

Security Implications

Potential attack scenarios if this vulnerability is exploited

1

2

3

4

How to Fix

Recommended remediation steps

  • 1Replace Crypto.Hash.MD5.new() with Crypto.Hash.SHA256.new() for all integrity and authentication use cases.
  • 2For password hashing, do not use any raw hash function including SHA-256 — use Argon2 (argon2-cffi), bcrypt, or scrypt which are designed to be slow and memory-intensive.
  • 3For message authentication, use Crypto.Hash.HMAC with SHA-256 as the digest module instead of bare MD5.
  • 4MD5 may remain in non-security contexts (cache keys, deduplication) where collision resistance carries no security consequence — add an explicit comment documenting this intent.
  • 5When migrating stored MD5 checksums (e.g., in a database), rehash with SHA-256 on next verified access and deprecate the MD5 code path with a sunset date.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

Matches any call to `PyCryptoHashMD5.method("new")` where `PyCryptoHashMD5` is a QueryType resolving fully-qualified names `Crypto.Hash.MD5` and `Cryptodome.Hash.MD5`. This covers both the PyCryptodome drop-in compatibility namespace (`Crypto.*`) and the standalone namespace (`Cryptodome.*`). The rule fires on `.new()` constructor invocation, which is the standard PyCryptodome pattern for instantiating hash objects.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

OWASP Top 10
A02:2021 - Cryptographic Failures
PCI DSS v4.0
Requirement 4.2.1 -- use strong cryptography
NIST SP 800-131A
MD5 and SHA-1 disallowed for digital signatures
NIST SP 800-53
SC-13: Cryptographic Protection

References

External resources and documentation

Similar Rules

Explore related security rules for Python

Frequently Asked Questions

Common questions about Insecure MD5 Hash (PyCryptodome)

MD5 is acceptable for non-security purposes such as cache keys, file deduplication, or content-addressable storage where an attacker benefiting from a collision is not a concern. It must not be used for signatures, integrity verification, password hashing, or authentication in security contexts.
PYTHON-CRYPTO-SEC-010 targets MD5 used via the `cryptography` library hazmat interface (hashes.MD5()). This rule (PYTHON-CRYPTO-SEC-012) targets MD5 in PyCryptodome/PyCrypto (Crypto.Hash.MD5.new() or Cryptodome.Hash.MD5.new()). The underlying weakness is identical; the rules differ by library.
PyCryptodome includes MD5 for legacy compatibility and protocol support, not because it is safe for new security-sensitive applications. The presence of an algorithm in a library does not imply it is recommended.
SHA-256 is fast by design, which is a property attackers exploit for brute-force and dictionary attacks. Use Argon2id (argon2-cffi), bcrypt, or scrypt, which are deliberately slow and memory-intensive to limit attacker throughput.
This rule specifically matches calls to Crypto.Hash.MD5.new(). If MD5 is used indirectly inside a higher-level PyCryptodome KDF, it would require a separate rule targeting that KDF.
Run `code-pathfinder scan --ruleset python/cryptography/PYTHON-CRYPTO-SEC-012 --path ./src` in your pipeline. Add `--format sarif` to produce SARIF output compatible with GitHub Advanced Security and similar platforms.
The `cryptography` library is generally recommended for new Python projects due to its active maintenance and explicit hazmat separation. PyCryptodome is well-maintained and widely used, but either library is acceptable when strong algorithms (SHA-256, AES-GCM) are used correctly.

New feature

Get these findings posted directly on your GitHub pull requests

The Insecure MD5 Hash (PyCryptodome) rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works