Insecure MD5 Hash (cryptography)

MEDIUM

MD5 is cryptographically broken due to collision attacks since 2004. Use SHA-256 or SHA-3 instead.

Rule Information

Language
Python
Category
Cryptography
Author
Shivasurya
Shivasurya
Last Updated
2026-03-22
Tags
pythoncryptographymd5weak-hashCWE-327OWASP-A02
CWE References

Interactive Playground

Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.

pathfinder scan --ruleset python/PYTHON-CRYPTO-SEC-010 --project .
1
2
3
4
5
rule.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

About This Rule

Understanding the vulnerability and how it is detected

Detects usage of MD5 via the `cryptography` library's hazmat primitives interface (`hashes.MD5()`). MD5 produces a 128-bit digest and has been considered cryptographically broken since 2004 when Wang et al. demonstrated practical chosen-prefix collision attacks. By 2008, rogue CA certificates were forged using MD5 collisions in under hours of computation. Today, MD5 collisions can be produced in seconds on commodity hardware.

MD5 must not be used for digital signatures, certificate validation, HMAC-based authentication, or data integrity verification in security contexts. It remains acceptable for non-security purposes such as cache keys, file deduplication, or content-addressable storage where collision resistance is not a security requirement.

This rule specifically targets `cryptography.hazmat.primitives.hashes.MD5` instantiation, which is the hazmat (Hazardous Materials) layer indicating the caller is expected to understand the risks — yet MD5 is still dangerous regardless of the API used.

Security Implications

Potential attack scenarios if this vulnerability is exploited

1

2

3

4

How to Fix

Recommended remediation steps

  • 1Replace hashes.MD5() with hashes.SHA256() or hashes.SHA3_256() for all integrity and signing use cases.
  • 2For password hashing, do not use any raw hash function — use a memory-hard KDF such as Argon2 (argon2-cffi), bcrypt, or scrypt instead.
  • 3For HMAC authentication, use HMAC with SHA-256 or SHA-3 (cryptography.hazmat.primitives.hmac with hashes.SHA256()).
  • 4MD5 may remain in place for purely non-security uses (cache keys, file deduplication) where collision resistance carries no security consequence — document this explicitly.
  • 5When migrating existing MD5-hashed data (e.g., stored checksums), re-hash with SHA-256 on first verified access and deprecate the MD5 path.

Detection Scope

How Code Pathfinder analyzes your code for this vulnerability

Matches any call to `CryptoHashes.method("MD5")` where `CryptoHashes` is a QueryType resolving fully-qualified names under `cryptography.hazmat.primitives.hashes`. This catches `hashes.MD5()` regardless of how the `hashes` module is imported or aliased. The rule fires on instantiation of the MD5 hash object, not on specific method calls made on the resulting digest object.

Compliance & Standards

Industry frameworks and regulations that require detection of this vulnerability

OWASP Top 10
A02:2021 - Cryptographic Failures
PCI DSS v4.0
Requirement 4.2.1 -- use strong cryptography
NIST SP 800-131A
MD5 and SHA-1 disallowed for digital signatures
NIST SP 800-53
SC-13: Cryptographic Protection

References

External resources and documentation

Similar Rules

Explore related security rules for Python

Frequently Asked Questions

Common questions about Insecure MD5 Hash (cryptography)

MD5 is safe for non-security checksums such as file deduplication, cache invalidation keys, or content-addressable storage where an attacker gaining from a collision provides no security benefit. It must not be used for digital signatures, certificate hashing, HMAC, password storage, or any context where collision resistance matters.
SHA-256 (and all raw hash functions) are designed to be fast. Speed is an advantage for an attacker performing brute-force or dictionary attacks. Password hashing requires a deliberately slow, memory-hard function — use Argon2, bcrypt, or scrypt. PBKDF2 with SHA-256 is acceptable when Argon2 is unavailable, but requires at least 600,000 iterations per NIST SP 800-132.
If MD5 is mandated by an external specification you cannot change, document it clearly, isolate the usage, and add compensating controls (e.g., an outer integrity layer using SHA-256 HMAC). Flag the dependency for removal when the protocol allows migration.
No. This rule targets the `cryptography` library's hazmat primitives. For hashlib.md5() detection, see the hashlib-specific rules in this ruleset.
Run `code-pathfinder scan --ruleset python/cryptography/PYTHON-CRYPTO-SEC-010 --path ./src` in your pipeline. Add `--format sarif` to produce SARIF output compatible with GitHub Advanced Security and similar platforms.
MEDIUM reflects that MD5 is context-dependent — collision attacks are practical but require attacker interaction at the point of signing or hashing. Rules targeting MD4 and MD2 are rated HIGH because those algorithms offer no practical security even in constrained scenarios.
Technically yes, but using the hazmat interface for non-security purposes adds unnecessary complexity. Prefer `hashlib.md5()` for checksums to make the non-security intent explicit. The hazmat interface signals cryptographic use, which increases the chance of future misuse.

New feature

Get these findings posted directly on your GitHub pull requests

The Insecure MD5 Hash (cryptography) rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.

See how it works