Interactive Playground
Experiment with the vulnerable code and security rule below. Edit the code to see how the rule detects different vulnerability patterns.
pathfinder scan --ruleset python/PYTHON-LANG-SEC-030 --project .About This Rule
Understanding the vulnerability and how it is detected
MD5 (Message Digest Algorithm 5) was once widely used for cryptographic hashing but is now considered cryptographically broken. Practical collision attacks against MD5 were demonstrated in 2004, and chosen-prefix collision attacks are feasible in under an hour on consumer hardware. MD5 should not be used for any security-sensitive purpose.
MD5 is broken for: digital signatures (collision attacks allow forging signatures), certificate fingerprinting (collisions allow creating malicious certificates with the same fingerprint), password storage (rainbow tables and GPU cracking reduce MD5 passwords to seconds), and file integrity verification when the attacker can choose file content.
MD5 remains suitable for non-security purposes such as checksums to detect accidental corruption (not adversarial modification), content-addressed storage keys where collision resistance is not required, and hash table keys. However, it must never be used where an adversary could craft colliding inputs.
Security Implications
Potential attack scenarios if this vulnerability is exploited
Collision Attacks
An attacker can generate two different inputs with the same MD5 hash in seconds to minutes on modern hardware. This allows forging digital signatures, creating malicious files that match expected checksums, and bypassing integrity checks.
Password Cracking
MD5 is extremely fast (billions of hashes per second on GPUs) and has no salt or stretching by default. MD5-hashed passwords are trivially cracked using rainbow tables, dictionary attacks, or brute force.
Certificate Forgery
MD5 collisions have been used to forge X.509 certificates and create rogue CA certificates. Applications using MD5 for certificate fingerprinting or verification can be deceived by crafted certificates.
Integrity Check Bypass
File or message integrity checks using MD5 can be bypassed by an attacker who can influence the content being hashed. The attacker crafts a malicious file that has the same MD5 hash as the legitimate file.
How to Fix
Recommended remediation steps
- 1Replace hashlib.md5() with hashlib.sha256() or hashlib.sha3_256() for all security-sensitive hashing operations.
- 2For password hashing specifically, use hashlib.pbkdf2_hmac(), bcrypt, scrypt, or argon2 — never bare MD5 or any fast hash.
- 3For file integrity verification against adversarial modification, use SHA-256 or SHA-512.
- 4Reserve MD5 only for non-security checksums where collision resistance is not a security requirement, and document why it is acceptable.
- 5Audit all places where MD5 digests are compared to expected values to determine if they are security-sensitive.
Detection Scope
How Code Pathfinder analyzes your code for this vulnerability
This rule detects calls to hashlib.md5() in Python source code. All call sites are flagged since the security-sensitivity of each use requires human review to determine. Non-security uses may be suppressed with documentation explaining the acceptable use case.
Compliance & Standards
Industry frameworks and regulations that require detection of this vulnerability
References
External resources and documentation
Similar Rules
Explore related security rules for Python
Insecure SHA-1 Hash Usage
SHA-1 is cryptographically weak due to practical collision attacks. Use SHA-256 or SHA-3 for security-sensitive hashing.
Insecure Hash via hashlib.new()
hashlib.new() with an insecure algorithm name (MD5, SHA1, SHA-224) creates a cryptographically weak hash. Use SHA-256 or SHA-3.
SHA-224 or SHA3-224 Weak Hash Usage
SHA-224 and SHA3-224 provide only 112-bit collision resistance, which is below the 128-bit minimum recommended by NIST for new applications.
Frequently Asked Questions
Common questions about Insecure MD5 Hash Usage
New feature
Get these findings posted directly on your GitHub pull requests
The Insecure MD5 Hash Usage rule runs in CI and posts inline review comments on the exact lines — no dashboard, no SARIF viewer.