# PYTHON-CRYPTO-SEC-015: Insecure SHA1 Hash (PyCryptodome)

> **Severity:** MEDIUM | **CWE:** CWE-327, CWE-328 | **OWASP:** A02:2021

- **Language:** Python
- **Category:** Cryptography
- **URL:** https://codepathfinder.dev/registry/python/cryptography/PYTHON-CRYPTO-SEC-015
- **Detection:** `pathfinder scan --ruleset python/PYTHON-CRYPTO-SEC-015 --project .`

## Description

Detects usage of SHA-1 via the PyCryptodome library using any of its SHA-1 module aliases:
`Crypto.Hash.SHA.new()`, `Cryptodome.Hash.SHA.new()`, `Crypto.Hash.SHA1.new()`, or
`Cryptodome.Hash.SHA1.new()`. PyCryptodome exposes SHA-1 under both the older `SHA` name
(for historical compatibility with PyCrypto) and the explicit `SHA1` name.

SHA-1 produces a 160-bit digest and was formally broken in 2017 when Stevens et al. at
CWI Amsterdam and Google produced the first known SHA-1 collision (SHAttered attack).
A chosen-prefix collision attack was demonstrated in 2020 at a cost of approximately
900 GPU-years, making targeted forgery feasible for well-resourced adversaries.

NIST deprecated SHA-1 for digital signatures in SP 800-131A and FIPS 186-5 removes SHA-1
entirely from approved signature algorithms. Browser vendors stopped trusting SHA-1 TLS
certificates in 2017.

SHA-1 must not be used for digital signatures, certificate hashing, HMAC authentication,
or data integrity in security contexts. It is sometimes encountered in PyCryptodome
codebases because older PyCrypto tutorials used `from Crypto.Hash import SHA` — this
rule catches both the legacy `SHA` alias and the explicit `SHA1` module name.


## Vulnerable Code

```python
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.backends import default_backend

# SEC-015: SHA1 in PyCryptodome
from Crypto.Hash import SHA
h_sha = SHA.new(b"data")
```

## Secure Code

```python
from Crypto.Hash import SHA256, SHA3_256

# SECURE: SHA-256 for general integrity checking
h = SHA256.new()
h.update(b"data to hash")
digest = h.hexdigest()

# SECURE: SHA-3 for stronger collision resistance
h = SHA3_256.new()
h.update(b"document bytes")
digest = h.hexdigest()

# SECURE: HMAC with SHA-256 for message authentication
from Crypto.Hash import HMAC
import os
key = os.urandom(32)
mac = HMAC.new(key, digestmod=SHA256)
mac.update(b"message to authenticate")
tag = mac.hexdigest()

# SECURE: SHA-512 when a larger digest is needed
from Crypto.Hash import SHA512
h = SHA512.new()
h.update(b"data requiring 512-bit digest")
digest = h.hexdigest()

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class PyCryptoHashSHA(QueryType):
    fqns = ["Crypto.Hash.SHA", "Cryptodome.Hash.SHA",
            "Crypto.Hash.SHA1", "Cryptodome.Hash.SHA1"]


@python_rule(
    id="PYTHON-CRYPTO-SEC-015",
    name="Insecure SHA1 Hash (PyCryptodome)",
    severity="MEDIUM",
    category="cryptography",
    cwe="CWE-327",
    tags="python,pycryptodome,sha1,weak-hash,CWE-327",
    message="SHA-1 is deprecated for security use. Use SHA-256 or SHA-3 instead.",
    owasp="A02:2021",
)
def detect_sha1_hash_pycrypto():
    """Detects SHA1 in PyCryptodome."""
    return PyCryptoHashSHA.method("new")
```

## How to Fix

- Replace Crypto.Hash.SHA.new() and Crypto.Hash.SHA1.new() with Crypto.Hash.SHA256.new() for all integrity, signing, and authentication use cases.
- Use Crypto.Hash.SHA3_256.new() or Crypto.Hash.SHA3_512.new() when stronger collision resistance or independence from SHA-2 is needed.
- For password hashing, do not use SHA-1 or any raw hash — use Argon2id (argon2-cffi), bcrypt, or scrypt.
- Update any HMAC usage from HMAC with SHA-1 to HMAC with SHA-256: `HMAC.new(key, digestmod=SHA256)` instead of `HMAC.new(key, digestmod=SHA)`.
- If SHA-1 is required by a legacy file format or protocol you cannot modify, isolate the usage, document it explicitly, and apply compensating controls such as an outer SHA-256 HMAC over the data.

## Security Implications

- **undefined:** 
- **undefined:** 
- **undefined:** 
- **undefined:** 

## FAQ

**Q: Is SHA-1 ever safe to use with PyCryptodome?**

SHA-1 is acceptable only for non-security legacy compatibility contexts where collision resistance is not a security requirement — for example, computing Git-style content identifiers in tooling that is not security-sensitive. It must not be used for digital signatures, HMAC, certificate hashing, or password hashing.

**Q: What is the difference between this rule and PYTHON-CRYPTO-SEC-011?**

PYTHON-CRYPTO-SEC-011 targets SHA-1 used via the `cryptography` library hazmat interface (hashes.SHA1()). This rule (PYTHON-CRYPTO-SEC-015) targets SHA-1 in PyCryptodome via Crypto.Hash.SHA.new() or Crypto.Hash.SHA1.new(). The underlying weakness is identical; the rules differ by library.

**Q: Why does this rule cover both Crypto.Hash.SHA and Crypto.Hash.SHA1?**

PyCryptodome inherited the `SHA` module name from the original PyCrypto library, where `from Crypto.Hash import SHA` was the canonical way to import SHA-1. PyCryptodome also provides the explicit `SHA1` name. Both resolve to the same SHA-1 algorithm. This rule catches both to handle both modern and legacy import styles.

**Q: Why not use SHA-256 for password hashing?**

SHA-256 is designed to be fast, which benefits attackers performing brute-force attacks. Use Argon2id, bcrypt, or scrypt for password hashing — these are deliberately slow and memory-intensive to limit attacker throughput to thousands of attempts per second rather than billions.

**Q: Does this rule fire on HMAC-SHA1?**

Yes, if Crypto.Hash.SHA or Crypto.Hash.SHA1 is used as the digestmod for an HMAC. While HMAC-SHA1 is not broken by collision attacks, it is deprecated in current standards and should be replaced with HMAC-SHA256.

**Q: How do I run this rule in CI/CD?**

Run `code-pathfinder scan --ruleset python/cryptography/PYTHON-CRYPTO-SEC-015 --path ./src` in your pipeline. Add `--format sarif` to produce SARIF output compatible with GitHub Advanced Security and similar platforms.

**Q: I see SHA-1 used in a legacy codebase that I cannot fully migrate yet — how should I prioritize?**

Prioritize eliminating SHA-1 from: (1) digital signatures and certificate generation first, (2) HMAC authentication second, (3) data integrity checks third. Non-security uses such as content identifiers can be addressed last. Document all retained uses with a migration timeline and review compensating controls.

## References

- [CWE-327: Use of a Broken or Risky Cryptographic Algorithm](https://cwe.mitre.org/data/definitions/327.html)
- [CWE-328: Use of Weak Hash](https://cwe.mitre.org/data/definitions/328.html)
- [SHAttered: First SHA-1 Collision (Stevens et al., 2017)](https://shattered.io/)
- [Leurent & Peyrin 2020: SHA-1 is a Shambles — Chosen-Prefix Collisions](https://eprint.iacr.org/2020/014.pdf)
- [NIST SP 800-131A Rev 2: Transitioning the Use of Cryptographic Algorithms](https://csrc.nist.gov/publications/detail/sp/800-131a/rev-2/final)
- [FIPS 186-5: Digital Signature Standard — SHA-1 removed](https://csrc.nist.gov/publications/detail/fips/186/5/final)
- [OWASP Cryptographic Failures (A02:2021)](https://owasp.org/Top10/A02_2021-Cryptographic_Failures/)
- [PyCryptodome documentation: Crypto.Hash.SHA1](https://pycryptodome.readthedocs.io/en/latest/src/hash/sha1.html)

---

Source: https://codepathfinder.dev/registry/python/cryptography/PYTHON-CRYPTO-SEC-015
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
