# PYTHON-LANG-SEC-031: Insecure SHA-1 Hash Usage

> **Severity:** MEDIUM | **CWE:** CWE-327 | **OWASP:** A02:2021

- **Language:** Python
- **Category:** Python Core
- **URL:** https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-031
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LANG-SEC-031 --project .`

## Description

SHA-1 (Secure Hash Algorithm 1) produces a 160-bit digest and has been deprecated for
cryptographic use since 2011. In 2017, the SHAttered attack demonstrated a practical
SHA-1 collision, and chosen-prefix collision attacks became feasible in 2019, enabling
forgery of X.509 certificates, PGP keys, and other structures that use SHA-1 signatures.

SHA-1 should not be used for digital signatures, certificate fingerprinting, HMAC in
new protocols, code signing, password hashing, or any context where collision resistance
is a security requirement. NIST formally deprecated SHA-1 for all security applications
in 2022.

SHA-1 retains some non-security uses: it remains acceptable for non-adversarial integrity
checks, content-addressed storage where the threat model does not include adversarial input,
and HMAC-SHA1 in legacy protocol compatibility (where cryptographic analysis shows HMAC
construction still provides MAC security despite the hash's weaknesses).


## Vulnerable Code

```python
import hashlib

sha1_hash = hashlib.sha1(b"data")
```

## Secure Code

```python
import hashlib

# INSECURE: SHA-1 for security-sensitive hashing
# digest = hashlib.sha1(data).hexdigest()

# SECURE: SHA-256 for general security purposes
def compute_signature_hash(data: bytes) -> str:
    return hashlib.sha256(data).hexdigest()

# SECURE: SHA-3 for new protocols requiring stronger guarantees
def compute_document_hash(data: bytes) -> str:
    return hashlib.sha3_256(data).hexdigest()

# SECURE: BLAKE2b for high-performance secure hashing
def compute_fast_hash(data: bytes) -> str:
    return hashlib.blake2b(data, digest_size=32).hexdigest()

# SHA-1 in HMAC for legacy protocol compatibility (document clearly)
import hmac
def legacy_hmac_sha1(key: bytes, message: bytes) -> bytes:
    # ACCEPTABLE: HMAC-SHA1 is cryptographically sound for MAC purposes
    # despite SHA-1's collision weakness. Only use for legacy protocol compatibility.
    return hmac.new(key, message, "sha1").digest()

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, QueryType

class HashlibModule(QueryType):
    fqns = ["hashlib"]


@python_rule(
    id="PYTHON-LANG-SEC-031",
    name="Insecure SHA1 Hash Usage",
    severity="MEDIUM",
    category="lang",
    cwe="CWE-327",
    tags="python,sha1,weak-hash,cryptography,OWASP-A02,CWE-327",
    message="SHA-1 is cryptographically weak. Use SHA-256 or SHA-3 instead.",
    owasp="A02:2021",
)
def detect_sha1():
    """Detects hashlib.sha1() usage."""
    return HashlibModule.method("sha1")
```

## How to Fix

- Replace hashlib.sha1() with hashlib.sha256() or hashlib.sha3_256() for all security-sensitive hashing operations.
- Update certificate fingerprint verification to use SHA-256 fingerprints.
- Migrate code signing pipelines from SHA-1 to SHA-256 or stronger algorithms.
- For HMAC in new code, use HMAC-SHA-256; HMAC-SHA1 may be retained only for legacy protocol compatibility where migration is not feasible.
- Audit all SHA-1 usages and document whether each is security-sensitive or acceptable for non-adversarial use.

## Security Implications

- **Chosen-Prefix Collision Attacks:** Chosen-prefix collisions allow an attacker to craft two documents with identical
SHA-1 hashes where each document starts with attacker-chosen content. This has been
used to forge X.509 certificates and create malicious files that match expected
checksums. The attack cost is within reach of well-funded attackers.

- **Certificate Forgery:** X.509 certificates using SHA-1 signatures can be forged through collision attacks.
Major browser vendors and CA/Browser Forum deprecated SHA-1 certificates in 2016-2017.
Applications still accepting SHA-1 certificate fingerprints are vulnerable to
certificate impersonation attacks.

- **Code Signing Weakness:** Software signed with SHA-1 can potentially be replaced by malicious code with a
crafted collision that produces the same SHA-1 hash, undermining software supply
chain integrity.

- **PGP Key Forgery:** PGP/GPG uses SHA-1 for key fingerprints in older key formats. SHA-1 collision attacks
have been demonstrated against PGP key certification signatures, enabling key
impersonation in some configurations.


## FAQ

**Q: Is SHA-1 completely broken or just weakened?**

SHA-1 has practical collision attacks (SHAttered 2017, chosen-prefix 2019) that
are within reach of determined attackers. It is broken for collision-resistance-dependent
uses (signatures, certificates, code signing). HMAC-SHA1 retains MAC security since
HMAC's security depends on pseudorandomness rather than collision resistance, but
HMAC-SHA256 is strongly preferred for new code.


**Q: Why is SHA-1 still in Git and some protocols?**

Git historically used SHA-1 for content addressing (not digital signatures), where
the threat model is accidental corruption rather than adversarial collision. Git is
transitioning to SHA-256 (SHA-256 repository support was added in Git 2.29). Legacy
protocols using HMAC-SHA1 (e.g., TOTP, some OAuth variants) retain security because
HMAC construction is not broken by SHA-1 collision attacks.


**Q: How is SHA-1 different from MD5 in terms of security?**

SHA-1 is stronger than MD5 (160-bit vs 128-bit digest, more complex structure) but
both have practical collision attacks and should not be used for security purposes.
SHA-1 collision attacks are more expensive than MD5 collision attacks but have been
demonstrated. The migration guidance is the same: use SHA-256 or stronger.


**Q: What is the recommended migration timeline for SHA-1?**

For new code: use SHA-256 immediately. For existing systems: prioritize certificate
fingerprints and code signing (highest risk), then integrity verification, then
content addressing. HMAC-SHA1 in legacy protocols can be lower priority if migration
cost is high, but should be documented with a migration plan.


**Q: Does FIPS mode affect SHA-1 usage in Python?**

In FIPS 140-3 mode, SHA-1 is not approved for security applications. Python's hashlib
in FIPS mode may raise an error or require the usedforsecurity=False parameter when
creating a SHA-1 hash. Systems operating under FIPS requirements must migrate away
from SHA-1 in all security-sensitive contexts.


**Q: Is SHA-1 safe for non-cryptographic uses like hash tables?**

SHA-1 is computationally heavy compared to purpose-built non-cryptographic hash
functions. For hash tables and checksums with no adversarial threat model, use
xxHash, SipHash, or Python's built-in hash() instead. SHA-1 is neither the safest
nor the most performant choice for non-security hash uses.


## References

- [CWE-327: Use of a Broken or Risky Cryptographic Algorithm](https://cwe.mitre.org/data/definitions/327.html)
- [SHAttered: First SHA-1 collision (2017)](https://shattered.io/)
- [Python docs: hashlib module](https://docs.python.org/3/library/hashlib.html)
- [OWASP Cryptographic Storage Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Cryptographic_Storage_Cheat_Sheet.html)
- [NIST SP 800-131A Revision 2 - Transitioning the Use of Cryptographic Algorithms](https://csrc.nist.gov/publications/detail/sp/800-131a/rev-2/final)

---

Source: https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-031
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
