# PYTHON-LANG-SEC-062: Insecure urllib Request Object Usage

> **Severity:** MEDIUM | **CWE:** CWE-319 | **OWASP:** A02:2021

- **Language:** Python
- **Category:** Python Core
- **URL:** https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-062
- **Detection:** `pathfinder scan --ruleset python/PYTHON-LANG-SEC-062 --project .`

## Description

urllib.request.Request() creates a request object that encapsulates a URL, headers, and
request body. When the URL in the Request object uses http:// rather than https://,
all data transmitted including authentication headers, API keys, request bodies, and
cookies is sent in plaintext over the network.

urllib.request.OpenerDirector (obtained via urllib.request.build_opener()) provides
a customizable request opener. When configured without proper HTTPS handlers or used
with HTTP URLs, it has the same cleartext transmission risk.

This rule audits urllib.request.Request() and urllib.request.build_opener() usage to
ensure HTTPS URLs are used and no insecure handlers are installed.


## Vulnerable Code

```python
import requests as http_requests
import urllib.request
import ftplib
import telnetlib

# SEC-062: urllib Request
req = urllib.request.Request("http://example.com")
```

## Secure Code

```python
import urllib.request
import ssl

# INSECURE: urllib.request.Request with HTTP URL
# req = urllib.request.Request("http://api.example.com/data",
#                              headers={"Authorization": "Bearer token"})
# response = urllib.request.urlopen(req)

# SECURE: Use HTTPS URL with the Request object
def make_authenticated_request(url: str, token: str) -> bytes:
    if not url.startswith("https://"):
        raise ValueError("Only HTTPS URLs are permitted")
    req = urllib.request.Request(
        url,
        headers={"Authorization": f"Bearer {token}"},
    )
    ctx = ssl.create_default_context()
    with urllib.request.urlopen(req, context=ctx, timeout=30) as response:
        return response.read()

# PREFERRED: Use the requests library instead
import requests

def api_call(url: str, token: str) -> dict:
    response = requests.get(
        url,
        headers={"Authorization": f"Bearer {token}"},
        timeout=30,
    )
    response.raise_for_status()
    return response.json()

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, QueryType

class UrllibModule(QueryType):
    fqns = ["urllib.request"]


@python_rule(
    id="PYTHON-LANG-SEC-062",
    name="Insecure urllib Request Object",
    severity="MEDIUM",
    category="lang",
    cwe="CWE-319",
    tags="python,urllib,request-object,insecure-transport,CWE-319",
    message="urllib.request.Request() detected. Ensure HTTPS URLs are used.",
    owasp="A02:2021",
)
def detect_urllib_request():
    """Detects urllib.request.Request and OpenerDirector usage."""
    return UrllibModule.method("Request", "OpenerDirector")
```

## How to Fix

- Validate that all URLs in urllib.request.Request() objects use the https:// scheme before the request is executed.
- When using OpenerDirector, audit all installed handlers to ensure no insecure HTTP handlers or certificate bypass handlers are used.
- Pass an explicit ssl.create_default_context() to urlopen() when executing HTTPS Request objects for clarity and security.
- Consider migrating from urllib.request.Request/OpenerDirector to the requests library for simpler, more secure HTTP client code.
- Add URL scheme validation in any code that accepts URLs from configuration or user input before using them in urllib.request.Request().

## Security Implications

- **Authentication Header Exposure:** Authentication headers added to Request objects (Authorization, X-API-Key, Cookie)
are transmitted in plaintext when HTTP URLs are used, exposing credentials to
network observers.

- **Request Body Disclosure:** POST request bodies containing passwords, form data, file contents, or API parameters
are transmitted without encryption, enabling data theft by network observers.

- **Custom Handler Security Bypass:** OpenerDirector with custom handlers can install insecure protocols or disable
default security behaviors. Handlers that bypass certificate verification or
add HTTP Basic Auth to HTTP (not HTTPS) URLs create additional security risks.

- **Redirect Security:** Custom openers may handle redirects differently from urlopen(), potentially following
redirects from HTTPS to HTTP URLs or not handling security-sensitive redirect
scenarios correctly.


## FAQ

**Q: What is the difference between urllib.request.Request and urlopen()?**

urllib.request.Request() creates a request object that stores the URL, headers, data,
and method without making the network connection. urllib.request.urlopen() actually
executes the HTTP request. You can pass a Request object to urlopen() to execute it.
The security concern is the URL scheme stored in the Request object when it is executed.


**Q: Is urllib.request.build_opener() inherently insecure?**

No. build_opener() creates an opener with configurable handlers. It becomes insecure
if handlers are installed that bypass certificate verification (HTTPSHandler with
an unverified context) or if HTTP URLs are used with credential headers. Review
all handlers installed via build_opener().


**Q: How do I add Basic Authentication with urllib.request securely?**

Use urllib.request.HTTPPasswordMgrWithDefaultRealm() and urllib.request.HTTPBasicAuthHandler()
with the opener. Ensure the URL uses https:// so credentials are not transmitted in
plaintext. Alternatively, add the Authorization header directly to the Request object.


**Q: What custom handlers might be security-sensitive?**

HTTPRedirectHandler: controls redirect behavior. ProxyHandler: routes traffic through
a proxy (ensure proxy uses HTTPS). HTTPSHandler: controls SSL context (ensure it uses
ssl.create_default_context()). UnknownHandler: handles unknown protocols (avoid). Any
custom handler that bypasses authentication or certificate verification.


**Q: Should I use urllib.request.install_opener() globally?**

urllib.request.install_opener() sets a global default opener that affects all subsequent
urlopen() calls in the process. Avoid installing global openers with reduced security
settings as they affect all HTTP requests, not just the intended ones. Use local openers
passed explicitly to urlopen() instead.


**Q: Does this rule detect urllib.request usage in Python 2 compatibility code?**

The rule targets Python 3 urllib.request patterns. Python 2 used urllib and urllib2
which are not available in Python 3. Code that imports from six.moves.urllib or
uses compatibility shims is not directly detected by this rule.


## References

- [CWE-319: Cleartext Transmission of Sensitive Information](https://cwe.mitre.org/data/definitions/319.html)
- [Python docs: urllib.request.Request](https://docs.python.org/3/library/urllib.request.html#urllib.request.Request)
- [Python docs: urllib.request.build_opener()](https://docs.python.org/3/library/urllib.request.html#urllib.request.build_opener)
- [OWASP TLS Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Transport_Layer_Security_Cheat_Sheet.html)
- [OWASP Top 10 A02:2021 Cryptographic Failures](https://owasp.org/Top10/A02_2021-Cryptographic_Failures/)

---

Source: https://codepathfinder.dev/registry/python/lang/PYTHON-LANG-SEC-062
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
