Argument Matching

Match specific function arguments to create precise rules that reduce false positives. Learn when to use keyword vs. positional matching.

Why Argument Matching?

Not all function calls are dangerous. For example, app.run() is safe, butapp.run(debug=True) is a security risk in production. Argument matching lets you detect the dangerous pattern while ignoring safe usage.

Key Benefits:

  • Dramatically reduce false positives
  • Detect specific vulnerability patterns
  • Match on configuration values
  • Catch insecure defaults

Keyword Arguments

Use match_name to match keyword (named) arguments. This is the most common and readable way to match arguments.

Basic Keyword Matching

from codepathfinder import rule, calls

@rule(id="flask-debug", severity="high", cwe="CWE-489")
def detect_flask_debug():
    """Detects Flask running in debug mode"""
    return calls("app.run", match_name={"debug": True})

This matches:

app.run(debug=True)      # Matched!
app.run(debug=False)     # Not matched
app.run()                # Not matched

Multiple Keyword Arguments

Match multiple keyword arguments (all must be present):

@rule(id="insecure-server", severity="critical")
def detect_insecure_server():
    """Detects server with debug and public host"""
    return calls("app.run", match_name={
        "debug": True,
        "host": "0.0.0.0"
    })

This matches:

app.run(host="0.0.0.0", debug=True)  # Matched!
app.run(debug=True, host="0.0.0.0")  # Matched! (order doesn't matter)
app.run(debug=True)                  # Not matched (missing host)
app.run(host="0.0.0.0")              # Not matched (missing debug)

String Value Matching

@rule(id="yaml-unsafe-loader", severity="critical", cwe="CWE-502")
def detect_unsafe_yaml():
    """Detects unsafe YAML loaders"""
    return calls("yaml.load", match_name={"Loader": "UnsafeLoader"})

This matches:

yaml.load(data, Loader=UnsafeLoader)     # Matched!
yaml.load(data, Loader=SafeLoader)       # Not matched
yaml.safe_load(data)                     # Not matched

Positional Arguments

Use match_position to match arguments by their position (0-indexed). Useful when arguments don't have names or for older APIs.

Basic Positional Matching

@rule(id="socket-bind-all", severity="medium", cwe="CWE-668")
def detect_socket_bind_all():
    """Detects socket binding to all interfaces"""
    return calls("socket.bind", match_position={0: "0.0.0.0"})

This matches:

socket.bind("0.0.0.0", 8080)    # Matched! (first argument is "0.0.0.0")
socket.bind("localhost", 8080)   # Not matched
socket.bind(("0.0.0.0", 8080))   # See tuple indexing below

Tuple Indexing

Many Python functions accept tuples. Use bracket notation to index into tuples:

@rule(id="socket-bind-tuple", severity="medium")
def detect_socket_bind_all_tuple():
    """Detects socket binding to all interfaces (tuple form)"""
    return calls("socket.bind", match_position={"0[0]": "0.0.0.0"})

This matches:

socket.bind(("0.0.0.0", 8080))  # Matched! (first element of first argument)
socket.bind(("localhost", 8080)) # Not matched

IPv6 Pattern

@rule(id="socket-bind-ipv6-all", severity="medium")
def detect_socket_bind_ipv6():
    """Detects binding to all IPv6 interfaces"""
    return calls("socket.bind", match_position={"0[0]": "::"})

Multiple Values (OR Logic)

Pass a list of values to match any of them. This creates OR logic for a single argument.

@rule(id="yaml-any-unsafe", severity="critical", cwe="CWE-502")
def detect_any_unsafe_yaml_loader():
    """Detects any unsafe YAML loader"""
    return calls("yaml.load", match_position={
        1: ["Loader", "UnsafeLoader", "CLoader"]
    })

This matches:

yaml.load(data, Loader)          # Matched!
yaml.load(data, UnsafeLoader)    # Matched!
yaml.load(data, CLoader)         # Matched!
yaml.load(data, SafeLoader)      # Not matched

Multiple Insecure Values

@rule(id="weak-hash", severity="high", cwe="CWE-327")
def detect_weak_hashing():
    """Detects weak hashing algorithms"""
    return calls("hashlib.new", match_position={
        0: ["md5", "sha1", "md4"]
    })

This matches:

hashlib.new("md5")     # Matched!
hashlib.new("sha1")    # Matched!
hashlib.new("sha256")  # Not matched (secure algorithm)

Wildcards in Arguments

Use * wildcards in argument values to match patterns.

Wildcard Strings

@rule(id="world-writable", severity="medium", cwe="CWE-732")
def detect_world_writable_permissions():
    """Detects world-writable file permissions"""
    return calls("os.chmod", match_position={1: "0o7*"})

This matches:

os.chmod("file.txt", 0o777)  # Matched!
os.chmod("file.txt", 0o775)  # Matched!
os.chmod("file.txt", 0o755)  # Not matched (not world-writable)

Match Any Value

Use "*" to match any value (useful for detecting presence of an argument):

@rule(id="any-password-arg", severity="high")
def detect_password_argument():
    """Detects any function with a password argument"""
    return calls("*.connect", match_name={"password": "*"})

This matches:

db.connect(password="secret")      # Matched!
db.connect(password="")                # Matched! (even empty string)
db.connect(password=get_password())    # Matched!
db.connect(user="admin")               # Not matched (no password arg)

Advanced Patterns

Combining Keyword and Positional

@rule(id="complex-pattern", severity="high")
def detect_complex_vulnerability():
    """Matches both positional and keyword arguments"""
    return calls(
        "subprocess.Popen",
        match_position={0: "*sh*"},        # Command contains "sh"
        match_name={"shell": True}         # AND shell=True
    )

This matches:

subprocess.Popen("/bin/sh", shell=True)   # Matched!
subprocess.Popen("bash -c ls", shell=True) # Matched!
subprocess.Popen("/bin/sh", shell=False)   # Not matched
subprocess.Popen("python", shell=True)     # Not matched

Detect Missing Security Argument

Sometimes the absence of an argument is dangerous. Use OR logic with argument matching:

from codepathfinder import rule, calls, Or

@rule(id="requests-no-verify", severity="high", cwe="CWE-295")
def detect_ssl_verification_disabled():
    """Detects HTTP requests without SSL verification"""
    return Or(
        calls("requests.*", match_name={"verify": False}),
        calls("requests.*", match_name={"verify": "*cert*"})
    )

Real-World: Dangerous Pickle

@rule(id="pickle-from-network", severity="critical", cwe="CWE-502")
def detect_pickle_deserialization():
    """Detects unpickling data from network sources"""
    return Or(
        # Direct pickle calls
        calls("pickle.loads"),
        calls("pickle.load"),

        # Pickle with specific protocols
        calls("pickle.Unpickler", match_name={"*": "*"}),

        # Django pickle sessions
        calls("*.get_session", match_name={"serializer": "*Pickle*"})
    )

Best Practices:

  • Prefer match_name over match_position when possible (more readable)
  • Use wildcards sparingly to avoid over-matching
  • Test with both positive and negative examples
  • Combine with OR logic to catch multiple dangerous patterns