Argument Matching
Match specific function arguments to create precise rules that reduce false positives. Learn when to use keyword vs. positional matching.
Why Argument Matching?
Not all function calls are dangerous. For example, app.run() is safe, butapp.run(debug=True) is a security risk in production. Argument matching lets you detect the dangerous pattern while ignoring safe usage.
Key Benefits:
- Dramatically reduce false positives
- Detect specific vulnerability patterns
- Match on configuration values
- Catch insecure defaults
Keyword Arguments
Use match_name to match keyword (named) arguments. This is the most common and readable way to match arguments.
Basic Keyword Matching
from codepathfinder import rule, calls
@rule(id="flask-debug", severity="high", cwe="CWE-489")
def detect_flask_debug():
"""Detects Flask running in debug mode"""
return calls("app.run", match_name={"debug": True})This matches:
app.run(debug=True) # Matched!
app.run(debug=False) # Not matched
app.run() # Not matchedMultiple Keyword Arguments
Match multiple keyword arguments (all must be present):
@rule(id="insecure-server", severity="critical")
def detect_insecure_server():
"""Detects server with debug and public host"""
return calls("app.run", match_name={
"debug": True,
"host": "0.0.0.0"
})This matches:
app.run(host="0.0.0.0", debug=True) # Matched!
app.run(debug=True, host="0.0.0.0") # Matched! (order doesn't matter)
app.run(debug=True) # Not matched (missing host)
app.run(host="0.0.0.0") # Not matched (missing debug)String Value Matching
@rule(id="yaml-unsafe-loader", severity="critical", cwe="CWE-502")
def detect_unsafe_yaml():
"""Detects unsafe YAML loaders"""
return calls("yaml.load", match_name={"Loader": "UnsafeLoader"})This matches:
yaml.load(data, Loader=UnsafeLoader) # Matched!
yaml.load(data, Loader=SafeLoader) # Not matched
yaml.safe_load(data) # Not matchedPositional Arguments
Use match_position to match arguments by their position (0-indexed). Useful when arguments don't have names or for older APIs.
Basic Positional Matching
@rule(id="socket-bind-all", severity="medium", cwe="CWE-668")
def detect_socket_bind_all():
"""Detects socket binding to all interfaces"""
return calls("socket.bind", match_position={0: "0.0.0.0"})This matches:
socket.bind("0.0.0.0", 8080) # Matched! (first argument is "0.0.0.0")
socket.bind("localhost", 8080) # Not matched
socket.bind(("0.0.0.0", 8080)) # See tuple indexing belowTuple Indexing
Many Python functions accept tuples. Use bracket notation to index into tuples:
@rule(id="socket-bind-tuple", severity="medium")
def detect_socket_bind_all_tuple():
"""Detects socket binding to all interfaces (tuple form)"""
return calls("socket.bind", match_position={"0[0]": "0.0.0.0"})This matches:
socket.bind(("0.0.0.0", 8080)) # Matched! (first element of first argument)
socket.bind(("localhost", 8080)) # Not matchedIPv6 Pattern
@rule(id="socket-bind-ipv6-all", severity="medium")
def detect_socket_bind_ipv6():
"""Detects binding to all IPv6 interfaces"""
return calls("socket.bind", match_position={"0[0]": "::"})Multiple Values (OR Logic)
Pass a list of values to match any of them. This creates OR logic for a single argument.
@rule(id="yaml-any-unsafe", severity="critical", cwe="CWE-502")
def detect_any_unsafe_yaml_loader():
"""Detects any unsafe YAML loader"""
return calls("yaml.load", match_position={
1: ["Loader", "UnsafeLoader", "CLoader"]
})This matches:
yaml.load(data, Loader) # Matched!
yaml.load(data, UnsafeLoader) # Matched!
yaml.load(data, CLoader) # Matched!
yaml.load(data, SafeLoader) # Not matchedMultiple Insecure Values
@rule(id="weak-hash", severity="high", cwe="CWE-327")
def detect_weak_hashing():
"""Detects weak hashing algorithms"""
return calls("hashlib.new", match_position={
0: ["md5", "sha1", "md4"]
})This matches:
hashlib.new("md5") # Matched!
hashlib.new("sha1") # Matched!
hashlib.new("sha256") # Not matched (secure algorithm)Wildcards in Arguments
Use * wildcards in argument values to match patterns.
Wildcard Strings
@rule(id="world-writable", severity="medium", cwe="CWE-732")
def detect_world_writable_permissions():
"""Detects world-writable file permissions"""
return calls("os.chmod", match_position={1: "0o7*"})This matches:
os.chmod("file.txt", 0o777) # Matched!
os.chmod("file.txt", 0o775) # Matched!
os.chmod("file.txt", 0o755) # Not matched (not world-writable)Match Any Value
Use "*" to match any value (useful for detecting presence of an argument):
@rule(id="any-password-arg", severity="high")
def detect_password_argument():
"""Detects any function with a password argument"""
return calls("*.connect", match_name={"password": "*"})This matches:
db.connect(password="secret") # Matched!
db.connect(password="") # Matched! (even empty string)
db.connect(password=get_password()) # Matched!
db.connect(user="admin") # Not matched (no password arg)Advanced Patterns
Combining Keyword and Positional
@rule(id="complex-pattern", severity="high")
def detect_complex_vulnerability():
"""Matches both positional and keyword arguments"""
return calls(
"subprocess.Popen",
match_position={0: "*sh*"}, # Command contains "sh"
match_name={"shell": True} # AND shell=True
)This matches:
subprocess.Popen("/bin/sh", shell=True) # Matched!
subprocess.Popen("bash -c ls", shell=True) # Matched!
subprocess.Popen("/bin/sh", shell=False) # Not matched
subprocess.Popen("python", shell=True) # Not matchedDetect Missing Security Argument
Sometimes the absence of an argument is dangerous. Use OR logic with argument matching:
from codepathfinder import rule, calls, Or
@rule(id="requests-no-verify", severity="high", cwe="CWE-295")
def detect_ssl_verification_disabled():
"""Detects HTTP requests without SSL verification"""
return Or(
calls("requests.*", match_name={"verify": False}),
calls("requests.*", match_name={"verify": "*cert*"})
)Real-World: Dangerous Pickle
@rule(id="pickle-from-network", severity="critical", cwe="CWE-502")
def detect_pickle_deserialization():
"""Detects unpickling data from network sources"""
return Or(
# Direct pickle calls
calls("pickle.loads"),
calls("pickle.load"),
# Pickle with specific protocols
calls("pickle.Unpickler", match_name={"*": "*"}),
# Django pickle sessions
calls("*.get_session", match_name={"serializer": "*Pickle*"})
)Best Practices:
- Prefer
match_nameovermatch_positionwhen possible (more readable) - Use wildcards sparingly to avoid over-matching
- Test with both positive and negative examples
- Combine with OR logic to catch multiple dangerous patterns