HIGH regex dosflask

Regex Dos in Flask

How Regex Dos Manifests in Flask

Regex Denial of Service (ReDoS) in Flask applications typically occurs when user-controlled input is passed directly to regex patterns without proper validation or timeout controls. Flask's route matching system, request parameter validation, and custom input processing are common vectors for this vulnerability.

The most dangerous pattern occurs in route decorators where path parameters are extracted using regex. Consider this Flask route:

from flask import Flask
app = Flask(__name__)

@app.route('/user/<username>')
def user_profile(username):
    return f'Profile for {username}'

While this looks safe, if the username parameter is used in a regex without sanitization, an attacker can craft input that causes exponential backtracking. The classic example is nested quantifiers like (a+)+ which can be exploited with input like aaaaaaaaaaaaaaaaaaaaaaaa!.

Another Flask-specific manifestation appears in request validation. Developers often use regex to validate JSON payloads or form data:

import re
from flask import request, jsonify

@app.route('/submit', methods=['POST'])
def submit():
    data = request.get_json()
    email = data.get('email', '')
    
    # Vulnerable: no timeout, no length limits
    if not re.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$', email):
        return jsonify({'error': 'Invalid email'}), 400
    
    return jsonify({'status': 'success'})

An attacker can submit an email like user@aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!.com where the domain portion contains many repeated characters, causing the regex engine to perform excessive backtracking.

Flask's URL routing system itself can be vulnerable. When using custom converters or complex route patterns, malicious input can trigger catastrophic backtracking during route resolution:

from flask import Flask
from werkzeug.routing import BaseConverter

class HexConverter(BaseConverter):
    regex = r'[0-9a-fA-F]+'  # Vulnerable to crafted input
    
app = Flask(__name__)
app.url_map.converters['hex'] = HexConverter

@app.route('/data/<hex:id>')
def data(id):
    return f'Data for hex ID: {id}'

The impact is severe: a single request with carefully crafted input can consume 100% CPU for seconds or minutes, effectively creating a denial of service condition that blocks legitimate users from accessing the application.

Flask-Specific Detection

Detecting Regex Dos in Flask applications requires examining both the application code and runtime behavior. The most effective approach combines static analysis with runtime monitoring.

Static code analysis should focus on these Flask-specific patterns:

import ast
import re

def find_vulnerable_regex_patterns(filepath):
    with open(filepath, 'r') as f:
        tree = ast.parse(f.read())
    
    vulnerable_patterns = []
    
    for node in ast.walk(tree):
        # Check for re.match, re.search, re.compile calls
        if isinstance(node, ast.Call) and isinstance(node.func, ast.Attribute):
            if node.func.value.id == 're' and node.func.attr in ['match', 'search', 'compile']:
                # Check if the pattern is a string literal
                if isinstance(node.args[0], ast.Constant):
                    pattern = node.args[0].value
                    if has_vulnerable_regex(pattern):
                        vulnerable_patterns.append({
                            'line': node.lineno,
                            'pattern': pattern,
                            'context': ast.unparse(node)
                        })
    
    return vulnerable_patterns

def has_vulnerable_regex(pattern):
    # Check for common ReDoS patterns
    vulnerable_patterns = [
        r'([a-zA-Z]+)+',  # Nested quantifiers
        r'(a+)+',        # Classic example
        r'(.*)+',        # Greedy quantifiers
    ]
    
    for vp in vulnerable_patterns:
        if re.search(vp, pattern):
            return True
    
    return False

For runtime detection, monitoring request processing times can reveal ReDoS attempts. Flask applications should track request duration and flag suspicious patterns:

from flask import Flask, request
import time
import logging

app = Flask(__name__)
logger = logging.getLogger('werkzeug')
logger.setLevel(logging.ERROR)  # Reduce noise

@app.before_request
def start_timer():
    request.start_time = time.time()

@app.after_request
def log_request_time(response):
    if hasattr(request, 'start_time'):
        duration = time.time() - request.start_time
        if duration > 1.0:  # 1 second threshold
            app.logger.warning(
                f'Long request: {request.path} took {duration:.2f}s '
                f'from {request.remote_addr}'
            )
    return response

Automated scanning with middleBrick provides comprehensive detection by testing your Flask application's endpoints with known ReDoS payloads. The scanner tests route parameters, request bodies, and query parameters against patterns that trigger catastrophic backtracking, providing you with a security score and specific findings about vulnerable endpoints.

Flask-Specific Remediation

Remediating Regex Dos in Flask requires a defense-in-depth approach combining input validation, timeout controls, and safer regex patterns.

The first line of defense is input length validation before any regex processing:

from flask import Flask, request, jsonify
from werkzeug.exceptions import RequestEntityTooLarge

app = Flask(__name__)

# Global input size limit
app.config['MAX_CONTENT_LENGTH'] = 1 * 1024 * 1024  # 1MB

@app.route('/submit', methods=['POST'])
def submit():
    # Validate input length before processing
    if request.content_length and request.content_length > 100 * 1024:
        return jsonify({'error': 'Input too large'}), 400
    
    data = request.get_json()
    email = data.get('email', '')
    
    # Safe validation using a non-backtracking regex
    if not is_valid_email(email):
        return jsonify({'error': 'Invalid email'}), 400
    
    return jsonify({'status': 'success'})

def is_valid_email(email):
    # Use a regex that avoids backtracking
    # This pattern is safe: no nested quantifiers, no greedy operators
    pattern = r'^[a-zA-Z0-9._%+-]{1,64}@[a-zA-Z0-9.-]{1,253}\.[a-zA-Z]{2,6}$'
    return re.match(pattern, email) is not None

For cases where complex regex is necessary, implement timeout controls using the re module with a custom timeout:

import re
import signal
from flask import jsonify

class RegexTimeout(Exception):
    pass


def regex_with_timeout(pattern, string, timeout=1):
    def handler(signum, frame):
        raise RegexTimeout()
    
    # Set the signal handler and a timer
    signal.signal(signal.SIGALRM, handler)
    signal.alarm(timeout)
    
    try:
        result = re.match(pattern, string)
    finally:
        signal.alarm(0)  # Disable the timer
    
    return result

@app.route('/search')
def search():
    query = request.args.get('q', '')
    
    try:
        if regex_with_timeout(r'complex_pattern', query, timeout=2):
            return jsonify({'found': True})
        else:
            return jsonify({'found': False})
    except RegexTimeout:
        return jsonify({'error': 'Regex processing timeout'}), 503

For route parameter validation, use Flask's built-in converters or create safe custom converters:

from flask import Flask
from werkzeug.routing import BaseConverter, ValidationError

class SafeStringConverter(BaseConverter):
    def __init__(self, url_map):
        super().__init__(url_map)
        self.regex = r'[^/]+'  # No special regex characters
    
    def to_python(self, value):
        # Additional validation if needed
        if len(value) > 100:
            raise ValidationError()
        return value

app = Flask(__name__)
app.url_map.converters['safe'] = SafeStringConverter

@app.route('/user/<safe:username>')
def user_profile(username):
    return f'Profile for {username}'

Consider using alternative validation libraries that are resistant to ReDoS attacks, such as validators or pydantic with strict schemas:

from pydantic import BaseModel, EmailStr
from flask import Flask, request, jsonify

class InputData(BaseModel):
    email: EmailStr
    name: str

@app.route('/submit', methods=['POST'])
def submit()):
    try:
        data = InputData(**request.get_json())
        return jsonify({'status': 'success'})
    except Exception as e:
        return jsonify({'error': str(e)}), 400

Related CWEs: inputValidation

CWE IDNameSeverity
CWE-20Improper Input Validation HIGH
CWE-22Path Traversal HIGH
CWE-74Injection CRITICAL
CWE-77Command Injection CRITICAL
CWE-78OS Command Injection CRITICAL
CWE-79Cross-site Scripting (XSS) HIGH
CWE-89SQL Injection CRITICAL
CWE-90LDAP Injection HIGH
CWE-91XML Injection HIGH
CWE-94Code Injection CRITICAL

Frequently Asked Questions

How can I test my Flask application for Regex DoS vulnerabilities?
You can test your Flask application by using automated security scanners like middleBrick, which specifically tests for ReDoS vulnerabilities by sending crafted payloads to your endpoints. Additionally, you can perform manual testing by sending inputs with repeated characters to parameters that use regex validation and monitoring CPU usage. Look for patterns like (a+)+ or nested quantifiers in your code and test them with inputs like aaaaaaaaaaaaaaaaaaaaaaaa!.
What's the difference between regex validation and safer alternatives in Flask?
Regex validation in Flask can be vulnerable to ReDoS if patterns contain nested quantifiers or greedy operators. Safer alternatives include using Flask's built-in converters, third-party validation libraries like pydantic or marshmallow that have built-in protections, or simple string operations for basic validation. For email validation specifically, consider using dedicated libraries like email-validator which are designed to be safe and comprehensive.