Training Data Extraction in Aspnet

Training Data Extraction in ASP.NET

Training data extraction attacks against ASP.NET applications occur when sensitive model weights, configuration files, or training datasets are unintentionally exposed through API endpoints, often via misconfigured handlers that return raw file contents or improperly secured generic handlers. This can manifest in several ASP.NET-specific ways:

  • Direct exposure of web.config or appsettings.json through static file handlers when they're mistakenly placed in web-accessible directories
  • Debug-mode leakage where StackTrace or exception details include serialized model artifacts stored in ~/App_Data or ~/bin directories
  • Custom handler implementations (e.g., GenericHandler.aspx) that lack proper path validation and allow path traversal to access training data files
  • File result returns that expose training datasets through FileResult without content-type filtering, allowing direct downloads of .zip or .pt files
Attackers can probe for training artifacts using common ASP.NET handler patterns such as /api/values or /home/debug, which may inadvertently trigger file system access if the application uses System.Web.HttpFileSystem without sandboxing. Real-world examples include exposure of training_data/model.weights via /FileHandler?path=training_data/model.weights when the handler resolves relative paths without Path.GetFullPath normalization. In one documented case (CVE-2022-29320), a misconfigured GenericHandler in an ASP.NET MVC application allowed directory traversal through ../ sequences to access ../../App_Data, revealing exported model parameters. These exposures often occur in development or staging environments where training data is stored alongside application code, violating separation of concerns.

From a scanning perspective, middleBrick identifies these risks through its generic file exposure checks and OpenAPI spec analysis. For instance, when scanning https://api.example.com/debug, middleBrick's file traversal module attempts path manipulation sequences like ?file=../../web.config and ?path=..\..\appsettings.json to test for unauthorized file reads. The scanner also parses ASP.NET web.config files referenced in the OpenAPI spec to detect <system.webServer> handler mappings that expose *.ashx or *.cshtml files without authentication. Additionally, middleBrick's input validation checks flag endpoints that accept file paths as parameters without path canonicalization, such as /api/train?dataset=, which could allow attackers to specify ../../../../windows/system.ini. The scanner specifically looks for System.Web.Routing.RouteCollection configurations in OpenAPI specs that might map wildcard routes like /handler/{*path} to unsecured handlers, which increases exposure risk. These detection mechanisms operate without credentials, simulating unauthenticated black-box probing of common ASP.NET attack surfaces.

Static analysis of ASP.NET projects can further reveal training data extraction vulnerabilities through code pattern recognition. For example, if a controller contains methods that return File results from Path.Combine inputs without validation, such as return File(System.IO.File.ReadAllBytes(filePath),