Unicode Normalization in Gin
How Unicode Normalization Manifests in Gin
Unicode normalization attacks in Gin applications typically exploit the framework's handling of HTTP request parameters and path variables. Gin's c.Param(), c.Query(), and c.PostForm() methods don't automatically normalize Unicode input, creating a window for attackers to bypass security controls.
A common manifestation is path traversal via Unicode normalization. An attacker might send a request to /api/v1/users/%C3%A9ric (precomposed e-acute) versus /api/v1/users/%65%2B%CC%81ric (e + combining acute accent). These represent the same visual character but different byte sequences. If your Gin handler uses these parameters for file access or database queries without normalization, you might have inconsistent behavior:
func getUser(c *gin.Context) {
userID := c.Param("id")
// Attacker can craft userID with different normalization forms
// to bypass authorization checks or access unintended resources
user, err := db.GetUserByID(userID)
// ...
}Authentication bypasses are another critical vector. If your Gin application uses username parameters for lookup, an attacker can craft usernames that normalize to valid accounts. For example, admin versus ädmin (a with diaeresis) might hash differently or match different database entries depending on your collation settings.
Rate limiting and API key validation are also vulnerable. A Gin middleware that tracks requests by API key without normalization allows the same logical key in different forms to count as separate entities, potentially bypassing rate limits entirely.
Gin-Specific Detection
Detecting Unicode normalization issues in Gin requires both static analysis and runtime testing. Start by examining your handler functions for direct parameter usage without validation. Look for patterns like:
// Vulnerable: direct parameter usage without normalization
func getDocument(c *gin.Context) {
docID := c.Param("id")
// No normalization before database lookup
doc, err := storage.GetDocument(docID)
// ...
}middleBrick's black-box scanning can identify these vulnerabilities by sending requests with different Unicode normalization forms to the same endpoint and checking for inconsistent responses. The scanner tests both NFC (composed) and NFD (decomposed) forms, as well as NFKC and NFKD variants, to detect if your API treats logically identical inputs differently.
For comprehensive detection, implement logging that captures the raw request bytes and normalized forms. Compare how different normalization forms affect your application's behavior:
func normalizeInput(input string) string {
return strings.ToLower(
string(unicode.Normalize(unicode.NFC, []rune(input))),
)
}
// Test with different forms
func testNormalization() {
testCases := map[string]string{
"NFC": "éric",
"NFD": "éric",
"NFKC": "A", // Full-width A
"NFKD": "A", // Compatibility decomposition
}
for form, input := range testCases {
normalized := normalizeInput(input)
// Check if different forms produce same normalized output
// If not, you have a normalization vulnerability
}
}middleBrick's API security scanning automatically tests these normalization vectors across all 12 security categories, including authentication bypasses and authorization flaws that might result from inconsistent Unicode handling.
Gin-Specific Remediation
The most effective remediation in Gin is to normalize all user input at the earliest possible point in your request handling pipeline. Create a middleware that normalizes query parameters, path variables, and request bodies before they reach your handlers:
func unicodeNormalizationMiddleware() gin.HandlerFunc {
return func(c *gin.Context) {
// Normalize path parameters
for param := range c.Params {
c.Params[param].Value = normalizeString(c.Params[param].Value)
}
// Normalize query parameters
query := c.Request.URL.Query()
for key, values := range query {
for i, val := range values {
query.Set(key, normalizeString(val))
}
}
c.Request.URL.RawQuery = query.Encode()
// Normalize JSON body if present
if c.ContentType() == "application/json" {
var body map[string]interface{}
if err := c.ShouldBindJSON(&body); err == nil {
normalizedBody := normalizeMap(body)
// Re-encode normalized body
jsonBody, _ := json.Marshal(normalizedBody)
c.Request.Body = io.NopCloser(bytes.NewBuffer(jsonBody))
c.Request.ContentLength = int64(len(jsonBody))
c.Request = c.Request.WithContext(context.WithValue(
c.Request.Context(), "normalizedBody", normalizedBody))
}
}
c.Next()
}
}
func normalizeString(s string) string {
return string(unicode.Normalize(unicode.NFC, []rune(s)))
}
func normalizeMap(m map[string]interface{}) map[string]interface{} {
normalized := make(map[string]interface{})
for k, v := range m {
normalized[normalizeString(k)] = normalizeValue(v)
}
return normalized
}
func normalizeValue(v interface{}) interface{} {
switch val := v.(type) {
case string:
return normalizeString(val)
case map[string]interface{}:
return normalizeMap(val)
case []interface{}:
for i, item := range val {
val[i] = normalizeValue(item)
}
return val
default:
return v
}
}
// Use in main.go
func main() {
r := gin.New()
r.Use(unicodeNormalizationMiddleware())
// ... routes
}For database queries, ensure your ORM or database client uses consistent collation. With GORM, you can set the collation at the connection level or use binary collation for exact byte matching when needed:
dsn := "user:password@/dbname?charset=utf8mb4&collation=utf8mb4_unicode_ci"
db, err := gorm.Open(mysql.Open(dsn), &gorm.Config{})
// Or for exact byte matching when required
db.Exec("SET NAMES utf8mb4 COLLATE utf8mb4_bin")Always validate and sanitize input after normalization. Use libraries like github.com/go-playground/validator to enforce expected formats, and consider implementing a whitelist approach for known-good characters in critical parameters.