Api Rate Abuse on Kubernetes
How Api Rate Abuse Manifests in Kubernetes
Rate abuse in Kubernetes manifests through several attack vectors that exploit the platform's API server and controller patterns. The Kubernetes API server itself implements rate limiting, but misconfigured services within clusters often create vulnerabilities that attackers can chain together for amplification attacks.
One common pattern involves Horizontal Pod Autoscaler (HPA) abuse. When HPA controllers are configured without proper rate limits on their scaling decisions, attackers can trigger rapid scaling cycles by repeatedly hitting the application's endpoints. This creates a feedback loop where each attack request causes the HPA to scale up, consuming more cluster resources:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: vulnerable-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: vulnerable-app
minReplicas: 1
maxReplicas: 100
behavior:
scaleDown:
stabilizationWindowSeconds: 0 # No stabilization - enables rapid cycling
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 1 # Scale by 100% every secondThe above configuration allows an attacker to cause the deployment to scale from 1 to 100 pods in seconds by sending rapid requests, then scale back down, consuming CPU and memory throughout the cycle.
Another Kubernetes-specific manifestation occurs through admission controller abuse. Admission controllers process every API request, and poorly configured webhooks can become rate limiting bottlenecks. An attacker can discover and repeatedly call these webhook endpoints, causing the API server to become unresponsive to legitimate requests:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: vulnerable-webhook
spec:
webhooks:
- name: validate.example.com
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: ["*"]
apiVersions: ["*"]
resources: ["*"]
clientConfig:
service:
name: webhook-service
namespace: default
path: /validate
timeoutSeconds: 30 # Long timeout enables DoS
failurePolicy: Fail # Blocks all requests on failureWithout proper timeout configuration and circuit breaking, this webhook becomes a single point of failure that an attacker can exploit through rate abuse.
Service mesh configurations in Kubernetes also present unique rate abuse opportunities. Istio and Linkerd often implement their own rate limiting policies, but these can be bypassed if not properly configured across all ingress and egress points. An attacker who discovers a service without mesh protection can abuse it while legitimate traffic is protected:
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: unprotected-service
spec:
hosts:
- unprotected-service
http:
- match:
- uri:
prefix: /api/vulnerable
route:
- destination:
host: vulnerable-service
# Missing rate limit configurationThis configuration allows unrestricted access to /api/vulnerable endpoints, while other services might have proper rate limiting applied.
Kubernetes-Specific Detection
Detecting API rate abuse in Kubernetes requires monitoring both the control plane and application layers. Kubernetes provides several observability mechanisms that can reveal rate abuse patterns.
API server audit logs are the first line of detection. By enabling audit logging with appropriate levels, you can track request patterns that indicate rate abuse:
apiVersion: audit.k8s.io/v1
kind: Policy
metadata:
name: rate-abuse-detection
spec:
rules:
- level: Request
resources:
- group: ""
resources: ["pods", "services", "deployments"]
verbs: ["GET", "POST", "PUT", "DELETE"]
options:
ignoreNamespaces: ["kube-system", "kube-public"]
- level: Metadata
omitStages: ["RequestReceived"]These logs can be analyzed for burst patterns using tools like Falco or custom log processors. Look for requests exceeding normal thresholds or unusual API endpoint access patterns.
Horizontal Pod Autoscaler metrics provide another detection vector. Monitoring HPA scaling events can reveal abuse:
apiVersion: v1
kind: ConfigMap
metadata:
name: hpa-monitoring-config
data:
config.yaml: |
hpa:
alert_threshold: 10 # Alert if scaling more than 10 times in 5 minutes
burst_duration: 300
max_replicas_alert: 50Prometheus metrics from HPA controllers can trigger alerts when scaling occurs too rapidly or reaches maximum replicas too frequently.
Service mesh telemetry offers comprehensive rate abuse detection across all services. Istio's Mixer provides built-in rate limiting metrics:
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: rate-limit-telemetry
spec:
metrics:
- providers:
- name: prometheus
overrides:
- match:
metric: requests_total
tags:
source_app: source.workload.name | "unknown"
destination_app: destination.workload.name | "unknown"
response_code: response.code | 200
- match:
metric: request_duration
tags:
source_app: source.workload.name | "unknown"
destination_app: destination.workload.name | "unknown"These metrics can be analyzed for abnormal request patterns using tools like Kiali or custom dashboards.
middleBrick scanning provides automated detection of rate abuse vulnerabilities without requiring cluster access. The scanner tests unauthenticated endpoints for rate limiting weaknesses and identifies services that lack proper protection:
middlebrick scan https://api.kubernetes.example.com \
--output json \
--test-rate-abuse \
--test-hpa-configurationsThe scan tests for common Kubernetes rate abuse patterns including HPA misconfigurations, admission controller vulnerabilities, and service mesh bypass opportunities. Results include severity ratings and specific remediation guidance.
Kubernetes-Specific Remediation
Remediating API rate abuse in Kubernetes requires a layered approach using native Kubernetes features and best practices. The first layer is proper HPA configuration with stabilization windows and reasonable limits:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: secured-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: secured-app
minReplicas: 2
maxReplicas: 10
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # 5 minute stabilization
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 50
periodSeconds: 60
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70This configuration prevents rapid scaling cycles by requiring sustained load before scaling decisions and limiting the rate of scale operations.
Admission controller security requires proper timeout configuration and circuit breaking:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: secured-webhook
spec:
webhooks:
- name: validate.example.com
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: ["apps"]
apiVersions: ["v1"]
resources: ["deployments"]
clientConfig:
service:
name: webhook-service
namespace: default
path: /validate
timeoutSeconds: 5 # Reduced from 30
failurePolicy: Ignore # Fail open instead of blocking
sideEffects: None
timeoutSeconds: 5
namespaceSelector:
matchLabels:
admission-webhook: enabled # Only watch specific namespacesShorter timeouts and fail-open policies prevent admission controllers from becoming DoS vectors.
Service mesh rate limiting provides application-layer protection:
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: protected-service
spec:
hosts:
- protected-service
http:
- match:
- uri:
prefix: /api
route:
- destination:
host: protected-service
rateLimits:
- namespace: default
descriptor:
key: generic_key
value: "api_requests"
rateLimit:
requestsPerUnit: 100
unit: HOURThis configuration limits API requests to 100 per hour per namespace, preventing abuse while allowing legitimate traffic.
Network policies can limit traffic to API servers and critical services:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-server-protection
spec:
podSelector:
matchLabels:
component: apiserver
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
access-apiserver: "true"
ports:
- protocol: TCP
port: 6443This network policy restricts API server access to specific namespaces, reducing the attack surface for rate abuse.
Finally, implement proper monitoring and alerting for rate abuse patterns:
apiVersion: v1
kind: ConfigMap
metadata:
name: rate-abuse-alerts
data:
alerts.yaml: |
groups:
- name: rate_abuse
rules:
- alert: RapidScaling
expr: increase(hpa_scaling_events_total[5m]) > 10
for: 2m
labels:
severity: warning
annotations:
summary: "HPA scaling too rapidly"
description: "HPA {{ $labels.name }} scaled {{ $value }} times in 5 minutes"
- alert: HighRequestRate
expr: rate(requests_total[5m]) > 1000
for: 1m
labels:
severity: critical
annotations:
summary: "High request rate detected"
description: "Service {{ $labels.service }} receiving {{ $value }} requests/second"These alerts trigger when scaling events or request rates exceed normal thresholds, enabling rapid response to rate abuse attempts.