Usage & Rate Limit Policies

Available on Enterprise Self Hosting plan only.Requires 1.17.0 or higher version of the Gateway.

Policies allow you to create fine-grained controls over API usage and rate limits at the workspace level. You can define conditions to target specific requests and group usage/rate limits by various dimensions.

Policy Types

1. Usage Limit Policies

Control spend (cost) or token consumption with configurable limits and periodic resets.

2. Rate Limit Policies

Control request throughput with requests-per-minute (rpm), requests-per-hour (rph), or requests-per-day (rpd) limits.

Policy Structure

Conditions

Conditions determine which requests a policy applies to. All conditions must match (AND logic). Each condition supports:

Field	Type	Description
`key`	string	The dimension to match against
`value`	string \| string[]	Value(s) to match (OR logic for arrays). Use `"*"` for wildcard.
`excludes`	string \| string[]	Value(s) to exclude from matching

Supported Condition Keys

Key	Description	Example Value	Version
`api_key`	Match by API key ID	`"uuid_of_api_key"`	1.17.0+
`metadata.*`	Match by request metadata	`"metadata._user"` with value `"user123"`	1.17.0+
`workspace_id`	Match by workspace	`"workspace id"`	1.17.0+
`virtual_key`	Match by virtual key slug	`"virtual key slug"`	2.0.0+
`provider`	Match by provider	`"openai"`, `"anthropic"`, `"azure-openai"`	2.0.0+
`config`	Match by config slug	`"config slug"`	2.0.0+
`prompt`	Match by prompt slug	`"prompt slug"`	2.0.0+
`model`	Match by model (with wildcard support)	`"@openai/gpt-4o"`, `"@anthropic/*"`	2.0.0+

Group By

Group by determines how usage/limits are bucketed. Each unique combination of group_by values gets its own counter. Example:

[
  {
    "key": "api_key"
  },
  {
    "key": "metadata._user"
  }
]

Authentication

All policy endpoints require authentication using:

API Key: Include in x-portkey-api-key header

Permissions

Policies require the following RBAC permissions:

policies:create - Create policies
policies:read - Read policies
policies:update - Update policies
policies:delete - Delete policies
policies:list - List policies

Base URL

All policy endpoints are under:

/v1/policies

Usage Limits Policies

Usage limits policies allow you to set maximum usage (cost or tokens) that can be consumed over a period. When the limit is reached, requests will be blocked until the limit resets.

When a usage limit is exceeded, Portkey returns a 412 Precondition Failed HTTP status code.

Policy Types

cost: Limit based on total cost (in dollars)
tokens: Limit based on total tokens consumed

Parameters

`credit_limit` (required)

The maximum usage limit that can be consumed before requests are blocked.

Type: Number (integer or float)
Minimum value:
- For cost type: 1 (represents $1.00)
- For tokens type: 100 tokens
Units:
- For cost type: Value is in USD (dollars)
- For tokens type: Value is in tokens
Behaviour: When the credit limit is reached, all matching requests will be blocked until the limit resets (if periodic_reset is configured)

`alert_threshold` (optional)

An optional threshold that triggers notifications before the credit limit is reached.

Type: Number (integer or float)
Minimum value: 1
Units:
- For cost type: Value is in USD (dollars)
- For tokens type: Value is in tokens
Validation: Must be less than credit_limit if provided
Behaviour:
- When usage reaches this threshold, email notifications are sent to configured recipients
- The API key continues to function normally until the full credit_limit is reached
- Useful for proactive monitoring and budget management

`periodic_reset` (optional)

Configures automatic reset of the usage limit at regular intervals.

Type: String (enum)
Valid values:
- "weekly" - Budget limits automatically reset every week
- "monthly" - Budget limits automatically reset every month
- Omitted/not provided - No periodic reset (limit applies until exhausted)
Reset timing:
- Weekly: Resets occur every Monday at 12:00 AM UTC
- Monthly: Resets occur on the 1st calendar day of each month at 12:00 AM UTC
Behaviour: When a reset occurs, the usage counter resets to zero and the limit becomes available again

Validation Rules

Conditions: Each condition must have key and value fields.
Group By: Each group must have a key field.
Valid Keys: For both conditions and group_by, valid keys include api_key, workspace_id, virtual_key, provider, config, prompt, model, or any key starting with metadata.
Alert Threshold: Must be less than credit_limit if provided.
Workspace: Workspace ID is required (can be provided via API key or request body).

Rate Limits Policies

Rate limits policies allow you to control the rate of requests or tokens consumed per minute, hour, or day. When the rate limit is exceeded, requests will be throttled.

When a rate limit is exceeded, Portkey returns a 429 Too Many Requests HTTP status code.

Policy Types

requests: Limit based on number of requests
tokens: Limit based on number of tokens

Rate Units

rpm: Requests/Tokens per minute
rph: Requests/Tokens per hour
rpd: Requests/Tokens per day

Parameters

`type` (required)

The type of rate limit to enforce.

Type: String (enum)
Valid values:
- "requests" - Limit based on number of API requests
- "tokens" - Limit based on number of tokens consumed
Behaviour: Determines what metric is being rate-limited

`unit` (required)

The time interval unit for the rate limit.

Type: String (enum)
Valid values:
- "rpm" - Requests/Tokens per minute
- "rph" - Requests/Tokens per hour
- "rpd" - Requests/Tokens per day
Behaviour:
- Defines the time window over which the rate limit is calculated
- Limits reset automatically at the start of each time period

`value` (required)

The maximum number of requests or tokens allowed within the specified time unit.

Type: Number (integer)
Minimum value: 1
Units:
- For requests type: Value represents the number of API requests
- For tokens type: Value represents the number of tokens
Behaviour: When the rate limit is exceeded, subsequent requests are throttled/rejected until the time period resets

Validation Rules

Conditions: Each condition must have key and value fields.
Group By: Each group must have a key field.
Valid Keys: For both conditions and group_by, valid keys include api_key, workspace_id, virtual_key, provider, config, prompt, model, or any key starting with metadata.
Value: Must be a numeric value.
Workspace: Workspace ID is required (can be provided via API key or request body).

Use Cases

Use Case 1: Global Workspace Rate Limit

Limit all requests in a workspace to 1000 requests per minute.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [{"key": "api_key", "value": "*"}],
    "group_by": [{"key": "workspace_id"}],
    "value": 1000,
    "type": "requests",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 2: Per User Rate Limit

Limit each user (identified by _user metadata) to 100 requests per minute.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "metadata._user", "value": "*" }
    ],
    "group_by": [
      { "key": "metadata._user" }
    ],
    "value": 100,
    "type": "requests",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 3: Per User Monthly Spend Budget

Limit each user to $50/month in API costs.

{
  "type": "usage_limits",
  "policy": {
    "conditions": [
      { "key": "metadata._user", "value": "*" }
    ],
    "group_by": [
      { "key": "metadata._user" }
    ],
    "credit_limit": 50,
    "type": "cost",
    "periodic_reset": "monthly",
    "status": "active"
  }
}

Use Case 4: Provider Specific Rate Limit

Limit OpenAI requests to 500 RPM, separate from other providers.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "provider", "value": "openai" }
    ],
    "group_by": [{"key": "workspace_id"}],
    "value": 500,
    "type": "requests",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 5: Model Specific Token Rate Limit

Limit GPT-4o to 100,000 tokens per minute.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "model", "value": "@openai/gpt-4o" }
    ],
    "group_by": [{"key": "workspace_id"}],
    "value": 100000,
    "type": "tokens",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 6: Limit All Models from a Provider (Wildcard)

Limit all Anthropic models to 50,000 tokens per minute.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "model", "value": "@anthropic/*" }
    ],
    "group_by": [{ "key": "provider" }],
    "value": 50000,
    "type": "tokens",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 7: Per Virtual Key Weekly Budget

Track and limit spend per virtual key to $100/week.

{
  "type": "usage_limits",
  "policy": {
    "conditions": [
      { "key": "virtual_key", "value": "*" }
    ],
    "group_by": [
      { "key": "virtual_key" }
    ],
    "credit_limit": 100,
    "type": "cost",
    "periodic_reset": "weekly",
    "status": "active"
  }
}

Use Case 8: Config Specific Rate Limit

Limit requests using a specific gateway config to 200 RPM.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "config", "value": "production-config" }
    ],
    "group_by": [{"key": "config"}],
    "value": 200,
    "type": "requests",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 9: Prompt Specific Usage Budget

Limit a specific prompt template to 1 million tokens per month.

{
  "type": "usage_limits",
  "policy": {
    "conditions": [
      { "key": "prompt", "value": "customer-support-v2" }
    ],
    "group_by": [ { "key": "prompt"}],
    "credit_limit": 1000000,
    "type": "tokens",
    "periodic_reset": "monthly",
    "status": "active"
  }
}

Use Case 10: Multiple Allowed Values (OR Logic)

Allow only specific API keys to use expensive models, with a combined rate limit.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "api_key", "value": ["pk_premium_1", "pk_premium_2", "pk_premium_3"] },
      { "key": "model", "value": ["@openai/gpt-4o", "@anthropic/claude-3-5-sonnet-20241022"] }
    ],
    "group_by": [{ "key": "api_key"}],
    "value": 100,
    "type": "requests",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 11: Exclude Specific Models

Apply rate limit to all OpenAI models EXCEPT GPT-4o.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "model", "value": "@openai/*", "excludes": "@openai/gpt-4o" }
    ],
    "group_by": [ { "key": "model"} ],
    "value": 1000,
    "type": "requests",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 12: Per User, Per Model Budget

Track spend separately for each user AND model combination.

{
  "type": "usage_limits",
  "policy": {
    "conditions": [
      { "key": "metadata._user", "value": "*" }
    ],
    "group_by": [
      { "key": "metadata._user" },
      { "key": "model" }
    ],
    "credit_limit": 10,
    "type": "cost",
    "periodic_reset": "monthly",
    "status": "active"
  }
}

Use Case 13: Team Based Provider Quota

Limit each team (from metadata) to specific token quotas per provider.

{
  "type": "usage_limits",
  "policy": {
    "conditions": [
      { "key": "metadata._team", "value": "*" }
    ],
    "group_by": [
      { "key": "metadata._team" },
      { "key": "provider" }
    ],
    "credit_limit": 500000,
    "type": "tokens",
    "periodic_reset": "weekly",
    "status": "active"
  }
}

Use Case 14: Exclude Internal API Keys from Limits

Apply rate limits to all API keys except internal ones.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "api_key", "value": "*", "excludes": ["pk_internal_1", "pk_internal_2"] }
    ],
    "group_by": [
      { "key": "api_key" }
    ],
    "value": 50,
    "type": "requests",
    "unit": "rpm",
    "status": "active"
  }
}

Use Case 15: Combined Conditions - Premium Users on Specific Models

Rate limit premium tier users only when using expensive models.

{
  "type": "rate_limits",
  "policy": {
    "conditions": [
      { "key": "metadata._tier", "value": "premium" },
      { "key": "model", "value": ["@openai/gpt-4o", "@openai/o1-preview", "@anthropic/claude-3-5-sonnet-20241022"] }
    ],
    "group_by": [
      { "key": "metadata._user" }
    ],
    "value": 500,
    "type": "requests",
    "unit": "rph",
    "status": "active"
  }
}

Important Notes

Condition Matching: All conditions must match (AND). Within a condition, multiple values use OR logic.
Model Format: Models are specified as @provider/model-name. Use @provider/* for wildcard matching.
Metadata Keys: Use metadata. prefix followed by your metadata key name (e.g., metadata._user, metadata._team).
Periodic Reset Options: "weekly", "monthly", or null for no reset.
Rate Limit Units: "rpm" (per minute), "rph" (per hour), "rpd" (per day).
Usage Limit Types: "cost" (in dollars) or "tokens".

Introduction

Product

Self-Hosting

Support

​Policy Types

​1. Usage Limit Policies

​2. Rate Limit Policies

​Policy Structure

​Conditions

​Supported Condition Keys

​Group By

​Authentication

​Permissions

​Base URL

​Usage Limits Policies

​Policy Types

​Parameters

​credit_limit (required)

​alert_threshold (optional)

​periodic_reset (optional)

​Validation Rules

​Rate Limits Policies

​Policy Types

​Rate Units

​Parameters

​type (required)

​unit (required)

​value (required)

​Validation Rules

​Use Cases

​Use Case 1: Global Workspace Rate Limit

​Use Case 2: Per User Rate Limit

​Use Case 3: Per User Monthly Spend Budget

​Use Case 4: Provider Specific Rate Limit

​Use Case 5: Model Specific Token Rate Limit

​Use Case 6: Limit All Models from a Provider (Wildcard)

​Use Case 7: Per Virtual Key Weekly Budget

​Use Case 8: Config Specific Rate Limit

​Use Case 9: Prompt Specific Usage Budget

​Use Case 10: Multiple Allowed Values (OR Logic)

​Use Case 11: Exclude Specific Models

​Use Case 12: Per User, Per Model Budget

​Use Case 13: Team Based Provider Quota

​Use Case 14: Exclude Internal API Keys from Limits

​Use Case 15: Combined Conditions - Premium Users on Specific Models

​Important Notes

​API Reference

​Usage Limits Policies

Create Usage Limits Policy

List Usage Limits Policies

Retrieve Usage Limits Policy

Update Usage Limits Policy

Delete Usage Limits Policy

​Rate Limits Policies

Create Rate Limits Policy

List Rate Limits Policies

Retrieve Rate Limits Policy

Update Rate Limits Policy

Delete Rate Limits Policy

Policy Types

1. Usage Limit Policies

2. Rate Limit Policies

Policy Structure

Conditions

Supported Condition Keys

Group By

Authentication

Permissions

Base URL

Usage Limits Policies

Policy Types

Parameters

`credit_limit` (required)

`alert_threshold` (optional)

`periodic_reset` (optional)

Validation Rules

Rate Limits Policies

Policy Types

Rate Units

Parameters

`type` (required)

`unit` (required)

`value` (required)

Validation Rules

Use Cases

Use Case 1: Global Workspace Rate Limit

Use Case 2: Per User Rate Limit

Use Case 3: Per User Monthly Spend Budget

Use Case 4: Provider Specific Rate Limit

Use Case 5: Model Specific Token Rate Limit

Use Case 6: Limit All Models from a Provider (Wildcard)

Use Case 7: Per Virtual Key Weekly Budget

Use Case 8: Config Specific Rate Limit

Use Case 9: Prompt Specific Usage Budget

Use Case 10: Multiple Allowed Values (OR Logic)

Use Case 11: Exclude Specific Models

Use Case 12: Per User, Per Model Budget

Use Case 13: Team Based Provider Quota

Use Case 14: Exclude Internal API Keys from Limits

Use Case 15: Combined Conditions - Premium Users on Specific Models

Important Notes

API Reference

Usage Limits Policies

Rate Limits Policies