Cloud Quota Limits: AWS vs Azure vs GCP - A Complete Comparison Guide

Posted on: fecalendar-white

2025-12-10 /images/blog/posts/cloud-quota-limits.png

Every cloud provider has service quotas and limits. Learn the differences between AWS Service Quotas, Azure subscription limits, and GCP quotas—including how to monitor, request increases, and automate quota management across all three platforms.

The Quota Reality Check

You’ve architected a scalable system, tested in staging, and you’re ready to launch. Then deployment fails: “vCPU limit exceeded.” Or worse—it succeeds in one region but fails in another because quota limits vary by region.

Every cloud provider imposes limits on resources you can provision. Understanding these limits—and how to manage them proactively—is essential for production workloads.

This guide covers:

How each provider thinks about quotas (spoiler: the terminology differs)
Where to view and manage limits
How to request increases
Automation strategies for multi-account/multi-region setups

Terminology: The First Confusion

Before diving in, let’s clear up the vocabulary:

Provider	Terms Used	Meaning
AWS	Quota = Limit (interchangeable)	Both refer to the same concept; some are adjustable, some are fixed
Azure	Limits	Can refer to adjustable quotas or hard caps; some are tier-dependent
GCP	Quota (adjustable) vs Limit (fixed)	Quotas can be increased; limits cannot

This inconsistency causes confusion when working across clouds. Throughout this guide, we’ll use “quota” for adjustable values and “limit” for hard caps.

AWS Service Quotas

AWS has the most mature quota management system, with a dedicated service called Service Quotas that provides a unified view across all AWS services.

Key Concepts

Concept	Description
Account-level quotas	Apply to your entire AWS account
Resource-level quotas	Apply to specific resources (newer feature)
Regional quotas	Many quotas are per-region, not global
Default quotas	What you start with; usually lower than max
Applied quotas	Your current limit after any increases

Viewing Your Quotas

AWS Console:

Navigate to Service Quotas
Select an AWS service
View current quotas and usage

AWS CLI:

# List quotas for a specific service
aws service-quotas list-service-quotas \
  --service-code ec2 \
  --query 'Quotas[*].[QuotaName,Value,Adjustable]' \
  --output table

# Get a specific quota
aws service-quotas get-service-quota \
  --service-code ec2 \
  --quota-code L-1216C47A  # Running On-Demand Standard instances

Common AWS Quotas That Bite

Service	Quota	Default	Notes
EC2	Running On-Demand instances (vCPUs)	5-64 vCPUs (varies by instance type)	Per region, per instance family
Lambda	Concurrent executions	1,000	Per region; soft limit
VPC	VPCs per region	5	Usually easy to increase
EBS	Snapshots per region	100,000	Higher than you’d expect
S3	Buckets per account	100	Global, not regional
RDS	DB instances per region	40	Includes all engine types
IAM	Roles per account	1,000	Global quota
CloudFormation	Stacks per region	2,000	Can be limiting in complex setups

Requesting an Increase

Via Console:

Service Quotas → Select service → Select quota
Click “Request quota increase”
Enter desired value and submit

Via CLI:

aws service-quotas request-service-quota-increase \
  --service-code ec2 \
  --quota-code L-1216C47A \
  --desired-value 256

Processing Time:

Small increases: Often automatic (minutes)
Large increases: Manual review (hours to days)
Very large increases: May require AWS account team involvement

Automatic Quota Management (New in 2025)

AWS recently launched Automatic Quota Management that proactively adjusts quotas based on your usage:

# Enable automatic management for a quota
aws service-quotas put-service-quota-increase-request-into-template \
  --service-code lambda \
  --quota-code L-B99A9384 \
  --desired-value 5000 \
  --aws-region us-east-1

Two modes:

Notify and Auto-Adjust: Automatically increases quotas and notifies you
Notify Only: Alerts when approaching limits; you decide whether to increase

Monitoring with CloudWatch

Set up alarms before you hit limits:

# Create alarm for Lambda concurrent executions
aws cloudwatch put-metric-alarm \
  --alarm-name "LambdaConcurrentExecutionsHigh" \
  --metric-name ConcurrentExecutions \
  --namespace AWS/Lambda \
  --statistic Maximum \
  --period 60 \
  --threshold 800 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 3 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:alerts

AWS Quota Monitor Solution

For enterprise setups, AWS provides the Quota Monitor for AWS solution:

Deploys via CloudFormation
Integrates with AWS Organizations
Monitors all accounts in your organization
Sends alerts via SNS, Slack, or email
Works with CloudWatch dashboards

Deploy from the AWS Solutions Library.

Azure Subscription Limits

Azure takes a different approach: limits are tied to subscriptions and often vary by service tier and region.

Key Concepts

Concept	Description
Subscription limits	Most limits apply at subscription scope
Regional limits	Many limits are per-region within a subscription
Tier-dependent limits	Some limits increase with higher service tiers
vCPU quotas	Compute limits are per VM family, per region

Viewing Your Quotas

Azure Portal:

Search for “Quotas” or “Usage + quotas”
Select a provider (e.g., Compute)
Filter by region and subscription

Azure CLI:

# List compute quotas for a region
az vm list-usage --location eastus --output table

# Get specific quota
az quota show \
  --scope "/subscriptions/{sub-id}/providers/Microsoft.Compute/locations/eastus" \
  --resource-name "standardDSv3Family"

Common Azure Limits That Bite

Service	Limit	Default	Notes
Compute	Total Regional vCPUs	20-100	Per subscription, per region
Compute	VM family vCPUs	10-20	Per family (DSv3, FSv2, etc.)
Storage	Storage accounts per region	250	Per subscription
Networking	VNets per subscription	1,000	Global across regions
App Service	App Service plans	100	Per resource group
Azure Functions	Max instances (Consumption)	200	Per function app
AKS	Clusters per subscription	5,000	High default
Resource Manager	API reads per hour	12,000	Can throttle automation

Requesting an Increase

Via Portal:

Go to Quotas blade
Select the quota to increase
Click Request increase
Enter new value and business justification

Via Azure CLI:

# Request quota increase
az quota create \
  --scope "/subscriptions/{sub-id}/providers/Microsoft.Compute/locations/eastus" \
  --resource-name "standardDSv3Family" \
  --limit-object value=100 limit-object-type=LimitValue

Via Support Request: For large increases or non-adjustable limits:

Create a support request with issue type: “Service and subscription limits (quotas)”
Provide business justification
Azure team reviews and processes

Important Azure-Specific Notes

Free and Trial subscriptions cannot request quota increases—you must upgrade to Pay-as-you-go or higher.
Regional differences: If you need 30 vCPUs in West Europe, you must specifically request 30 vCPUs in West Europe. Increases don’t apply globally.
No cost for quotas: Requesting a quota increase is free—you only pay for resources you actually provision.

Monitoring Azure Quotas

Azure Monitor Alerts:

# Create alert rule for quota usage
az monitor metrics alert create \
  --name "HighCPUQuotaUsage" \
  --resource-group myResourceGroup \
  --scopes "/subscriptions/{sub-id}" \
  --condition "total UsagePercentage > 80" \
  --action-group myActionGroup \
  --description "Alert when CPU quota usage exceeds 80%"

Azure Policy for Quota Governance:

{
  "if": {
    "allOf": [
      {
        "field": "type",
        "equals": "Microsoft.Compute/virtualMachines"
      },
      {
        "field": "Microsoft.Compute/virtualMachines/sku.name",
        "like": "Standard_D*"
      }
    ]
  },
  "then": {
    "effect": "audit"
  }
}

Google Cloud Quotas

GCP distinguishes clearly between quotas (adjustable) and limits (fixed), and applies them at the project level rather than account level.

Key Concepts

Concept	Description
Project-level quotas	Most quotas apply per GCP project
Regional quotas	Many quotas are per-region within a project
Quotas vs Limits	Quotas can be increased; limits are hard caps
Quota metrics	Quotas are tracked as metrics you can monitor

Viewing Your Quotas

GCP Console:

Navigate to IAM & Admin → Quotas
Filter by service and region
View current usage and limits

gcloud CLI:

# List quotas for Compute Engine
gcloud compute project-info describe \
  --project=my-project \
  --format="table(quotas.metric,quotas.limit,quotas.usage)"

# List quotas for a specific region
gcloud compute regions describe us-central1 \
  --format="table(quotas.metric,quotas.limit,quotas.usage)"

Common GCP Quotas That Bite

Service	Quota	Default	Notes
Compute Engine	CPUs per region	24	Per project, per region
Compute Engine	GPUs per region	0	Must request to get any
Compute Engine	Persistent disk (TB)	20 TB	Per region
GKE	Nodes per cluster	15,000	High default
Cloud Functions	Max instances	3,000	Per function, per region
Cloud SQL	Instances per project	100	Across all regions
BigQuery	Concurrent queries	100	Per project
Pub/Sub	Topics per project	10,000	High default
Cloud Storage	Buckets per project	Unlimited	No quota, but other limits apply

Requesting an Increase

Via Console:

IAM & Admin → Quotas
Select quota → Edit Quotas
Enter new limit and justification
Submit request

Via gcloud:

# Request quota increase (newer method)
gcloud alpha quotas quota-preferences create \
  --service=compute.googleapis.com \
  --quota-id=CPUS-per-project-region \
  --preferred-value=100 \
  --project=my-project \
  --dimensions=region=us-central1

Processing:

Many quota increases are approved automatically within minutes
Large increases or GPU quotas may require manual review (hours to days)
New accounts often need to demonstrate usage before significant increases

Terraform Support for GCP Quotas

GCP uniquely offers Terraform resources for quota management:

# Request a quota increase via Terraform
resource "google_cloud_quotas_quota_preference" "compute_cpus" {
  parent        = "projects/my-project"
  service       = "compute.googleapis.com"
  quota_id      = "CPUS-per-project-region"
  contact_email = "[email protected]"

  quota_config {
    preferred_value = 100
  }

  dimensions = {
    region = "us-central1"
  }
}

# Data source to check quota info
data "google_cloud_quotas_quota_info" "compute_cpus" {
  parent   = "projects/my-project"
  service  = "compute.googleapis.com"
  quota_id = "CPUS-per-project-region"
}

output "quota_increase_eligible" {
  value = data.google_cloud_quotas_quota_info.compute_cpus.quota_increase_eligibility
}

GCP Quota Adjuster (Automatic Increases)

GCP offers an automatic quota adjustment feature:

Enable in the Quotas console
GCP monitors your usage patterns
Automatically requests increases when you approach limits
You receive notifications of changes

Two modes:

Auto-adjust and notify: Increases happen automatically
Notify only: You receive alerts but must request increases manually

Monitoring GCP Quotas

Cloud Monitoring Alerts:

# Alert policy for quota usage
displayName: "High CPU Quota Usage"
combiner: OR
conditions:
  - displayName: "CPU quota > 80%"
    conditionThreshold:
      filter: 'metric.type="compute.googleapis.com/quota/cpus_per_project/usage" resource.type="compute.googleapis.com/Project"'
      comparison: COMPARISON_GT
      thresholdValue: 0.8
      duration: "60s"
      aggregations:
        - alignmentPeriod: "60s"
          perSeriesAligner: ALIGN_MEAN

Cross-Cloud Comparison

Quota Structure Comparison

Aspect	AWS	Azure	GCP
Primary Scope	Account + Region	Subscription + Region	Project + Region
Global Quotas	Some (S3 buckets, IAM)	Rare	Rare
Quota Console	Service Quotas	Usage + quotas blade	IAM & Admin → Quotas
API/CLI Support	Comprehensive	Good	Good
Terraform Support	Via AWS provider	Via AzureRM provider	Native quota resources
Automatic Management	Yes (new in 2025)	Limited	Yes (Quota Adjuster)

Request Process Comparison

Aspect	AWS	Azure	GCP
Self-service Increases	Most quotas	Most quotas	Most quotas
Approval Time (Small)	Minutes	Minutes to hours	Minutes
Approval Time (Large)	Hours to days	Hours to days	Hours to days
Support Required	For very large increases	For large/fixed limits	For large increases
Cost to Request	Free	Free	Free

Monitoring Tools Comparison

Feature	AWS	Azure	GCP
Native Monitoring	CloudWatch + Service Quotas	Azure Monitor	Cloud Monitoring
Pre-built Solution	Quota Monitor for AWS	Azure Advisor	Quota Adjuster
Multi-account View	With AWS Organizations	With Management Groups	With Resource Manager
Alerting	SNS, EventBridge	Action Groups	Notification Channels

Multi-Cloud Quota Management Strategy

For organizations running across multiple clouds, consider these strategies:

1. Centralized Quota Documentation

Maintain a single source of truth for quota requirements:

# quotas.yml
production:
  aws:
    us-east-1:
      ec2_vcpus: 256
      lambda_concurrent: 5000
      rds_instances: 20
    eu-west-1:
      ec2_vcpus: 128
      lambda_concurrent: 3000

  azure:
    eastus:
      compute_vcpus: 200
      storage_accounts: 50
    westeurope:
      compute_vcpus: 100

  gcp:
    us-central1:
      compute_cpus: 100
      cloud_sql_instances: 20

2. Pre-Deployment Quota Checks

Before deploying to a new region, verify quotas are sufficient:

#!/bin/bash
# check-quotas.sh

REGION=$1
REQUIRED_VCPUS=${2:-50}

# Check AWS
aws_vcpus=$(aws service-quotas get-service-quota \
  --service-code ec2 \
  --quota-code L-1216C47A \
  --region $REGION \
  --query 'Quota.Value' --output text)

if (( $(echo "$aws_vcpus < $REQUIRED_VCPUS" | bc -l) )); then
  echo "AWS: Insufficient vCPUs ($aws_vcpus < $REQUIRED_VCPUS)"
  exit 1
fi

echo "All quota checks passed"

3. Automated Quota Monitoring

Deploy monitoring across all clouds:

AWS (CloudWatch):

aws cloudwatch put-metric-alarm \
  --alarm-name "EC2-vCPU-Quota-Warning" \
  --metric-name ResourceCount \
  --namespace AWS/Usage \
  --dimensions Name=Service,Value=EC2 Name=Resource,Value=vCPU \
  --statistic Maximum \
  --period 3600 \
  --threshold 80 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --evaluation-periods 1 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:quota-alerts

GCP (Terraform):

resource "google_monitoring_alert_policy" "quota_alert" {
  display_name = "CPU Quota Warning"
  combiner     = "OR"

  conditions {
    display_name = "CPU usage > 80%"

    condition_threshold {
      filter          = "metric.type=\"compute.googleapis.com/quota/cpus_per_project/usage\" resource.type=\"compute.googleapis.com/Project\""
      comparison      = "COMPARISON_GT"
      threshold_value = 0.8
      duration        = "60s"
    }
  }

  notification_channels = [google_monitoring_notification_channel.email.name]
}

4. Quota Request Automation

For frequently needed increases, automate the request process:

# quota_manager.py
import boto3
import subprocess
import json

def request_aws_quota_increase(service_code, quota_code, region, desired_value):
    client = boto3.client('service-quotas', region_name=region)

    try:
        response = client.request_service_quota_increase(
            ServiceCode=service_code,
            QuotaCode=quota_code,
            DesiredValue=desired_value
        )
        return response['RequestedQuota']['Status']
    except Exception as e:
        return f"Error: {e}"

def request_gcp_quota_increase(project, quota_id, region, desired_value):
    cmd = [
        'gcloud', 'alpha', 'quotas', 'quota-preferences', 'create',
        f'--service=compute.googleapis.com',
        f'--quota-id={quota_id}',
        f'--preferred-value={desired_value}',
        f'--project={project}',
        f'--dimensions=region={region}',
        '--format=json'
    ]

    result = subprocess.run(cmd, capture_output=True, text=True)
    return json.loads(result.stdout) if result.returncode == 0 else result.stderr

Best Practices Summary

Before You Deploy

Document quota requirements for your application
Check current quotas in target regions
Request increases proactively (don’t wait for failures)
Include buffer (request 20-30% more than you need)

During Operations

Monitor quota usage with alerts at 70% and 90%
Review quotas quarterly as usage patterns change
Track quota history for capacity planning
Automate requests for predictable scaling

For Multi-Region/Multi-Cloud

Remember quotas are usually regional (not global)
Different regions have different defaults (popular regions may have lower defaults due to capacity)
New regions may need manual quota setup
Enterprise agreements may include higher default quotas

Conclusion

Cloud quotas are a reality of operating in any public cloud. The providers differ in terminology and tools, but the core workflow is the same:

Understand your limits before they bite you
Monitor usage proactively
Request increases before you hit capacity
Automate where possible for scale

The most mature teams treat quota management as part of their infrastructure-as-code practice—tracking requirements in version control, automating checks in CI/CD, and requesting increases as part of their deployment process.

Related Reading:

References: