Resource Hierarchy: the foundation of everything else
The most common mistake in companies starting with GCP: creating projects directly under the organization without planning the hierarchy. This makes it impossible to apply consistent policies across multiple teams, manage billing by business unit, and delegate access without compromising security.
Organization: company.com
├── Folder: production/
│ ├── Project: prod-networking # Shared VPC, Cloud DNS, NAT
│ ├── Project: prod-data # BigQuery, Cloud SQL
│ └── Project: prod-apps # GKE, Cloud Run, APIs
├── Folder: non-production/
│ ├── Project: staging-apps
│ └── Project: dev-sandbox
└── Folder: shared-services/
├── Project: monitoring # Centralized Cloud Monitoring
└── Project: security # Security Command Center, KMSSeparating into distinct projects by function (networking, data, apps) creates natural blast radius boundaries. An IAM incident in the apps project doesn't compromise KMS keys in security. Billing by folder enables cost reporting per business unit without manual tagging.
Shared VPC: the correct network pattern for enterprises
With Shared VPC, a host project centralizes the network (subnets, firewall rules, Cloud NAT, VPC peering with on-premises) and service projects consume that network without managing their own network resources. This eliminates isolated VPC proliferation, simplifies connectivity, and centralizes traffic auditing.
# Terraform — enable Shared VPC on the host project
resource "google_compute_shared_vpc_host_project" "host" {
project = var.host_project_id
}
resource "google_compute_shared_vpc_service_project" "apps" {
host_project = google_compute_shared_vpc_host_project.host.project
service_project = var.apps_project_id
}
# Subnet with Private Google Access so VMs reach Google APIs without public IPs
resource "google_compute_subnetwork" "apps_subnet" {
name = "apps-subnet"
ip_cidr_range = "10.10.0.0/24"
region = "us-central1"
network = google_compute_network.shared_vpc.id
private_ip_google_access = true
}GKE Autopilot vs. Standard: the right decision by use case
GKE Autopilot manages the data plane automatically — Google provisions and scales nodes, applies security patches, and optimizes pod packing. Pricing is per CPU/memory/storage consumed by pods, not by nodes. GKE Standard gives full control over node pools but requires operational management of the data plane.
- Autopilot: recommended for most enterprise workloads. Eliminates node operations, guarantees automatic security compliance, and per-pod billing reduces waste.
- Standard: necessary when you have specific hardware requirements (GPUs, local SSDs), need custom DaemonSets, or have workloads with highly variable resource profiles that Autopilot can't optimize.
- In practice: most enterprises should start with Autopilot and migrate to Standard only when they hit concrete limitations.
Workload Identity: no credentials in code
The most dangerous anti-pattern in GCP: creating service account JSON keys and distributing them as environment variables or secrets. If that key leaks, the compromised access has no automatic expiration. Workload Identity lets GKE pods access GCP APIs with a service account identity, without static credentials.
# Kubernetes ServiceAccount with Workload Identity binding
apiVersion: v1
kind: ServiceAccount
metadata:
name: api-service
namespace: production
annotations:
iam.gke.io/gcp-service-account: api-service@prod-apps.iam.gserviceaccount.com# Bind the KSA to the GSA
gcloud iam service-accounts add-iam-policy-binding \
api-service@prod-apps.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:prod-apps.svc.id.goog[production/api-service]"Cloud Run for APIs without cluster management
For stateless HTTP APIs, Cloud Run is frequently better than GKE. No node management, no pod scheduling concerns, automatic scale-to-zero, and per-request pricing. The tradeoff: less control over the execution environment and cold starts for services with sporadic traffic.
# Cloud Run service with VPC connector and no direct public traffic
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: payment-api
annotations:
run.googleapis.com/ingress: internal-and-cloud-load-balancing
run.googleapis.com/vpc-access-connector: projects/prod-networking/locations/us-central1/connectors/vpc-connector
run.googleapis.com/vpc-access-egress: private-ranges-only
spec:
template:
spec:
serviceAccountName: api-service@prod-apps.iam.gserviceaccount.com
containers:
- image: us-central1-docker.pkg.dev/prod-apps/api/payment-api:latest
resources:
limits:
cpu: "2"
memory: "1Gi"Network security: VPC Service Controls
VPC Service Controls creates a security perimeter around GCP APIs (BigQuery, Cloud Storage, Cloud SQL) that prevents data exfiltration even if a service account is compromised. APIs inside the perimeter are only accessible from explicitly authorized networks and projects.
For enterprises handling regulated data (PCI-DSS, HIPAA, financial data), VPC Service Controls isn't optional — it's the control that prevents an IAM breach from becoming a data breach.
Frequently Asked Questions
GCP, AWS, or Azure for a company new to cloud?
How do I control GCP costs across multiple projects?
How do I migrate from on-premises to GCP without cutting existing services?
What is Cloud Armor and when do I need it?
Does GKE Autopilot have relevant limitations for enterprise production?
Is your company evaluating GCP or needing to better structure its existing cloud architecture? Our team can do an assessment and propose a reference architecture.
Talk to our team