Section 1 of the Google Cloud Professional Cloud Architect exam lays the foundation for all architectural decisions in the cloud. This section is about translating business needs into effective, scalable, secure, and cost-efficient cloud solutions.
🎯 1.1 – Meeting Business Requirements and Strategy
Understanding and aligning technical solutions with business objectives is the architect’s first responsibility. This includes:
- Budget constraints
- Time-to-market pressures
- Regulatory needs
- Cost-performance trade-offs
💸 Cost Optimization Strategies in GCP
graph LR A[Meet Business Requirements and Strategy] subgraph Cost Optimization Goals B[Preemptible VMs] C[Committed Use Discounts] D[Custom Machine Types] E[Auto-scaling] F[Coldline or Archive Storage] G[Cloud Functions and Serverless] H[Object Lifecycle Rules] end B -->|Optimize short-lived, fault-tolerant workloads to reduce compute costs| A C -->|Lower costs for predictable, sustained resource usage| A D -->|Right-size compute resources to avoid over-provisioning| A E -->|Scale resources based on demand to optimize usage| A F -->|Store infrequently accessed data cost-effectively| A G -->|Pay only for execution time for event-driven workloads| A H -->|Tier data automatically based on access frequency| A A --> I[Business Use Cases and Product Strategy] A --> J[Supporting Application Design] A --> K[Integration with External Systems – Network Costs] A --> L[Movement of Data – Egress Costs and Transfers] A --> M[Design Trade-offs – Cost vs Performance and Availability] A --> N[Build, Buy, Modify, or Deprecate – Option Cost Analysis] A --> O[Success Metrics – ROI and Cost Efficiency] A --> P[Compliance and Observability – Control Costs] K --> Q[Network options like VPN and Interconnect] L --> R[Transfer methods like gsutil, Transfer Service, Appliance] M --> S[Balance cost with high availability and failover] M --> T[Balance cost with scalability and performance] N --> U[Compare managed vs self-managed cost models] O --> V[Include cost efficiency in KPIs] P --> W[Account for security control costs like VPC SC]
GCP enables cloud cost control through features such as:
- Preemptible VMs – cheap, short-lived compute for stateless jobs
- Committed Use Discounts – discounts for sustained usage
- Coldline/Archive Storage – economical long-term data storage
- Cloud Functions – efficient for event-driven architectures
Feature | Benefits | Drawbacks | Ideal Use Cases |
---|---|---|---|
Preemptible VMs (PVMs) | Very low-cost compute | Short lifespan; can be preempted at any time | Cost-sensitive, non-critical batch processing and fault-tolerant workloads |
Significant cost savings (up to 80%) | Automatically terminated after 24 hours | ||
Ideal for short-lived, fault-tolerant batch jobs | No SLA; not suitable for critical workloads | ||
Requires graceful shutdown handling | |||
Committed Use Discounts (CUDs) | Deeply discounted prices | Requires a commitment (typically one to three years) | Long-term workloads with predictable, sustained usage |
Predictable costs for sustained resource usage | Payment for resources is fixed even if usage is lower than anticipated | ||
Savings maintained even if instance configurations change | Regional discount limitations | ||
Coldline/Archive Storage | Extremely economical for long-term storage | Optimized for infrequent access | Archival storage for data that is rarely accessed |
Very low at-rest storage costs | Higher costs for data retrieval | ||
Minimum storage duration requirements (e.g., 30 days for Nearline, 90 days for Coldline, 365 days for Archive) | |||
Lower availability (or no SLA for Archive) | |||
Cloud Functions | Excellent for event-driven architectures | Designed for event-based, stateless tasks | Lightweight, event-driven tasks or microservices |
Serverless with no infrastructure management | Limited runtime options and execution time limits | ||
Pay only for execution time | Not suitable for running full-scale applications on VMs | ||
Scales down to zero when idle |
Architect decisions should weigh:
- Egress and network costs
- Data movement strategies
- Build vs Buy vs Modify vs Deprecate decisions
- Operational KPIs including ROI, TCO, and compliance
🔧 1.2 – Defining Technical Requirements
Once business goals are set, architects define the technical solution—built to withstand failure, adapt to growth, and operate within constraints.
🛡️ High Availability Design on GCP
graph TD LB[Load Balancer] --> MIG[Managed Instance Group - Multi-zone] MIG --> CEI[Compute Engine Instances] LB --> SQLHA[Cloud SQL HA - Regional] LB --> CSMR[Cloud Storage Multi-Region] LB --> SPN[Cloud Spanner - Global Availability]
Design for failure across:
- Compute: multi-zone MIGs
- Storage: multi-region buckets
- Databases: regional SQL, global Spanner
🌍 Choosing the Right Load Balancer
graph TB subgraph Load_Balancers LB1[Global HTTPS Load Balancer] -->|L7| CDN[Cloud CDN] LB2[Regional Internal Load Balancer] -->|L4| GKE[GKE Internal Services] LB3[External TCP or UDP Load Balancer] -->|L4| NONHTTP[Non-HTTPS Traffic] end
Each LB has trade-offs across scope, protocol, and layer (L4 vs L7).
⚖️ Elasticity and Quota Management
graph TD SSD[Scalable Solution Design] --> MIGS[Autoscaling Managed Instance Groups] SSD --> HPA[GKE Horizontal Pod Autoscaler] SSD --> SRVLESS[Serverless Services - Cloud Run] SSD --> QINC[Request Quota Increases] QINC --> QMON[Monitor Quota Usage via Cloud Monitoring]
Key patterns:
- Use autoscaling to adapt resources
- Track and manage quotas to avoid production issues
💽 1.3 – Choosing GCP Network, Storage, and Compute Resources
Choosing the right services often comes down to understanding patterns and trade-offs.
📦 Storage Decision Tree
graph TD A[Data Type] --> AA{Compute Resource Attached} AA -- Yes --> M[Local SSD - Ephemeral, High IOPS, Low Latency] AA -- No --> B{Structured} B -- Yes --> C{Strong Consistency Required} C -- Yes --> D[Cloud Spanner - Global, Scalable, Strong Consistency] C -- No --> E[Cloud SQL - Regional, Relational] B -- No --> F{Large Objects or Blobs} F -- Yes --> G[Cloud Storage - Object Storage] G --> N{Long-Term Archival Needed} N -- Yes --> O[Cloud Storage Archive or Coldline - Cost-Effective] F -- Potentially for Analytics --> K[BigQuery - Serverless Data Warehouse] F -- No --> L{Real-Time NoSQL Use Case} L -- Yes --> P{Scalability and Document-Based} P -- Yes --> J[Cloud Firestore - NoSQL Document for Mobile or Web Apps] P -- No --> Q[Cloud Bigtable - Wide-Column NoSQL for Analytics or Ops] L -- No --> H{POSIX Interface Needed} H -- Yes --> I[Filestore - Managed NFS] H -- No --> R[Cloud Memorystore - In-Memory Store]
Match services to use cases:
Use Case | Service |
---|---|
SQL, consistency | Cloud SQL |
Global consistency + scalability | Cloud Spanner |
Object storage + archival | Cloud Storage |
Serverless analytics | BigQuery |
Real-time + NoSQL | Firestore / Bigtable |
NFS interface | Filestore |
🖥️ Compute Resource Decision Tree
graph TD A[Workload Type] --> B{Stateless} B -- Yes --> S{Event-Driven} S -- Yes --> T[Cloud Functions - Serverless and Event-Based] S -- No --> C[Cloud Run or App Engine - Serverless Containers or PaaS] B -- No --> D{Containerized} D -- Yes --> E[GKE - Kubernetes Orchestration] D -- No --> F[Compute Engine - Infrastructure as a Service] F --> U{Specialized Hardware Required} U -- Yes --> V[TPUs - ML Hardware Acceleration] F --> G{Short-Lived and Fault-Tolerant} G -- Yes --> H[Preemptible VMs - Cost-Effective for Batch Jobs] G -- No --> I[Standard VMs - Full Control and Persistence] I --> W{Isolation Requirements} W -- Yes --> X[Sole-Tenant Nodes - Dedicated Hardware] I --> Y[Machine Types - General, Compute, Memory, GPU] Y --> Z[Custom Machine Types - Tailored to Workload]
Key distinctions:
Scenario | Use |
---|---|
Event-driven, simple | Cloud Functions |
Containerized workloads | Cloud Run / GKE |
Full control or special hardware | Compute Engine |
ML acceleration | TPUs |
Bare metal or licensing constraint | Sole-tenant nodes |
🌐 GCP Network Services Map
flowchart LR A[Cloud Networking] --> B[VPC Network - Global, Software Defined] B --> C[Subnets - Regional, IP Address Ranges] B --> D[Firewall Rules - Stateful Traffic Control] B --> EE[Network Tiers - Premium Global or Standard Regional] B --> MM[Container Networking - GKE Pods, Services, Policies] B --> NNN[Cloud Load Balancing - Global, Regional, Internal or External] B --> F[Private Access Options] F --> FF[Private Google Access - VM to Google API] F --> GG[Private Services Access - VPC to Managed Services] F --> HH[VPC Service Controls - Perimeter for Managed Services] B --> II[Cloud NAT - Managed Outbound Internet Access] B --> JJ[Cloud DNS - Scalable and Reliable DNS] B --> KK[Cloud Armor - Web Application Firewall] B --> LL[Traffic Director - Service Mesh and Traffic Management] A --> M[Hybrid Connectivity - On-Prem or Multicloud Integration] M --> N[Cloud VPN - Secure IPSec Tunnel] N --> P[Cloud Router - Dynamic Routing with BGP] M --> O[Dedicated Interconnect - Physical, High Bandwidth] O --> P M --> Q[Partner Interconnect - Through Provider] Q --> P A --> R[VPC Peering - Private Connectivity Between VPCs]
Memorize how VPCs and services interconnect:
- Hybrid (VPN, Interconnect)
- Private access (Google APIs)
- Security perimeters (VPC SC, Cloud Armor)
🔄 1.4 – Designing a Migration Plan
Migration must be well-planned and well-tested.
🗺️ GCP Migration Services Map
flowchart LR A("`**1** Assess Current IT Landscape and Workloads **2** Identify Dependencies and Licenses **3** Analyze Business and Technical Requirements`") A --> D[Plan Migration Strategy] D --> E[Choose Migration Approach: Rehost, Replatform, Refactor] E --> F[VMware Engine - Rehost / Lift and Shift] E --> G[Migrate for Compute Engine - Replatform / Lift and Optimize] E --> H[Consider GKE, App Engine, Cloud Run - Refactor / Move and Improve] D --> I[Plan Network Connectivity: VPN, Interconnect, Peering] D --> J[Plan Data Migration] J --> K[Estimate Data Size] K --> L[Use gsutil for Less Than 1TB] K --> M[Use Storage Transfer Service for 1TB to 10TB] K --> N[Use Transfer Appliance for More Than 10TB] J --> O[Use Database Migration Service] J --> P[Target Systems: Cloud Storage, BigQuery, Cloud SQL, etc.] D --> Q[Plan Resource Quotas and Capacity] D --> R[Plan Cost Optimization: Discounts and Rightsizing] D --> S[Plan Testing and Proof of Concept] D --> T[Plan Security and Compliance] D --> U[Migrate Applications and Data] U --> V[Monitor Migration Progress] U --> W[Optimize After Migration] W --> X[Apply Cost Optimization] W --> Y[Improve Performance] W --> Z[Achieve Operational Excellence] W --> AA[Harden Security and Ensure Compliance] A --> BB[Plan Training and Enablement for Teams]
Key migration tools:
Use Case | Tool |
---|---|
VMs → GCP | Migrate for Compute Engine |
VMware as-is | VMware Engine |
Refactoring to serverless | GKE, App Engine, Cloud Run |
DB migration | Database Migration Service |
Data migration | gsutil, Transfer Service, Appliance |
🔮 1.5 – Planning for Future Improvements
Architects must build with modernization in mind.
📈 Cloud Modernization Journey
flowchart TD A[VMs in Compute Engine] --> B[Containers in GKE] B --> C[Microservices on Cloud Run] C --> D[Event-Driven Architecture using Pub Sub] D --> E[Integration with AI and ML using Vertex AI] E --> F[Data Mesh or BigQuery Federation]
Evolve architecture:
- Start with VMs (IaaS)
- Shift to containers (GKE)
- Modernize with Cloud Run
- Add event-driven processing (Pub/Sub)
- Integrate AI/ML (Vertex AI)
- Unify data (BigQuery Federation)