Section 1 of the Google Cloud Professional Cloud Architect exam lays the foundation for all architectural decisions in the cloud. This section is about translating business needs into effective, scalable, secure, and cost-efficient cloud solutions.
🎯 1.1 – Meeting Business Requirements and Strategy
Understanding and aligning technical solutions with business objectives is the architect’s first responsibility. This includes:
- Budget constraints
- Time-to-market pressures
- Regulatory needs
- Cost-performance trade-offs
💸 Cost Optimization Strategies in GCP
graph LR
A[Meet Business Requirements and Strategy]
subgraph Cost Optimization Goals
B[Preemptible VMs]
C[Committed Use Discounts]
D[Custom Machine Types]
E[Auto-scaling]
F[Coldline or Archive Storage]
G[Cloud Functions and Serverless]
H[Object Lifecycle Rules]
end
B -->|Optimize short-lived, fault-tolerant workloads to reduce compute costs| A
C -->|Lower costs for predictable, sustained resource usage| A
D -->|Right-size compute resources to avoid over-provisioning| A
E -->|Scale resources based on demand to optimize usage| A
F -->|Store infrequently accessed data cost-effectively| A
G -->|Pay only for execution time for event-driven workloads| A
H -->|Tier data automatically based on access frequency| A
A --> I[Business Use Cases and Product Strategy]
A --> J[Supporting Application Design]
A --> K[Integration with External Systems – Network Costs]
A --> L[Movement of Data – Egress Costs and Transfers]
A --> M[Design Trade-offs – Cost vs Performance and Availability]
A --> N[Build, Buy, Modify, or Deprecate – Option Cost Analysis]
A --> O[Success Metrics – ROI and Cost Efficiency]
A --> P[Compliance and Observability – Control Costs]
K --> Q[Network options like VPN and Interconnect]
L --> R[Transfer methods like gsutil, Transfer Service, Appliance]
M --> S[Balance cost with high availability and failover]
M --> T[Balance cost with scalability and performance]
N --> U[Compare managed vs self-managed cost models]
O --> V[Include cost efficiency in KPIs]
P --> W[Account for security control costs like VPC SC]
GCP enables cloud cost control through features such as:
- Preemptible VMs – cheap, short-lived compute for stateless jobs
- Committed Use Discounts – discounts for sustained usage
- Coldline/Archive Storage – economical long-term data storage
- Cloud Functions – efficient for event-driven architectures
| Feature | Benefits | Drawbacks | Ideal Use Cases |
|---|---|---|---|
| Preemptible VMs (PVMs) | Very low-cost compute | Short lifespan; can be preempted at any time | Cost-sensitive, non-critical batch processing and fault-tolerant workloads |
| Significant cost savings (up to 80%) | Automatically terminated after 24 hours | ||
| Ideal for short-lived, fault-tolerant batch jobs | No SLA; not suitable for critical workloads | ||
| Requires graceful shutdown handling | |||
| Committed Use Discounts (CUDs) | Deeply discounted prices | Requires a commitment (typically one to three years) | Long-term workloads with predictable, sustained usage |
| Predictable costs for sustained resource usage | Payment for resources is fixed even if usage is lower than anticipated | ||
| Savings maintained even if instance configurations change | Regional discount limitations | ||
| Coldline/Archive Storage | Extremely economical for long-term storage | Optimized for infrequent access | Archival storage for data that is rarely accessed |
| Very low at-rest storage costs | Higher costs for data retrieval | ||
| Minimum storage duration requirements (e.g., 30 days for Nearline, 90 days for Coldline, 365 days for Archive) | |||
| Lower availability (or no SLA for Archive) | |||
| Cloud Functions | Excellent for event-driven architectures | Designed for event-based, stateless tasks | Lightweight, event-driven tasks or microservices |
| Serverless with no infrastructure management | Limited runtime options and execution time limits | ||
| Pay only for execution time | Not suitable for running full-scale applications on VMs | ||
| Scales down to zero when idle |
Architect decisions should weigh:
- Egress and network costs
- Data movement strategies
- Build vs Buy vs Modify vs Deprecate decisions
- Operational KPIs including ROI, TCO, and compliance
🔧 1.2 – Defining Technical Requirements
Once business goals are set, architects define the technical solution—built to withstand failure, adapt to growth, and operate within constraints.
🛡️ High Availability Design on GCP
graph TD LB[Load Balancer] --> MIG[Managed Instance Group - Multi-zone] MIG --> CEI[Compute Engine Instances] LB --> SQLHA[Cloud SQL HA - Regional] LB --> CSMR[Cloud Storage Multi-Region] LB --> SPN[Cloud Spanner - Global Availability]
Design for failure across:
- Compute: multi-zone MIGs
- Storage: multi-region buckets
- Databases: regional SQL, global Spanner
🌍 Choosing the Right Load Balancer
graph TB
subgraph Load_Balancers
LB1[Global HTTPS Load Balancer] -->|L7| CDN[Cloud CDN]
LB2[Regional Internal Load Balancer] -->|L4| GKE[GKE Internal Services]
LB3[External TCP or UDP Load Balancer] -->|L4| NONHTTP[Non-HTTPS Traffic]
end
Each LB has trade-offs across scope, protocol, and layer (L4 vs L7).
⚖️ Elasticity and Quota Management
graph TD SSD[Scalable Solution Design] --> MIGS[Autoscaling Managed Instance Groups] SSD --> HPA[GKE Horizontal Pod Autoscaler] SSD --> SRVLESS[Serverless Services - Cloud Run] SSD --> QINC[Request Quota Increases] QINC --> QMON[Monitor Quota Usage via Cloud Monitoring]
Key patterns:
- Use autoscaling to adapt resources
- Track and manage quotas to avoid production issues
💽 1.3 – Choosing GCP Network, Storage, and Compute Resources
Choosing the right services often comes down to understanding patterns and trade-offs.
📦 Storage Decision Tree
graph TD
A[Data Type] --> AA{Compute Resource Attached}
AA -- Yes --> M[Local SSD - Ephemeral, High IOPS, Low Latency]
AA -- No --> B{Structured}
B -- Yes --> C{Strong Consistency Required}
C -- Yes --> D[Cloud Spanner - Global, Scalable, Strong Consistency]
C -- No --> E[Cloud SQL - Regional, Relational]
B -- No --> F{Large Objects or Blobs}
F -- Yes --> G[Cloud Storage - Object Storage]
G --> N{Long-Term Archival Needed}
N -- Yes --> O[Cloud Storage Archive or Coldline - Cost-Effective]
F -- Potentially for Analytics --> K[BigQuery - Serverless Data Warehouse]
F -- No --> L{Real-Time NoSQL Use Case}
L -- Yes --> P{Scalability and Document-Based}
P -- Yes --> J[Cloud Firestore - NoSQL Document for Mobile or Web Apps]
P -- No --> Q[Cloud Bigtable - Wide-Column NoSQL for Analytics or Ops]
L -- No --> H{POSIX Interface Needed}
H -- Yes --> I[Filestore - Managed NFS]
H -- No --> R[Cloud Memorystore - In-Memory Store]
Match services to use cases:
| Use Case | Service |
|---|---|
| SQL, consistency | Cloud SQL |
| Global consistency + scalability | Cloud Spanner |
| Object storage + archival | Cloud Storage |
| Serverless analytics | BigQuery |
| Real-time + NoSQL | Firestore / Bigtable |
| NFS interface | Filestore |
🖥️ Compute Resource Decision Tree
graph TD
A[Workload Type] --> B{Stateless}
B -- Yes --> S{Event-Driven}
S -- Yes --> T[Cloud Functions - Serverless and Event-Based]
S -- No --> C[Cloud Run or App Engine - Serverless Containers or PaaS]
B -- No --> D{Containerized}
D -- Yes --> E[GKE - Kubernetes Orchestration]
D -- No --> F[Compute Engine - Infrastructure as a Service]
F --> U{Specialized Hardware Required}
U -- Yes --> V[TPUs - ML Hardware Acceleration]
F --> G{Short-Lived and Fault-Tolerant}
G -- Yes --> H[Preemptible VMs - Cost-Effective for Batch Jobs]
G -- No --> I[Standard VMs - Full Control and Persistence]
I --> W{Isolation Requirements}
W -- Yes --> X[Sole-Tenant Nodes - Dedicated Hardware]
I --> Y[Machine Types - General, Compute, Memory, GPU]
Y --> Z[Custom Machine Types - Tailored to Workload]
Key distinctions:
| Scenario | Use |
|---|---|
| Event-driven, simple | Cloud Functions |
| Containerized workloads | Cloud Run / GKE |
| Full control or special hardware | Compute Engine |
| ML acceleration | TPUs |
| Bare metal or licensing constraint | Sole-tenant nodes |
🌐 GCP Network Services Map
flowchart LR A[Cloud Networking] --> B[VPC Network - Global, Software Defined] B --> C[Subnets - Regional, IP Address Ranges] B --> D[Firewall Rules - Stateful Traffic Control] B --> EE[Network Tiers - Premium Global or Standard Regional] B --> MM[Container Networking - GKE Pods, Services, Policies] B --> NNN[Cloud Load Balancing - Global, Regional, Internal or External] B --> F[Private Access Options] F --> FF[Private Google Access - VM to Google API] F --> GG[Private Services Access - VPC to Managed Services] F --> HH[VPC Service Controls - Perimeter for Managed Services] B --> II[Cloud NAT - Managed Outbound Internet Access] B --> JJ[Cloud DNS - Scalable and Reliable DNS] B --> KK[Cloud Armor - Web Application Firewall] B --> LL[Traffic Director - Service Mesh and Traffic Management] A --> M[Hybrid Connectivity - On-Prem or Multicloud Integration] M --> N[Cloud VPN - Secure IPSec Tunnel] N --> P[Cloud Router - Dynamic Routing with BGP] M --> O[Dedicated Interconnect - Physical, High Bandwidth] O --> P M --> Q[Partner Interconnect - Through Provider] Q --> P A --> R[VPC Peering - Private Connectivity Between VPCs]
Memorize how VPCs and services interconnect:
- Hybrid (VPN, Interconnect)
- Private access (Google APIs)
- Security perimeters (VPC SC, Cloud Armor)
🔄 1.4 – Designing a Migration Plan
Migration must be well-planned and well-tested.
🗺️ GCP Migration Services Map
flowchart LR
A("`**1** Assess Current IT Landscape and Workloads
**2** Identify Dependencies and Licenses
**3** Analyze Business and Technical Requirements`")
A --> D[Plan Migration Strategy]
D --> E[Choose Migration Approach: Rehost, Replatform, Refactor]
E --> F[VMware Engine - Rehost / Lift and Shift]
E --> G[Migrate for Compute Engine - Replatform / Lift and Optimize]
E --> H[Consider GKE, App Engine, Cloud Run - Refactor / Move and Improve]
D --> I[Plan Network Connectivity: VPN, Interconnect, Peering]
D --> J[Plan Data Migration]
J --> K[Estimate Data Size]
K --> L[Use gsutil for Less Than 1TB]
K --> M[Use Storage Transfer Service for 1TB to 10TB]
K --> N[Use Transfer Appliance for More Than 10TB]
J --> O[Use Database Migration Service]
J --> P[Target Systems: Cloud Storage, BigQuery, Cloud SQL, etc.]
D --> Q[Plan Resource Quotas and Capacity]
D --> R[Plan Cost Optimization: Discounts and Rightsizing]
D --> S[Plan Testing and Proof of Concept]
D --> T[Plan Security and Compliance]
D --> U[Migrate Applications and Data]
U --> V[Monitor Migration Progress]
U --> W[Optimize After Migration]
W --> X[Apply Cost Optimization]
W --> Y[Improve Performance]
W --> Z[Achieve Operational Excellence]
W --> AA[Harden Security and Ensure Compliance]
A --> BB[Plan Training and Enablement for Teams]
Key migration tools:
| Use Case | Tool |
|---|---|
| VMs → GCP | Migrate for Compute Engine |
| VMware as-is | VMware Engine |
| Refactoring to serverless | GKE, App Engine, Cloud Run |
| DB migration | Database Migration Service |
| Data migration | gsutil, Transfer Service, Appliance |
🔮 1.5 – Planning for Future Improvements
Architects must build with modernization in mind.
📈 Cloud Modernization Journey
flowchart TD A[VMs in Compute Engine] --> B[Containers in GKE] B --> C[Microservices on Cloud Run] C --> D[Event-Driven Architecture using Pub Sub] D --> E[Integration with AI and ML using Vertex AI] E --> F[Data Mesh or BigQuery Federation]
Evolve architecture:
- Start with VMs (IaaS)
- Shift to containers (GKE)
- Modernize with Cloud Run
- Add event-driven processing (Pub/Sub)
- Integrate AI/ML (Vertex AI)
- Unify data (BigQuery Federation)