All roles
โ˜๏ธ

Cloud Engineer Interview Prep

AWS, Azure, or GCP. Technical questions on services, networking, and security plus behavioural questions about projects.

15 questionsยท45-60 min usualยท10 technical, 3 behavioural, 2 scenario

General tips for this role

  • Always draw on the whiteboard. Even if simple, visualising shows architectural thinking.
  • When unsure of an exact AWS service name, describe what it does. 'A managed message queue' is fine even if you cannot remember 'SQS'.
  • Mention cost. Senior engineers always think about cost.
  • Mention security. Always.
  • Ask clarifying questions before diving in. 'Is this a startup MVP or an enterprise migration?' Changes everything.

What is the difference between IaaS, PaaS, and SaaS?

easytechnical
Show model answer
Model answer

IaaS (Infrastructure) gives you raw virtual machines and storage; you install everything. Example: AWS EC2. PaaS (Platform) gives you a platform to deploy code; you do not manage the OS. Example: Azure App Service. SaaS (Software) is ready-to-use software; you just log in. Example: Salesforce. The deeper the stack the provider manages, the less control and responsibility you have.

Tip

Use the 'pizza analogy': IaaS is buying the ingredients, PaaS is take-and-bake, SaaS is delivered hot.

What is the difference between Availability Zones and Regions?

easytechnical
Show model answer
Model answer

A region is a geographic area (e.g. eu-west-2 = London). Each region has multiple Availability Zones (AZs), which are physically separate datacentres within the region. AZs let you build apps that survive a single datacenter failure. For disaster recovery, you also need to think about multiple regions.

How would you design a highly available web app on AWS?

mediumtechnical
Show model answer
Model answer

Run the web tier on EC2 instances in an Auto Scaling group across at least two Availability Zones. Put an Application Load Balancer in front to distribute traffic. Use RDS Multi-AZ for the database. Store static assets in S3 with CloudFront as a CDN. Add Route 53 for DNS failover. For 99.99% you would need at least 2 AZs; for 99.999% you would need multi-region.

Tip

Always mention BOTH compute AND data layers. Many candidates forget the database.

What is the difference between Security Groups and NACLs in AWS?

mediumtechnical
Show model answer
Model answer

Security Groups are stateful, applied at the instance level โ€” return traffic is automatically allowed. NACLs are stateless, applied at the subnet level โ€” you must explicitly allow both inbound and outbound. Security Groups are 'allow only'; NACLs can both allow and deny.

Tip

Most candidates remember the names but not which is stateful. Memorise: SG = stateful, NACL = not.

How does Auto Scaling work and what are its limitations?

mediumtechnical
Show model answer
Model answer

Auto Scaling automatically adds or removes EC2 instances based on metrics (CPU, network, custom CloudWatch metrics). It needs a launch template and an Auto Scaling group. Limitations: takes minutes to scale up, so spikes are not handled instantly; cold starts; you pay for the new instances from second 1. For instant scaling, consider Lambda or pre-warmed pools.

Explain how you would secure data at rest and in transit on AWS.

mediumtechnical
Show model answer
Model answer

At rest: use AWS KMS for key management; enable encryption on S3 (SSE-S3 or SSE-KMS), EBS volumes, RDS, and DynamoDB. In transit: use TLS 1.2+ everywhere, force HTTPS via ALB/CloudFront, use VPC endpoints to keep traffic off the public internet. Add IAM least-privilege policies and CloudTrail for auditing.

Tip

Encryption AT rest + IN transit are different. Always mention both.

Walk me through how you would migrate a legacy on-premises application to the cloud.

hardtechnical
Show model answer
Model answer

Use the AWS 6 Rs framework. 1) Discover and assess: inventory apps, dependencies, performance, compliance needs. 2) Categorise each app into the 6 Rs: Rehost (lift-and-shift), Replatform, Repurchase (move to SaaS), Refactor, Retire, Retain. 3) Pilot a low-risk app first to validate tooling and process. 4) Build a landing zone (VPCs, IAM, logging, security baseline). 5) Migrate in waves, starting with non-critical apps. 6) Optimise after migration: right-size instances, adopt managed services, set up FinOps.

Tip

Mention 'landing zone' โ€” it shows enterprise experience.

What are the trade-offs between Lambda and EC2?

hardtechnical
Show model answer
Model answer

Lambda: serverless, pay per execution, scales to zero, 15-min max runtime, cold starts. Great for event-driven, short-lived tasks. EC2: full control, runs continuously, predictable cost at scale, no cold starts. Better for long-running services or when you need a specific OS. Cost crossover: roughly, if your workload runs more than ~30% of the time, EC2 becomes cheaper than Lambda.

Tip

Mention cold starts โ€” interviewers love when you acknowledge Lambda is not always the answer.

How would you monitor a microservices architecture?

hardtechnical
Show model answer
Model answer

Three pillars: metrics, logs, traces. Metrics with CloudWatch/Prometheus + Grafana dashboards (CPU, latency, error rate, request rate). Centralised logs (CloudWatch Logs, ELK, Datadog). Distributed tracing (X-Ray, Jaeger, Datadog APM) to follow a request across services. Set up alerts on SLO breaches, not raw metrics. Implement health checks at the service and dependency level.

What is Infrastructure as Code and why does it matter?

mediumtechnical
Show model answer
Model answer

IaC is defining infrastructure (servers, networks, databases) in code rather than clicking in a console. Tools: Terraform, AWS CloudFormation, Pulumi, Azure Bicep. Benefits: version control, code review, repeatable deployments, easy rollback, can spin up identical environments (dev/staging/prod). It is the foundation of modern DevOps.

Tell me about a time you handled a production outage.

mediumbehavioural
Show model answer
Model answer

Use STAR. Situation: which system went down, when, how you found out. Task: your role in the response. Action: triage, hypothesis, fix, communication. Result: how long it lasted, customer impact, what you learned. End with the postmortem and improvements you made.

Tip

Interviewers want to hear about ownership, calm under pressure, and learning. Never blame others.

How do you balance speed with reliability when shipping changes?

mediumbehavioural
Show model answer
Model answer

Talk about your CI/CD pipeline, automated testing, blue-green or canary deployments, feature flags. Mention error budgets if you can. Show you take BOTH seriously and have processes that catch issues early.

Tip

Mentioning 'error budgets' (the SRE concept) signals seniority.

Why do you want to work in cloud engineering?

easybehavioural
Show model answer
Model answer

Honest answer + one technical reason. 'I love that cloud lets you build globally-scaled systems with a few API calls. The fact that I can spin up a multi-region database in 5 minutes still amazes me. I want to work with that scale.'

Tip

Avoid generic answers like 'It's the future'. Pick something specific.

Your team's AWS bill has doubled in the last month. How do you investigate?

hardscenario
Show model answer
Model answer

Start with Cost Explorer to find which services drove the increase. Filter by service, then by tag, then by resource. Common culprits: unattached EBS volumes, idle EC2 instances, data transfer between AZs, NAT Gateway egress, unused EIPs. Set up Budgets to alert before it happens again. Implement tagging policy so all resources can be attributed. Consider Reserved Instances or Savings Plans for steady workloads. Long-term: implement FinOps practices.

A new junior joins your team. How do you help them ramp up on AWS?

mediumscenario
Show model answer
Model answer

Give them read-only access first. Pair them on the architecture diagram of one core service. Have them go through AWS Skill Builder fundamentals (1 week). Assign a small, low-risk Terraform PR. Code review with detailed comments. Set up a weekly 1:1 to ask questions. After a month, they should be able to deploy a basic service end-to-end with supervision.

Tip

Shows you can mentor โ€” important for senior roles.

Help someone else find this

This is free, no ads. Share with anyone preparing for the test.