All roles
๐Ÿ”

DevOps Engineer Interview Prep

Linux, Docker, Kubernetes, CI/CD, Terraform, monitoring. Mix of dev and ops, mid-to-senior salaries.

11 questionsยท60 min, often whiteboard + live troubleshootingยท9 technical, 1 behavioural, 1 scenario

General tips for this role

  • Master one cloud, one orchestration tool, one IaC tool, one CI/CD platform. Depth beats breadth.
  • Be ready to whiteboard a CI/CD pipeline end to end.
  • Use the right vocabulary: SLO, SLI, error budget, blast radius, blameless postmortem.
  • Mention security unprompted: secrets management, IAM, network policies.
  • Have a homelab story. Running Kubernetes at home (k3s, microk8s) signals genuine passion.

What is the difference between Docker and a virtual machine?

easytechnical
Show model answer
Model answer

VMs virtualise hardware โ€” each VM runs a full OS, takes GBs of memory, boots in minutes. Containers virtualise the OS โ€” share the host kernel, take MBs, start in milliseconds. Containers are lighter, faster to start, easier to orchestrate. VMs are more isolated and run different OSes. Containers are dominant for cloud-native apps.

Explain a typical CI/CD pipeline you have built.

mediumtechnical
Show model answer
Model answer

Trigger: developer pushes to a branch. CI: install dependencies, run linting, run unit tests, run integration tests, build artifact (Docker image). Push image to registry. CD: deploy to staging automatically. Run smoke tests. Promote to production on manual approval or after additional automated checks. Notify Slack on failure. Rollback strategy ready (blue-green, canary, or feature flags). Tools: GitHub Actions, GitLab CI, Jenkins, ArgoCD.

What is Kubernetes and what problems does it solve?

mediumtechnical
Show model answer
Model answer

Container orchestration platform. Manages thousands of containers across many servers. Solves: scheduling (where to run each container), scaling (add more containers when busy), self-healing (restart failed containers), networking (containers can find each other), config and secrets management, rolling deployments. The standard way to run containerised workloads in production.

Explain the difference between a Deployment, StatefulSet, and DaemonSet in Kubernetes.

mediumtechnical
Show model answer
Model answer

Deployment: stateless apps, replicas are interchangeable. Pod names are random. Use for web servers, APIs. StatefulSet: stateful apps that need stable identity (databases, queues). Pods have predictable names (mydb-0, mydb-1). DaemonSet: runs one pod per node. Use for node-level services like log collectors (Fluent Bit), monitoring agents (Datadog), network plugins.

What is Terraform and how does state work?

mediumtechnical
Show model answer
Model answer

Terraform: Infrastructure as Code tool. Declarative โ€” describe what you want, Terraform makes it so. State file: tracks what Terraform has created so it knows what to update vs create vs destroy. Critical to store the state file remotely (S3 + DynamoDB lock, or Terraform Cloud) โ€” never on a local machine, never in Git (contains secrets). Use workspaces or separate state files per environment.

How would you do a zero-downtime deployment?

hardtechnical
Show model answer
Model answer

Multiple strategies. Rolling update: replace pods one at a time (default in Kubernetes Deployments). Blue-green: deploy new version alongside old, switch traffic at once. Canary: route a small % of traffic to new version, increase if metrics are good. Feature flags: deploy code disabled, enable for users gradually. Best practice: always combine with database migrations that are backwards-compatible (additive only).

Production is down. Walk me through how you respond.

hardtechnical
Show model answer
Model answer

(1) Acknowledge the alert. (2) Assess impact: how many users affected, what is broken. (3) Communicate: post in incident channel, page secondary if needed. (4) Triage: what changed recently (deploys, config, infrastructure)? Check dashboards. (5) Hypothesise and test: rollback if recent deploy is the likely cause. (6) Mitigate: get the service back up. Root cause can wait. (7) After recovery: write a postmortem (blameless), action items, share with the team. Process is more important than heroic individual fixes.

Tip

Interviewers want to hear about communication, not just technical fixes.

What is the difference between push and pull-based monitoring?

mediumtechnical
Show model answer
Model answer

Push: services send metrics to a central server (StatsD, Datadog agent). Pull: monitoring server scrapes metrics from services (Prometheus). Pull is more common in Kubernetes โ€” easier service discovery. Push is better for short-lived jobs (e.g. cron jobs that finish before being scraped).

What metrics are most important to monitor?

mediumtechnical
Show model answer
Model answer

USE method (Utilisation, Saturation, Errors) for resources. RED method (Rate, Errors, Duration) for services. The Four Golden Signals: latency, traffic, errors, saturation. Plus: dependency health (DBs, APIs you depend on), business metrics (signups, orders), and 'is the customer happy' metrics. Avoid alerting on every metric โ€” alert on SLO breaches.

Describe a time you improved a process for your team.

mediumbehavioural
Show model answer
Model answer

STAR. Example: 'Our deploys took 2 hours. I migrated CI from Jenkins to GitHub Actions, parallelised the test suite, and added caching. Deploys now take 12 minutes. Team ships 4x more often.' Quantify always.

Tip

Improving the team is what separates senior DevOps from mid-level.

A developer asks you to give them root access on production. How do you respond?

mediumscenario
Show model answer
Model answer

No. But politely and with alternatives. 'Production root is restricted to break-glass scenarios. What are you trying to do? Most needs can be met by: read-only access for debugging, a deploy permission for their service, or a kubectl namespace they can manage. Tell me what you need and I will set up the right access.' This shows security awareness AND collaboration.

Tip

Tests whether you balance security with not being a blocker. The wrong answer is 'sure, here you go'.

Help someone else find this

This is free, no ads. Share with anyone preparing for the test.