Data Engineer
Build the data pipelines and infrastructure that power analytics and AI. Work with big data technologies and cloud data platforms.
What a typical day looks like
I work on the pipelines that move data from where it is created to where it is used. Most mornings start with checking that all the overnight jobs finished โ Airflow dashboard, dbt run logs. If something failed, that is the first priority. Then I move to the day's main work โ usually a new pipeline, a refactor, or a performance fix. Afternoons often have a coordination element: meeting with analysts whose queries are slow, meeting with engineers whose system needs to emit better events, meeting with product about new data they want to capture. The role is more relational than people expect. Strong data engineers are part SQL wizard, part diplomat.
Hour-by-hour
Skills you need
Required
Nice to have
Build these to stand out
Hands-on projects beat any CV bullet point. Pick one and finish it.
ETL Pipeline with Airflow + dbt
Pull data from a public API daily, transform with dbt, load to a warehouse (BigQuery free tier or DuckDB). Schedule with Airflow. Add tests. Document with dbt docs.
Demonstrates the modern data engineering stack. Solid portfolio piece.
Streaming Data Pipeline
Build a Kafka producer/consumer system. Produce events to one topic, consume and aggregate to another. Add windowed aggregations with Kafka Streams or Flink.
Streaming is harder than batch. Companies need senior engineers who can do it.
Open-Source Contribution
Pick a data engineering tool (dbt, Airflow, dlt, Meltano). Find a 'good first issue' on GitHub. Submit a PR. Get it merged.
An OSS merged PR is gold on a CV. Shows you can navigate someone else's codebase.