A good Data Engineer must be familiar with GUI-based tools and be able to develop efficient code for data engineering activities, such as developing and maintaining ETL/ELT pipelines, data transformations, feature engineering to support model training, and insights generation.

Delivery ModeOnline
Duration2 hours
PrerequisitesNone
Includes Hands-On SessionYes

Target Audience

This session is suitable for:

  • Junior Data Engineers
  • Data Analysts who wish to brush up their Python programming skills
  • Other data professionals who feel the need to brush up their Python programming skills
  • University students with an interest in Data Engineering

Who Is A Junior Data Engineer

Whilst the specific requirements (and the definition) of a Junior Data Engineer may vary, depending on the organisation (and in some cases, geography as well), the bare minimum characteristics are:

  • Less than 3 years of relevant work experience as a Data Engineer.
  • Build and maintain scalable end-to-end data pipelines and ETL/ELT processes.
  • Good programming skills in one or more programming languages, such as Python, Scala, Rust, and Kotlin.
  • Development and maintenance of scripts and associated code for automating activities in the data pipelines.
  • Implement methods to improve data reliability and quality.
  • Awareness of the 18 DataOps principles, especially

Minimum Python Skills Expected

As of November 2023, the minimum Python skills expected from Junior Data Engineers are:

  • Generators
  • Iterators
  • Object-Oriented Programming
  • Regular Expressions
  • Threading and Multiprocessing
  • Unit Testing
  • Profiling Python Code

You can read our blog post for simple explanations of the skills listed above,
๐Ÿ“ฐ โ€œMinimum Python Expected From Junior Data Engineersโ€

What Will Be Covered

In this session, you will,

  • Get an overview of each of the minimum Python skills expected from Junior Data Engineers.
  • Develop (via iterative improvements) efficient Python code for a few data engineering tasks, such as data transformations, and feature engineering to support model training.
  • Learn how to write unit tests to improve reliability of code developed.
  • Learn how the basic concepts of mocking
  • Learn how to speed up slow (inefficient) Python code.

Hands-On Session

We will use Google Colab notebooks to simplify the process by eliminating the need to install prerequisite libraries.

Upcoming Sessions

WhenWhere
Saturday, 27 Jan 20244 - 6 PM ๐ŸŒOnlineRegister

Please note:

  • All times above are expressed in SGT. Click on the ๐ŸŒ icon next to sessions of interest to get your location’s corresponding date and time.
  • For reference, 4 PM SGT is 1:30 PM (Bangalore), 3 PM (Jakarta), 1 PM (Lahore)

Other Dates & Times

๐Ÿ‘‹ I am interested but don’t see a date or time that works for me.

You can indicate your preferred date & time when you register.

What’s Next?

What should I learn after attending this session?

Tracks

This session covers content that is part of the following tracks:

Full List

Further Reading

  1. Minimum Python Expected From Junior Data Engineers (Nov, 2023)