Scenario

Consider a scenario where we have an employee_work_location table with columns such as employee_name, work_location, start_date, and end_date. The goal is to extract the active employee for each month.

Challenges

Example Data

employee_namework_locationstart_dateend_date
emp_aOffice A2022-01-012022-12-31
emp_bOffice B2021-05-152022-02-28
emp_cOffice C2022-03-102023-01-15
emp_dOffice A2021-08-202022-11-30
emp_aOffice B2021-03-052022-09-15
emp_bOffice C2022-06-122023-03-01
emp_cOffice A2021-12-012022-06-30
emp_dOffice B2022-02-152022-12-15
emp_eOffice C2021-07-102022-08-25
emp_fOffice A2022-04-182023-02-28

To achieve our goal, we can use a single SQL query with the help of range/cartesian joins and a temporary date series. The query generates a series of months from the minimum start date to the maximum end date, performing a cross join with the employee_work_location table. This ensures that we capture all relevant combinations.

SQL Query:

WITH date_series AS (
  SELECT generate_series(
           (SELECT DATE_TRUNC('month', MIN(start_date)) FROM employee_work_location),
           (SELECT DATE_TRUNC('month', MAX(end_date)) + INTERVAL '1 month' FROM employee_work_location),
           '1 month'::interval
         )::DATE AS work_month
)
SELECT
  ds.work_month,
  ew.employee_name,
  ew.work_location
FROM
  date_series ds
CROSS join  employee_work_location ew
where DATE_TRUNC('month', ew.start_date) and DATE_TRUNC('month', ew.end_date)
ORDER BY
  ds.work_month,  ew.employee_name,   ew.work_location;

Note: We are truncating the month to get the interval from beginning of the month, and adding 1 month to the interval through + INTERVAL '1 month' to not miss the max end month

Result

Now output of the our query presents a comprehensive view of employee work locations for each month.

work_monthemployee_namework_location
2021-03-01emp_aOffice B
2021-04-01emp_aOffice B
2021-01-01emp_aOffice B
2021-06-01emp_aOffice B
2021-06-01emp_bOffice B
2022-12-01emp_bOffice C
2022-12-01emp_cOffice C
2022-12-01emp_dOffice B
2022-12-01emp_fOffice A
2023-01-01emp_bOffice C
2023-01-01emp_cOffice C
2023-01-01emp_fOffice A
2023-02-01emp_bOffice C
2023-02-01emp_fOffice A
2023-03-01emp_bOffice C

Further Enhancements: Separating Months and Years

For additional clarity, we can further refine the result by extracting the year and month using PostgreSQL’s EXTRACT() function.

WITH date_series AS (
  SELECT generate_series(
           (SELECT DATE_TRUNC('month', MIN(start_date)) FROM employee_work_location),
           (SELECT DATE_TRUNC('month', MAX(end_date)) + INTERVAL '1 month' FROM employee_work_location),
           '1 month'::interval
         )::DATE AS work_month
)
SELECT
  extract(year FROM  ds.work_month) as "year",
  extract(month FROM  ds.work_month) as "month",
  ew.employee_name,
  ew.work_location
FROM
  date_series ds
CROSS join  employee_work_location ew
where DATE_TRUNC('month', ds.work_month) between DATE_TRUNC('month', ew.start_date) and DATE_TRUNC('month', ew.end_date)
ORDER BY
  "year", "month",  ew.employee_name,   ew.work_location;

The result now includes separate columns for the year and month, providing a more organized view of the data.

yearmonthemployee_namework_location
20213emp_aOffice B
20214emp_aOffice B
20214emp_aOffice B
20215emp_aOffice B
20216emp_aOffice B
20231emp_cOffice C
20231emp_fOffice A
20232emp_bOffice C
20232emp_fOffice A
20233emp_bOffice C

With these SQL techniques, we can efficiently extract and organize data, gaining valuable insights into temporal relationships.