1. Dimension modelling
  2. Difference between etl and elt, and state the process for each
  3. Given a business problem, how would you construct the data model
  4. how spark or mpp database joins data under the hood -  shuffling, broadcasting, hash join, sort merge join, nested loop join etc.
  5. https://datalemur.com/