Top 3 Problems with Data Integrations
A few weeks back, I was discussing data products with Lan Nguyen and she talked about the top three types of problems with data integrations.
Lan has extensive experience working on data and logistic products at scale, particularly at Amazon.
According to Lan, the top 3 problems with data integrations are:
Definition (i.e. Rev. vs Costs)
Labeling Data Clearly (number types)
Cadence of Data (daily vs weekly)
If you are interviewing for a role that is data heavy, think logistics and just about any product in big tech, it is important to be well-versed in the problems you could encounter.
Let’s break down these concepts a bit more.
Definition
The best example here is seeing a dollar amount. Is it a positive or negative amount? Is it the revenue, profit, or cost? Getting that wrong will lead to horrible business decisions and very frustrated customers.
Labeling
Imaging data is labeled as a number. You have all seen that error in a spreadsheet, just imagine if that happened automatically at scale for billions of entries into a database. (I can see you cringing from here.)
Cadence
How often does the data come in? Or what is it calculated on? Imagine one team does weekly summaries and another daily, and you merge them?
While these are simple concepts, we can easily forget them in the anxious moments of an interview. If you are preparing to interview with a data-focused team that might be quizzing you on issues with APIs, this is a nice list to review.