KonMari your data ETL strategy

Published July 19, 2021   |   
Team Crayon Data

Use the KonMari method for a data ETL strategy that sparks joy for your organization

To the uninitiated, Marie Kondo is the world’s most famous tidying up consultant (thanks to Netflix). Her trademarked KonMari method is all about how one can declutter their living spaces to live a better life. The philosophy is simple: do not start with a room, but rather, start with a category, like clothes. Follow a specific order. Inspect each item. If it does not spark joy, discard it.

The Extract-Transform-Load (ETL) strategy for an organization is strikingly similar. Most firms do not have a tidying up strategy and outsource this critical job. In this edition of #BankingOnVidhya, I look at how we can adapt the KonMari method to store data that sparks joy and adds value to our organizations.

Muri – Over-burden

The cost of data storage has crashed, because enterprises now use distributed data platforms instead of single server databases. This does not mean we store every piece of data for posterity.

Organizations require clarity and acumen on

  • What to store forever
  • What needs temporary storage, and
  • Observe data points/streams as events to react to.

This will lead to smarter data ETL investments, easier accessibility and faster downstream processes.

Muda – Unnecessary efficiency

Data velocity & frequency of storage need not be faster than the speed of the underlying business process. We need real time data & analytics for reacting to currency fluctuations or credit card transactions.

There is a difference between the microseconds required for Fx and taking 5 minutes before reacting to a card transaction. Spending 2x more to bring down the reaction time from 5 minutes to 30 seconds just means you have an overzealous tech team.

Kata – Follow order

Banks & financial services typically need data for

  • Smarter data/reach out to clients.
  • Data driven decisions making & reporting to regulators

It helps to stay in this order while prioritizing data ETL design. Faster processes take priority on extraction and slower processes re-use the data. Data pulled once then rolls up for different purposes.

Reconciling data done by siloed teams is a favorite executive sport! You will find the CMO & CFO discussing their data assumptions into eternity rather than the ‘so what’.

Tokimeku – Sparks joy

The data ETL job is sandwiched with core systems on one side and downstream applications and users on the other side. Mechanical transformation of data across this supply chain is a common sight.

Every piece of data needs a central decision-making hub to be transformed into a compact, unique and accessible position in the data ecosystem. Respecting other data loads and not creating data cousins is critical.

And, remember, if the data does not spark joy, have the maturity to let it go!

More from the #BankingOnVidhya series: Data ingestion