The Modern Data Stack (MDS) is an umbrella term that refers to modern data management practices that make integration, transformation and usage of data easier, faster and cheaper (if done well). As [Fivetran puts it](https://www.fivetran.com/blog/what-is-the-modern-data-stack) , it is essentially a bunch of tools - A fully managed ELT pipeline - Cloud based columnar data warehouse - Data transformation tool - BI tool for data visualisation Here is a high level presentation I had put together for some final year undergraduates in December, 2022. <div style="position: relative; width: 100%; height: 0; padding-top: 56.2500%; padding-bottom: 0; box-shadow: 0 2px 8px 0 rgba(63,69,81,0.16); margin-top: 1.6em; margin-bottom: 0.9em; overflow: hidden; border-radius: 8px; will-change: transform;"> <iframe loading="lazy" style="position: absolute; width: 100%; height: 100%; top: 0; left: 0; border: none; padding: 0;margin: 0;" src="https:&#x2F;&#x2F;www.canva.com&#x2F;design&#x2F;DAFU30395LU&#x2F;view?embed" allowfullscreen="allowfullscreen" allow="fullscreen"> </iframe> </div> <a href="https:&#x2F;&#x2F;www.canva.com&#x2F;design&#x2F;DAFU30395LU&#x2F;view?utm_content=DAFU30395LU&amp;utm_campaign=designshare&amp;utm_medium=embeds&amp;utm_source=link" target="_blank" rel="noopener">Amrita Guest Talk - The MDS</a> by Ramshankar Yadhunath ## What was before the "Modern Data Stack"? My data career in a way has been "born" into the #modern-data-stack phase of data management. Therefore, I don't have much experience dealing with "legacy" or "traditional" data stacks. Based of some observation and much deliberate discussion with my colleagues, what I have come to realise is that a "traditional" data stack is characteristic of - On-premise data storage solutions - Code-heavy and complex ETL pipelines - A more waterfall approach to implementing data solutions that will bring data closer to end users in a format they would like - And all this often leads to a slow pace of delivery and implementation of use cases A useful analogy to a traditional data stack (TDS) is offered by [Neptune.ai's MLOps Blog](https://neptune.ai/blog/modern-data-stack) - > Imagine a TDS like a Christmas light decoration in a box. The lights are needed at the right time for the Christmas party but you discover that some of the bulbs need to be changed. You have to untangle the whole thing to find the broken bulb. After a while, you finally replace the bulb but by the time you are done, the party is over. ## Is it worth it, moving away from TDS to MDS? This is definitely the million dollar question. I have a few observations based on my experience working at a fast paced data and AI consultancy in London. ### The Good - Choose an MDS for these reasons - The #modern-data-stack brings a plug-and-play approach to building data systems -> Simplifies the process - Easy to use BI tools and their self-service nature helps get data to consumers fast -> Faster time to insights - Lowers the barrier to building an end-to-end data system -> Anybody can get started with it[^1] ### The Dangers - Be wary of the MDS for these reasons - Costs -> If not carefully considered and scrutinised, cloud spend can be a surprise - Sales-y pitches and faff -> With a lot of vendors competing for the same space, at times it becomes hard to understand the true value an MDS tool might bring to your firm[^2] ## Closing Thoughts I really like the MDS approach. Well, I ought to - It is what helps me put food on the table 😁. But, I would definitely exercise caution in going down the MDS path as a default because "everybody else is doing so". Like any great methodology, the #modern-data-stack is quite useful, but if implemented without the right considerations and prior research, it could easily be a step in the wrong direction. ## Update 17th Feb, 2024 - Some more thoughts on the MDS When I started my professional career in data just over two years ago, all the rage was about the MDS (Modern Data Stack). Most "data-driven" companies wanted a piece of it, investors were lining up, consultants sharpened their sales pitches, and technology providers were in a frenzy to attract "new" customers to build their "modern" data stack on their platform. As a first-time "data person," this was all really fancy to me. And it looked so cool! But, it was also funny. Funny because I had never worked in the traditional data stack before. I was born into the MDS phase of data. An MDS baby, if you will. Here I was, with absolutely no applied experience with on-prem systems, finding myself amidst conversations about moving away from on-prem into the cloud. > That's a bit like me at a nightclub. I know I am there, but I am not really sure how to strike a convo without seeming stupid. But the biggest challenge of starting your career in an industry during a boom of multiple SaaS vendors (the MDS) is the sheer amount of knowledge you need to get up to speed with in the quickest possible time. It's hard because there is **just too much confusion**. Everybody is trying to sell you something. The SaaS products might be brilliant, but to a new professional. the overly enthusiastic sales pitches are boring. In fact, they just drown the real bits that matter to a new engineer - *Concepts & Foundations*. It also does not help how the MDS wave brought about this magnanimous promise of *democratising data*. The simpler it became to use a product to achieve X, the harder it became to understand the underlying challenge the product helped fix. > It's easy to press a button and see data move. But, it takes experience to know which button needs to be pressed and what level of difficulty that button is helping alleviate. #### Footnotes [^1]: Of course, there would need to be a baseline understanding of data management and a clear vision on how it would help the business outcome. [^2]: Especially when you meet a bunch of vendors at a conference or meetup. Those places are an introvert's nightmare 👀