All articles

Building a fit-for-purpose modern data stack, part 1.

Author:

Moss Pauly

Published:

February 13, 2023

Building a fit-for-purpose modern data stack, part 1.

In this multi-part blog post Sydney-based Zipster Moss Pauly, Senior Manager, Data Products introduces our ANZ data platform and the team’s journey over the last 18 months.

The broader Zip Data & Analytics and Engineering teams have led the transition from a fairly standard lakehouse implementation (S3, custom scripts and AWS Athena) to a modern data stack (dbt, Fivetran, Airbyte, Snowflake, Census, Snowplow, Airflow). And while the platform machinery has changed fairly extensively, our consumption layer has remained relatively constant throughout this with data scientists primarily using Databricks and our analytics team producing BI reporting in Tableau.

In this first part of the article - aimed at supporting others through the complex process of implementing a modern tech stack - Moss shares his insights on how to establish a framework to guide decision making on the various components that now make up Zip’s stack.

In part two, we take a deep dive into the six key decisions the team tackled, and the valuable experiences and key learnings Moss and the team gathered along the way. We hope you enjoy your read!

With the new year well underway, I’ve been reflecting on this journey from old to new. We were lucky enough to have a great team to navigate the many decisions needed, as well as access to a fantastic data community here in Sydney, Australia.

Even with a supportive community and an internet worth of reading material, I would have loved an in-depth article that went into the decision-making process of building a modern data stack and selecting components.

With that in mind, here’s the article I wish I’d had access to at the start of our journey.

Start with people’s experiences

Chances are that no matter what problem you’re looking at, you’re not the first. One of the most valuable things you can do is learn from those that have tackled this problem before you and reach out to your peers.

What do you like about your stack and what would you do differently next time?

I can’t remember how many times I asked this question throughout this journey. It must have been pushing upwards of 30 across a lot of different companies and people. When you’re early on in the process, you cannot ask this question enough. Generally speaking, people will be far more diplomatic in writing than in conversation, and we found that in order to really get a deep understanding of their experiences with different tools, nothing beat having a chat over a beer or two. We found pretty quickly in these chats that there were common callouts that really helped guide our decision-making.

Another great resource for understanding people’s experience with different components of the stack is talking to vendors who integrate against them — kick-off conversations with a number of providers you’re considering at the start.

Everyone in the modern data stack space I’ve talked to is really friendly and passionate about data and the challenges around data. If you’re talking to someone about egress, ask them what data warehouse most of their customers are using and what trends they’re seeing. If you’re talking to someone about a data warehouse, ask them what they’re seeing for transformations or egress. These vendors are privy to a bird's eye view of the landscape, and that’s really valuable to tap into.

Decision framework and process

We didn’t want to make decisions on components in this stack lightly. Rigour is really important here. At this point, I’ll cover how we evaluated decisions and some additional considerations.

The biggest benefit of the modern data stack is tight integration with each component solving their domain excellently. Decisions around components become easier as you lock in more components as you know exactly what they’re integrating with. At the start, you don’t have this so you need a clear vision of what the problem spaces are that you’re solving and what players you might consider.

As a starting point, we considered the following:

  • Event Collection
  • Data Ingress
  • Data Warehousing
  • Data Transformation
  • Data Egress

Cost scalability is a key consideration for us. We’ve been burnt before with event volumes so we went into cost scalability with eyes wide open. We evaluated this in the following way:

  • SaaS that can migrate to open source is a massive plus. This means that we can reduce time to value initially and always have an option to control costs if we need.
  • Any paid component is evaluated at 1x, 2x and 4x expected volumes. This gives us an idea of the economy of scale.
  • All decisions are made after we’ve done a POC, got our hands dirty and actually played with it. Some things are great on paper but the workflows can be sub-optimal.
  • Anything we want to deploy and manage ourselves has to run on a container platform and slot in with our operational tooling.
  • Look at the supporting community for each tool. The larger and more accessible it is, the better.
  • Listen to when people talk about how delighted they are with something and follow your gut.

Lastly, document your decisions thoroughly. You’ve probably spent weeks researching, comparing and testing options here, so this is in your best interests.

  • Capture the options you considered with the pros and cons for each, and a clear articulation of your recommendation. It helps to clarify, compare, and take decision-makers on the journey.
  • You (or someone else) may need to return to a decision in the future, and it helps to restore a detailed understanding of the context available at the time.
  • It’s likely going to result in you asking for an investment from your business, so having an in-depth articulation is much more likely to get you money than a strong verbal suggestion.
  • If you have a process in your organisation for socialising and endorsing strategic decisions, use it. If you don’t, set up something lightweight that involves key stakeholders (and budget approvers).

In the next part of the article I wanted to share the six key decisions we landed on, detailing our requirements, the path we chose and some tips that the reader may find useful based on our experiences as a team.

Read part 2 of ‘Building a fit-for-purpose modern data stack’ now >

Like what you've read?

Check out these other articles

September 21, 2023

Diversity, Equity & Inclusion
Celebrating Wear it Purple Day

Author: Zach Rennick

We recently sat down with CX Superstar Zach Rennick in celebration of Wear it Purple Day, an annual LGBTIQA+ awareness day especially for young people, based in Australia.
Zipsters support the Future of Finance

Author: Lucy Lindsay and Jimmy Kelly

Hear from Sydney-based Zipsters Lucy and Jimmy who volunteered their time to partner with Young Change Agents, a nationwide social enterprise helping youth from all backgrounds see problems as opportunities

June 6, 2023

Data, Analytics & Risk
Risk UnZipped: Q&A with Priyamvada

Author: Priyamvada Kamra

Meet Priyamvada, our Director, Risk Strategy based in the US. Read on for her career story, the work she's leading at Zip and how she brings her true, authentic self to work.

May 2, 2023

Life at Zip
Celebrating our Zipsterversaries: February - April

Author: Multiple contributors

Each month we recognise the tenure of a Zipster who has truly lived our values during their time here. This article re-caps the stories of the three Zipsters who we celebrated from February to April
Zipping it forward - our partnership with The DV Collective

Author: Anna Wei and Ying Zhang

Meet two of our fearless Product Managers, Anna Wei and Ying Zhang, and learn how they used their Volunteering Leave to support The DV Collective, a national domestic violence initiative

March 27, 2023

Diversity, Equity & Inclusion
Inspiring our next generation of female sales leaders

Author: Karen Farrar

In this article, Senior Director, Commercial Karen Farrar reflects on this year's International Women's Day theme and how it resonates given her experiences as a Zipster.

March 16, 2023

Diversity, Equity & Inclusion
How we celebrated IWD 2023

Author: Multiple contributors

Learn all about how we recognised International Women's Day 2023 and what a gender equal future looks in the eyes of four of our fearless Zipsters
Sustainability UnZipped: Q&A with Oli

Author: Oli Nelson

We recently sat down with Zip’s Sustainability Associate Oli Nelson to hear what the team have been up to in support of our global Social Impact and Sustainability initiatives.

February 16, 2023

Data, Analytics & Risk
Building a fit-for-purpose modern data stack, part 2.

Author: Moss Pauly

In this second part of a multi-piece article, Moss Pauly, Senior Manager, Data Products, deep dives into the key decisions the team tackled when modernising Zip’s data stack

February 15, 2023

Data, Analytics & Risk
Risk UnZipped: Q&A with Mohamed

Author: Mohamed Afifi

Meet Mohamed, based out of our NYC office, who's leading our Enterprise Risk Management strategy at an exciting point in the journey for Zip in the US.
All Things Product: Mel Hambarsoomian

Author: Mel Hambarsoomian

Our Director of Product Design, Melanie, recently sat down to reveal her career highlights, sharing what she looks for in talent, her insights on design maturity, product thinking and design leadership

January 29, 2023

Life at Zip
Celebrating our Zipsterversaries: November - January

Author: Multiple contributors

Each month we recognise the tenure of a Zipster who has truly lived our values during their time here. This article re-caps the stories of the three Zipsters who we celebrated from November '22 to January

January 26, 2023

Diversity, Equity & Inclusion
How our Zipsters celebrated Lunar New Year

Author: Multiple contributors

Hear about the significance of Lunar New Year to Jess and Clement, and learn how they celebrate
Making use of our Volunteering Leave

Author: Multiple contributors

Every year Zipsters get 2 days of paid Volunteering Leave. Here's how a couple of the team used theirs...
Engineering UnZipped: Q&A with Philip

Author: Philip Laureano

Meet ANZ Engineering Manager, Philip Laureano, and learn about the technology that decides which Zip customers are approved for credit
Engineering Unzipped: Q&A with Autumn

Author: Autumn Ragland

US-based Engineer Autumn shares her experiences as a Zipster
Atomic change and the future of Technology at Zip

Author: George Gorman

ANZ CTO George Gorman gives a glimpse into the opportunities that lie ahead for the team

November 11, 2022

Data, Analytics & Risk
Moving at Zip speed while building data that stands the test of time

Author: Tal Bergman

ANZ Director Data Tal Bergman offers insight into life in the Data & Analytics team
Engineering UnZipped: Q&A with Kalpana

Author: Kalpana Chandrasekar

Meet ANZ Engineering Manager, Kalpana Chandrasekar, from our Shop & Rewards team

November 11, 2022

Data, Analytics & Risk
Master these three skills to grow your data and analytics career

Author: Will Walker

Looking to grow your data and analytics career? ANZ Senior Manager, Analytics Will Walker shares his top tips
View all articles