The Curious Path from Marketing Chaos to Data Lakehouse Nirvana
I’ve been thinking a lot lately about how companies actually evolve their data infrastructure—not the sanitized case studies we see at conferences, but the messy, real-world journey from “we have too many dashboards” to “we have a composable data stack.” And here’s what’s interesting: the organizations that end up with elegant, modern architectures rarely started by trying to build elegant, modern architectures.
They started somewhere much simpler. Often with something like Datorama.
Why Start with Datorama? (Or: The Art of Beginning)
There’s this fascinating paradox in data maturity. The entire enterprise wants a single point of truth. But trying to build that on day one? That’s like trying to learn jazz piano when you haven’t yet figured out where middle C is.
Datorama does something subtle and powerful: it forces you to think about your data before you get fancy with your infrastructure. You start consolidating marketing sources, sure. But more importantly, you start asking questions like:
- What should we actually call this campaign across all these platforms?
- How do we want to organize our product taxonomy?
- Which metrics actually matter when we look across channels?
These aren’t technical questions. They’re business questions. And weirdly, you need to answer them whether you’re using a marketing-focused platform or building a custom data warehouse. Datorama just makes you answer them earlier, when the stakes are lower.
The Taxonomy Revelation (That Nobody Warns You About)
Here’s something I wish someone had told me years ago: your data infrastructure will never be better than your naming conventions. Never.
You can have the most beautiful lakehouse architecture in the world, but if your campaign names are chaos, your UTM parameters are ad hoc, and every team has their own way of tagging products… well, you’ve just built an expensive museum for messy data.
When you’re working with Datorama, this becomes obvious fast. You connect Google Ads and Facebook and LinkedIn, and suddenly you’re staring at three completely different naming schemes for what’s supposed to be the same campaign. The platform doesn’t solve this for you—it just makes it impossible to ignore.
And that’s actually a gift. Because once you establish those conventions—once you build that muscle of consistent taxonomy—it travels with you up the entire maturity curve.
The Journey Up (And What Changes Along the Way)
So let’s say you’ve spent some time with Datorama. You’ve cleaned up your taxonomy. You’re getting decent cross-channel reporting. Your stakeholders are happy because they can finally see Google and Meta performance side-by-side without someone spending three hours in Excel.
What happens next?
This is where it gets interesting. Your needs start to evolve:
- Finance wants to blend marketing data with revenue data from Salesforce
- Product teams want to understand the full customer journey, not just the marketing touchpoints
- Data science wants to start building predictive models
- Every executive wants to see data in their preferred format (Tableau, Power BI, Looker—pick your poison)
Suddenly, a marketing-focused platform feels constraining. Not because it’s bad at what it does, but because what you need to do has expanded beyond its original scope.
Enter the Lakehouse (And Why Your Earlier Work Matters)
The natural evolution is toward something more flexible: a data lakehouse architecture. Snowflake, Databricks, BigQuery—take your pick. These platforms let you store everything, join anything, and process at scale.
But here’s what I find fascinating: the organizations that succeed with lakehouses aren’t the ones with the best data engineers. They’re the ones with the best data discipline. And where does that discipline come from? Often from those earlier days of cleaning up campaign names and standardizing UTM parameters.
When you migrate from Datorama to a lakehouse:
- Those naming conventions you established? They become your dimensional models
- That cross-channel harmonization? It becomes your data transformation logic
- Those business rules for campaign categorization? They become your dbt models
You’re not starting from scratch. You’re scaling up patterns you’ve already validated.
The Composable CDP Question (Or: Why Hightouch Makes Sense Now)
Traditional CDPs promise everything: data collection, identity resolution, segmentation, activation. But they’re also expensive, rigid, and often redundant when you already have a lakehouse storing all your customer data.
This is where composable CDPs like Hightouch get interesting. The pitch is simple: if you already have clean, modeled customer data in your lakehouse, why send it to another platform just to send it back out to marketing tools? Why not activate directly from your warehouse?
But notice the critical dependency: “if you already have clean, modeled customer data.” This is where those taxonomy lessons from your Datorama days become crucial. Hightouch doesn’t clean your data. It doesn’t solve identity resolution for you (at least not entirely). It assumes you’ve done that work.
If you have? Then you get something remarkably powerful: the ability to define audiences using SQL, sync them to any marketing platform, and keep your customer data in one place. It’s CDP functionality without the CDP baggage.
If you haven’t? Then you’re just moving chaos from one system to another, faster.
The Visualization Democracy (Or: Why Your Users Don’t Care About Your Stack)
Here’s a truth that took me too long to learn: nobody outside your data team cares what your data infrastructure looks like. They care about whether they can answer their questions.
The marketing team loves Tableau. Finance lives in Power BI. Product built everything in Looker. Executives just want a clean mobile dashboard they can check between meetings.
With a lakehouse as your foundation, you can support all of them. Your data models live in one place, but different teams can query them through their preferred tools. No more “sorry, we only support this one viz platform” conversations.
But again—and I really can’t stress this enough—this only works if your underlying data makes sense. If your product categorizations are consistent. If your campaign taxonomies align. If your customer identifiers are resolved.
All things you hopefully figured out back in your Datorama days.
So What’s the Actual Path Forward?
If I were mapping this journey for a marketing team today, here’s what I’d be curious about:
Phase 1: Foundation (Datorama Era)
- Get your marketing sources in one place
- Establish and enforce naming conventions
- Build the discipline of consistent taxonomy
- Learn what cross-channel reporting actually teaches you
Phase 2: Expansion (Early Lakehouse)
- Migrate marketing data to a warehouse
- Start blending with non-marketing sources
- Formalize your transformations in code (dbt or similar)
- Discover what questions you couldn’t answer before
Phase 3: Activation (Composable CDP)
- Implement reverse ETL (Hightouch, Census, etc.)
- Start activating warehouse data directly to marketing tools
- Build sophisticated audience definitions in SQL
- Shorten the loop from insight to action
Phase 4: Democracy (Multi-Tool Viz)
- Support multiple visualization platforms
- Enable self-serve analytics for different teams
- Maintain governance without blocking progress
- Let the data serve the users, not the other way around
The Questions That Remain
I’m genuinely curious about a few things as I watch companies navigate this path:
Can you skip steps? Probably not entirely. The discipline has to come from somewhere, and I haven’t seen many teams successfully implement a lakehouse and composable CDP while simultaneously figuring out basic data hygiene. But maybe your starting point isn’t Datorama—maybe it’s something else that forces similar discipline.
Is there a “done” state? I doubt it. Every time I talk to someone who’s “finished” their data transformation, they mention three new things they’re exploring. The composable data stack keeps adding pieces, and what felt modern two years ago feels legacy today.
Does everyone need to follow this path? Definitely not. Small teams might skip straight to a simple warehouse setup. Enterprise companies might have different constraints. But there’s something universal about the progression from fragmented → consolidated → harmonized → activated → democratized.
Linking It All Together
I’ve written about various pieces of this journey before:
Datorama early stage: https://adman-analytics.com/2025/09/27/building-your-analytics-foundation-how-datorama-accelerates-your-journey-up-the-maturity-curve/
Building that data lake: https://adman-analytics.com/2025/10/14/cloud-data-warehouses-for-marketing-is-the-best-way-to-leverage-ai/
Choosing a data lakehouse platform https://adman-analytics.com/2025/10/15/storing-semi-structured-and-unstructured-data/ [
Data storytelling: https://adman-analytics.com/2025/09/26/the-art-of-data-storytelling-transform-numbers-into-compelling-narratives/
Each of those posts explores specific decisions along this path. But the meta-pattern is what I find most interesting: how the habits you build early determine what’s possible later. How discipline at the foundation enables flexibility at the top.
Where Are You on This Journey?
I’d be curious to hear from others navigating this path. Are you still in the Datorama consolidation phase? Have you migrated to a lakehouse but struggling with activation? Built the whole stack but wondering if it’s actually better than what you had before?
Because here’s the thing: there’s no shame in being early in the journey. The companies I admire most aren’t the ones with the fanciest data stacks—they’re the ones who’ve thoughtfully evolved from where they were to where they needed to be, learning at each stage what they actually required versus what sounded cool at a conference.
The curious path isn’t about reaching a destination. It’s about asking better questions at each stage of the journey.
