When three different phone systems, two call tracking tools, and over 20 brands all push leads into the same HubSpot portal… things get messy fast.
In this episode of How I Fixed Your Data, I sat down with Chris Osantowski from Pyxis Growth Partners, a HubSpot consulting partner based in Chicago, to talk about one of their toughest data cleanup challenges yet, and how they reduced a client’s contact database from 700,000 to just over 400,000, without losing any marketing attribution data.
🎥 Watch the full episode below:
☎️ The Problem: Three Phone Integrations, One Giant Mess
Chris’s client was a home services company that managed over 20 brands inside a single HubSpot portal.
Each brand had its own way of capturing lead, and none of them spoke the same data language.
Here’s what the Pyxis team inherited:
-
Three different phone systems, all connected to HubSpot
-
Two call-tracking tools, CallRail and WhatConverts
-
HubSpot Forms and Gravity Forms from multiple websites
-
Legacy data uploads from individual brand teams
The result: chaos.
Phone calls, form submissions, and uploads were constantly creating duplicate contacts — each one holding a slightly different piece of marketing information.
At its peak, the HubSpot portal ballooned to over 700,000 contacts.- Reporting broke.
- Sales teams lost trust.
- And no one was sure which data was right anymore.
🧩 The Fix: Merge the Duplicates, Keep the Attribution
Pyxis Growth Partners turned to Koalify to untangle the mess. Their goal wasn’t just to delete duplicates — it was to preserve the marketing data that mattered most.
Here’s how they approached it:
1️⃣ Phone Numbers as the Source of Truth
Since the client was a call-heavy business, many leads came without emails.
So instead of deduplicating by email, Chris’s team used phone numbers as the primary match field — treating them the same way HubSpot treats email addresses.
Koalify’s fuzzy matching handled phone numbers that were formatted differently (like “+1 312-555-0000” vs. “(312) 555-0000”), catching duplicates that HubSpot’s default logic missed.
2️⃣ Merge Rules to Preserve CallRail Data
The team then used Koalify merge rules to make sure the right data survived every merge.
They prioritized:
-
CallRail’s UTM and source data (to keep accurate attribution)
-
The most recently updated record (for freshness)
“If there were three versions of the same contact, we made sure the one with CallRail data won,” Chris said.“That’s the record sales and marketing could actually trust.”
3️⃣ Automate, Then Review
Once the rules were tested, merges ran automatically in the background — reducing manual cleanup time and eliminating the risk of human error.
The team also reviewed edge cases manually to ensure no valuable records were lost.
📉 The Results: From 700,000 to 400,000 Clean Contacts
The outcome was dramatic.
Within days, the contact database shrank from 700,000 to just over 400,000 contacts — a controlled, intentional cleanup that kept the marketing data intact.
“It was amazing to wake up and see the overall database shrink, not from deleting data, but from finally cleaning it up,” Chris recalls.
The benefits extended far beyond the numbers:
-
Reliable attribution — Marketing could finally trust its UTM data again.
-
Better reporting — No more duplicate counts or conflicting sources.
-
Fewer marketing contacts — Saving on HubSpot license costs.
-
Confident sales teams — No more double-calling the same lead.
With a clean CRM, automation rules became simpler, and nurture workflows finally worked as intended.
💡 Lessons Learned: Don’t Wait Until 700K
Chris’s main takeaway? Start deduplication early.
“We waited too long, 700,000 contacts made the rollout more careful and time-consuming than it needed to be,” he says. “If we had brought in Koalify sooner, we could have prevented the mess before it started.”
He also emphasizes the importance of merge rules:
“Most people overthink their primary rules. What really matters is your merge logic, that’s where you decide what data survives.”
⏱️ Setup Time: Just a Few Hours
Despite the scale, the cleanup wasn’t complex to configure.
“It took maybe two hours to set up the primary and merge rules, and another couple of hours to refine them,” Chris said.“The hard part was understanding the data, Koalify was the easy part.”
Once configured, the automation took care of the rest.
🧠 The Takeaway
When you’re managing multiple brands and call integrations in HubSpot, duplicates are inevitable — but data loss doesn’t have to be.
Koalify helped Pyxis Growth Partners:
-
Consolidate three phone integrations
-
Clean up 300,000 duplicate records
-
Retain CallRail attribution data
-
Restore trust in the CRM
As Chris puts it:
“I’m not intimidated by messy data anymore. With Koalify, I can promise clean, reliable, attributed data, even when 20 brands share the same HubSpot portal.”