Case Study: Integrating Salesforce with Product Usage Data: Segment, Mixpanel, and Hightouch in Action

Case Study - Design System - Blog - 15
October 11, 2025
10 min
Bohdan Hlushko
Head of Software Engineering
Vadym Shvydkyi
INSART’s tech quarterback. Oversees full-stack architecture from backend to data platforms. His team crafts fintech solutions at startup pace while keeping enterprise-grade quality and reliability front-and-center.

Table of Contents

For enterprise fintechs, data often lives in silos: sales teams live inside Salesforce, product teams inside Segment or Mixpanel, and data scientists inside Databricks.

Each system offers a different lens on the customer — CRM data reflects intent, analytics reflects behavior, and ML systems reflect prediction. But unless these worlds talk to each other, the business never sees a complete picture.

At INSART, we specialize in stitching these fragmented systems together into a unified customer intelligence fabric.

This case study showcases how we integrated Salesforce, Segment, and Mixpanel using Hightouch for reverse ETL, and how we layered a churn prediction model in Databricks to power data-driven relationship management for an enterprise fintech platform.


The Problem: Sales Without Context

The client — a B2B fintech platform for wealth management — faced a recurring issue:

  • Salesforce contained rich account data: pipeline stage, ARR, region, account owners.

  • Segment tracked in-app events: logins, transactions, API calls, feature usage.

  • Mixpanel held advanced analytics: user retention, feature adoption curves, funnel drop-offs.

Yet, these datasets were disconnected.

Sales reps saw “dormant accounts” but had no insight into actual product activity. Data scientists built churn models that sales never acted on. Marketing had no visibility into who was “at risk” versus “expanding.”

The result: decisions based on gut feel instead of behavioral evidence.

The goal was to synchronize sales and usage data, establish a unified identity graph, and operationalize insights — all within the familiar Salesforce environment.

 

Case Study: Integrating Salesforce with Product Usage Data: Segment, Mixpanel, and Hightouch in Action

 


Architecture Overview

Here’s the high-level architecture that INSART implemented:

+-----------------+      +-----------------+       +-----------------+
|     Segment     | ---> |     Snowflake   | <---  |     Mixpanel    |
|  (event stream) |      |  (data warehouse)|      |  (analytics API)|
+-----------------+      +-----------------+       +-----------------+
          |                        |
          |                 +------+------+
          |                 |   Databricks |
          |                 | (ML / Scoring)|
          |                 +------+------+
          |                        |
          v                        |
   +-------------+           +------v------+
   |  Hightouch  |  <------> |  Salesforce |
   | (Reverse ETL)|          |   (CRM API) |
   +-------------+           +-------------+

Core components:

  • Segment: collects real-time product usage events and forwards them to Snowflake.

  • Mixpanel: provides advanced engagement analytics and retention cohorts via API exports.

  • Snowflake: acts as the single source of truth for all data (sales, product, engagement).

  • Databricks: processes the warehouse data to train and score churn-prediction models.

  • Hightouch: performs Reverse ETL, syncing enriched insights back into Salesforce objects.

  • Salesforce: becomes the operational surface — surfacing customer health, risk, and usage insights directly to sales reps.


Step-by-Step Integration

Step 1: Ingesting Product Usage Data

The first step was capturing in-app events using Segment.

Segment SDKs were integrated into both the web app and backend API to track key behavioral events such as:

  • Account_Created

  • Funds_Deposited

  • Portfolio_Viewed

  • Transaction_Executed

  • API_Request_Made

  • Account_Logged_In

Each event included contextual properties:

{
  "event": "Transaction_Executed",
  "userId": "u_1329",
  "properties": {
    "transaction_id": "t_56892",
    "amount": 1200.00,
    "currency": "USD",
    "account_id": "acc_342",
    "region": "EU"
  },
  "timestamp": "2025-10-10T15:32:00Z"
}

Segment pipelines automatically streamed these events into Snowflake using its native Snowflake destination connector.


Step 2: Enriching Product Data with Mixpanel Analytics

While Segment provided granular event data, Mixpanel offered higher-level metrics:

  • DAU/WAU/MAU ratios

  • Feature adoption

  • Retention cohorts

  • Funnel drop-off percentages

INSART’s data engineering team used Mixpanel’s JQL API (JavaScript Query Language) to extract these metrics daily into Snowflake.

Example query for active users:

function main() {
  return Events({
    from_date: "2025-10-01",
    to_date: "2025-10-31"
  })
  .groupByUser(["$user_id"], mixpanel.reducer.count());
}

Output was normalized to daily activity tables:

CREATE TABLE user_activity (
  user_id STRING,
  date DATE,
  active_events INT
);

This enabled blending behavioral signals from Mixpanel with transactional data from Segment.


Step 3: Building the Unified Identity Graph

The biggest technical challenge was identity resolution.

Different systems used different keys:

  • Salesforce → email, account_id

  • Segment → userId, account_id

  • Mixpanel → $distinct_id, occasionally email

  • Internal DB → transaction_id

INSART created a Unified Identity Graph table in Snowflake:

CREATE TABLE identity_graph AS
SELECT
  LOWER(s.email) AS email,
  s.user_id AS segment_id,
  m.distinct_id AS mixpanel_id,
  f.account_id,
  f.transaction_id
FROM segment_users s
LEFT JOIN mixpanel_users m ON LOWER(s.email) = LOWER(m.email)
LEFT JOIN finance_transactions f ON f.user_id = s.user_id;

This graph served as the bridge for joining sales and behavioral data — ensuring all insights could be mapped back to a single Salesforce Account or Contact.


Step 4: Modeling Behavioral Insights

In Databricks, INSART data scientists used PySpark notebooks to analyze engagement behavior and train a churn prediction model.

The model estimated the likelihood of an account becoming inactive or downgrading in the next 30 days.

Feature Engineering

  • avg_sessions_per_week

  • days_since_last_login

  • avg_transaction_volume_30d

  • feature_adoption_index (weighted based on product modules used)

  • support_tickets_last_90d

  • billing_delinquency_flag

Model Training (Databricks MLflow Example)

from pyspark.ml.classification import RandomForestClassifier
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.evaluation import BinaryClassificationEvaluator

features = ["avg_sessions_per_week", "transaction_volume", "feature_adoption_index"]
assembler = VectorAssembler(inputCols=features, outputCol="features")
train_data = assembler.transform(data)

rf = RandomForestClassifier(labelCol="churned", numTrees=100)
model = rf.fit(train_data)

predictions = model.transform(test_data)
evaluator = BinaryClassificationEvaluator(labelCol="churned")
auc = evaluator.evaluate(predictions)
print(f"AUC: {auc:.3f}")

Scoring

Once trained, the model produced a churn_probability score for each account daily.

Scores were written back to Snowflake in a table called customer_health_scores.


Step 5: Reverse ETL to Salesforce with Hightouch

With insights centralized in Snowflake, INSART configured Hightouch to operationalize them inside Salesforce.

What is Reverse ETL?

Reverse ETL pushes data from your warehouse into operational tools like Salesforce, HubSpot, or Zendesk.

Rather than exporting CSVs, it automates synchronization of enriched data back into CRM systems — enabling real-time insights.

Hightouch Configuration

  • Source: Snowflake

  • Destination: Salesforce

  • Primary Key Mapping: account_id

  • Fields synced:

    • churn_probability

    • avg_sessions_per_week

    • feature_adoption_index

    • last_active_date

    • health_status (derived column)

Example sync mapping (Hightouch YAML):

model: customer_health_scores
destination: salesforce
primary_key: account_id
mappings:
  churn_probability: Churn_Score__c
  avg_sessions_per_week: Sessions_7d__c
  feature_adoption_index: Feature_Usage_Score__c
  last_active_date: Last_Active__c
  health_status: Account_Health__c
schedule: every_6_hours

The sync runs every six hours, ensuring Salesforce reflects the latest behavioral and ML-driven insights.


Step 6: Surfacing Insights in Salesforce

In Salesforce, the synced data populates custom fields under the Account object and is visualized via custom dashboards:

Metric

Source

Description

Churn_Score__c

Databricks

Predicted churn probability (0–1).

Feature_Usage_Score__c

Segment + Mixpanel

Weighted engagement index.

Last_Active__c

Segment

Last activity timestamp.

Account_Health__c

Derived

Categorical value: Healthy / At Risk / Dormant.

Sales and success teams can now:

  • Prioritize outreach to “At Risk” accounts.

  • Identify upsell candidates (high usage, low churn).

  • Align campaigns to actual engagement metrics.


Key Technologies in Detail

Segment

  • Event tracking SDKs integrated in web and backend.

  • Automatically forward events to Snowflake (warehouse destination).

  • Real-time identity resolution (identify() + group() calls) ensure consistent mapping of users and accounts.

Example:

analytics.identify('user_1234', {
  email: 'user@example.com',
  account_id: 'acc_345'
});
analytics.track('Funds_Deposited', { amount: 2000, currency: 'USD' });

Mixpanel

  • Used for cohort analysis, feature adoption, and retention reporting.

  • API exports daily behavioral aggregates into Snowflake.

  • Cohort exports also sync into Hightouch for marketing automation.

Example API call:

curl https://mixpanel.com/api/2.0/engage?distinct_id=user_1234

Hightouch

  • No-code Reverse ETL platform.

  • Transforms and syncs data models from warehouse to Salesforce.

  • Supports field mapping, schedule control, and change-detection syncs.

  • Uses Salesforce REST API v57 for upserts (PATCH /sobjects/Account/{id}).


Databricks

  • Unified analytics workspace for ML modeling.

  • MLflow tracks model versions, parameters, and metrics.

  • Orchestrated by Databricks Jobs triggered after Snowflake loads complete.

Example job pipeline:

1. Airflow DAG triggers Databricks job nightly.
2. Job trains model and writes scores → Snowflake.
3. Hightouch syncs → Salesforce.
4. Salesforce dashboards auto-refresh in Tableau CRM.

Fintech Context: Why It Matters

In fintech, understanding user behavior is a compliance and revenue imperative.

Transactions, deposits, and feature engagement are strong proxies for customer health and risk — but when they remain invisible to sales, opportunities are lost.

By connecting product analytics (Segment + Mixpanel) with sales intelligence (Salesforce), INSART enabled the client to:

  • Identify dormant accounts early through usage decay patterns.

  • Trigger proactive retention actions in Salesforce (e.g., automated task creation for CSM).

  • Feed churn scores into marketing automation, targeting high-risk accounts with specific incentives.

  • Quantify feature adoption ROI, aligning product strategy with revenue.

This integration didn’t just align systems — it aligned departments: sales, data, and product finally shared one version of truth.

 

Case Study: Integrating Salesforce with Product Usage Data: Segment, Mixpanel, and Hightouch in Action

 


Measurable Outcomes

Within three months of deployment:

  • Churn reduced by 22% across SMB segments.

  • Sales response time to at-risk accounts dropped from 72h → 6h.

  • Pipeline accuracy improved, as Salesforce “Health” field correlated strongly (r=0.81) with renewals.

  • Data freshness improved — Salesforce now refreshed from warehouse every 6 hours (previously weekly).


Lessons Learned

  1. Identity resolution is everything. Without a clean identity graph, integrations become brittle.

  2. Reverse ETL is transformative. It closes the loop — data doesn’t just live in dashboards; it drives action.

  3. Start small with fields. Sync 3–5 metrics first, validate adoption, then scale to more complex signals.

  4. Sales adoption requires UX. Embedding dashboards and highlights directly into Salesforce accounts ensures visibility.

  5. Keep governance in mind. Each sync must have ownership: who maintains logic, who validates freshness.


INSART’s Approach

INSART’s strength lies in connecting data engineering with customer success operations.

We don’t just integrate APIs; we create data products that drive measurable business actions.

Our stack covers:

  • Data pipelines (Segment, Snowflake)

  • Reverse ETL automation (Hightouch)

  • Predictive analytics (Databricks)

  • CRM integration (Salesforce)

  • Observability and governance (dbt, Airflow)

This cross-functional depth enables fintechs to unify product and sales intelligence — without building complex infrastructure in-house.

 

Case Study: Integrating Salesforce with Product Usage Data: Segment, Mixpanel, and Hightouch in Action

 


Conclusion

Modern fintech growth depends on knowing not just who your customers are, but how they behave.

By integrating Salesforce with Segment, Mixpanel, and Databricks through Hightouch, INSART turned data into action — enabling real-time, behavioral sales intelligence.

Now, every account manager can open Salesforce and see not just pipeline data, but the pulse of every user — usage, health, churn risk, and opportunity — all in one place.

That’s what data-driven relationship management looks like.

SUBSCRIBE

Whether you are a founder, investor or partner – we have something for you.

Home
Get in touch
Explore on signals.MAG