Skip to main content

Auto Parts Data Platform for a European B2B Marketplace

EU Auto Parts Group
Automotive aftermarket
Germany
Auto Parts Data Platform for a European B2B Marketplace
Duration
9 months (Phase 1–5)
Team Size
7 people
Services
3 services

Client Context

A Bavaria-based aftermarket distributor sourced parts from 40+ Tier-1 and Tier-2 suppliers (Bosch, Mahle, Hella, Continental and several regional brands) and resold to ~3,800 independent garages and BMW / Mercedes / VW dealer networks across DACH. Their original catalog had grown organically inside a 12-year-old SQL Server monolith with no public API, and every supplier delivered data in their own Excel, CSV or proprietary XML schema. Wrong-fit returns were eroding margin and the commercial team could not run targeted campaigns without weeks of manual SKU cleanup.

The Challenge

Business Challenge

Returns from wrong-fit orders had reached ~7.4% of B2B revenue. Each return cost the company on average €38 in logistics, restock and credit-note handling. Sales lost ground to two larger pan-European marketplaces because their catalog search felt slow, fitment was unreliable, and EAN/OEM cross-references were missing for ~31% of SKUs.

Technical Challenge

10.2M SKUs across 40+ suppliers, no canonical product or fitment schema, no vehicle compatibility (KBA/TecDoc) enrichment, search p95 latency of 4.2s on a legacy full-text index, and no API surface — partners had to FTP CSVs and re-import nightly.

Signals Before We Started

  • 7.4% wrong-fit return rate (industry benchmark in DACH aftermarket is 2.5–3.5%)

  • 31% of SKUs missing OEM cross-references or EAN

  • 4.2s p95 search latency; 11% of search sessions abandoned before the first click

  • Supplier onboarding took 6–9 weeks per brand (manual mapping)

  • No structured fitment data — 100% of vehicle compatibility was free-text

  • 0 public API endpoints; partners integrated via nightly FTP

Our Solution

Overview

We delivered a modern aftermarket catalog platform with three pillars: (1) supplier ingestion pipelines that normalize heterogeneous feeds into a canonical schema, (2) a fitment enrichment engine that joins OEM/EAN cross-references with KBA / TecDoc-style vehicle data, and (3) an Elasticsearch-backed B2B search surface with REST and event APIs for partners and the existing ERP.

Architecture

.NET 8 microservices on Azure Kubernetes Service (AKS), PostgreSQL 15 for canonical catalog state, Elasticsearch 8 for search, Apache Kafka for ingestion / domain events, Azure Blob Storage for raw supplier payloads, Redis for read-through cache, Angular admin console, and Azure API Management as the external gateway. Identity via Azure AD B2C for partners; observability through Azure Monitor + OpenTelemetry to Grafana / Loki.

Approach

  • 1

    Discovery + canonical schema design

  • 2

    Ingestion pipelines per supplier with validation

  • 3

    Fitment & cross-reference enrichment engine

  • 4

    Search relevance tuning on Elasticsearch

  • 5

    Partner REST + webhook APIs

  • 6

    Phased rollout per supplier and per market

Platform Modules

The system was delivered as the following modules — each with its own owner, integration contract and rollout plan.

Supplier Gateway

Per-supplier adapters (CSV / XLSX / XML / JSON) with schema validation, dead-letter queue, and replayable raw payload storage on Azure Blob.

Canonical Catalog Service

Source-of-truth Part, Brand, Category, Fitment and CrossReference aggregates in PostgreSQL, with full event history streamed onto Kafka topics.

Fitment Enrichment Engine

Joins canonical parts with vehicle reference data; resolves engine-code → applicable-parts and confidence-scores ambiguous mappings for steward review.

Search & Discovery

Elasticsearch 8 cluster with custom analyzers for OEM numbers; relevance tuning combines brand tier, fitment exactness and inventory availability.

Partner API & Webhooks

REST + webhook surface behind Azure API Management; OAuth2 client credentials, throttling, and per-partner contract testing.

Steward Console

Angular admin app for the data team: handle ingestion exceptions, review low-confidence fitment matches, and audit cross-reference changes.

Data Flow

Supplier files arrive via SFTP or S3 bucket and land in Azure Blob as immutable raw payloads. The matching Supplier Gateway adapter validates, normalizes, and emits a `catalog.supplier.row.received` event onto Kafka. The Canonical Catalog Service consumes these events, applies deduplication and conflict resolution, and writes the canonical Part aggregate to PostgreSQL while emitting `catalog.part.changed`. The Fitment Enrichment Engine reacts to `catalog.part.changed`, joins vehicle reference data, scores ambiguous matches and writes back fitment edges. A downstream indexer projects the enriched read model into Elasticsearch. Partner REST queries hit a Redis read-through cache in front of Elasticsearch, while webhooks notify subscribed partners of price, stock and fitment changes.

Integrations

  • Bosch / Mahle / Hella / Continental supplier feeds (CSV, XLSX, XML)

  • TecDoc-style vehicle reference dataset

  • SAP S/4HANA ERP (orders, stock and pricing) via webhook + SAP iDoc bridge

  • DATEV-compatible accounting export

  • Azure AD B2C for partner identity

Delivery Timeline

Phased delivery — each phase had explicit goals, measurable outcomes and a checkpoint before progression.

  1. Phase 1 — Discovery & canonical model

    Week 1–4
    Goals
    • ·Map 40+ supplier feed formats
    • ·Define canonical Part, Brand, Fitment, CrossReference entities
    • ·Align with stakeholders on KPI tree (returns, search, onboarding)
    Outcomes
    • Canonical schema v1 signed off
    • Data quality baseline: 31% missing OEM, 18% missing EAN, 12% duplicate SKUs across suppliers
    • Prioritized rollout list: 8 hero suppliers covering ~62% of revenue first
  2. Phase 2 — Ingestion & cleansing

    Week 5–12
    Goals
    • ·Build supplier-specific adapters (CSV, XLSX, XML, JSON over FTP / S3)
    • ·Validation, deduplication, and conflict-resolution rules
    • ·Write canonical catalog to PostgreSQL with full history
    Outcomes
    • 8 hero suppliers ingested daily; ingestion success ≥99.2%
    • Dedupe collapsed 10.2M raw rows → 8.6M canonical SKUs
    • Steward console enabled staff to resolve ~400 daily exceptions
  3. Phase 3 — Fitment enrichment

    Week 9–18
    Goals
    • ·Integrate KBA / vehicle reference data
    • ·Build fitment-resolver service (engine code → applicable parts)
    • ·Confidence scoring for ambiguous mappings
    Outcomes
    • Fitment coverage rose from 47% to 91% of high-velocity SKUs
    • Confidence-scored matches surfaced to stewards for ambiguous cases (<0.7)
    • Vehicle compatibility queryable in <80ms p95
  4. Phase 4 — Search & partner APIs

    Week 14–24
    Goals
    • ·Reindex canonical catalog into Elasticsearch
    • ·Tune relevance: brand boost, OEM-exact priority, fitment filter
    • ·Expose REST + webhook APIs via APIM
    Outcomes
    • Search p95 latency dropped from 4.2s to 620ms
    • Partner sandbox onboarded 12 garages in first 3 weeks
    • First webhook integration with the existing SAP ERP went live week 22
  5. Phase 5 — Rollout & hardening

    Week 22–36
    Goals
    • ·Onboard remaining 32 suppliers in waves
    • ·Blue-green migration off legacy catalog
    • ·Run a 4-week parallel period before decommissioning
    Outcomes
    • Full supplier base live, average onboarding dropped from 6–9 weeks to 4–7 days
    • Legacy SQL Server catalog decommissioned in week 34
    • Returns dropped to 4.6% within the parallel period and 4.1% three months post-cutover

Technology Stack

.NET 8
PostgreSQL 15
Elasticsearch 8
Apache Kafka
Azure (AKS, APIM, AD B2C, Blob)
Angular 17
Redis
OpenTelemetry

The Results

Measurable impact delivered within 9 months (Phase 1–5).

Wrong-fit returns
From 7.4% to 4.1% of B2B revenue 3 months after full cutover
Search latency (p95)
4.2s → 620ms on 8.6M canonical SKUs
Supplier onboarding
Adapter pattern + steward console replaces manual mapping
Fitment coverage
On high-velocity SKUs after enrichment engine went live

Security & Compliance

  • GDPR-aligned data handling — partner personal data isolated in EU-West region only
  • OAuth2 client credentials + per-partner rate limits at APIM
  • Audit trail of every steward action (who changed which mapping, when, why)
  • Signed webhooks (HMAC-SHA256) and replay protection
  • Least-privilege service principals; no shared production credentials

Delivery & Operations

  • GitHub Actions CI with contract tests against partner OpenAPI specs
  • Blue-green deploys on AKS via Argo Rollouts
  • OpenTelemetry traces shipped to Grafana Tempo; SLOs alert on search p95 > 1s
  • On-call rotation across the German PO + Vireon engineering lead
  • Quarterly fitment-quality drill: regression test on 1,000 golden vehicle/part pairs
What we'd do again

Key Learnings

  • Front-load the canonical schema decision — every later integration pays the price of that schema for years.

  • Score don't reject: surface low-confidence fitment matches to stewards instead of dropping them; a 0.6-confidence row resolved by a human is better than a missing SKU in search.

  • Treat supplier feeds as immutable raw payloads on object storage. We replayed 14 months of ingestion when we added a new cross-reference field — without that, the schema change would have shipped as a 9-month backfill project.

  • Search relevance is a product, not a config. We allocated a single engineer to relevance tuning for the last six weeks; that work alone closed ~40% of the 'cannot find a part' support tickets.

Let's Discuss Your Project

Schedule a free consultation to explore how we can help you achieve your goals.