Fulfillment KPIs: the 8 that really matter

Your 3PL has 30 KPIs in the WMS. It shows you four. The four that look good.

I know this from the other side of the table. You build a monthly report that pleases: “On-time shipped 99.2%”, “pick accuracy 99.8%”, green traffic lights, quarterly meeting passed. What's not in the report: that the on-time rate only counts orders that were picked at all — and not the 4% that came in after cut-off and quietly slipped into the next day. That pick accuracy is measured as a “complaint rate”, i.e. only the complaints the end customer raised.

Here are the eight KPIs you should demand. With the variant your 3PL shows you voluntarily — and the variant you need to know whether you're throwing money on the floor.

1. Same-day pick rate

Definition

Share of orders that arrive before the defined cut-off and leave the warehouse on the same working day.

Same-day pick rate = (orders before cut-off, shipped same day)
                     / (orders before cut-off)

Benchmark DACH 2026

Mid-tier DACH sites with a 14:00 cut-off should manage > 96%. The top quartile is > 98%. Anyone dropping below 94% has a control problem — either in wave planning or staff forecasting.

What your 3PL likes to show you

“Best-efforts obligation” — worded in the contract as a soft target, no penalty. Some providers also only count the orders that reached the WMS — i.e. after drop filters, after cancellations, after stock locks. That removes 2–3% of problem orders from the denominator.

How to really measure it

You take the order confirmation time from your shop (not from the WMS) and reconcile it against the carrier scan. If the carrier scanned the order by 24:00 on the same working day, it's “same-day”. Everything else doesn't count.

What it costs if it misses

The Amazon FBM late-shipment-rate threshold is 4%. Anyone who breaches it loses the Buy Box and account health. Otto Partner Connect rates delivery punctuality as a service-level KPI with a direct impact on merchant sorting in search results. Plus: repeat buyers in DACH studies react above average to delivery time — studies (Forrester et al.) show around 44% of buyers abandon carts because of shipping costs or missing shipping options.

2. Cut-off time

Definition

The time from which incoming orders no longer leave the warehouse the same working day.

Benchmark DACH 2026

12:00 — weak, typical of small 3PLs with only one late shift
14:00 — lower mid-field, the most common standard in DACH
16:00 — solid, demanding in wave control
17:00–18:00 — top quartile, mostly exploiting the DHL cut-off

GLS SameDay sets cut-offs at individual providers up to 16:30. DHL usually collects until 17:30–18:00, depending on site.

What your 3PL likes to show you

The theoretical cut-off. Contractually it's 16:00, but the actual average of the last 30 days is 14:20 — because wave planning doesn't support it, because a shift was off sick, because the printer played up. No one tightens the thumbscrew on you.

How to really measure it

Actual cut-off = mean carrier handover time
                 over the last 30 working days
                 (source: carrier scan log)

You want to see the p90 cut-off, not the mean. It shows you when the shipping window closes on 9 out of 10 days.

What it costs if it misses

Every hour of later cut-off is a CRO lever. A same-day promise on the product detail page (“Order in the next 2h 14m and get it tomorrow”) demonstrably affects conversion rate and basket value. Anyone at 14:00 while the competition runs 17:00 loses in the afternoon hours — and that's exactly where 30–35% of revenue sits.

3. Pick accuracy

Definition

Share of outbound orders that contain exactly the ordered items in the correct quantity.

Benchmark DACH 2026

The industry standard is 99.5% upward; best-in-class at 99.8–99.9%. Anyone running persistently below 99% (i.e. 1 error per 100 orders) has a structural process or scan problem. More than 35% of warehouses actually run at 1% error or worse.

What your 3PL likes to show you

The complaint-based accuracy. That means only the cases where an end customer actively complained are counted. In experience, only one in three to five customers gets in touch when something is missing — the rest keep it, return it without comment, or simply send it back with no objection. The dark figure in this measurement model is at least a factor of 2.

How to really measure it

A sample audit at the outbound gate. At least 100 random orders per week are opened and cross-checked against the delivery note before carrier handover. The result is the real pick accuracy. Second lever: outbound weighing — every carton is compared against the target weight from the ERP before the label. A deviation > 5% triggers a re-check.

What it costs if it misses

Complaint handling costs €8–18 per case (return postage, re-shipment, customer-service time). Plus the NPS effect: a wrongly delivered order significantly lowers the repurchase probability. At a pick accuracy of 99.0% instead of 99.7%, on 200,000 orders/year you pay an extra ~€14,000–32,000 — and lose customers.

4. Inventory accuracy

Definition

The match between WMS stock and actual, physical stock.

Inventory accuracy = |SKUs with correct stock|
                     / |all counted SKUs|

Benchmark DACH 2026

Good 3PLs reach > 99.5%. Best-in-class is 99.9%. Below 97% you no longer have plannable availability — every out-of-stock could be a WMS ghost.

What your 3PL likes to show you

The accuracy only over the top-50 SKUs, because those are picked daily anyway and the stock practically self-corrects. The long tail (ABC class C) is counted once a year at the cut-off date stocktake — and that's where the real ghosts sit.

How to really measure it

Cycle counting with ABC classification, reported separately:

A class (top-20% SKUs, 80% volume): counted weekly, target > 99.8%
B class: monthly, target > 99.5%
C class (long tail): quarterly, target > 98%

You demand a monthly cycle-count report per ABC class. If your 3PL reports “99.7% overall” but no split, you know the C class has a hole in it.

What it costs if it misses

Phantom stock leads to out-of-stock sales you only notice when picking — then you have to cancel on the customer, which feeds back into your late-shipment and cancellation rate. Amazon FBM requires < 2.5% pre-fulfillment cancellation rate; anyone who breaches it risks the account. On the other side: ghost overstock you don't know about is dead capital. With 1,500 SKUs and 0.5% deviation, €30,000–80,000 of working capital is quickly in limbo.

5. Damage rate (inbound and outbound)

Definition

Share of parcels that arrive at the end customer with physical damage — split by carrier damage (transport) and warehouse damage (pack error, under-securing).

Benchmark DACH 2026

Target < 0.3% for standard e-commerce
DACH reality is 0.5–1.0% depending on carton mix and carrier
Industry-wide e-commerce damage rates are 1–3%; in DACH typically at the lower end, because the carrier network is denser and hubs are younger

What your 3PL likes to show you

Only the carrier damages, i.e. cases where the carrier scan documented damage. Pack damages (“glass in bubble wrap instead of box-in-box”) are usually not captured at all, because they come in as an end-customer complaint and are handled in customer service without 3PL feedback.

How to really measure it

Bring two data sources together:

Inbound inspection at goods-in (your own photo-documentation process)
End-customer complaints with a photo, categorised by “box intact, contents broken” (= pack error) vs. “box crushed” (= carrier)

The rate end-customer damage / shipped orders is your real damage rate. You only see the carrier share and 3PL share after categorisation.

What it costs if it misses

Each damage case costs €12–35 operationally (replacement shipment, disposal, service time). NPS effect: a broken product in the unboxing moment is the worst conceivable first impression of a DTC brand. For beauty/glass/premium goods, a 1% damage rate can mean a six-figure NPS loss per year.

6. Return turnaround time (RTT)

Definition

Time from the physical arrival of the return in the warehouse to re-booking into sellable stock.

Benchmark DACH 2026

Standard: < 3 working days
Premium / technology-supported: < 1 working day, sometimes < 24 hours
Industry reality: 3–5 working days average at classic 3PLs

What your 3PL likes to show you

Only the “completed” returns — i.e. those that have been through the full inspection and refurbish process. What's not shown: returns stuck in a triage backlog because no clear rule is in place (e.g. “damaged goods: back to brand or disposal?”). These hang in limbo for weeks.

How to really measure it

RTT = timestamp re-booking in WMS
      - timestamp goods-in returns acceptance

You want the p90 RTT, not the mean. The p90 shows you when 9 out of 10 returns are back in stock.

What it costs if it misses

Every return day in limbo is a day of out-of-stock risk. For a brand with a 25% returns rate and 5 days RTT instead of 1, a permanent 15–20% of your stock is “in processing” — capital that isn't selling. On €800,000 of stock, that's €120,000–160,000 tied up in working capital without you seeing it.

7. Carrier on-time delivery (OTD) and OTIF

Definition

OTD: share of shipments that arrive at the end customer within the service promise the carrier committed to.

OTIF (on-time in-full): the stricter B2B variant — on time and complete.

Benchmark DACH 2026

DHL Paket Germany: standard transit 1–2 working days, OTD rate 92–96%, depending on region and season
GLS in European tests: ~93% on-time
DPD in comparable tests: lower, depending on source 82–90%
OTIF B2B benchmark: 95–98% expected in grocery and retail; Walmart demands 98% and sanctions shortfalls with a 3% COGS penalty

What your 3PL likes to show you

“OTD 96%” — based on a definition in which “on-time” includes 24 hours of tolerance on the committed day. With this definition, Wednesday-promised-Thursday-delivered becomes “on-time”. From the end customer's perspective that's rubbish.

How to really measure it

Wire the carrier tracking API into your OMS. You compare the committed delivery day (per the carrier's service product, not an “estimate”) against the actual first delivery attempt. A strict definition. Plus: a breakdown by postcode cluster, because the urban-rural spread in DACH is up to 8 percentage points.

What it costs if it misses

The Amazon FBM OTDR minimum threshold is 90% (recommendation 95%, Buy-Box cutoff ~97%); Seller-Fulfilled Prime requires 93.5% with a stricter promise. Anyone who breaches it drops out of Prime status, which can mean a 25–40% revenue drop on Amazon. In B2B wholesale, an OTIF miss costs up to 3% COGS in penalties (see the Walmart standard, increasingly adopted in Europe by Metro, Rewe, Edeka).

8. Cost per order (CPO) / cost per parcel (CPP)

Definition

Full cost per shipped order, including all components: pick, pack, storage, surcharges, material, carrier, returns, IT onboarding fees, minimum-quantity penalties, stocktake costs.

CPO = (Σ all 3PL and carrier invoices in a month)
      / number of outbound orders in the same month

Benchmark DACH 2026

A typical range of €6.50–11.00 per order for standard goods at medium volumes (10,000–50,000 parcels/month). Details in the 3PL cost comparison DACH 2026. Below €6 is usually only achievable with full automation, low surcharges or a non-DACH location. Above €11 you either have small volume, a heavy assortment or a premium 3PL.

What your 3PL likes to show you

The “base price per pick” — i.e. €0.90. Looks cheap. What's missing: storage, labelling, stretch film, multi-item surcharge, returns handling, IT monthly, minimum-volume billing, peak surcharges, carrier surcharges (fuel, toll, island, metro). In total, that makes the real CPO 60–120% higher than the shop-window price.

How to really measure it

Monthly reconciliation. You take:

All 3PL invoices for the month
All carrier invoices for the same period (caution: cut-off offset; carriers often bill mid-to-mid)
Consumables (cartons, tape, filling — either via 3PL or own purchase)
IT/onboarding fees apportioned over 12 months

Divide by the number of outbound orders. That's your CPO. Everything else is marketing material.

What it costs if it misses

The only number that counts for your P&L. If your CPO is €0.80 above benchmark, on 150,000 orders that's €120,000 of EBIT loss per year — directly, with no detour. You finance your 3PL's inefficiency out of your margin.

Bonus: KPIs your 3PL doesn't show voluntarily

Four KPIs that appear in no standard report, because they invite uncomfortable questions for the 3PL. You should demand them anyway:

First-pass yield (FPY). Share of orders shipped directly without a re-pick, re-pack or correction run. Target > 98%. Low FPY values show you process gaps not yet visible in pick accuracy — because they're corrected internally.
Storage utilisation per warehouse type. How full are your bin sizes? If 70% of your SKUs sit in “L” bins but only need 30% volume, you permanently pay 40% over-dimensioning.
KAM response time (escalation hotline). Hours between email-to-KAM and a qualified reply. The standard should be < 4h. Reality: often 24–48h. This KPI tells you how seriously you're taken as a customer.
Order-to-ship cycle-time distribution. Not the mean, but the distribution (p50, p90, p99). The mean lies: if 95% go out in 2h and 5% in 36h, the mean looks harmless — but it's exactly those 5% that are your customer-service tickets.

How to enforce the KPIs

Four levers — no discussion, no “best effort”:

A KPI annex in the contract. Each KPI with a threshold, measurement method, data source and measurement frequency. Not “pick accuracy high”, but “pick accuracy measured by outbound sample audit, > 99.5%, reported monthly”.
Penalties from the first shortfall. The standard is 5–15% of monthly pick & pack volume on a shortfall of > 1 threshold. Important: penalties must be net, i.e. not offset by the 3PL's surcharge over-earnings.
Monthly reporting instead of quarterly. Quarterly, you only see problems 90 days after the event. Monthly, you can intervene before it becomes chronic.
Dashboard access instead of a PDF. PDFs are dressed-up snapshots. You want read-only access to the WMS reporting layer or a live dashboard (Looker, Power BI). Anyone who denies it has something to hide.

Conclusion: the real KPIs change the conversation

If you write these 8 (plus 4 bonus) into your contract and hold them against benchmarks monthly, the balance of power between you and your 3PL shifts. You're no longer the customer who accepts a quarterly PDF and nods — you're the operational authority that knows what it measures.

That may sound like hard work. It is. But between a 3PL that delivers its numbers and one that lulls you with green traffic lights, there are six-figure EBIT amounts at the end of the year.

Need someone to read the numbers with you? I do quarterly reviews of your KPIs in the sparring retainer — we take apart every monthly report, identify the glossed-over spots and write you the questions your KAM doesn't want to answer.

If there's a suspicion your 3PL structurally underperforms, the fulfillment audit is the right lever: 4 weeks, deep data analysis, a clear finding with negotiation ammunition.

Fulfillment KPIs: the 8 that really matter

1. Same-day pick rate

Definition

Benchmark DACH 2026

What your 3PL likes to show you

How to really measure it

What it costs if it misses

2. Cut-off time

Definition

Benchmark DACH 2026

What your 3PL likes to show you

How to really measure it

What it costs if it misses

3. Pick accuracy

Definition

Benchmark DACH 2026

What your 3PL likes to show you

How to really measure it

What it costs if it misses

4. Inventory accuracy

Definition

Benchmark DACH 2026

What your 3PL likes to show you

How to really measure it

What it costs if it misses

5. Damage rate (inbound and outbound)

Definition

Benchmark DACH 2026

What your 3PL likes to show you

How to really measure it

What it costs if it misses

6. Return turnaround time (RTT)

Definition

Benchmark DACH 2026

What your 3PL likes to show you

How to really measure it

What it costs if it misses

7. Carrier on-time delivery (OTD) and OTIF

Definition

Benchmark DACH 2026

What your 3PL likes to show you

How to really measure it

What it costs if it misses

8. Cost per order (CPO) / cost per parcel (CPP)

Definition

Benchmark DACH 2026

What your 3PL likes to show you

How to really measure it

What it costs if it misses

Bonus: KPIs your 3PL doesn't show voluntarily

How to enforce the KPIs

Conclusion: the real KPIs change the conversation

What the industry won't say out loud — as a PDF.

You're overpayingfor your fulfilment.

You're overpaying
for your fulfilment.