Why Rule-Based Product Recommendations Break Down at 10,000 SKUs

By Brianna Okafor

Every DTC brand starts with rules. "Show best-sellers first." "Pin new arrivals to the top row." "If a shopper browsed outerwear, recommend outerwear." These rules feel like common sense, and at catalog sizes under a few hundred SKUs, they produce reasonable results. Merchandising teams can review the outputs, tweak the priority order, and feel like they're in control.

Then the catalog grows. A DTC apparel brand goes from 400 SKUs to 4,000 as they expand into accessories and home goods. A beauty brand hits 10,000 active variants once you factor in shades, sizes, and kit configurations. The rules don't change — but the catalog they're meant to organize has grown by an order of magnitude. That's when the logic starts to decay.

What Rules Actually Do (And What They Don't)

A merchandising rule is a hard-coded if-then statement applied uniformly across your catalog. "If category = skincare, show top 8 by revenue." "If in-stock quantity < 5, push to the back of the grid." These rules are fast to write, easy to audit, and completely indifferent to individual shoppers.

That last point is the one that matters. Rules operate on catalog attributes — category, stock level, margin, launch date — not on the person browsing. When your catalog is small enough that your bestsellers are genuinely appealing to most of your shoppers, this works fine. The bestseller overlap with individual preference is high. But as your catalog deepens and diversifies, the bestseller list becomes a worse and worse approximation of what any given shopper actually wants.

A brand with 10,000 SKUs might have 40 distinct product categories and thousands of sub-niches. The shopper who came from a Pinterest pin about minimalist kitchen organization has almost nothing in common with the shopper who arrived from a Reddit thread about camping gear — even if they're both browsing the same homepage. Your rules show them both the same "top sellers" grid. One of them is probably going to bounce.

The Coverage Problem

Here's a number that surprises most merchandising teams when they first look at it: at 10,000 SKUs, the percentage of your catalog that ever appears in a recommendation slot based on standard rules is typically under 15%. The rules promote the same 200-300 products over and over. The remaining 9,700 SKUs essentially don't exist from the recommendation layer's perspective.

This creates a coverage collapse. Products in the long tail — the niche, specific items that may convert exceptionally well for a small subset of highly motivated shoppers — are invisible. The customer who would have purchased your limited-edition cast iron skillet never sees it because they browsed the "kitchen" category once and your rules only surface the top-8 by revenue, which are all basic utility items.

Worse, coverage collapse compounds over time. Products that don't get recommended don't accumulate sales velocity. Products without sales velocity don't move up in revenue-ranked rules. They become permanently stranded — not because shoppers don't want them, but because the rule system never gave them an opportunity to generate signal.

The Maintenance Burden That Nobody Accounts For

Rules also have an invisible cost: they require ongoing human maintenance to stay accurate. Seasonal rules need to be swapped as inventory changes. New category expansions need new rules written. When a product's affinity profile shifts — say, a cooler that was mostly camping-adjacent starts getting purchased by tailgaters — the rules don't know. Someone has to notice, then update them.

In practice, rules don't get maintained at the frequency the catalog demands. We've seen merchandising teams running on rules written eight to twelve months prior, pointing to collections that have been restructured, categories that no longer exist, or products that have sold out. The lag between catalog reality and rule logic creates a quiet failure mode — recommendations that technically "work" but are serving stale signals as though they're current.

We're not saying rules are inherently bad. For high-level business constraints — "never show out-of-stock items," "always pin the current campaign's hero products to positions 1-3" — rules are exactly right. The problem is when rules try to do the job of personalization: inferring what an individual shopper wants based on their current session context. That's not what rules were designed for.

Where Scale-Specific Failure Shows Up in the Data

When we look at product grid CTR data from DTC stores that have grown their catalogs significantly, a few patterns emerge consistently.

First, position bias becomes extreme. The top-left and top-center tiles in a rule-based grid collect disproportionate click share — not because those products are most relevant, but because the rule always puts the same products there. A shopper who's seen that top tile three times already has a lower probability of clicking it on the fourth visit, but the rule doesn't know they've seen it before.

Second, new product failure rates spike. A new SKU entering the catalog has no sales history, so revenue-ranked rules push it to the back. It never gets seen by enough shoppers to build velocity. The product gets marked as a slow-mover and eventually discounted or discontinued — but the underlying reason it failed was that the recommendation layer systematically excluded it during the period when it needed exposure most.

Third, cross-category discovery drops off. When a shopper's session history suggests they're open to a category adjacent to their primary interest, rules have no mechanism to surface that. The affinity connection between someone who buys minimalist kitchenware and someone who would respond well to a storage and organization collection exists in the behavioral data — but rules can't read behavioral data. They read catalog attributes.

The Cold Start Isn't the Only Problem

Most discussion about recommendation system limitations focuses on cold start — the inability to personalize for new visitors with no history. Cold start is real, but it's not the primary failure mode of rules at scale. Rules fail even for returning customers.

Consider a shopper who has placed three orders over 18 months. They have a clear preference vector — moderate price range, specific aesthetic, strong affinity for one brand sub-line. But if your rule for returning customers is "show items in categories they've purchased before," you're ignoring everything they've browsed and hovered over but not yet bought. You're also ignoring the signal in their dwell time, their scroll patterns across the grid, the sequence in which they moved through your catalog in the last session.

Purchase history is the loudest signal but also the slowest-updating. A shopper can change their mind about what they want this week versus last quarter, and their session micro-behaviors will show that shift long before they make another purchase. Rules that rely only on purchase history are working with a months-old snapshot of a person who may have meaningfully different preferences today.

What Comes After Rules

The transition from rule-based merchandising to session-aware personalization isn't about replacing rules entirely — it's about being honest about what rules can and can't do. Rules handle constraints and business logic well. They don't handle individual relevance inference well, and they fail progressively worse as catalog size grows.

At the point where your catalog has more SKUs than your merchandising team can manually evaluate, some form of automated relevance modeling becomes necessary — not as a philosophical upgrade, but as a practical operational response to a catalog your rules can't meaningfully index.

The brands that make this transition successfully tend to keep their high-level business rules intact — pinned promotions, out-of-stock suppression, margin floors — while replacing the middle and bottom layers of the recommendation stack with behavioral scoring. The rules handle the "what we won't show," and the model handles "what we think this person wants to see right now."

That separation of concerns is what makes catalog growth feel manageable rather than chaotic. Your 10,000 SKUs are no longer a problem to be governed by increasingly complex rule trees. They're a resource — a deep inventory of relevant products waiting to be matched to the right shopper at the right moment.