A cause acting at one cross-hierarchy intersection — one region's slice of one product line — isn't even a visible object in most analytics tooling. FactorPrism® makes it visible, located, and reconciled; this study measures whether it gets it right. 280 blind engine runs against known injected causes: recall and precision in the open, including where performance degrades. To our knowledge, no other vendor in this category publishes accuracy against ground truth.
Suppose margin fell because of something specific to Northeast Outerwear — one region's slice of one product line. In the tooling most teams actually use, that location is not a row on any screen. Slice by region and the problem smears across everything Northeast sells; slice by product and it smears across every region's outerwear. The cross-hierarchy intersection where the cause actually acts — the thing you need to see — is structurally invisible to one-dimension-at-a-time analysis. Tree-based screening functions touch such intersections, but emit them as overlapping fragments with no location in your hierarchy and no reconciliation to your number.
So the first question about any root cause tool is whether the answer is visible at all. And the second question follows immediately: when a tool does name a cause, how would you know if it's right? On real business data there is no answer key. The industry's track record here is poor, and documented: anomaly-detection platforms became famous for alert fatigue — hundreds of flagged segments, most of them noise — and automated insight tools surface "interesting" segments with no statement of accuracy at all.
This study answers both questions the only honest way: generate business data where the true causes are injected by construction — at known locations across the hierarchy lattice, including exactly the cross-hierarchy intersections ordinary tooling can't render — then run the production FactorPrism® engine against it blind, and score whether it finds each cause and the place where it acts. We also do something vendors rarely do: we report where performance degrades, and we publish the comparison against the naive method most tools (and most analysts) actually use.
A factor counts as a hit only if it names the injected location and the correct direction. Recall = injected causes recovered; recall@top-5 = ranked where a reader will actually look; precision = returned factors corresponding to a real cause; magnitude error = how far the estimated impact is from the injected one.
| Period-over-period noise | Config | Factors returned (mean / max) | Recall | Recall@top-5 | Precision | Median magnitude error |
|---|---|---|---|---|---|---|
| 0.1% | FactorPrism® | 4.4 / 12 | 100% | 100% | 94% | 1.5% |
| 0.1% | magnitude-only | 6.8 / 52 | 100% | 100% | 94% | 1.4% |
| 0.5% | FactorPrism® | 5.5 / 12 | 100% | 100% | 80% | 2.0% |
| 0.5% | magnitude-only | 13.4 / 131 | 100% | 100% | 79% | 2.0% |
| 1% | FactorPrism® | 7.6 / 12 | 100% | 100% | 62% | 4.0% |
| 1% | magnitude-only | 27.9 / 175 | 100% | 100% | 48% | 4.2% |
| 2% | FactorPrism® | 10.0 / 12 | 98% | 95% | 43% | 8.2% |
| 2% | magnitude-only | 57.0 / 190 | 97% | 95% | 29% | 8.2% |
| 5% | FactorPrism® | 12 / 12 | 95% | 78% | 35% | 16.9% |
| 5% | magnitude-only | 106.2 / 226 | 97% | 77% | 21% | 23.1% |
| 10% | FactorPrism® | 12 / 12 | 73% | 58% | 30% | 36.1% |
| 10% | magnitude-only | 142.7 / 240 | 100% | 57% | 20% | 49.7% |
| 15% | FactorPrism® | 12 / 12 | 63% | 38% | 25% | 42.7% |
| 15% | magnitude-only | 159.1 / 249 | 100% | 40% | 19% | 55.8% |
1. In its operating regime, the engine is essentially exact.
At the noise levels typical of established business aggregates (≤0.5% random period-over-period movement per segment), FactorPrism® recovered 100% of injected causes in 40 of 40 datasets, ranked every one in the top 5, kept precision at 80–94%, and estimated each cause's impact within ~2% of its true contribution — while returning a list of 4–6 factors. The answer isn't directionally right; it names the correct locations and gets the sizes right.
2. The factor list never floods.
This is the category's documented failure mode, and the headline result. By 2% noise, magnitude-only thresholding returns 57 factors on average (peaking near 200) — an alert flood in which the three real causes are items among dozens. FactorPrism®'s significance gate returns at most 12, with the same recall (98% vs 97%). The gate isn't discarding truth to look tidy; it's discarding segments whose movement carries no evidence of signal beyond their own volatility.
3. When data gets very noisy, no method can save a small effect — and honesty matters more than recall.
At 10–15% noise, a 5% cause sits at or below what two periods of data can mathematically distinguish. The magnitude-only arm still shows "100% recall" at 15% noise — but only because it returns ~160 factors; the truth is in the list the way a name is in a phone book (precision 19%, ranked placement collapsed to FactorPrism®'s level). No tool should claim to work in this regime. What a tool should do there is hold the line at a bounded, ranked dozen candidates instead of handing a business user 250 alerts.
4. Ranking degrades before it breaks.
Even past the comfortable regime, recoverable causes stay near the top: at 5% noise — already very volatile for period-over-period segment data — 78% of injected causes ranked in the top 5 of a 12-item list.
Performance is governed by the period-over-period stability of your segments — not company size or row count. Quarterly revenue for an established business, aggregated to segment level, typically behaves like the left side of the table; daily data for tiny, sparse segments behaves like the right. FactorPrism®'s pre-flight checks and significance gating are built around exactly this: report what the data can support, and stay quiet past that line.
Two structural properties hold at every noise level, because they're guaranteed by construction rather than estimated: the returned factors always reconcile exactly to the total change being explained, and every factor is located — attached to the level of the business where it acts, separating broad-based forces from problems localized to one segment.
FactorPrism® is a Snowflake Native App: the analysis runs entirely inside your Snowflake account — your data never leaves. Install and run the built-in demo, no data connection required.
Get it on Snowflake Marketplace