YieldOpsAcademy
Zone 00 ยท Clean Room

The Wafer's Journey

Follow Wafer #3301 through final inspection. A supervised classifier just passed it. It should not have. Master Isolation Forest through the one failure mode that trained models cannot see: the defect that has never happened before.

๐Ÿ”
Today's Subject
Anomaly Detection
Unsupervised discovery of novel killer defects without labels

The Vocabulary Problem

Your supervised classifier was trained on three years of historical wafer data. It learned every known failure mode: gate oxide breakdown, copper hillock formation, lithography misalignment, etch non-uniformity. It is very good at recognizing the past. Wafer #3301 just passed inspection with 97.3% confidence. The downstream electrical test will tell a different story in four hours. The problem is not the model. The problem is that a robotic handler developed a micro-vibration pattern last Tuesday and no wafer has ever failed that way before. The classifier has no vocabulary for it. Isolation Forest does not need vocabulary. It only needs to know that Wafer #3301 looks different.

01
Build a Forest of Random Partitions
Construct 100 isolation trees. Each tree randomly selects a feature and a random split value within that feature's range, then recursively partitions the data. No labels, no target variable. The trees learn nothing about defects. They only learn the shape of the data.
02
Measure Isolation Depth
For each wafer, record how many splits it takes to isolate it into its own partition across all trees. Normal wafers sit in dense clusters and require many splits to separate. Anomalies sit alone in sparse regions and isolate in very few splits.
03
Compute the Anomaly Score
Average the path length across all trees and normalize it against the expected path length for a random point. Scores near 1.0 mean very short paths: the point isolated easily and is likely anomalous. Scores near 0.5 mean the point required average effort: normal behavior.
04
Flag and Route for Physical Inspection
Wafers above the score threshold are routed to metrology for physical inspection rather than continuing to packaging. The physical test result becomes the first label for this failure mode, expanding the fab's defect vocabulary for the next supervised model retrain.

Isolation Forest does not define normal. It does not define defective. It only identifies points that the data itself treats as outliers by virtue of how easily they are partitioned away from everything else.

Full Access Required

Continue the journey

Zones 01 through 04 cover the problem scenario, algorithm analysis, alternative comparisons, interview gauntlet, and production checklist for this journey.

All six journeys are included with full access.

Unlock full access ยท $149See the learning path