Company Deep Dives
Every semiconductor company has different priorities. This part tailors your preparation to ASML, Lam Research, Applied Materials, Intel/TSMC/Samsung, and startups.
Field Manual Layer 4 covers the vendor-specific telemetry details that sharpen these answers: ASML TwinScan alignment array timing, Lam RF complex impedance, Applied Materials throttle valve and PVD target health, and TEL cell architecture. Read it before your company-specific prep sessions.
ASML: Lithography and Metrology
The question archetype: EUV stochastic defect modeling
import numpy as np
def stochastic_defect_model(
dose_mj_cm2: float = 20.0,
pixel_size_nm: float = 3.0, # 3nm node = 3nm pixel
photon_energy_ev: float = 92.0, # EUV at 13.5nm
) -> dict:
"""
At 3nm, we are in the shot noise dominated regime.
Photon count per pixel follows Poisson distribution.
Variance = mean: low mean photons = high relative noise.
Key insight: this is not process variation we can optimize away.
It is fundamental quantum physics. We model it, then
trade off dose (throughput) vs. defect probability.
"""
joules_per_photon = photon_energy_ev * 1.602e-19
dose_j_m2 = dose_mj_cm2 * 1e-3 / 1e-4 # mJ/cm2 -> J/m2
pixel_area_m2 = (pixel_size_nm * 1e-9) ** 2
mean_photons = (dose_j_m2 * pixel_area_m2) / joules_per_photon
# P(zero photons) = P(defect) under Poisson model
prob_zero = np.exp(-mean_photons)
# Relative CD variation: sigma_CD / CD_mean ~ 1/sqrt(N)
relative_cd_variation = 1.0 / np.sqrt(mean_photons)
return {
'mean_photons_per_pixel': mean_photons,
'defect_probability': prob_zero,
'relative_cd_variation': relative_cd_variation,
'stochastic_regime': mean_photons < 100,
'interpretation': (
'Shot noise dominated: stochastic defects are significant'
if mean_photons < 100 else 'Shot noise manageable'
)
}
result = stochastic_defect_model(pixel_size_nm=3.0)
# At 3nm node: mean_photons ~ 10-50, defect_prob ~ 1e-5 to 1e-22
# This drives EUV dose optimization and stochastic OPCASML vocabulary
| Term | What It Means | Why It Matters |
|---|---|---|
| Overlay | Misalignment between successive lithography layers (x, y error in nm) | 1nm overlay error can short contacts; tightest spec in advanced fab; drives scanner matching effort |
| CDU | Critical Dimension Uniformity: variation in feature size across a wafer | Transistor performance spread; tighter CDU = faster clock speed distribution |
| Stochastics | Random variations from discrete photon count and resist molecule statistics | Fundamental limit at small scales; must be modeled and traded off, not optimized away |
| Scanner matching | Making multiple scanners produce identical patterning results | Fleet capacity: any scanner must process any lot; matching error is overlay error |
| Reticle heating | Thermal expansion of mask during exposure from absorbed EUV power | Pattern distortion; needs physics-based prediction and correction before each lot |
| Dose/focus window | Range of acceptable dose and focus parameters for a given process | Smaller window = more sensitive to variation; EUV stochastics shrink this window |
| Aerial image | Light intensity pattern in resist before chemical development | Simulated for OPC; data science predicts actual CD from aerial image model |
ASML system design question
"Design a model to predict overlay error for a wafer before it enters the scanner. Inputs: previous layer metrology, scanner alignment measurements, wafer thermal history, chuck fingerprint."
class OverlayPredictionModel:
"""
Predict overlay error (dx, dy) per exposure field before exposure.
Multi-timescale inputs require hierarchical model:
- Slow (wafer level): previous overlay, flatness
- Medium (thermal): temperature, time since last exposure
- Fast (field level): alignment marks, chuck state, lens heating
"""
def engineer_features(self, wafer_record: dict) -> dict:
features = {}
# Physics: thermal expansion dL = alpha * L * dT
# Silicon alpha = 2.6e-6 per K, 300mm wafer
features['thermal_expansion_nm'] = (
2.6e-6 * wafer_record['wafer_diameter_mm'] * 1e6 *
(wafer_record['wafer_temperature_pre_load'] - 22.0)
)
# Chuck fingerprint: systematic offset per slot (calibrate once, apply always)
features['chuck_slot_x'] = wafer_record['chuck_slot'] % 3
features['chuck_slot_y'] = wafer_record['chuck_slot'] // 3
# Lens heating: accumulated dose increases lens temperature, shifts focus
features['lens_dose_accumulated'] = (
wafer_record['lens_heating_accumulated'] * wafer_record['dose_mj_cm2']
)
# Previous layer residual: carry forward uncompensated overlay
features['overlay_residual_x'] = (
wafer_record['previous_layer_overlay_x'] - wafer_record['target_overlay_x']
)
return features
def model_architecture(self) -> dict:
# Wafer-level: GBT (slow features, interpretable)
# Field-level correction: Ridge (fast, stable, linear in alignment marks)
# Combined: y_total = y_wafer + y_field_correction
return {
'wafer_model': 'XGBoostRegressor',
'field_model': 'RidgeRegression', # Must complete before exposure starts
'combination': 'additive',
'latency_ms': 100, # Hard constraint: before wafer loads
}Lam Research: Etch and Deposition
The question archetype: multi-sensor endpoint detection
"Design real-time endpoint detection for a high-aspect-ratio contact etch (10:1 aspect ratio holes through oxide to silicon). OES has interference from polymer deposition. RF impedance is noisy from matching network tuning. Detect endpoint reliably in under 50ms."
from collections import deque
import numpy as np
class RobustEndpointDetector:
"""
Multi-sensor endpoint detection for high-aspect-ratio etch.
Single-sensor detection fails when that sensor has interference.
OES has polymer deposition noise. RF impedance has matching network noise.
Solution: weighted voting across 3 sensors, require 2-of-3 agreement,
confirm across two consecutive windows to prevent false triggers.
"""
def __init__(self, window_ms=50, sample_rate_hz=1000):
samples = int(window_ms * sample_rate_hz / 1000)
self.oes_si = deque(maxlen=samples) # 777nm Si emission
self.oes_o = deque(maxlen=samples) # 520nm O emission
self.rf = deque(maxlen=samples)
self.imp = deque(maxlen=samples)
self.state = 'ETCHING_OXIDE' # state machine
def update(self, reading: dict) -> dict:
if 'oes' in reading:
self.oes_si.append(reading['oes'].get(777, 0))
self.oes_o.append(reading['oes'].get(520, 0))
if 'rf' in reading: self.rf.append(reading['rf'])
if 'imp' in reading: self.imp.append(reading['imp'])
if len(self.rf) < len(self.rf.maxlen):
return {'endpoint': False, 'confidence': 0.0, 'reason': 'filling_buffer'}
# Sensor quality checks (Archetype 6: frozen sensor)
for name, buf in [('OES_Si', self.oes_si), ('RF', self.rf)]:
if np.std(list(buf)[-10:]) < 1e-6:
return {'endpoint': False, 'confidence': 0.0,
'reason': f'FROZEN_{name}', 'action': 'SUSPEND_AND_ALARM'}
votes, confidence = 0, 0.0
# Signal 1: OES Si/O ratio rises at endpoint (Si exposed, O drops)
si_mean = np.mean(list(self.oes_si)[-10:])
o_mean = np.mean(list(self.oes_o)[-10:])
ratio = si_mean / (o_mean + 1e-6)
if ratio > 2.0: votes += 1
confidence += min(ratio / 2.0, 1.0) * 0.40
# Signal 2: RF 2nd harmonic drops when plasma load shifts at endpoint
fft = np.fft.rfft(np.array(self.rf) - np.mean(self.rf))
h2 = float(np.abs(fft[2])) if len(fft) > 2 else 0.0
if hasattr(self, 'rf_baseline') and self.rf_baseline:
drop = 1 - h2 / self.rf_baseline
if drop > 0.3: votes += 1
confidence += max(drop, 0) * 0.35
# Signal 3: impedance slope turns positive approaching endpoint
slope = float(np.polyfit(range(len(self.imp)), list(self.imp), 1)[0])
if slope > 0.01: votes += 1
confidence += min(max(slope * 100, 0), 1.0) * 0.25
# State machine: require confirmation across two windows
detected = votes >= 2 and confidence > 0.6
if self.state == 'ETCHING_OXIDE' and detected:
self.state = 'APPROACHING_ENDPOINT'
elif self.state == 'APPROACHING_ENDPOINT' and detected:
self.state = 'ENDPOINT_DETECTED'
return {'endpoint': True, 'confidence': confidence,
'reason': 'multi_sensor_confirmed'}
elif self.state == 'APPROACHING_ENDPOINT' and not detected:
self.state = 'ETCHING_OXIDE' # false alarm, reset
return {'endpoint': False, 'confidence': confidence, 'state': self.state}Lam vocabulary
| Term | What It Means | Why It Matters |
|---|---|---|
| Aspect ratio | Depth divided by width of an etched feature | High AR (>10:1) causes ion transport limitations, ARDE, and endpoint signal attenuation |
| Bosch process | Alternating etch/passivation cycles for deep silicon etch | Endpoint detection must work per cycle, not just at final depth |
| ARDE | Aspect Ratio Dependent Etching: etch rate slows in narrow features | Non-uniform depth across different feature sizes on same wafer |
| Microloading | Local etch rate depends on surrounding pattern density | Isolated features etch faster than dense arrays; affects uniformity |
| Chamber matching | Making multiple identical chambers produce identical etch profiles | Fleet capacity: any chamber must process any lot without yield difference |
| Seasoning | Running conditioning wafers after PM to rebuild chamber wall deposits | Fresh post-PM chamber behaves differently; seasoning stabilizes it |
| RF match | Impedance matching network between RF generator and plasma | Capacitor wear leads to impedance mismatch, arcs, and Archetype 1 failure mode |
Applied Materials: Breadth and Integration
The question archetype: cross-step yield pattern detection
"We have 50 tools across etch, deposition, and CMP. Each generates 1GB per day of data. Design a system to detect yield-limiting patterns that span multiple process steps."
WITH wafer_journey AS (
SELECT
w.wafer_id, w.lot_id,
e.tool_id AS etch_tool,
e.rf_power_mean,
e.endpoint_detected,
c.tool_id AS cmp_tool,
c.removal_rate,
c.pad_life_at_run,
m.cd_measurement,
m.defect_count,
EXTRACT(EPOCH FROM (c.step_start - e.step_end)) / 3600
AS etch_to_cmp_hours -- Q-time check
FROM wafers w
LEFT JOIN etch_fdc e ON w.wafer_id = e.wafer_id
LEFT JOIN cmp_fdc c ON w.wafer_id = c.wafer_id
AND c.step_start > e.step_end -- causal: CMP after etch
LEFT JOIN final_metrology m ON w.wafer_id = m.wafer_id
WHERE e.step_start > CURRENT_DATE - INTERVAL '90 days'
)
SELECT * FROM wafer_journey
WHERE etch_to_cmp_hours < 12; -- Q-time limit: exclude violationsdef cross_step_analysis(wafer_journey_df):
"""
Look for yield-limiting patterns that span process steps.
Three patterns consistently appear in advanced logic fabs.
"""
findings = []
# Pattern 1: incomplete etch endpoint -> polymer residue -> CMP polishing rate change
# Physics: residual SiO2 polymer from incomplete etch hardens the pad contact
corr = wafer_journey_df['endpoint_detected'].corr(wafer_journey_df['removal_rate'])
if abs(corr) > 0.3:
findings.append({
'pattern': 'etch_endpoint_affects_cmp_rate',
'correlation': corr,
'action': 'Feed endpoint confidence score as feedforward input to CMP R2R controller'
})
# Pattern 2: etch RF power x CMP pad life interaction
# Physics: high RF power leaves harder etch byproducts; worn pad cannot clear them
# Detection: two-way ANOVA with interaction term
df = wafer_journey_df.copy()
df['rf_bin'] = pd.qcut(df['rf_power_mean'], 2, labels=['low_rf', 'high_rf'])
df['pad_bin'] = pd.qcut(df['pad_life_at_run'], 2, labels=['fresh', 'worn'])
interaction = df.groupby(['rf_bin', 'pad_bin'])['cd_measurement'].mean().unstack()
# If (high_rf, worn) is much worse than both main effects predict, interaction exists
findings.append({
'pattern': 'etch_cmp_interaction',
'table': interaction.to_dict(),
'action': 'Joint R2R control: optimize etch and CMP parameters together'
})
# Pattern 3: Q-time violation leakage corrupts training labels
violations = wafer_journey_df[wafer_journey_df['etch_to_cmp_hours'] > 8]
if len(violations) > 50:
findings.append({
'pattern': 'q_time_label_corruption',
'n_affected': len(violations),
'action': 'Filter pm_sequence=0 AND scrap_reason NOT IN Q-TIME before any model training'
})
return findingsIntel / TSMC / Samsung: Yield at Scale
The question archetype: persistent fleet yield differences
"We have 100 identical etch chambers. Some consistently produce 2% higher yield than others, but we cannot identify why. The difference persists across multiple PM cycles. Design an analysis to find the root cause."
-- Step 1: find lots that were split across multiple chambers (natural experiment)
-- These eliminate product-mix confounding since same lot = same design
WITH split_lots AS (
SELECT lot_id,
COUNT(DISTINCT tool_id) AS n_tools,
ARRAY_AGG(DISTINCT tool_id) AS tools_used
FROM wafer_processing
WHERE step_name = 'CRITICAL_ETCH_01'
AND timestamp > CURRENT_DATE - INTERVAL '90 days'
GROUP BY lot_id
HAVING COUNT(DISTINCT tool_id) >= 2 -- split across at least 2 chambers
)
SELECT s.lot_id, w.tool_id, w.wafer_id, y.final_yield
FROM split_lots s
JOIN wafer_processing w ON s.lot_id = w.lot_id
JOIN yield_data y ON w.wafer_id = y.wafer_id
ORDER BY s.lot_id, w.tool_id;
-- Analysis: within-lot yield difference = tool effect (product mix cancelled)from statsmodels.regression.mixed_linear_model import MixedLM
import pandas as pd, numpy as np
def fleet_tool_effect_model(split_lot_data: pd.DataFrame) -> dict:
"""
Linear mixed model: yield ~ tool_id + (1|lot_id)
Fixed effect: tool_id (what we care about)
Random effect: lot_id (nuisance - controls for product, recipe, timing)
A Difference-in-Differences approach: same lot, different tools.
The lot random intercept absorbs all lot-level confounders.
"""
tool_dummies = pd.get_dummies(split_lot_data['tool_id'], prefix='tool', drop_first=True)
data = pd.concat([split_lot_data[['lot_id', 'final_yield']], tool_dummies], axis=1)
formula = 'final_yield ~ ' + ' + '.join(tool_dummies.columns)
model = MixedLM.from_formula(formula,
groups=data['lot_id'], # random intercept per lot
re_formula='~1',
data=data)
result = model.fit()
# Extract statistically significant tool effects
tool_effects = {
col.replace('tool_', ''): {
'yield_delta_pct': result.params[col],
'p_value': result.pvalues[col],
'significant': result.pvalues[col] < 0.05
}
for col in tool_dummies.columns
}
# Next step: KS test on sensor signatures of best vs. worst tools
# to generate root cause hypotheses for physical investigation
return tool_effectsFoundry vocabulary
| Term | What It Means | Why It Matters |
|---|---|---|
| DPW | Die per Wafer: functional chips on one wafer | Direct revenue denominator; every yield improvement multiplies by DPW |
| Binning | Sorting chips by speed, power, leakage at electrical test | Same silicon, different prices; bin prediction is a classification problem with asymmetric costs |
| Scribe line | Area between dice used for test structures | Test structure data represents die performance; spatial alignment to die map required |
| Reticle-limited yield | Yield loss from mask defects rather than process variation | Archetype 7: Moran's I catches it; aggregate defect count misses it |
| Line yield | Wafers completing all steps divided by wafers started | Cycle time and WIP management; Little's Law applies |
| Kill ratio | Defects that cause die failure divided by total defects | Not all defects kill; critical area analysis required to translate defect count to yield loss |
Smaller Companies and Startups
KLA (inspection and metrology), Nova (process control metrology), Onto Innovation (process control), and various AI/ML startups targeting fab automation. The interview style differs significantly from the tier-one OEMs.
| Company | Focus Area | Key Technical Challenge | Your Differentiator |
|---|---|---|---|
| ASML | Lithography, imaging | EUV stochastics, overlay prediction | Physics-first modeling, Poisson statistics, thermal expansion features |
| Lam Research | Plasma etch, ALD | Real-time endpoint under 50ms | Multi-sensor fusion, frozen sensor detection, state machine design |
| Applied Materials | Breadth, integration | Cross-step yield pattern detection | Wafer journey SQL, interaction effects, Q-time label filtering |
| Intel/TSMC/Samsung | Yield at scale | Fleet monitoring, root cause attribution | Mixed effects models, split-lot experiments, cost-calibrated ROI |
| Startups | Speed, ownership | End-to-end with constraints | Full-stack capability, business impact quantification, frugality |
The Interview Day
Week-before checklist
Day-of strategy
The semiconductor industry is at an inflection point. 3nm production, EUV lithography, advanced packaging, AI-driven design: these create unprecedented demand for data scientists who understand both algorithms and atoms.
Your LeetCode preparation is not wasted. It gave you algorithmic fluency. This manual adds domain fluency: the constraints, the failure modes, the vocabulary, and the judgment that comes from understanding how physical systems generate data.
You are now prepared to walk into any semiconductor data science interview and demonstrate that you can think like a fab engineer from day one. Not because you memorized answers, but because you understand the underlying principles that make this domain unique.