Why Most Lean Assessments Fail in Aerospace
Every lean transformation begins with an assessment. And most assessments in aerospace manufacturing are worse than useless — they are actively misleading. Here is why.
The typical lean assessment is a checklist: “Do you have 5S? Do you have standard work? Do you have visual management?” The facility checks the boxes. Yes, there are 5S labels on the shelves. Yes, there is a standard work document in the binder. Yes, there is a whiteboard in the production area. The assessment scores the facility at Level 3. The leadership team celebrates. And nothing changes, because the labels are ignored, the standard work binder hasn’t been opened in six months, and no one reads the whiteboard.
A meaningful assessment does not ask “do you have the tool?” It asks “how deeply is the tool embedded in daily operations, and what happens when it breaks?” The difference between Level 2 and Level 4 is not the presence of a Pitch Board. It is what happens at 10:15 AM when the Pitch Board shows a red block — does anyone act on it within 15 minutes, or does it sit there until the end-of-shift meeting?
⚠️ The Checklist Trap
If your lean assessment can be completed by walking the floor and checking boxes, it is measuring the presence of artifacts, not the maturity of systems. A 5S label is an artifact. A process where every operator can find every tool within 30 seconds — because the 5S system is maintained daily and deviations are corrected within the shift — is a system. Artifacts score well on checklists. Systems score well on results.
The Four Management Questions
Every manufacturing facility — Make Shop or Assembly Shop — must answer four fundamental management questions every day. The maturity assessment evaluates how well the facility answers each one.
| Question | What It Means (Make Shop) | What It Means (Assembly Shop) |
|---|---|---|
| Schedule | “Are the right parts being worked on in the right sequence?” | “Is each station completing its work within the Takt period?” |
| Capacity | “Do we have enough machine hours to meet the production plan?” | “Do we have enough stations and labor hours to meet the production rate?” |
| Manpower | “Do we have the right number of people with the right skills at each work center?” | “Are operators cross-trained, balanced across stations, and not burning out?” |
| Cost | “What is the actual cost per part, and where is the waste?” | “What is the actual labor cost per unit, and where is the NVA?” |
These four questions form the columns of the maturity rubric. The five maturity levels form the rows. The assessment evaluates each cell: “How well does this facility answer the Schedule question?” at each level.
The Five Maturity Levels
Level 1: Reactive
The facility operates on tribal knowledge and firefighting. There are no reliable systems — schedules change daily based on who screams loudest, capacity is unknown until a machine breaks, manpower is allocated by gut feel, and cost is calculated only at the quarterly financial review. The dominant management mode is “walk the floor and fix what’s broken.” Heroes are celebrated. The same problems recur weekly because root causes are never addressed.
Level 2: Defined
Basic metrics exist and are tracked. The facility has a production schedule that is published weekly. Capacity is estimated. Standard work documents exist for some operations. But the systems are fragile — when a supervisor is absent, the system breaks. Metrics are collected but not acted upon. Data exists in spreadsheets that one person maintains. If that person leaves, the data disappears. This is the Black Book level.
Level 3: Managed
Systems are in place and function daily without depending on specific individuals. Pitch Boards are updated every Takt period. CONWIP limits are enforced. Standard work is followed and updated when improvements are made. Data drives decisions — not opinions, not politics, not whoever shouts loudest. The facility can answer all four management questions with current data within 15 minutes.
Level 4: Optimized
Systems are integrated and proactive. The constraint is identified and actively managed. Line balance is reviewed when rate changes occur. Kata coaching happens daily. The facility anticipates problems before they manifest — a rising trend in M-01 reason codes triggers a material flow investigation before it causes a line stop. Cross-functional teams solve problems using data, not blame.
Level 5: Self-Improving
The system detects and corrects its own gaps without management intervention. Operators update standard work when they find a better method. Team leads run Kata experiments without being told. The Pitch Board action loop closes within the shift, not the week. New employees are trained using TWI methods by certified trainers on the line, not by “shadowing” for two weeks. The organization improves continuously because improvement is how work is done, not a separate activity.
💡 Where Most Aerospace Facilities Actually Are
In our experience, approximately 70% of aerospace manufacturing facilities operate at Level 1–2 across all four management questions. Another 25% are at Level 2–3 in some areas but Level 1 in others. Fewer than 5% operate consistently at Level 3 or above. This is not a criticism — it is a realistic baseline. The value of the assessment is knowing where you actually are, not where you wish you were.
Make Shop Maturity Rubric
The Make Shop rubric evaluates machining, fabrication, and processing operations where parts flow through shared equipment with variable routings.
Schedule Management — Make Shop
| Level | Indicators |
|---|---|
| L1: Reactive | Schedule is a hot list maintained by the production manager. Priority changes daily based on customer complaints or downstream shortages. Expeditors outnumber planners. “What’s the schedule?” gets a different answer depending on who you ask. |
| L2: Defined | Weekly dispatch list published from MRP. Priority is based on due dates, but no sequencing logic accounts for setup optimization or constraint loading. Schedule adherence is measured monthly but not acted upon daily. |
| L3: Managed | CONWIP limits are in place. Work is released based on completion signals, not MRP push. The constraint work center has a visible dispatch queue sorted by priority. Schedule adherence is measured daily and deviations trigger same-day response. |
| L4: Optimized | Constraint scheduling includes setup optimization (family grouping) and buffer management. Buffer penetration is monitored in real-time. Non-constraint work centers are subordinated — they process whatever feeds the constraint, in the sequence the constraint needs it. |
| L5: Self-Improving | The scheduling system automatically detects buffer penetration trends and adjusts release timing. Operators at the constraint flag emerging issues (tool wear, material variation) before they cause schedule breaks. The system learns from past disruptions and pre-positions recovery actions. |
Capacity Management — Make Shop
| Level | Indicators |
|---|---|
| L1: Reactive | Capacity is unknown until a bottleneck appears. The response to missed delivery is “add overtime” or “add a shift.” No one can answer “how many machine-hours do we need next quarter?” without a week of spreadsheet work. |
| L2: Defined | Rough-cut capacity planning exists: machine hours available vs. planned load by work center. But routing data in the ERP is inaccurate (see ERP Data Integrity), so the capacity plan is ±30% at best. OEE is measured on some machines. |
| L3: Managed | Routing data is accurate within ±10% for the constraint and top-20% parts. Capacity loading is reviewed weekly with actual vs. planned comparison. Utilization targets are set below 85% to prevent queue explosion. OEE is measured on all critical machines with root-cause analysis for losses. |
| L4: Optimized | Capacity planning integrates rate ramp scenarios (see Production Ramp Management). Lead time is modeled using Kingman’s equation with validated variability inputs. Capital investment decisions are based on constraint throughput analysis, not department-level wish lists. |
| L5: Self-Improving | The system continuously refines routing accuracy using actual cycle time data fed back from the floor. Capacity forecasts auto-adjust based on observed OEE trends. The facility can model the impact of a rate change within 24 hours with ±5% accuracy. |
Manpower Management — Make Shop
| Level | Indicators |
|---|---|
| L1: Reactive | One operator per machine, fixed assignment. When an operator is absent, the machine sits idle. Cross-training is informal — “Joe showed me once.” Staffing decisions are based on headcount targets from finance, not production requirements. |
| L2: Defined | A skills matrix exists showing who can run which machines. Cross-training is planned but inconsistent. Staffing is based on machine count — “we need one person per machine” — regardless of utilization or cycle time. |
| L3: Managed | Staffing is based on load analysis: if a machine runs 60% utilized, the operator can tend a second machine during wait time. Cross-training ensures every critical machine has ≥3 qualified operators. TWI is used for operator qualification. |
| L4: Optimized | Labor is flexed weekly based on the production plan. Operators move between work centers based on where the load is, not where they “belong.” The constraint work center is always fully staffed first; non-constraint staffing flexes around it. |
| L5: Self-Improving | Operators self-organize around the constraint. Team leads allocate labor daily based on buffer status and dispatch priority without supervisor intervention. New operators reach qualification in 50% less time than industry average due to structured TWI programs. Retention exceeds 90% because operators see career progression through skill development. |
Cost Management — Make Shop
| Level | Indicators |
|---|---|
| L1: Reactive | Cost is known only at the P&L level. Part-level costs are estimated using standard rates that haven’t been updated in years. “We’re profitable” or “we’re not profitable” is the extent of cost visibility. Waste is invisible because it is embedded in the standard. |
| L2: Defined | Standard costs exist per part. Variance reports are generated monthly. But the standards are based on ERP routing data that is inaccurate (see Level 2 Capacity), so variances are noise. No one trusts the cost data enough to make decisions from it. |
| L3: Managed | Standard costs are based on validated routing data. Variance analysis distinguishes between rate variance (we paid more per hour) and efficiency variance (we used more hours). Constraint throughput accounting is understood — decisions prioritize constraint throughput over local cost optimization. |
| L4: Optimized | Cost models integrate quality costs (scrap, rework, MRB), schedule delay costs, and inventory carrying costs (see Little’s Law). Make-vs-buy decisions are based on constraint impact, not unit cost comparison. The true cost of an expedited order is quantified. |
| L5: Self-Improving | Cost reduction is driven by continuous waste elimination at the process level. Operators identify and quantify waste as part of daily Kata practice. The facility can demonstrate year-over-year cost reduction through process improvement, not headcount reduction. |
Assembly Shop Maturity Rubric
The Assembly Shop rubric evaluates sequential assembly operations where labor flows through fixed stations with defined Takt time.
Schedule Management — Assembly Shop
| Level | Indicators |
|---|---|
| L1: Reactive | No Takt discipline. Work is pushed to the next station “when it’s done.” Station completion times vary by 2–3x. The schedule is a Gantt chart that is updated weekly and is already wrong by Tuesday. Overtime is the default recovery mechanism. |
| L2: Defined | Takt time is calculated and published. Pitch Boards exist but are updated inconsistently — sometimes at the end of the shift, sometimes not at all. When a station misses Takt, the response is “work harder” or “add overtime.” |
| L3: Managed | Pitch Boards are updated every Takt period. Misses are coded with a reason code taxonomy. The top 3 miss reasons are visible and being actively worked. An escalation path exists: if a station is red at the Takt boundary, the team lead responds within 15 minutes. |
| L4: Optimized | Pitch Board data drives weekly Pareto analysis. Systemic issues are addressed through Kata experiments. Line balance is reviewed monthly and adjusted when mix changes. Station work content is within ±5% of Takt target. Recovery plans exist for predictable disruptions (inspector unavailable, engineering hold, part shortage). |
| L5: Self-Improving | Operators flag emerging schedule risks before they cause misses (“this sealant batch is curing slower — Station 4 may need an extra 10 minutes”). The team adjusts in real-time without supervisor involvement. Schedule adherence exceeds 90% consistently, with miss patterns analyzed and root causes eliminated permanently. |
Capacity Management — Assembly Shop
| Level | Indicators |
|---|---|
| L1: Reactive | Capacity is expressed as “number of people” without connection to station count, Takt, or work content. “We need more people” is the default response to any capacity shortfall. No one can quantify how many stations or labor hours a rate increase requires. |
| L2: Defined | Station count and labor hours are calculated for the current rate. Takt time is used to size the line. But the calculation assumes ideal conditions — no allowance for NVA, rework, learning curve, or absenteeism. Actual capacity is 15–25% below the plan. |
| L3: Managed | Capacity planning includes realistic allowances: NVA factors, Water Spider coverage, quality rework rates, and absenteeism. Yamazumi charts show the actual breakdown of value-add vs. waste at each station. Capacity gaps are identified before they become crises. |
| L4: Optimized | Capacity models account for rate ramp scenarios with Little’s Law validation. The facility can model “what does it take to go from 4/month to 6/month?” in terms of stations, people, Takt change, and timeline — within 48 hours, with ±10% accuracy. |
| L5: Self-Improving | Capacity is continuously freed through waste elimination. Year-over-year, the same rate requires fewer labor hours because NVA is systematically removed. The facility can absorb moderate rate increases (10–15%) without adding stations or headcount, using freed capacity from previous improvement cycles. |
Manpower Management — Assembly Shop
| Level | Indicators |
|---|---|
| L1: Reactive | Operators are assigned to “their” station permanently. Cross-training is minimal. When someone is absent, the station is either skipped or overtime is used to cover. New operators learn by “shadowing” an experienced person for 2–4 weeks — with no structured curriculum and highly variable outcomes. |
| L2: Defined | A cross-training matrix exists. Training plans are documented. But cross-training is inconsistent — some operators qualify on 2–3 stations, others only on one. Training quality varies by trainer. There is no structured method for skill transfer. |
| L3: Managed | Cross-training ensures every station has ≥3 qualified operators. TWI Job Instruction is the standard training method. Training outcomes are validated through observation, not just sign-off. Operator workload is monitored — if the actual work content exceeds Takt, it is treated as a system problem, not a performance problem. |
| L4: Optimized | Labor is flexed daily based on Pitch Board patterns and absence management. Certified TWI trainers exist on every shift. New operators reach full qualification in a defined timeline (e.g., 6 weeks for basic stations, 12 weeks for complex). Burnout is actively monitored through Yamazumi analysis of work content vs. Takt. |
| L5: Self-Improving | Operators actively seek cross-training because it creates career progression. The training system produces consistently qualified operators regardless of who trains them. Experienced operators contribute to standard work improvements through structured Kata experiments. The line can absorb 15% absenteeism without schedule impact due to deep cross-training depth. |
Cost Management — Assembly Shop
| Level | Indicators |
|---|---|
| L1: Reactive | Cost is tracked as “labor hours per unit” at the program level. No visibility into where hours are consumed. “We spent 18,000 hours on that aircraft” — but no one knows how many of those hours were value-add vs. walking, waiting, reworking, or searching for parts. |
| L2: Defined | Labor hours are tracked by station or zone. Cost per unit is calculated. But the breakdown between value-add and NVA is estimated, not measured. The NVA cost calculation has not been done rigorously. |
| L3: Managed | Value-add vs. NVA is measured at the station level through time studies and Yamazumi analysis. The cost of zone violations is quantified (see Water Spider). Rework hours are tracked by reason code. The facility knows exactly where its labor dollars go. |
| L4: Optimized | Cost targets are set based on theoretical minimum (value-add only) plus realistic allowances. Improvement targets are quantified: “Reducing NVA at Station 7 from 35% to 20% saves $X per unit.” Investment decisions (Water Spiders, tooling, fixtures) are justified by measured NVA reduction. |
| L5: Self-Improving | Cost per unit decreases year-over-year through systemic waste elimination, not headcount reduction. Learning curve improvements are tracked and verified against theoretical learning curve models. The facility can commit to cost reduction targets for future rate increases because improvement is predictable, not aspirational. |
The IE Competency Matrix
A facility cannot advance its maturity level without the Industrial Engineering competencies to support it. The IE Competency Matrix maps the technical skills required at each maturity level. Use it to assess your IE team’s readiness and to plan skill development.
| Competency | L1–2 (Foundation) | L3 (Managed) | L4–5 (Optimized/Self-Improving) |
|---|---|---|---|
| Time Study & Work Measurement | Can conduct a basic time study with a stopwatch. Understands observed time vs. standard time. Can calculate a simple average. | Can build a Yamazumi chart from time study data. Distinguishes VA, NVA, and required NVA. Applies allowance factors correctly. | Can design and validate standard work from time studies. Trains others in work measurement. Uses time study data to drive Kata experiments. |
| Flow Analysis | Can map a basic value stream. Understands the concept of flow vs. batch. Can identify obvious bottlenecks by observation. | Can apply Little’s Law to calculate WIP limits. Understands Kingman’s equation and its implications for utilization targets. Can design a CONWIP system. | Can model multi-stage production systems with variability. Can predict the impact of rate changes on WIP, lead time, and queue behavior. Can design and validate DBR scheduling systems. |
| Line Design & Balancing | Can calculate Takt time from demand and available time. Understands the concept of station balance. Can identify overloaded stations. | Can build a complete line design: Takt → station count → staffing → Yamazumi balance. Can design Water Spider routes. Can set up and maintain Pitch Boards. | Can redesign lines for rate changes including learning curve impacts. Can model ramp scenarios (see Production Ramp Management). Can integrate Takt, balance, staffing, and material flow into a unified production system. |
| Data & Systems | Can pull reports from the ERP system. Maintains spreadsheets for tracking. Understands basic SQL or reporting tools. | Can audit and correct ERP routing data. Builds dashboards that drive daily decisions. Understands the difference between data accuracy and data precision. | Designs data systems that self-correct (e.g., actual cycle time feedback loops to routing standards). Integrates shop floor data with planning systems. Can quantify the cost of data inaccuracy. |
| Coaching & Change | Can explain lean concepts to operators. Participates in Kaizen events. Understands basic problem-solving (5-Why, fishbone). | Can facilitate Gemba walks. Coaches team leads using the Coaching Kata five questions. Understands ADKAR and applies it to improvement rollouts. | Develops other coaches. Designs the improvement system (who coaches whom, on what cadence, using what structure). Builds organizational capability that persists beyond any individual IE’s tenure. |
💡 Assess the IE Team, Not Just the Floor
A facility cannot operate at Level 3 if its IE team has Level 1 competencies. The IE Competency Matrix is not a performance review — it is a gap analysis. If your facility needs to reach Level 3 in Schedule Management but your IEs cannot build a Yamazumi chart, the first investment is in IE skill development, not in more Pitch Boards. The tools are only as good as the people who design and maintain them.
Conducting Your Own Assessment
A meaningful maturity assessment takes 2–3 days and involves direct observation, not conference room discussion. Here is the protocol:
Day 1: Observe the Make Shop
Walk the floor for a full shift. For each of the four management questions, observe the indicators described in the rubric. Do not ask managers what they do — watch what actually happens. Is the dispatch list visible? Do operators know their priority? When a machine goes down, what happens in the first 15 minutes? Where do people go to find information? Write down specific observations, not impressions.
Day 2: Observe the Assembly Shop
Same protocol. Walk a full shift. Watch the Pitch Board (if one exists) — is it updated in real-time, at shift end, or never? When a station misses Takt, what happens? Who responds, how quickly, and with what information? Watch material flow — do assemblers leave their strike zone? How often? For how long? How do new operators learn their work?
Day 3: Score and Calibrate
Using the rubric, score each of the 8 cells (4 questions × 2 shop types) independently. For each cell, write a one-paragraph justification citing specific observations from Days 1–2. Then gather the leadership team, present the scores with the evidence, and calibrate. The goal is not to achieve consensus — it is to create a shared, honest picture of where the facility actually stands.
Facility: Mid-size aerospace assembly and machining operation. 200 employees. 3 aircraft programs.
| Question | Make Shop | Assembly Shop |
|---|---|---|
| Schedule | L1 — Hot list driven by expeditors. No CONWIP. Different answers depending on who you ask. | L2 — Takt is calculated. Pitch Boards exist but updated at shift end only. No real-time response. |
| Capacity | L2 — Rough-cut capacity planning exists but routing data is ±30% inaccurate. OEE on 2 of 15 machines. | L1 — Capacity = “number of people.” No Yamazumi. No NVA measurement. “We need more people” is the answer to every question. |
| Manpower | L1 — One operator per machine. No structured cross-training. “Joe showed me once.” | L2 — Cross-training matrix exists. Training is by shadowing. Quality varies wildly by trainer. |
| Cost | L1 — Cost known at P&L level only. No part-level cost accuracy. Standards not updated in 3 years. | L2 — Hours tracked by zone. VA/NVA split estimated at 60/40 but never measured. |
Overall assessment: Make Shop L1.25 / Assembly Shop L1.75. The facility has some Level 2 artifacts (Pitch Boards, cross-training matrix, capacity plans) but they are not embedded in daily operations. The systems break when the person who maintains them is absent.
Priority improvement sequence:
- Make Shop Schedule → L2: Implement weekly dispatch list with sequencing logic. Stop the hot-list chaos.
- Assembly Shop Schedule → L3: Upgrade Pitch Boards to real-time with reason code taxonomy. Implement 15-minute escalation response.
- Make Shop Manpower → L2: Build skills matrix. Start structured cross-training for constraint work centers first.
- Assembly Shop Capacity → L2: Conduct Yamazumi analysis at all stations. Measure actual VA/NVA split.
💡 Sequence Matters: Don’t Try to Move Everything at Once
The assessment will reveal gaps everywhere. The temptation is to launch improvement projects across all 8 cells simultaneously. Resist this. Pick the 2–3 cells that most directly impact schedule performance and focus there first. Schedule is the gateway: when the facility can reliably answer “are we on schedule?” with current data, every other improvement becomes easier because you have a feedback mechanism that tells you whether your changes are working.
🎯 The Bottom Line
A lean maturity assessment is not a score to put on a slide deck — it is a diagnostic tool that tells you where to invest your improvement energy. By evaluating your Make Shop and Assembly Shop separately across the four management questions (Schedule, Capacity, Manpower, Cost), you get a specific, actionable picture of where your facility actually stands. Most aerospace facilities are at Level 1–2. That is not a failure — it is a starting point. The rubric shows you what Level 3 looks like, the IE Competency Matrix shows you what skills you need to get there, and the other guides in this track (CONWIP, Pitch Boards, Standard Work, Kata) show you the tools. Next: Production Ramp Management — the capstone that integrates everything when the rate goes up.
Stop reading, start modeling
Model your process flow, run simulations, optimize staffing with TOC math, and test your knowledge with 107 interactive checks — all in one platform.