Machine Learning in Forensic Investigation: Webinar Recap

When the data outpaces traditional inspection, how do forensic engineers find the signal? A recap of CEERISK's recent webinar on three complex industrial loss cases.

In the most recent CEERISK webinar, Using Machine Learning in Forensic Investigation of Complex Industrial Losses, Mamoon Alyah (Managing Director and Senior Engineer) and Dr Amir Pourghorban (Senior Engineering Consultant and Head of Scientific Research) examined how machine learning is now being applied to support major loss investigations:

What it can do
What it cannot
Where it fits within a defensible forensic process


Mamoon Alyah Managing Director at CEERISK Consulting		Dr Amir Pourghorban Senior Engineering Consultant and Head of Scientific Research at CEERISK Consulting

The session covered a few of industrial loss cases drawn from CEERISK's previous work, each centred on a different asset class: a gas turbine failure, a power transformer failure and production interruptions. Across all, machine learning played the same essential role:

Organising vast quantities of operational data
Surfacing periods that warranted forensic attention
Supporting an engineering conclusion capable of withstanding scrutiny in a claims or dispute setting.

This article summarises the key points raised during the session and the common principles that ran through every case.

Why machine learning, and why now

Modern industrial assets generate large, continuous streams of operational data. Sensors record temperatures, pressures, electrical parameters and process variables at intervals as short as every seconds. Years of records can sit behind a single failure event.

When a major loss occurs, that data becomes potential evidence. The challenge is no longer whether the evidence exists, but whether it can be located, interpreted and presented within the timelines that claims and disputes demand. Manual review at this scale is rarely practical. This is where machine learning earns its place in the forensic engineer's toolkit.

A point Mamoon and Amir made early and returned to throughout is that the techniques in question are not generative AI. They are statistical and computational tools applied to time-series data, designed to flag deviations from expected behaviour. They organise and focus, but they do NOT conclude.

A common methodology across all cases

Each of the cases followed broadly the same workflow:

Data acquisition and cleaning: Multi-channel sensor data is collated, aligned on a unified timestamp, and corrected for missing values, misaligned readings and corrupted sections.

Baseline training: A clean, fault-free section of operational data is used to teach the model what stable behaviour looks like.

Anomaly detection: The remainder of the dataset is run against the baseline. Statistical deviations are flagged with confidence probabilities, producing a timeline of evidence rather than a single answer.

Engineering validation: The flagged periods are then investigated further and interpreted by the forensic engineer, cross-referenced with operational logs, physical evidence and metallurgical or electrical testing.

The model focuses the investigation. The engineer decides it. That division of work is the core principle CEERISK applies in every ML-assisted forensic engagement.

Case 1: Gas turbine blade failure

The first case concerned a gas turbine at a Middle East power and desalination plant. During an operational run, the unit experienced an unexplained vibration shift, a cascade of alarms, and a trip. On disassembly, four rotor blades were found with liberated tips, and several others showed cracking.

Conventional forensic testing ruled out design, manufacturing and material defects. What it did identify, and could not immediately explain, was an ultra-fine gamma prime formation on the leading edges of the failed blades, indicating the microstructure had been re-solutioned during operation by extreme heat followed by rapid cooling. The metallurgy revealed what had weakened the blades, but not why or when.

The originating event lay months earlier in a dataset spanning more than seventy temperature and pressure sensors recordings. The ML team applied an autoencoder model to compress nine months of multivariate sensor data and measure reconstruction error against a known-normal baseline. The output identified a clear inflection in the error curve approximately six months before the incident.

Cross-referenced with operational logs, that period coincided with a programme to recommission the turbines from heavy oil to natural gas. Repeated combustor housing failures during that programme had ejected unburnt fuel into the turbine, igniting on hot surfaces and prompting cold-water quenching. Each cycle re-solutioned the blade microstructure; cumulatively they generated the gamma prime fingerprint and a low-cycle fatigue failure mechanism.

The forensic conclusion was that the loss originated during commissioning rather than operation, with material implications for which insurance policy properly responded.

Case 2: Power transformer failure

The second case dealt with the question that arises in many transformer losses: was the failure the result of a sudden insured event, or the endpoint of progressive insulation degradation? The answer determines coverage, recovery options and, in disputed losses, the structure of the claim itself.

The dataset combined dissolved gas analysis (DGA) results (hydrogen, carbon monoxide, ethylene, acetylene and others) with operational electrical and thermal parameters captured over the asset's life.

The ML team applied a stacking ensemble classifier, trained to distinguish between normal states and specific fault conditions, including partial discharge and thermal overheating, on the basis of gas profiles and operational signatures. Rather than a binary anomaly flag, the output was a classified fault timeline: when the transformer first entered an anomalous state, the type of fault the data was consistent with, and the model's confidence in that classification.

That timeline allowed the forensic team to validate findings against post-incident physical inspection and to distinguish a discrete event from gradual deterioration with a precision that DGA review alone could not deliver.

Case 3: Business interruption and production loss

The third case sat outside the conventional "what failed and why" frame. It concerned a complex business interruption (BI) loss at a manufacturing plant where a major incident had reduced output of certain products without affecting others. The forensic accountants needed a defensible figure for the loss of production attributable to the insured event, one that separated event-driven impact from the noise of normal operational variation, ambient conditions and scheduled maintenance over multiple years.

The ML team built a Normal Behaviour Model (NBM) trained on five years of daily generation data, ambient conditions and maintenance logs. The model learned the expected production output of the plant under varying healthy operating conditions. Periods where measured output deviated from the model's predicted output were then isolated and assessed.

The result was a defensible BI calculation. A quantification of production loss strictly attributable to the insured event, separated from operational shortfalls caused by equipment degradation and other external factors or scheduled activity. In a context where forensic accountants and engineers must work to a single narrative, that distinction is commercially significant.

Practical realities - Data quality, Limitations & Admissibility

A part of the session was dedicated to the realities of doing this work to a defensible standard.

Data quality is the binding constraint

Real-world datasets contain missing readings, stuck sensors, misaligned timestamps and corrupted sections. In ML-assisted forensic work, the majority of project hours typically go on data cleaning and feature engineering, not on running the model. Underestimating that effort is the single most common cause of weak outputs.

Machine learning identifies correlation, NOT causation

A flagged anomaly is a candidate for examination, not a conclusion. Some fault types are difficult to detect from data alone. Others are obscured by competing operational variables. Engineering judgement is what converts a statistical signal into a defensible finding.

Defensibility matters

Outputs intended to support claims, arbitration or litigation cannot rely on black-box models. The methodology, the training data, the validation steps and the engineering reasoning must all be traceable. Mamoon and Amir's framing “the model is a compass, not a conclusion” captures the discipline required.

What the session set out to demonstrate

For insurers, reinsurers, brokers, lawyers and corporate risk owners working on complex industrial losses, the takeaway from the webinar was not that machine learning replaces traditional forensic engineering. It is that, applied with care, ML now allows forensic engineering to address questions of scale and timeline that were previously impractical to resolve (what changed in this asset's history, and when?) and to produce evidence that holds up under scrutiny.

The same methodology generalises to wind turbines, battery energy storage systems, and any asset class with sufficiently dense operational data. Where the question is data-heavy and the timeline matters, ML-assisted forensic investigation is increasingly the practical answer.

Watch the full webinar

Mamoon Alyah and Dr Amir Pourghorban take you through each case in detail, including the data, the methodology and the engineering validation behind every conclusion. Watch the full recording.

Working on a complex industrial loss?

If you would like to discuss how the approaches covered in the webinar might apply to a current matter, asset portfolio or claims dispute, CEERISK's forensic engineering and data sciences teams would be glad to discuss it.