DataCafé
DataCafé
Changepoint Detection: Secret Weapon of the Data Scientist
How can we spot a change in a jet engine vibration that might mean it’s about to fail catastrophically? How can a service forecast adapt to unexpected changes brought about by a pandemic? How might we spot an increase in rate of change of pollution in the atmosphere? The answer to all these questions is changepoints, or rather changepoint detection.
Common to all these systems is a set of ordered data, usually a time series of observations or measurements that may be noisy but have some underlying pattern. As the world changes, so those changes might lead to dramatic changes in the measurements and a disruption of the usual pattern. Unless these forecasts or failure-detection systems are updated quickly to take account of a change in measurement data, they will likely produce erroneous or unpredictable results.
Changepoints have many important applications in areas such as:
- Climatology
- Genetic sequencing
- Finance
- Medical imaging
- Forecasting in industry
We speak to statistician Dr. Rebecca Killick from Lancaster University about her work in changepoint detection and how it is a critical part of the statistical toolkit for analysing time series and other ordered data sets. In particular:
- In forecasting where most methods tend to work on the basis of extrapolating trends, it is essential to know if a changepoint has occurred so that a refreshed model calculation can be started.
- If there is a change in the underlying dynamics of a system that causes a complex change in the observed output then this can often be detected with a changepoint. This might be indicative of a mechanical failure or impending change in operation or an unobserved event buried deep in a difficult-to-measure environment, like a nuclear reactor.
With interview guest Dr. Rebecca Killick, Associate Professor of Statistics at Lancaster University.
Further reading
- Rebecca Killick’s publications (via Lancaster University)
- Changepoints Overview Paper: changepoint: An R package for changepoint analysis (pdf via Journal of Statistical Software)
- R Package: changepoint: Methods for Changepoint Detection (R package via CRAN library)
- PELT algorithm paper: Optimal detection of changepoints with a linear computational cost (pdf via arXiv)
- Paper: Distinguishing Trends and Shifts from Memory in Climate Data (paper via American Meteorological Society)
- R Package: EnvCpt: Detection of Structural Changes in Climate and Environment Time Series (R package via CRAN library)
Some links above may require payment or login. We are not endorsing them or receiving any payment for mentioning them. They are provided as is. Often free versions of papers are available and we would encourage you to investigate.
Recording date: 10 June 2020
Interview date: 5 June 2020
Thanks for joining us in the DataCafé. You can follow us on twitter @DataCafePodcast and feel free to contact us about anything you've heard here or think would be an interesting topic in the future.