| Title: | Deep Learning–Based Changepoint Detection with Local Neural Models |
|---|---|
| Description: | Implementation of deep learning–based changepoint detection algorithm designed for time series with smooth local fluctuations. The method fits localized feed‑forward neural networks to approximate the underlying smooth component and constructs a residual‑based detector that isolates abrupt structural changes. A fully data‑adaptive empirical cumulative distribution function (ECDF) based thresholding rule and refinement procedures yield accurate changepoint localization without parametric assumptions on noise or trend structure. |
| Authors: | Arman Azizyan [aut, cre], Abolfazl Safikhani [aut] |
| Maintainer: | Arman Azizyan <[email protected]> |
| License: | GPL-2 |
| Version: | 0.1.0 |
| Built: | 2026-05-31 08:07:01 UTC |
| Source: | https://github.com/armanazizyan/scancp |
Given a sequence of labels in {1, 2}, this function finds the best split point that maximizes the number of correctly assigned labels on each side. The left and right segments are each assigned their majority class.
best_split_free(labels)best_split_free(labels)
labels |
Integer vector containing only 1 and 2. |
For each possible split point \(i\), the function computes:
left correct = max(#1_left, #2_left)
right correct = max(#1_right, #2_right)
and selects the split maximizing their sum.
A list with:
The optimal split location (between index and index + 1).
Maximum number of correctly assigned labels.
Majority class on the left segment.
Majority class on the right segment.
Proportion correctly assigned.
labels <- c(1,1,1,2,2,2) best_split_free(labels)labels <- c(1,1,1,2,2,2) best_split_free(labels)
Computes the hybrid detector statistic used in scanCP. The detector combines a ratio component and a difference component derived from the residual sum of squares (RSS) of:
two adjacent small-window MLP fits (window size w)
one large-window MLP fit (window size 2w)
The detector highlights locations where the large-window fit performs substantially worse than the two small-window fits, indicating a potential changepoint.
calc_detector(y, fit_mlp_res, w = 100, ma_window = w, use_abs_det = TRUE)calc_detector(y, fit_mlp_res, w = 100, ma_window = w, use_abs_det = TRUE)
y |
Numeric vector. Original signal. |
fit_mlp_res |
List. Output of
|
w |
Integer. Window size used for the small-window MLP. |
ma_window |
Integer. Moving-average smoothing window applied to the
detector. Defaults to |
use_abs_det |
Logical. Whether to take |
For each index i, the detector uses:
rss1: RSS of the small-window MLP on [i, i+w-1]
rss2: RSS of the small-window MLP on [i+w, i+2w-1]
rss.tot: RSS of the large-window MLP on [i, i+2w-1]
The ratio component is:
The difference component is:
The final detector is a convex combination:
Internal parameters a, b, and scale_01 are kept
fixed at their defaults for simplicity and stability.
A numeric vector of detector values of length n - 2w.
y <- rnorm(300) fit <- fit_mlp(y, w = 100) det <- calc_detector(y, fit, w = 100)y <- rnorm(300) fit <- fit_mlp(y, w = 100) det <- calc_detector(y, fit, w = 100)
Combines several univariate detector vectors into a single joint detector using L1, L2, or max aggregation. Optionally applies moving-average smoothing and computes contribution weights.
combine_detectors( det.lst, method = c("L1", "L2", "max"), ma_window = NULL, circular = FALSE, scale_contributions = TRUE, scale_joint = TRUE )combine_detectors( det.lst, method = c("L1", "L2", "max"), ma_window = NULL, circular = FALSE, scale_contributions = TRUE, scale_joint = TRUE )
det.lst |
List of numeric detector vectors (all same length). |
method |
One of |
ma_window |
Optional integer. If provided, each detector is smoothed
using |
circular |
Logical. Whether smoothing wraps around. |
scale_contributions |
Logical. Whether to compute contribution weights. |
scale_joint |
Logical. Whether to scale the joint detector to [0,1]. |
A list containing:
Joint detector vector.
Scaled detector matrix.
Raw detector matrix.
Contribution weights (or NULL).
Extracts changepoints from a detector statistic using ECDF thresholding, spacing-curve analysis, and local refinement. Produces the corrected signal and its piecewise-constant component. This function does not perform global MLP smoothing.
decompose_signal_core( y, detector, w = 200, ma_window = 100, right_tail_cutoff = 0.95, left_tail_cutoff = 0.6, threshold = "auto", use_abs_det = TRUE, min_cp_distance = NULL, circular = FALSE, margin = NULL )decompose_signal_core( y, detector, w = 200, ma_window = 100, right_tail_cutoff = 0.95, left_tail_cutoff = 0.6, threshold = "auto", use_abs_det = TRUE, min_cp_distance = NULL, circular = FALSE, margin = NULL )
y |
Numeric vector. Original signal. |
detector |
Numeric vector. Detector statistic produced by
|
w |
Integer. Window size used in detector construction. |
ma_window |
Integer. Moving-average smoothing window for the detector
and spacing curve. Defaults to |
right_tail_cutoff |
Numeric. Upper ECDF cutoff for automatic
threshold selection. Defaults to |
left_tail_cutoff |
Numeric. Lower ECDF cutoff for automatic
threshold selection. Defaults to |
threshold |
Either |
use_abs_det |
Logical. Whether to use |
min_cp_distance |
Integer. Minimum distance between detected peaks.
Defaults to |
circular |
Logical. Whether moving-average smoothing wraps around. |
margin |
Integer. Refinement margin around each changepoint.
Default |
The function identifies local maxima of the smoothed detector, evaluates their significance using the ECDF of the detector, optionally computes a spacing curve for automatic threshold selection, and refines each changepoint using a two-cluster k-means split within a local window.
A list containing:
The refined, piecewise-corrected signal.
Estimated piecewise-constant component.
Cumulative correction vector.
Refined changepoint locations.
ECDF values of selected peaks.
Estimated shifts at each changepoint.
Smoothed detector statistic.
Spacing-curve values (if threshold = "auto").
Final threshold applied.
Matrix of detected local maxima.
Applies the ECDF + spacing-curve thresholding method to a detector vector and returns raw changepoints (no correction).
detect_cp_ecdf( diff, w = 200, ma_window = 100, right_tail_cutoff = 0.95, left_tail_cutoff = 0.6, threshold = "auto", circular = FALSE )detect_cp_ecdf( diff, w = 200, ma_window = 100, right_tail_cutoff = 0.95, left_tail_cutoff = 0.6, threshold = "auto", circular = FALSE )
diff |
Numeric detector vector. |
w |
Integer. Window size used in detector construction. |
ma_window |
Integer. Moving-average window for smoothing. |
right_tail_cutoff |
Numeric. ECDF upper cutoff. |
left_tail_cutoff |
Numeric. ECDF lower cutoff. |
threshold |
"auto" or numeric ECDF threshold. |
circular |
Logical. Whether smoothing wraps around. |
A list containing:
Raw changepoint indices.
ECDF values of selected maxima.
Final threshold.
Matrix of local maxima.
Smoothed detector.
Spacing curve.
Fits a global MLP model to a corrected signal and computes:
smooth trend estimate
residuals
This function does NOT compute a piecewise-constant component.
That is handled entirely in decompose_signal_core().
fit_global_mlp(corrected_signal)fit_global_mlp(corrected_signal)
corrected_signal |
Numeric vector. Output of |
A list containing:
Global MLP smooth fit.
Residuals from the smooth fit.
The fitted RSNNS MLP model.
Fits two multilayer perceptron (MLP) models over rolling windows of a
univariate signal. A small-window MLP (window size w) captures
local structure, while a large-window MLP (window size 2w)
captures broader trends. Their residual behavior is used by
calc_detector to construct the changepoint detector.
All MLP hyperparameters are supplied through a unified
mlp_control list for consistency with scan_cp.
fit_mlp(vec, w = 100, mlp_control = list(), parallel = FALSE)fit_mlp(vec, w = 100, mlp_control = list(), parallel = FALSE)
vec |
Numeric vector. The input signal. |
w |
Integer. Window size for the small-window MLP. The large-window
MLP automatically uses window size |
mlp_control |
A named list of MLP hyperparameters. Any subset may be supplied; unspecified values fall back to defaults. Supported fields:
|
parallel |
Logical. If |
Each rolling window is standardized before fitting. The returned matrices contain three columns:
standardized input x
fitted values y_hat
original indices
When parallel = TRUE, a cluster is created using
parallel::makeCluster().
When parallel = FALSE, the function falls back to
serial execution using %do%.
A list with two elements:
smallList of fitted values for each rolling window of size w.
largeList of fitted values for each rolling window of size 2w.
Applies a simple two-sided moving average smoother to a numeric vector. This is used throughout the scanCP pipeline for stabilizing detector statistics and spacing curves.
ma(x, n = 5, circular = FALSE)ma(x, n = 5, circular = FALSE)
x |
Numeric vector to smooth. |
n |
Integer. Total number of neighbors used in the moving average. Must be >= 1. |
circular |
Logical. If TRUE, the smoothing wraps around the ends. |
This function is a thin wrapper around stats::filter() with a symmetric
moving average kernel. It is intentionally lightweight and dependency-free.
A numeric vector of the same length as x containing the smoothed
values. Endpoints may be NA if circular = FALSE.
x <- rnorm(100) y <- ma(x, n = 5)x <- rnorm(100) y <- ma(x, n = 5)
Creates an interactive Plotly visualization showing multiple rolling-window MLP fits over a univariate signal. A slider allows the user to move through different window positions and compare model fits dynamically.
plot_mlp_fits_interactive( y, fit_mlp_res, t.chp.ind = NA, step = 50, start = 75, w = 100 )plot_mlp_fits_interactive( y, fit_mlp_res, t.chp.ind = NA, step = 50, start = 75, w = 100 )
y |
Numeric vector. The original signal. |
fit_mlp_res |
A list containing rolling MLP fit results. Expected to be a list of two lists, each containing matrices with fitted values. |
t.chp.ind |
Optional numeric vector of changepoint indices to draw as vertical dashed lines. |
step |
Integer. Step size for the slider increments. |
start |
Integer. Starting index for the first model window. |
w |
Integer. Window size used in the rolling MLP fitting. |
A Plotly object.
Runs the complete changepoint detection pipeline on a univariate signal. This includes rolling-window MLP fitting, detector construction, thresholding, structural decomposition, and changepoint extraction.
scan_cp( y, w = 100, ma_window = w, threshold = "auto", threshold_tails = c(0.2, 0.95), min_cp_distance = 2 * w, margin = floor(w/2), use_abs_det = TRUE, mlp_control = list(), parallel = FALSE )scan_cp( y, w = 100, ma_window = w, threshold = "auto", threshold_tails = c(0.2, 0.95), min_cp_distance = 2 * w, margin = floor(w/2), use_abs_det = TRUE, mlp_control = list(), parallel = FALSE )
y |
Numeric vector. The input signal. |
w |
Integer. Window size for the small-window MLP used in
|
ma_window |
Integer. Window size for the moving-average smoothing
applied to the detector statistic. Defaults to |
threshold |
Character or numeric. If |
threshold_tails |
Numeric vector of length 2. Left and right tail
cutoffs used when estimating the automatic threshold. Defaults to
|
min_cp_distance |
Integer. Minimum separation (in indices) between
detected changepoints. Defaults to |
margin |
Integer. Margin used during local refinement of changepoint
locations. Defaults to |
use_abs_det |
Logical. Whether to use the absolute value of the detector statistic when identifying changepoints. |
mlp_control |
A named list of hyperparameters passed directly to
|
parallel |
Logical. If |
The pipeline proceeds in three main stages:
Rolling-window MLP fitting via fit_mlp.
Detector construction via calc_detector.
Structural decomposition and changepoint extraction via
decompose_signal_core.
Detected changepoints are printed for user convenience and returned as part
of the output list. Parallel computation is optional and controlled by the
parallel argument.
A list containing:
changepointsEstimated changepoint locations.
detectorThe detector statistic computed by
calc_detector.
fit_mlpRolling MLP fit results returned by
fit_mlp.
decompositionFull structural decomposition returned by
decompose_signal_core.
# Minimal example set.seed(1) y <- c(rnorm(200, 0), rnorm(200, 3)) # Full pipeline (parallel disabled for CRAN) res <- scan_cp(y, w = 20, parallel = FALSE)# Minimal example set.seed(1) y <- c(rnorm(200, 0), rnorm(200, 3)) # Full pipeline (parallel disabled for CRAN) res <- scan_cp(y, w = 20, parallel = FALSE)
Applies the univariate scan_cp pipeline independently to
each column of a multivariate signal. Each dimension is processed
separately (asynchronously), producing a list of univariate results.
scan_cp_multi_async( Y, w = 100, ma_window = w, threshold = "auto", threshold_tails = c(0.6, 0.95), min_cp_distance = 2 * w, margin = floor(w/2), use_abs_det = TRUE, mlp_control = list() )scan_cp_multi_async( Y, w = 100, ma_window = w, threshold = "auto", threshold_tails = c(0.6, 0.95), min_cp_distance = 2 * w, margin = floor(w/2), use_abs_det = TRUE, mlp_control = list() )
Y |
Numeric matrix (n × p). Each column is a signal dimension. |
w |
Integer. Window size for rolling MLPs and detector construction. Defaults to 100. |
ma_window |
Integer. Moving-average smoothing window for the detector.
Defaults to |
threshold |
Either |
threshold_tails |
Numeric vector of length 2 giving tail cutoff values
for automatic thresholding.
Defaults to |
min_cp_distance |
Integer. Minimum distance between detected
changepoints. Defaults to |
margin |
Integer. Local refinement margin. Defaults to |
use_abs_det |
Logical. Whether to use |
mlp_control |
List of parameters passed to |
A named list of length p, where each element is the output
of scan_cp applied to the corresponding column of Y.
Y <- cbind( x1 = c(rnorm(200), rnorm(200, 3)), x2 = c(rnorm(200), rnorm(200, -2)) ) res <- scan_cp_multi_async(Y, w = 100) res$x1$changepoints res$x2$changepointsY <- cbind( x1 = c(rnorm(200), rnorm(200, 3)), x2 = c(rnorm(200), rnorm(200, -2)) ) res <- scan_cp_multi_async(Y, w = 100) res$x1$changepoints res$x2$changepoints
Implements the synchronized multivariate changepoint pipeline:
Fit univariate MLP-based detectors for each dimension.
Combine detectors into a joint detector using L1/L2/max.
Detect changepoints on the joint detector.
Compute per-dimension contributions at each detected CP.
scan_cp_multi_sync( Y, w = 100, method = c("L1", "L2", "max"), mlp_params = list(), detector_params = list(), combine_params = list(), ecdf_params = list() )scan_cp_multi_sync( Y, w = 100, method = c("L1", "L2", "max"), mlp_params = list(), detector_params = list(), combine_params = list(), ecdf_params = list() )
Y |
Numeric matrix (n × p). Each column is a signal dimension. |
w |
Integer. Window size for rolling MLPs and detector. |
method |
Character. Combination method: "L1", "L2", or "max". |
mlp_params |
List of parameters passed to |
detector_params |
List of parameters passed to |
combine_params |
List of parameters passed to |
ecdf_params |
List of parameters passed to |
A list containing:
List of univariate detectors (one per dimension).
Joint detector vector.
Matrix of per-dimension contributions.
Detected synchronized changepoints.
Contribution of each dimension at each CP.
Full output of detect_cp_ecdf().
Identifies the most prominent spike in a spacing curve derived from the ECDF of a smoothed detector statistic. This is used to automatically determine a significance threshold for changepoint selection.
select_best_spike(s, right_tail_cutoff = 0.95, left_tail_cutoff = 0.6)select_best_spike(s, right_tail_cutoff = 0.95, left_tail_cutoff = 0.6)
s |
Numeric vector. Smoothed spacing curve. |
right_tail_cutoff |
Numeric in (0,1). Exclude spikes with ECDF probability above this value (typically near 1). |
left_tail_cutoff |
Numeric in (0,1). Exclude spikes with ECDF probability below this value (to avoid trivial early spikes). |
Uses pracma::findpeaks() to identify local maxima. Prominence is computed
as:
The spike with the largest prominence within the allowed ECDF range is selected.
A numeric value in (0,1) representing the selected ECDF threshold,
or NA if no valid spike is found.
s <- runif(100) select_best_spike(s)s <- runif(100) select_best_spike(s)
Generate a univariate signal composed of:
a smooth deterministic component,
a piecewise-constant step component defined by changepoint indices,
additive Gaussian noise.
simulate_piecewise_signal_idx( n = 1000, domain = c(-4, 4), changepoints_idx = c(300, 700), shift_sizes = c(0.5, -0.3, 0.7), noise_sd = 0.04, smooth_fun = function(x) 0.01 * (3 * x/2 - x^3/2), seed = NULL )simulate_piecewise_signal_idx( n = 1000, domain = c(-4, 4), changepoints_idx = c(300, 700), shift_sizes = c(0.5, -0.3, 0.7), noise_sd = 0.04, smooth_fun = function(x) 0.01 * (3 * x/2 - x^3/2), seed = NULL )
n |
Integer. Length of the signal. |
domain |
Numeric vector of length 2 giving the range of the time axis. |
changepoints_idx |
Integer vector of changepoint locations (indices).
Must lie strictly inside |
shift_sizes |
Numeric vector giving the mean level of each segment.
Must have length equal to |
noise_sd |
Numeric. Standard deviation of the Gaussian noise. |
smooth_fun |
Function. A function |
seed |
Optional integer. If supplied, sets the random seed for reproducibility. |
This function is useful for benchmarking changepoint detection algorithms and reproducing controlled simulation studies.
A list with components:
Time grid of length .
Smooth component .
Piecewise-constant step component.
Final noisy signal.
Sorted changepoint indices.
Segment means.
List of simulation parameters.
sim <- simulate_piecewise_signal_idx( n = 1000, changepoints_idx = c(300, 700), shift_sizes = c(0.5, -0.3, 0.7), noise_sd = 0.05, seed = 123 )sim <- simulate_piecewise_signal_idx( n = 1000, changepoints_idx = c(300, 700), shift_sizes = c(0.5, -0.3, 0.7), noise_sd = 0.05, seed = 123 )