Title: | Methods for Fast Multiple Change-Point/Break-Point Detection and Estimation |
---|---|
Description: | A developing software suite for multiple change-point and change-point-type feature detection/estimation (data segmentation) in data sequences. |
Authors: | Andreas Anastasiou [aut], Yining Chen [aut, cre], Haeran Cho [aut], Piotr Fryzlewicz [aut] |
Maintainer: | Yining Chen <[email protected]> |
License: | GPL-2 |
Version: | 2.5 |
Built: | 2025-01-22 04:41:48 UTC |
Source: | https://github.com/cran/breakfast |
A developing software suite for multiple change-point detection/estimation (data segmentation) in data sequences.
The current version implements methods for detecting changes in the data sequence modelled as (i) a piecewise-constant function plus i.i.d. Gaussian noise, (ii) a piecewise-constant function plus autoregressive time series, (iii) a piecewise-linear and continuous function plus i.i.d. Gaussian noise, and (iv) a piecewise-linear and discontinuous function plus i.i.d. Gaussian noise. This is carried out via a two-stage procedure combining solution path generation and model selection methodologies. Change-point detection in breakfast is carried out in two stages, first the computation of a solution path, followed by a model selection step along the path. A variety of solution path and model selection methods are included, which can be accessed individually, or through breakfast. Currently supported solution path methods are: sol.idetect, sol.idetect_seq, sol.wbs, sol.wbs2, sol.not, sol.tguh and sol.wcm.
Currently supported model selection methods are: model.ic, model.lp, model.sdll, model.thresh and model.gsa.
Check back future versions for more change-point models and the corresponding methods.
We would like to thank Shakeel Gavioli-Akilagun, Anica Kostic, Shuhan Yang and Christine Yuen for their comments and suggestions that helped improve this package.
browseVignettes(package = "breakfast")
contains a detailed comparative simulation study of various methods
implemented in breakfast for the models (i), (iii) and (iv).
This function estimates the number and locations of change-points in a univariate data sequence, which is modelled as (i) a piecewise-constant function plus i.i.d. Gaussian noise, (ii) a piecewise-constant function plus autoregressive time series, (iii) a piecewise-linear and continuous function plus i.i.d. Gaussian noise, or (iv) a piecewise-linear and discontinuous function plus i.i.d. Gaussian noise. This is carried out via a two-stage procedure combining solution path generation and model selection methodologies.
breakfast( x, type = c("const", "lin.cont", "lin.discont"), solution.path = NULL, model.selection = NULL )
breakfast( x, type = c("const", "lin.cont", "lin.discont"), solution.path = NULL, model.selection = NULL )
x |
A numeric vector containing the data to be processed |
type |
The type of change-point models fitted to the data; currently supported models are: piecewise constant signals ( |
solution.path |
A string or a vector of strings containing the name(s) of solution path generating method(s);
if individual methods are accessed via this option, default tuning parameters are used.
Alternatively, you can directly access each solution path generating method via When When If
|
model.selection |
A string or a vector of strings containing the name(s) of model selection method(s);
if individual methods are accessed via this option, default tuning parameters are used.
Alternatively, you can directly access each model selection method via
|
Please also take a look at the vignette for tips/suggestions/examples of using the breakfast package.
An S3 object of class breakfast.cpts
, which contains the following fields:
Input vector x
A list containing S3 objects of class cptmodel
; each contains the following fields:
The solution path method used
The model selection method used to return the final change-point estimators object
The number of estimated change-points in the piecewise-constant mean of the vector cptpath.object$x
The locations of estimated change-points in the piecewise-constant mean of the vector cptpath.object$x
. These are the end-points of the corresponding constant-mean intervals
An estimate of the piecewise-constant mean of the vector cptpath.object$x
; the values are the sample means of the data (replicated a suitable number of times) between each pair of consecutive detected change-points
A. Anastasiou & P. Fryzlewicz (2022) Detecting multiple generalized change-points by isolating single ones. Metrika, 85(2), 141–174.
R. Baranowski, Y. Chen & P. Fryzlewicz (2019) Narrowest-over-threshold detection of multiple change points and change-point-like features. Journal of the Royal Statistical Society: Series B, 81(3), 649–672.
H. Cho & C. Kirch (2022) Two-stage data segmentation permitting multiscale change points, heavy tails and dependence. Annals of the Institute of Statistical Mathematics, 74(4), 653–684.
H. Cho & P. Fryzlewicz (2024) Multiple change point detection under serial dependence: Wild contrast maximisation and gappy Schwarz algorithm. Journal of Time Series Analysis, 45(3): 479–494.
P. Fryzlewicz (2014) Wild binary segmentation for multiple change-point detection. The Annals of Statistics, 42(6), 2243–2281.
P. Fryzlewicz (2018) Tail-greedy bottom-up data decompositions and fast multiple change-point detection. The Annals of Statistics, 46(6B), 3390–3421.
P. Fryzlewicz (2020) Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection. Journal of the Korean Statistical Society, 49(4), 1027–1070.
f <- rep(rep(c(0, 1), each = 50), 10) x <- f + rnorm(length(f)) * .5 breakfast(x)
f <- rep(rep(c(0, 1), each = 50), 10) x <- f + rnorm(length(f)) * .5 breakfast(x)
Return a solution with the given number of change-points or change-point-type features from the solution path
model.fixednum(cptpath.object, fixednum = NULL)
model.fixednum(cptpath.object, fixednum = NULL)
cptpath.object |
A solution-path object, returned by a |
fixednum |
The number of change-points or change-point-type features |
The model selection method which returns results with a given number of change-points or change-point-type features. If there are multiple such elements on the solution path, the one with the smaller residual sum of squares will be returned. On the other hand, if no such element exists, an empty set (i.e. with no change-points) will be returned.
An S3 object of class cptmodel
, which contains the following fields:
solution.path |
The solution path method used to obtain |
type |
The model type used, inherited from the given |
model.selection |
The model selection method used to return the final change-point or change-point-type feature estimators object, here its value is |
no.of.cpt |
The number of estimated features in the mean of the vector |
cpts |
The locations of estimated features in the mean of the vector |
est |
An estimate of the mean of the vector |
sol.idetect
, sol.not
, sol.tguh
, sol.wbs
, sol.wbs2
, sol.wcm
, breakfast
x <- c(rep(0, 100), rep(1, 100), rep(0, 100)) + rnorm(300) model.fixednum(sol.wbs(x),2) model.fixednum(sol.not(x),2)
x <- c(rep(0, 100), rep(1, 100), rep(0, 100)) + rnorm(300) model.fixednum(sol.wbs(x),2) model.fixednum(sol.not(x),2)
This function estimates the number and locations of change-points in the piecewise-constant mean of a noisy data sequence with auto-regressive noise via gappy Schwarz algorithm
from a candidate model sequence generated by sol.wcm
.
model.gsa(cptpath.object, p.max = 10, pen = log(length(cptpath.object$x))^1.01)
model.gsa(cptpath.object, p.max = 10, pen = log(length(cptpath.object$x))^1.01)
cptpath.object |
A solution-path object, returned by a |
p.max |
The maximum AR order. The default is |
pen |
Penalty used for the Schwarz criterion. |
From the largest to the smallest (i.e. empty) candidate models generated by sol.wcm
,
gappy Schwarz algorithm locally evaluates the Schwarz criterion (SC, under piecewise constant signal + AR(p) noise model, with the AR order p to be determined adaptively) and its modification SC0
on each segment determined by the next smallest candidate model. It selects the larger model as the final model if over each segment, all newly introduced estimators are deemed ‘significant’ according to SC and SC0; see Cho and Fryzlewicz (2023) for details.
An S3 object of class cptmodel
, which contains the following fields:
solution.path |
The solution path method used to obtain |
model.selection |
The model selection method used to return the final change-point estimators object, here its value is |
no.of.cpt |
The number of estimated change-points in the piecewise-constant mean of the vector |
cpts |
The locations of estimated change-points in the piecewise-constant mean of the vector |
est |
An estimate of the piecewise-constant mean of the vector |
H. Cho & P. Fryzlewicz (2024) Multiple change point detection under serial dependence: Wild contrast maximisation and gappy Schwarz algorithm. Journal of Time Series Analysis, 45(3): 479–494.
set.seed(111) f <- rep(c(0, 5, 2, 8, 1, -2), c(100, 200, 200, 50, 200, 250)) x <- f + arima.sim(list(ar = c(.75, -.5), ma = c(.8, .7, .6, .5, .4, .3)), n = length(f), sd = 1) model.gsa(sol.wcm(x))
set.seed(111) f <- rep(c(0, 5, 2, 8, 1, -2), c(100, 200, 200, 50, 200, 250)) x <- f + arima.sim(list(ar = c(.75, -.5), ma = c(.8, .7, .6, .5, .4, .3)), n = length(f), sd = 1) model.gsa(sol.wcm(x))
This function estimates the number and locations of change-points or change-point-type features in the mean of a noisy data sequence via the sSIC (strengthened Schwarz information criterion) method.
model.ic(cptpath.object, alpha = 1.01, q.max = NULL)
model.ic(cptpath.object, alpha = 1.01, q.max = NULL)
cptpath.object |
A solution-path object, returned by a |
alpha |
The parameter associated with the sSIC. The default value is 1.01. Note that the SIC is recovered when alpha = 1. |
q.max |
The maximum number of features allowed. If nothing or |
The model selection method for algorithms that produce nested solution path is described in "Wild binary segmentation for multiple change-point detection", P. Fryzlewicz (2014), The Annals of Statitics, 42: 2243–2281. The corresponding description for those that produce non-nested solution set can be found in "Narrowest-over-threshold detection of multiple change points and change-point-like features", R. Baranowski, Y. Chen and P. Fryzlewicz (2019), Journal of Royal Statistical Society: Series B, 81(3), 649–672.
An S3 object of class cptmodel
, which contains the following fields:
solution.path |
The solution path method used to obtain |
type |
The model type used, inherited from the given |
model.selection |
The model selection method used to return the final change-point or change-point-type feature estimators object, here its value is |
no.of.cpt |
The number of estimated features in the mean of the vector |
cpts |
The locations of estimated features in the mean of the vector |
est |
An estimate of the mean of the vector |
P. Fryzlewicz (2014). Wild binary segmentation for multiple change-point detection. The Annals of Statistics, 42(6), 2243–2281.
R. Baranowski, Y. Chen & P. Fryzlewicz (2019). Narrowest-over-threshold detection of multiple change points and change-point-like features. Journal of the Royal Statistical Society: Series B, 81(3), 649–672.
sol.idetect
, sol.not
, sol.tguh
, sol.wbs
, sol.wbs2
, breakfast
x <- c(rep(0, 100), rep(1, 100), rep(0, 100)) + rnorm(300) model.ic(sol.wbs(x)) model.ic(sol.not(x))
x <- c(rep(0, 100), rep(1, 100), rep(0, 100)) + rnorm(300) model.ic(sol.wbs(x)) model.ic(sol.not(x))
This function estimates the number and locations of change-points in the piecewise-constant mean of a noisy data sequence via the localised pruning method, which performs a Schwarz criterion-based model selection on the given candidate set in a localised way.
model.lp( cptpath.object, min.d = 5, penalty = c("log", "polynomial"), pen.exp = 1.01, do.thr = TRUE, th.const = 0.5 )
model.lp( cptpath.object, min.d = 5, penalty = c("log", "polynomial"), pen.exp = 1.01, do.thr = TRUE, th.const = 0.5 )
cptpath.object |
A solution-path object, returned by a |
min.d |
A number specifying the minimal spacing between change points; |
penalty |
A string specifying the type of penalty term to be used in Schwarz criterion; possible values are:
|
pen.exp |
Exponent for the penalty term (see |
do.thr |
If |
th.const |
A constant multiplied to |
Further information can be found in Cho and Kirch (2022).
An S3 object of class cptmodel
, which contains the following fields:
solution.path |
The solution path method used to obtain |
model.selection |
The model selection method used to return the final change-point estimators object, here its value is |
no.of.cpt |
The number of estimated change-points in the piecewise-constant mean of the vector |
cpts |
The locations of estimated change-points in the piecewise-constant mean of the vector |
est |
An estimate of the piecewise-constant mean of the vector |
H. Cho & C. Kirch (2022) Two-stage data segmentation permitting multiscale change points, heavy tails and dependence. Annals of the Institute of Statistical Mathematics, 74(4), 653–684.
sol.idetect
, sol.idetect_seq
, sol.not
, sol.tguh
, sol.wbs
, sol.wbs2
, breakfast
f <- rep(rep(c(0, 1), each = 50), 10) x <- f + rnorm(length(f)) * .5 model.lp(sol.not(x))
f <- rep(rep(c(0, 1), each = 50), 10) x <- f + rnorm(length(f)) * .5 model.lp(sol.not(x))
This function estimates the number and locations of change-points in the piecewise-constant or piecewise-linear mean of a noisy data sequence via the Steepest Drop to Low Levels method.
model.sdll( cptpath.object, sigma = stats::mad(diff(cptpath.object$x)/sqrt(2)), universal = TRUE, th.const = NULL, th.const.min.mult = 0.3, lambda = 0.9 )
model.sdll( cptpath.object, sigma = stats::mad(diff(cptpath.object$x)/sqrt(2)), universal = TRUE, th.const = NULL, th.const.min.mult = 0.3, lambda = 0.9 )
cptpath.object |
A solution-path object, returned by a |
sigma |
An estimate of the standard deviation of the noise in the data |
universal |
If |
th.const |
Only relevant if |
th.const.min.mult |
A fractional multiple of the threshold, used to decide the lowest magnitude of contrasts from |
lambda |
Only relevant if |
The Steepest Drop to Low Levels method is described in "Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection", P. Fryzlewicz (2020), Journal of the Korean Statistical Society, 49, 1027–1070.
An S3 object of class cptmodel
, which contains the following fields:
solution.path |
The solution path method used to obtain |
type |
The model type used, inherited from the given |
model.selection |
The model selection method used to return the final change-point estimators object, here its value is |
no.of.cpt |
The number of estimated change-points |
cpts |
The locations of estimated change-points |
est |
An estimate of the mean of the vector |
P. Fryzlewicz (2020). Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection. Journal of the Korean Statistical Society, 49, 1027–1070.
sol.idetect
, sol.idetect_seq
, sol.not
, sol.tguh
, sol.wbs
, sol.wbs2
, breakfast
f <- rep(rep(c(0, 1), each = 50), 10) x <- f + rnorm(length(f)) model.sdll(sol.wbs2(x))
f <- rep(rep(c(0, 1), each = 50), 10) x <- f + rnorm(length(f)) model.sdll(sol.wbs2(x))
This function estimates the number and locations of change-points in the piecewise-constant or piecewise-linear mean of a noisy data sequence via thresholding.
model.thresh(cptpath.object, sigma = NULL, th.const = NULL)
model.thresh(cptpath.object, sigma = NULL, th.const = NULL)
cptpath.object |
A solution-path object, returned by a |
sigma |
An estimate of the standard deviation of the noise in the data |
th.const |
A positive real number used to define the threshold for the detection process. The default used in the piecewise-constant model is 1.15, while in the piecewise-linear model, the value is taken equal to 1.4. |
An S3 object of class cptmodel
, which contains the following fields:
solution.path |
The solution path method used to obtain |
type |
The model type used, inherited from the given |
model.selection |
The model selection method used to return the final change-point estimators object, here its value is |
no.of.cpt |
The number of estimated change-points |
cpts |
The locations of estimated change-points |
est |
An estimate of the mean of the vector |
sol.idetect
, sol.idetect_seq
, sol.not
, sol.tguh
, sol.wbs
, sol.wbs2
, breakfast
f <- rep(rep(c(0, 1), each = 50), 10) x <- f + rnorm(length(f)) model.thresh(sol.idetect_seq(x))
f <- rep(rep(c(0, 1), each = 50), 10) x <- f + rnorm(length(f)) model.thresh(sol.idetect_seq(x))
Plot method for objects of class breakfast.cpts
## S3 method for class 'breakfast.cpts' plot(x, display.data = TRUE, ...)
## S3 method for class 'breakfast.cpts' plot(x, display.data = TRUE, ...)
x |
a |
display.data |
if |
... |
current not in use |
f <- rep(rep(c(0, 1), each = 50), 5) x <- f + rnorm(length(f)) * .5 plot(breakfast(x, solution.path = 'all', model.selection = 'all'), display.data = TRUE) plot(breakfast(x), display.data = FALSE)
f <- rep(rep(c(0, 1), each = 50), 5) x <- f + rnorm(length(f)) * .5 plot(breakfast(x, solution.path = 'all', model.selection = 'all'), display.data = TRUE) plot(breakfast(x), display.data = FALSE)
Print method for objects of class breakfast.cpts
## S3 method for class 'breakfast.cpts' print(x, by = c("method", "estimator"), ...)
## S3 method for class 'breakfast.cpts' print(x, by = c("method", "estimator"), ...)
x |
a |
by |
if |
... |
current not in use |
f <- rep(rep(c(0, 1), each = 50), 5) x <- f + rnorm(length(f)) * .5 print(breakfast(x, solution.path = 'all', model.selection = 'all'), by = 'method') print(breakfast(x), by = 'estimator')
f <- rep(rep(c(0, 1), each = 50), 5) x <- f + rnorm(length(f)) * .5 print(breakfast(x, solution.path = 'all', model.selection = 'all'), by = 'method') print(breakfast(x), by = 'estimator')
Print method for objects of class cptmodel
## S3 method for class 'cptmodel' print(x, ...)
## S3 method for class 'cptmodel' print(x, ...)
x |
a |
... |
current not in use |
f <- rep(rep(c(0, 1), each = 50), 5) x <- f + rnorm(length(f)) * .5 print(model.ic(sol.idetect(x)))
f <- rep(rep(c(0, 1), each = 50), 5) x <- f + rnorm(length(f)) * .5 print(model.ic(sol.idetect(x)))
This function arranges all possible change-points in the mean of the input vector, or in its linear trend, in the order of importance, via the Isolate-Detect (ID) method. It is developed to be used with the sdll and information criterion (ic) model selection rules.
sol.idetect( x, type = "const", thr_ic_cons = 0.9, thr_ic_lin = 1.25, points = 3 )
sol.idetect( x, type = "const", thr_ic_cons = 0.9, thr_ic_lin = 1.25, points = 3 )
x |
A numeric vector containing the data to be processed. |
type |
The model type considered. |
thr_ic_cons |
A positive real number with default value equal to 0.9. It is used to create the solution path for the piecewise-constant model. The lower the value, the longer the solution path. |
thr_ic_lin |
A positive real number with default value 1.25. Used to create the solution path if |
points |
A positive integer with default value equal to 3. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, as described in the Isolate-Detect methodology. |
The Isolate-Detect method and its algorithm is described in "Detecting multiple generalized change-points by isolating single ones", A. Anastasiou & P. Fryzlewicz (2022), Metrika, https://doi.org/10.1007/s00184-021-00821-6.
An S3 object of class cptpath
, which contains the following fields:
solutions.nested |
|
solution.path |
Locations of possible change-points in the mean of |
solution.set |
Empty list |
x |
Input vector |
type |
The input parameter |
cands |
Matrix of dimensions length( |
method |
The method used, which has value "idetect" here |
A. Anastasiou & P. Fryzlewicz (2022). Detecting multiple generalized change-points by isolating single ones. Metrika, https://doi.org/10.1007/s00184-021-00821-6.
sol.idetect_seq
, sol.not
, sol.wbs
, sol.wbs2
, sol.tguh
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.idetect(r3)
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.idetect(r3)
This function arranges all possible change-points in the mean of the input vector, or in its linear trend, in the order of importance, via the Isolate-Detect (ID) method. It is developed to be used with the thresholding model selection rule.
sol.idetect_seq(x, type = "const", points = 4)
sol.idetect_seq(x, type = "const", points = 4)
x |
A numeric vector containing the data to be processed |
type |
The model type considered. |
points |
A positive integer with default value equal to 4. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, as described in the Isolate-Detect methodology. |
The Isolate-Detect method and its algorithm is described in "Detecting multiple generalized change-points by isolating single ones", A. Anastasiou & P. Fryzlewicz (2022), Metrika, https://doi.org/10.1007/s00184-021-00821-6.
An S3 object of class cptpath
, which contains the following fields:
solutions.nested |
|
solution.path |
Locations of possible change-points, arranged in decreasing order of change-point importance |
solution.set |
Empty list |
x |
Input vector |
type |
The input parameter |
cands |
Matrix of dimensions length( |
method |
The method used, which has value "idetect_seq" here |
A. Anastasiou & P. Fryzlewicz (2022). Detecting multiple generalized change-points by isolating single ones. Metrika, https://doi.org/10.1007/s00184-021-00821-6.
sol.idetect
, sol.not
, sol.wbs
, sol.wbs2
, sol.tguh
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.idetect_seq(r3)
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.idetect_seq(r3)
This function arranges all possible features (e.g. change in the mean, change in the slope, etc) of the input vector in the order of importance, via the Narrowest-Over-Threshold (NOT) method.
sol.not(x, type = "const", M = 10000, systematic.intervals = TRUE, seed = NULL)
sol.not(x, type = "const", M = 10000, systematic.intervals = TRUE, seed = NULL)
x |
A numeric vector containing the data to be processed |
type |
The model type considered. |
M |
The maximum number of all data sub-samples at the beginning of the algorithm. The default is
|
systematic.intervals |
When drawing the sub-intervals, whether to use a systematic (and fixed) or random scheme. The default is |
seed |
If a random scheme is used, a random seed can be provided so that every time the same sets of random sub-intervals would be drawn. The default is |
The Narrowest-Over-Threshold method and its algorithm is described in "Narrowest-over-threshold detection of multiple change points and change-point-like features", R. Baranowski, Y. Chen and P. Fryzlewicz (2019), Journal of Royal Statistical Society: Series B, 81(3), 649–672.
An S3 object of class cptpath
, which contains the following fields:
solutions.nested |
|
solution.path |
Empty list |
solution.set |
Locations of possible change-points in the mean of |
solution.set.th |
A list that contains threshold levels corresponding to the detections in |
x |
Input vector |
type |
The model type used, which is given in the input. If not given, the default is |
M |
Input parameter |
cands |
Matrix of dimensions length( |
method |
The method used, which has value "not" here |
R. Baranowski, Y. Chen & P. Fryzlewicz (2019). Narrowest-over-threshold detection of multiple change points and change-point-like features. Journal of the Royal Statistical Society: Series B, 81(3), 649–672.
sol.idetect
, sol.idetect_seq
, sol.tguh
, sol.wbs
, sol.wbs2
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.not(r3)
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.not(r3)
This function arranges all possible change-points in the mean of the input vector in the order of importance, via the Tail-Greedy Unbalanced Haar method.
sol.tguh(x, type = "const", p = 0.01)
sol.tguh(x, type = "const", p = 0.01)
x |
A numeric vector containing the data to be processed |
type |
The model type considered. |
p |
Specifies the number of region pairs merged
in each pass through the data, as the proportion of all remaining region pairs. The default is
|
The Tail-Greedy Unbalanced Haar decomposition algorithm is described in "Tail-greedy bottom-up data decompositions and fast multiple change-point detection", P. Fryzlewicz (2018), The Annals of Statistics, 46, 3390–3421.
An S3 object of class cptpath
, which contains the following fields:
solutions.nested |
|
solution.path |
Locations of possible change-points in the mean of |
solution.set |
Empty list |
x |
Input vector |
type |
Input parameter |
p |
Input parameter |
cands |
Matrix of dimensions length( |
method |
The method used, which has value "tguh" here |
P. Fryzlewicz (2018). Tail-greedy bottom-up data decompositions and fast multiple change-point detection. The Annals of Statistics, 46, 3390–3421.
sol.idetect
, sol.idetect_seq
, sol.not
, sol.wbs
, sol.wbs2
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.tguh(r3)
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.tguh(r3)
This function arranges all possible change-in-mean features of the input vector in the order of importance, via the Wild Binary Segmentation (WBS) method.
sol.wbs(x, type = "const", M = 10000, systematic.intervals = TRUE, seed = NULL)
sol.wbs(x, type = "const", M = 10000, systematic.intervals = TRUE, seed = NULL)
x |
A numeric vector containing the data to be processed |
type |
The model type considered. Currently |
M |
The maximum number of all data sub-samples at the beginning of the algorithm. The default is
|
systematic.intervals |
When drawing the sub-intervals, whether to use a systematic (and fixed) or random scheme. The default is |
seed |
If a random scheme is used, a random seed can be provided so that every time the same sets of random sub-intervals would be drawn. The default is |
The Wild Binary Segmentation algorithm is described in "Wild binary segmentation for multiple change-point detection", P. Fryzlewicz (2014), The Annals of Statistics, 42: 2243–2281.
An S3 object of class cptpath
, which contains the following fields:
solutions.nested |
|
solution.path |
Locations of possible change-points in the mean of |
solution.set |
Empty list |
x |
Input vector |
type |
The input parameter |
M |
Input parameter |
cands |
Matrix of dimensions length( |
method |
The method used, which has value "wbs" here |
P. Fryzlewicz (2014). Wild binary segmentation for multiple change-point detection. The Annals of Statistics, 42(6), 2243–2281.
R. Baranowski, Y. Chen & P. Fryzlewicz (2019). Narrowest-over-threshold detection of multiple change points and change-point-like features. Journal of the Royal Statistical Society: Series B, 81(3), 649–672.
sol.idetect
, sol.idetect_seq
, sol.not
, sol.tguh
, sol.wbs2
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.wbs(r3)
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.wbs(r3)
This function arranges all possible change-points in the mean of the input vector in the order of importance, via the Wild Binary Segmentation 2 method.
sol.wbs2(x, type = "const", M = 1000, systematic.intervals = TRUE, seed = NULL)
sol.wbs2(x, type = "const", M = 1000, systematic.intervals = TRUE, seed = NULL)
x |
A numeric vector containing the data to be processed. |
type |
The model type considered. |
M |
The maximum number of data sub-samples drawn at each recursive stage of the algorithm. The default is
|
systematic.intervals |
Whether data sub-intervals for CUSUM computation are drawn systematically (TRUE; start- and end-points taken from an approximately equispaced grid) or randomly (FALSE; obtained uniformly with replacement). The default is TRUE. |
seed |
If a random scheme is used, a random seed can be provided so that every time the same sets of random sub-intervals would be drawn. The default is |
The Wild Binary Segmentation 2 algorithm is described in "Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection", P. Fryzlewicz (2020), Journal of the Korean Statistical Society, 49, 1027-1070.
An S3 object of class cptpath
, which contains the following fields:
solutions.nested |
|
solution.path |
Locations of possible change-points in the mean of |
solution.set |
Empty list |
x |
Input vector |
type |
Input parameter |
M |
Input parameter |
cands |
Matrix of dimensions length( |
method |
The method used, which has value "wbs2" here |
P. Fryzlewicz (2020). Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection. Journal of the Korean Statistical Society, 49, 1027-1070.
sol.idetect
, sol.idetect_seq
, sol.not
, sol.tguh
, sol.wbs
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.wbs2(r3)
r3 <- rnorm(1000) + c(rep(0,300), rep(2,200), rep(-4,300), rep(0,200)) sol.wbs2(r3)
This function arranges all possible change-points in the mean of the input vector in the order of importance, via the Wild Binary Segmentation 2 method.
sol.wcm( x, type = "const", M = 100, min.d = NULL, Q = floor(log(length(x))^1.9), max.iter = 5 )
sol.wcm( x, type = "const", M = 100, min.d = NULL, Q = floor(log(length(x))^1.9), max.iter = 5 )
x |
A numeric vector containing the data to be processed. |
type |
The type of change-point models fitted to the data; currently the class of piecewise constant signals ( |
M |
The maximum number of data sub-samples drawn at each recursive stage of the algorithm. The default is |
min.d |
The minimum distance between candidate change-point estimators;
if |
Q |
The maximum number of allowable change-points.
The default is |
max.iter |
The maximum number of candidate change-point models considered; if a model with the number of change-point estimators exceeding |
The Wild Contrast Maximisation (WCM) algorithm generates a nested sequence of candidate models by identifying large gaps in the solution path generated by WBS2, which aids the model selection step in the presence of large random fluctuations due to serial dependence. See Cho and Fryzlewicz (2023) for further details.
An S3 object of class cptpath
, which contains the following fields:
solutions.nested |
|
solution.path |
Locations of possible change-points in the mean of |
solution.set |
A list of candidate change-point models. Each model contains possible change-points in the mean of |
x |
Input vector |
type |
The type of the change-point model considered, which has value "const" here |
M |
Input parameter |
cands |
Matrix of dimensions |
method |
The method used, which has value "wcm" here |
H. Cho & P. Fryzlewicz (2024) Multiple change point detection under serial dependence: Wild contrast maximisation and gappy Schwarz algorithm. Journal of Time Series Analysis, 45(3): 479–494.
set.seed(111) f <- rep(c(0, 5, 2, 8, 1, -2), c(100, 200, 200, 50, 200, 250)) x <- f + arima.sim(list(ar = c(.75, -.5), ma = c(.8, .7, .6, .5, .4, .3)), n = length(f), sd = 1) sol.wcm(x)$solution.set
set.seed(111) f <- rep(c(0, 5, 2, 8, 1, -2), c(100, 200, 200, 50, 200, 250)) x <- f + arima.sim(list(ar = c(.75, -.5), ma = c(.8, .7, .6, .5, .4, .3)), n = length(f), sd = 1) sol.wcm(x)$solution.set