Package: RCTS 0.2.4

RCTS: Clustering Time Series While Resisting Outliers

Robust Clustering of Time Series (RCTS) has the functionality to cluster time series using both the classical and the robust interactive fixed effects framework. The classical framework is developed in Ando & Bai (2017) <doi:10.1080/01621459.2016.1195743>. The implementation within this package excludes the SCAD-penalty on the estimations of beta. This robust framework is developed in Boudt & Heyndels (2022) <doi:10.1016/j.ecosta.2022.01.002> and is made robust against different kinds of outliers. The algorithm iteratively updates beta (the coefficients of the observable variables), group membership, and the latent factors (which can be common and/or group-specific) along with their loadings. The number of groups and factors can be estimated if they are unknown.

Authors:Ewoud Heyndels [aut, cre]

RCTS_0.2.4.tar.gz
RCTS_0.2.4.zip(r-4.5)RCTS_0.2.4.zip(r-4.4)RCTS_0.2.4.zip(r-4.3)
RCTS_0.2.4.tgz(r-4.4-any)RCTS_0.2.4.tgz(r-4.3-any)
RCTS_0.2.4.tar.gz(r-4.5-noble)RCTS_0.2.4.tar.gz(r-4.4-noble)
RCTS_0.2.4.tgz(r-4.4-emscripten)RCTS_0.2.4.tgz(r-4.3-emscripten)
RCTS.pdf |RCTS.html
RCTS/json (API)

# Install 'RCTS' in R:
install.packages('RCTS', repos = c('https://eh-in-r.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/eh-in-r/rcts/issues

Datasets:
  • X_dgp3 - The dataset X_dgp3 contains the values of the 3 observable variables on which Y_dgp3 is based.
  • Y_dgp3 - Y_dgp3 contains a simulated dataset for DGP 3.
  • df_results_example - An example for df_results. This dataframe contains the estimators for each configuration.
  • factor_group_true_dgp3 - Factor_group_true_dgp3 contains the values of the true group factors on which Y_dgp3 is based
  • g_true_dgp3 - G_true_dgp3 contains the true group memberships of the elements of Y_dgp3
  • lambda_group_true_dgp3 - Lambda_group_true_dgp3 contains the values of the loadings to the group factors on which Y_dgp3 is based

On CRAN:

36 exports 0.63 score 53 dependencies 215 downloads

Last updated 1 years agofrom:f6a1ba4cdd. Checks:OK: 7. Indexed: yes.

TargetResultDate
Doc / VignettesOKSep 09 2024
R-4.5-winOKSep 09 2024
R-4.5-linuxOKSep 09 2024
R-4.4-winOKSep 09 2024
R-4.4-macOKSep 09 2024
R-4.3-winOKSep 09 2024
R-4.3-macOKSep 09 2024

Exports:adapt_pic_with_sigma2maxmodeladd_configurationadd_metricsadd_piccalculate_best_configcalculate_error_termcalculate_lambda_groupcalculate_PICcalculate_sigma2calculate_VCsquaredcheck_stopping_rulescreate_data_dgp2define_C_candidatesdefine_configurationsdefine_kg_candidatesdefine_number_subsetsestimate_algorithmestimate_betaestimate_factor_groupfill_rcfill_rcjget_best_configurationget_convergence_speedget_final_estimationinitialise_betainitialise_clusteringinitialise_commonfactorstructure_macropcainitialise_df_picinitialise_df_resultsinitialise_rcinitialise_rcjiteratemake_subsamplesparallel_algorithmplot_VCsquaredupdate_g

Dependencies:cellWiseclicolorspacecpp11DEoptimRdplyrfansifarvergenericsggplot2gluegridExtragtableisobandlabelinglatticelifecyclemagrittrMASSMatrixmatrixStatsmgcvmunsellmvtnormncvregnlmepcaPPpillarpkgconfigplyrpurrrR6rbibutilsRColorBrewerRcppRcppArmadilloRdpackreshape2rlangrobustbaserrcovscalesshapestringistringrsvdtibbletidyrtidyselectutf8vctrsviridisLitewithr

Readme and manuals

Help Manual

Help pageTopics
Adapts the object that contains PIC for all candidate C's and all subsamples with sigma2_max_model.adapt_pic_with_sigma2maxmodel
When running the algorithm with a different number of observable variables then the number that is available, reformat X. (Mainly used for testing)adapt_X_estimating_less_variables
Adds the current configuration (number of groups and factors) to df_results.add_configuration
Adds several metrics to df_results.add_metrics
Fills in df_pic: adds a row with the calculated PIC for the current configuration.add_pic
Calculates the PIC for the current configuration.add_pic_parallel
Helpfunction in create_true_beta() for the option beta_true_heterogeneous_groups. (This is the default option.)beta_true_heterogroups
Function that returns for each candidate C the best number of groups and factors, based on the PIC.calculate_best_config
Calculates the error term Y - X*beta_est - LF - LgFg.calculate_error_term
Helpfunction for update_g(). Calculates the errors for one of the possible groups time series can be placed in.calculate_errors_virtual_groups
Returns the estimated groupfactorstructure.calculate_FL_group_estimated
Calculate the true groupfactorstructure.calculate_FL_group_true
calculates factor loadings of common factorscalculate_lambda
calculates factor loadings of groupfactorscalculate_lambda_group
Calculates the group factor structure: the matrix product of the group factors and their loadings.calculate_lgfg
Calculates objective function for individual i and group k in order to estimate group membership.calculate_obj_for_g
Function to determine PIC (panel information criterium)calculate_PIC
Function to calculate the first term of PIC (panel information criterium)calculate_PIC_term1
Calculates sum of squared errors, divided by NTcalculate_sigma2
Calculates sigma2maxmodelcalculate_sigma2maxmodel
Helpfunction. Calculates part of the 4th term of the PIC.calculate_TN_factor
Calculates VC², to determine the stability of the found number of groups and factors over the subsamples.calculate_VCsquared
Helpfunction used in update_g()calculate_virtual_factor_and_lambda_group
Calculates W = Y - X*beta_est. It is used in the initialization step of the algorithm, to initialise the factorstructures.calculate_W
Calculates (the estimated value of) the matrix X*beta_est.calculate_XB_estimated
Calculates the product of X*beta_true .calculate_XB_true
Calculates Z = Y - X*beta_est - LgFg. It is used in the estimate of the common factorstructure.calculate_Z_common
Calculates Z = Y - X*beta_est - LF. It is used to estimate the groupfactorstructure.calculate_Z_group
Checks the rules for stopping the algorithm, based on its convergence speed.check_stopping_rules
Function that puts individuals in a separate "class zero", when their distance to all possible groups is bigger then a certain threshold.clustering_with_robust_distances
Function used in generating simulated data with non normal errors.create_covMat_crosssectional_dependence
Creates an instance of DGP 2, as defined in Boudt and Heyndels (2022).create_data_dgp2
Creates beta_true, which contains the true values of beta (= the coefficients of X)create_true_beta
Defines the candidate values for C.define_C_candidates
Constructs dataframe where the rows contains all configurations that are included and for which the estimators will be estimated.define_configurations
Defines the set of combinations of group specific factors.define_kg_candidates
Returns a vector with the indices of the subsets. Must start with zero.define_number_subsets
Defines the object that will be used to define a initial clustering.define_object_for_initial_clustering_macropca
Determines parameters of rho-function.define_rho_parameters
Helpfunction in estimate_beta() for estimating beta_est.determine_beta
Help-function for return_robust_lambdaobject().determine_robust_lambda
An example for df_results. This dataframe contains the estimators for each configuration.df_results_example
Helpfunction to shorten code: are common factors being estimated.do_we_estimate_common_factors
Helpfunction to shorten code: are group factors being estimated.do_we_estimate_group_factors
This function is a wrapper around the initialization and the estimation part of the algorithm, for one configuration. It is only used for the serialized algorithm.estimate_algorithm
Estimates beta.estimate_beta
Estimates common factor(s) F.estimate_factor
Estimates group factors Fg.estimate_factor_group
Solves a very specific issue with MacroPCA.evade_crashes_macropca
Function to evade floating point errors.evade_floating_point_errors
factor_group_true_dgp3 contains the values of the true group factors on which Y_dgp3 is basedfactor_group_true_dgp3
Fills in the optimized number of common factors for each C.fill_rc
Fills in the optimized number of groups and group specific factors for each C.fill_rcj
Filters dataframe on the requested group specific factors configuration.final_estimations_filter_kg
g_true_dgp3 contains the true group memberships of the elements of Y_dgp3g_true_dgp3
Generates the true groupfactorstructure, to use in simulations.generate_grouped_factorstructure
Generate panel data Y for simulations.generate_Y
Finds the first stable interval after the first unstable point. It then defines the value for C for the begin, middle and end of this interval.get_best_configuration
Defines the convergence speed.get_convergence_speed
Function that returns the final clustering, based on the estimated number of groups and common and group specific factors.get_final_estimation
Function which is used to have a dataframe (called "grid") with data (individualindex, timeindex, XT and LF) available.grid_add_variables
Helpfunction in robustpca().handle_macropca_errors
Function with as input a dataframe. (this will be "Y" or "to_divide") It filters out rows with NA.handleNA
Removes NA's in LG (in function calculate_virtual_factor_and_lambda_group() )handleNA_LG
Initialisation of estimation of beta (the coefficients with the observable variables)initialise_beta
Function that clusters time series in a dataframe with kmeans.initialise_clustering
Initialises the estimation of the common factors and their loadings.initialise_commonfactorstructure_macropca
Initialises a dataframe which will contain the PIC for each configuration and for each value of C.initialise_df_pic
Initialises a dataframe that will contain an overview of metrics for each estimated configuration (for example adjusted randindex).initialise_df_results
Initialises rc.initialise_rc
Initialises rcj.initialise_rcj
Creates X (the observable variables) to use in simulations.initialise_X
Wrapper around estimate_beta(), update_g(), and estimating the factorstructures.iterate
Function that returns the set of combinations of groupfactors for which the algorithm needs to run.kg_candidates_expand
lambda_group_true_dgp3 contains the values of the loadings to the group factors on which Y_dgp3 is basedlambda_group_true_dgp3
Wrapper around lmrob.LMROB
Makes a dataframe with the PIC for each configuration and each candidate C.make_df_pic_parallel
Makes a dataframe with information on each configuration.make_df_results_parallel
Selects a subsample of the time series, and of the length of the time series. Based on this it returns a list with a subsample of Y, the corresponding subsample of X and of the true group membership and factorstructures if applicable.make_subsamples
Function to calculate the norm of a matrix.matrixnorm
Helpfunction in OF_vectorized3()OF_vectorized_helpfunction3
Calculates objective function for the classical algorithm: used in iterate() and in local_search.OF_vectorized3
Wrapper of the loop over the subsets which in turn use the parallelised algorithm.parallel_algorithm
Plots expression(VC^2) along with the corresponding number of groups (orange), common factors (darkblue) and group factors of the first group (lightblue).plot_VCsquared
Helpfunction: prepares object to perform robust PCA on.prepare_for_robpca
RCTSRCTS
Randomly reassign individual(s) if there are empty groups. This can happen if the total number of time series is low compared to the number of desired groups.reassign_if_empty_groups
Restructures X (which is an 3D-array of dimensions (N,T,p) to a 2D-matrix of dimension (NxT,p).restructure_X_to_order_slowN_fastT
Calculates robust loadingsreturn_robust_lambdaobject
Function that uses robust PCA and estimates robust factors and loadings.robustpca
Wrapper around the non-parallel algorithm, to estimate beta, group membership and the factorstructures.run_config
Scaling of X.scaling_X
Helpfunction in update_g(), to calculate solve(FG x t(FG)) x FGsolveFG
Shows the configurations for potential C's of the first stable interval (beginpoint, middlepoint and endpoint)tabulate_potential_C
Function that estimates group membership.update_g
The dataset X_dgp3 contains the values of the 3 observable variables on which Y_dgp3 is based.X_dgp3
Y_dgp3 contains a simulated dataset for DGP 3.Y_dgp3