Title: | Fitting Latent Class Vector-Autoregressive (VAR) Models |
---|---|
Description: | Estimates latent class vector-autoregressive models via EM algorithm on time-series data for model-based clustering and classification. Includes model selection criteria for selecting the number of lags and clusters. |
Authors: | Anja Ernst [aut, cre], Jonas Haslbeck [aut] |
Maintainer: | Anja Ernst <[email protected]> |
License: | GPL-2 |
Version: | 0.0.9 |
Built: | 2025-02-12 06:12:39 UTC |
Source: | https://github.com/aniebee/clustervar |
Extracts the coefficients of a given model from the output object of the LCVAR
function.
## S3 method for class 'ClusterVAR' coef(object, Model,...)
## S3 method for class 'ClusterVAR' coef(object, Model,...)
object |
An output object of the |
Model |
An integer vector specifying the model for which coefficients should be shown. For example, |
... |
Pass additional arguments. |
Lags |
An integer vector specifying for which model the coefficients are shown. For example, |
Classification |
The crisp classification for each individual into a cluster. Crisp classifications are made based on an individual's modal cluster membership probabilities. |
VAR_coefficients |
The cluster-wise vector-autoregressive coefficients for each cluster. |
Exogenous_coefficients |
The cluster-wise exogenous coefficients for each cluster. The first column in each array indicates the (conditional) within-person mean in that cluster. If exogenous variable(s) were specified, the other columns indicate the influences of the exogenous variable(s) in that cluster. |
Sigma |
The cluster-wise innovation covariance matrix for each cluster. |
Proportions |
The mixing proportions for each cluster. These can be considered as the proportion of individuals belonging to the respective cluster. |
PredictableTimepoints |
The total number of time-points in this dataset that could be predicted because the previous time-point(s) were observed. In case of unequal lags across different clusters, the number of time-points for each person are weighted by their posterior cluster-membership probability. See also |
Converged |
A logical value that indicates whether this model converged. |
Anja Ernst & Jonas Haslbeck
LCVAR_outExample <- LCVAR(Data = ExampleData, yVars = 1:4, ID = 5, Beep = 6, xContinuous = 7, xFactor = 8, Clusters = 2, Lags = 1:2, smallestClN = 3, Cores = 2, RndSeed = 3, Rand = 2, it = 25) coef(LCVAR_outExample, Model = c(1, 1)) coef(LCVAR_outExample, Model = c(2, 2))
LCVAR_outExample <- LCVAR(Data = ExampleData, yVars = 1:4, ID = 5, Beep = 6, xContinuous = 7, xFactor = 8, Clusters = 2, Lags = 1:2, smallestClN = 3, Cores = 2, RndSeed = 3, Rand = 2, it = 25) coef(LCVAR_outExample, Model = c(1, 1)) coef(LCVAR_outExample, Model = c(2, 2))
Function to fit a Latent Class VAR model with a given number of latent classes.
LCVAR(Data, yVars, Beep, Day = NULL, ID, xContinuous = NULL, xFactor = NULL, Clusters, Lags, Center = FALSE, smallestClN = 3, Cores = 1, RndSeed = NULL, Rand = 50, Rational = TRUE, Initialization = NULL, SigmaIncrease = 10, it = 50, Conv = 1e-05, pbar = TRUE, verbose = TRUE, Covariates = "equal-within-clusters", ...)
LCVAR(Data, yVars, Beep, Day = NULL, ID, xContinuous = NULL, xFactor = NULL, Clusters, Lags, Center = FALSE, smallestClN = 3, Cores = 1, RndSeed = NULL, Rand = 50, Rational = TRUE, Initialization = NULL, SigmaIncrease = 10, it = 50, Conv = 1e-05, pbar = TRUE, verbose = TRUE, Covariates = "equal-within-clusters", ...)
Data |
The data provided in a data.frame. |
yVars |
An integer vector specifying the position of the column(s) in dataframe |
Beep |
An integer specifying the position of the column in dataframe |
Day |
Optional argument. An integer specifying the position of the column in dataframe |
ID |
An integer specifying the position of the column in dataframe |
xContinuous |
Optional argument. An integer vector specifying the position of the column(s) in dataframe Data that contain the continuous exogenous variable(s), if present. Exogenous variables are also known as covariates or as moderators for the within-person mean. |
xFactor |
Optional argument. An integer vector specifying the position of the column(s) in dataframe Data that contain the categorical exogenous variable(s), if present. Exogenous variables are also known as covariates or as moderators for the within-person mean. |
Clusters |
An integer or integer vector specifying the numbers of latent classes (i.e., clusters) for which LCVAR models are to be calculated. |
Lags |
An integer or integer vector specifying the number of VAR(p) lags to consider. Needs to be a sequence of subsequent integers. The maximum number supported is |
Center |
Logical, indicating whether the data (i.e., the endogenous variables) should be centered per person before calculations. If |
smallestClN |
An integer specifying the lowest number of individuals allowed in a cluster. When during estimation the crisp cluster membership of a cluster indicates less than |
Cores |
A positive integer specifying the number of cores used to parallelize the computations. Specifying a high number of available cores can speed up computation. Defaults to |
RndSeed |
Optional argument. An integer specifying the value supplied to |
Rand |
The number of pseudo-random EM-starts used in fitting each possible model. For pseudo-random starts K individuals are randomly selected as cluster centres. Then individuals are partitioned into the cluster to which their individual VAR and individual covariate coefficients are closest. High numbers (e.g., 50 and above) ensure that a global optimum will be found, but will take longer to compute. Defaults to |
Rational |
Logical, indicating whether a rational EM-start should be used in addition to the other EM-starts. Defaults to |
Initialization |
Optional argument. An integer specifying the position of a column in dataframe |
SigmaIncrease |
A numerical value specifying the value by which every element of Sigma will be increased when posterior probabilities of cluster memberships are reset. Defaults to |
it |
An integer specifying the maximum number of EM-iterations allowed for every EM-start. After completing |
Conv |
A numerical value specifying the convergence criterion of the log likelihood to determine convergence of an EM-start. For details see Ernst et al. (2020) Inter-individual differences in multivariate time series: Latent class vector-autoregressive modelling. Defaults to |
pbar |
If |
verbose |
If |
Covariates |
Constraints on the parameters of the exogenous variable(s). So far only |
... |
Additional arguments passed to the function. |
This function estimates the latent class vector-autoregressive model to obtain latent classes (i.e., clusters) of individuals who are similar in VAR coefficients and (if specified) in within-person means and infleunces of exogenous variable(s).
Here represents an m x 1 vector that contains the cluster-wise conditional within-person mean for each y-variable in cluster k.
represents an m x q matrix that expresses the cluster-wise moderating influence of q exogenous variables (
) on the within-person means in cluster k.
represents an m×m matrix containing the cluster-wise VAR coefficients at lag a for cluster k. See the references below for details.
An object of class 'ClusterVAR' providing several LCVAR models. The details of the output components are as follows:
Call |
A list of arguments from the original function call. |
All_Models |
All LCVAR models across all number of clusters, lag combinations, and number of EM-starts. |
Runtime |
The runtime the function took to complete. |
Anja Ernst
Ernst, A. F., Albers, C. J., Jeronimus, B. F., & Timmerman, M. E. (2020). Inter-individual differences in multivariate time-series: Latent class vector-autoregressive modeling. European Journal of Psychological Assessment, 36(3), 482–491. doi:10.1027/1015-5759/a000578
head(SyntheticData) LCVAR_outExample1 <- LCVAR(Data = SyntheticData, yVars = 1:4, ID = 5, Beep = 9, Day = 10, xContinuous = 7, xFactor = 8, Clusters = 1:2, Lags = 1, Center = TRUE, Cores = 2, # Adapt to local machine RndSeed = 123, Rand = 1, it = 25) summary(LCVAR_outExample1) summary(object = LCVAR_outExample1, show = "GNL", Number_of_Lags = 1) coef(LCVAR_outExample1, Model = c(1, 1)) head(ExampleData) LCVAR_outExample2 <- LCVAR(Data = ExampleData, yVars = 1:4, ID = 5, Beep = 6, xContinuous = 7, xFactor = 8, Clusters = 1:2, Lags = 1:2, Center = FALSE, Cores = 2, RndSeed = 123, Rand = 1, it = 25, Conv = 1e-05) summary(LCVAR_outExample2) summary(object = LCVAR_outExample2, show = "GNL", Number_of_Lags = 1) summary(object = LCVAR_outExample2, show = "GNC", Number_of_Clusters = 2) coef(LCVAR_outExample2, Model = c(1, 1)) plot(LCVAR_outExample2, show = "specific", Model = c(1, 1)) LCVAR_outExample3 <- LCVAR(Data = ExampleData, yVars = c("Item1", "Item2","Item3", "Item4"), ID = "Person", Beep = "Timepoint", xContinuous = "ContiniousVariable", xFactor = "CategoricalVariable", Clusters = 1:2, Lags = 1:2, Center = FALSE, Cores = 2, RndSeed = 123, Rand = 1, it = 25, Conv = 1e-05) plot(LCVAR_outExample3, show = "GNL", Number_of_Lags = 1) plot(LCVAR_outExample3, show = "GNC", Number_of_Clusters = 2)
head(SyntheticData) LCVAR_outExample1 <- LCVAR(Data = SyntheticData, yVars = 1:4, ID = 5, Beep = 9, Day = 10, xContinuous = 7, xFactor = 8, Clusters = 1:2, Lags = 1, Center = TRUE, Cores = 2, # Adapt to local machine RndSeed = 123, Rand = 1, it = 25) summary(LCVAR_outExample1) summary(object = LCVAR_outExample1, show = "GNL", Number_of_Lags = 1) coef(LCVAR_outExample1, Model = c(1, 1)) head(ExampleData) LCVAR_outExample2 <- LCVAR(Data = ExampleData, yVars = 1:4, ID = 5, Beep = 6, xContinuous = 7, xFactor = 8, Clusters = 1:2, Lags = 1:2, Center = FALSE, Cores = 2, RndSeed = 123, Rand = 1, it = 25, Conv = 1e-05) summary(LCVAR_outExample2) summary(object = LCVAR_outExample2, show = "GNL", Number_of_Lags = 1) summary(object = LCVAR_outExample2, show = "GNC", Number_of_Clusters = 2) coef(LCVAR_outExample2, Model = c(1, 1)) plot(LCVAR_outExample2, show = "specific", Model = c(1, 1)) LCVAR_outExample3 <- LCVAR(Data = ExampleData, yVars = c("Item1", "Item2","Item3", "Item4"), ID = "Person", Beep = "Timepoint", xContinuous = "ContiniousVariable", xFactor = "CategoricalVariable", Clusters = 1:2, Lags = 1:2, Center = FALSE, Cores = 2, RndSeed = 123, Rand = 1, it = 25, Conv = 1e-05) plot(LCVAR_outExample3, show = "GNL", Number_of_Lags = 1) plot(LCVAR_outExample3, show = "GNC", Number_of_Clusters = 2)
numberPredictableObservations
is a function to determine the number of observations in a given dataset that can be predicted based on the availability of previous observations, considering a specified time-lag.
numberPredictableObservations(Data, yVars, Beep, Day = NULL, ID, xContinuous = NULL, xFactor = NULL, Lags, ...)
numberPredictableObservations(Data, yVars, Beep, Day = NULL, ID, xContinuous = NULL, xFactor = NULL, Lags, ...)
Data |
The data provided in a data.frame. |
yVars |
An integer vector specifying the position of the column(s) in dataframe |
Beep |
An integer specifying the position of the column in dataframe |
Day |
Optional. An integer specifying the position of the column in dataframe |
ID |
An integer specifying the position of the column in dataframe |
xContinuous |
Optional argument. An integer vector specifying the position of the column(s) in dataframe Data that contain the continuous exogenous variable(s), if present. Exogenous variables are also known as covariates or as moderators for the within-person mean. |
xFactor |
Optional argument. An integer vector specifying the position of the column(s) in dataframe Data that contain the categorical exogenous variable(s), if present. Exogenous variables are also known as covariates or as moderators for the within-person mean. |
Lags |
An integer or integer vector specifying the number of VAR(p) lags to consider. Needs to be a sequence of subsequent integers. The maximum number supported is |
... |
Additional arguments passed to the function. |
This function determines the number of observations in a given dataset that can be predicted based on previous observations. For instance, in a lag-1 model, if an observation is missing, the observation at the next time-point cannot be predicted. Similarly, in a lag-2 model, if an observation is missing, the observations at the next two time-points cannot be predicted. The output gives the number of predictable observations for each of the endogenous variables that was specified under yVars
. The number of predictable observations is the same for all endogenous variables.
Predictable observations per subject |
The number of predictable observations for each endogenous variable per subject, considering a specified time-lag. |
Total predictable observations |
The total number of predictable observations summed over all subjects in the dataset for each endogenous variable, considering a specified time-lag. |
Anja Ernst & Jonas Haslbeck
head(SyntheticData) Obs <- numberPredictableObservations(Data = SyntheticData, yVars = 1:4, Beep = 9, Day = 10, ID = 5, Lags = 1:3) Obs Obs$`Predictable observations per subject`$`1 Lag`
head(SyntheticData) Obs <- numberPredictableObservations(Data = SyntheticData, yVars = 1:4, Beep = 9, Day = 10, ID = 5, Lags = 1:3) Obs Obs$`Predictable observations per subject`$`1 Lag`
Creates a variety of plots summarizing fitted LCVAR models.
## S3 method for class 'ClusterVAR' plot(x, show, Number_of_Clusters = NULL, Number_of_Lags = NULL, Model = NULL, mar_heat = c(2.5,2.5,2,1), ...)
## S3 method for class 'ClusterVAR' plot(x, show, Number_of_Clusters = NULL, Number_of_Lags = NULL, Model = NULL, mar_heat = c(2.5,2.5,2,1), ...)
x |
An output object of the |
show |
Indicate summaries to plot. Alternatively, the VAR matrices of a specific model can be visualized. |
Number_of_Clusters |
An integer. Specify the fixed number of clusters when using |
Number_of_Lags |
An integer. Specify the fixed number of lags when using |
Model |
An integer vector. Specify when using |
mar_heat |
A numeric vector. Optional when using |
... |
Pass additional arguments. |
Creates different plots showing either a fitted LCVAR model or fit indices for a specified set of LCVAR models.
No return value, just plots figure.
Anja Ernst & Jonas Haslbeck
LCVAR_outExample <- LCVAR(Data = ExampleData, yVars = 1:4, ID = 5, Beep = 6, xContinuous = 7, xFactor = 8, Clusters = 1:2, Lags = 1:2, Center = FALSE, Cores = 2, RndSeed = 3, Rand = 2, it = 25) plot(LCVAR_outExample, show = "GNL", Number_of_Lags = 1) plot(LCVAR_outExample, show = "GNC", Number_of_Clusters = 2) plot(LCVAR_outExample, show = "specific", Model = c(1, 1)) plot(LCVAR_outExample, show = "specific", Model = c(1, 1), labels = c("A", "B", "C","D")) plot(LCVAR_outExample, show = "specificDiff", Model = c(1, 1))
LCVAR_outExample <- LCVAR(Data = ExampleData, yVars = 1:4, ID = 5, Beep = 6, xContinuous = 7, xFactor = 8, Clusters = 1:2, Lags = 1:2, Center = FALSE, Cores = 2, RndSeed = 3, Rand = 2, it = 25) plot(LCVAR_outExample, show = "GNL", Number_of_Lags = 1) plot(LCVAR_outExample, show = "GNC", Number_of_Clusters = 2) plot(LCVAR_outExample, show = "specific", Model = c(1, 1)) plot(LCVAR_outExample, show = "specific", Model = c(1, 1), labels = c("A", "B", "C","D")) plot(LCVAR_outExample, show = "specificDiff", Model = c(1, 1))
Takes the output of the ClusterVAR object and prints a small overview of the fitted model(s).
## S3 method for class 'ClusterVAR' print(x, ...)
## S3 method for class 'ClusterVAR' print(x, ...)
x |
An output object of the |
... |
Pass additional arguments |
Prints an overview of the fitted model(s) in the console.
No return value, just returns summary in console.
Anja Ernst & Jonas Haslbeck
Overview of Parameters of given Model.
## S3 method for class 'ClusterVARCoef' print(x, ...)
## S3 method for class 'ClusterVARCoef' print(x, ...)
x |
An output object of the |
... |
Pass additional arguments. |
Prints an overview of the fitted model in the console.
Anja Ernst & Jonas Haslbeck
Print Summary of Models into Console.
## S3 method for class 'ClusterVARSummary' print(x, ...)
## S3 method for class 'ClusterVARSummary' print(x, ...)
x |
An output object of the |
... |
Pass additional arguments. |
Prints the summary of the fitted models in the console.
Anja Ernst & Jonas Haslbeck
Print Summary of predictable Observations into Console.
## S3 method for class 'PredictableObs' print(x, ...)
## S3 method for class 'PredictableObs' print(x, ...)
x |
An output object of the |
... |
Pass additional arguments. |
Prints the summary of the number of predictable Observations in the console.
Anja Ernst & Jonas Haslbeck
Takes the output of the LCVAR function and creates a small summary of the fitted model(s).
## S3 method for class 'ClusterVAR' summary(object, show = "BPC", TS_criterion = "SC", global_criterion = "BIC", Number_of_Clusters = NULL, Number_of_Lags = NULL, ...)
## S3 method for class 'ClusterVAR' summary(object, show = "BPC", TS_criterion = "SC", global_criterion = "BIC", Number_of_Clusters = NULL, Number_of_Lags = NULL, ...)
object |
An output object of the |
show |
Indicate how models should be summarized, the possible choices are |
TS_criterion |
The information criterion to select the best model between models with a different number of lags but with the same number of clusters. The possible choices are |
global_criterion |
The information criterion to select the best model between models with different numbers of clusters but with the same number of lags. The possible choices are |
Number_of_Clusters |
An integer. Specify the fixed number of clusters when using |
Number_of_Lags |
An integer. Specify the fixed number of lags when using |
... |
Pass additional arguments. |
FunctionOutput |
Is a data frame containing summaries of the fitted models. |
Anja Ernst & Jonas Haslbeck
Hamilton, J. (1994), Time Series Analysis, Princeton University Press, Princeton.
Hannan, E. J. and B. G. Quinn (1979), The determination of the order of an autoregression, Journal of the Royal Statistical Society.
Lütkepohl, H. (2006), New Introduction to Multiple Time Series Analysis, Springer, New York.
Quinn, B. (1980), Order determination for a multivariate autoregression, Journal of the Royal Statistical Society.
Biernacki, C., Celeux, G., & Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of Statistics.
plot.ClusterVAR()
, coef.ClusterVAR()
LCVAR_outExample <- LCVAR(Data = ExampleData, yVars = 1:4, ID = 5, Beep = 6, xContinuous = 7, xFactor = 8, Clusters = 1:2, Lags = 1:2, Cores = 2, RndSeed = 3, Rand = 2, it = 25) summary(LCVAR_outExample) summary(object = LCVAR_outExample, show = "GNL", Number_of_Lags = 1) summary(object = LCVAR_outExample, show = "GNL", Number_of_Lags = 1, global_criterion = "ICL") summary(object = LCVAR_outExample, show = "GNC", Number_of_Clusters = 2) summary(object = LCVAR_outExample, show = "GNC", Number_of_Clusters = 2, TS_criterion = "HQ")
LCVAR_outExample <- LCVAR(Data = ExampleData, yVars = 1:4, ID = 5, Beep = 6, xContinuous = 7, xFactor = 8, Clusters = 1:2, Lags = 1:2, Cores = 2, RndSeed = 3, Rand = 2, it = 25) summary(LCVAR_outExample) summary(object = LCVAR_outExample, show = "GNL", Number_of_Lags = 1) summary(object = LCVAR_outExample, show = "GNL", Number_of_Lags = 1, global_criterion = "ICL") summary(object = LCVAR_outExample, show = "GNC", Number_of_Clusters = 2) summary(object = LCVAR_outExample, show = "GNC", Number_of_Clusters = 2, TS_criterion = "HQ")