The Dataset

We use data from a ecological momentary assessment (EMA) study from a single individual diagnosed with major depression. The data set consits of 51 variables measured on 1476 time points on 238 consecutive days (Wichers et al. 2016). The variables include questions regarding mood, self-esteem, social interaction, physical activity, events and symptoms of depression.

The measurements were taken from at 10 pseudo-randomized time intervals with average length of 90 min between 07:30 and 22:30. During the measured time interval, a double-blind medication dose reduction was carried out, consisting of a baseline period, the dose reduction, and two post assessment periods. For a detailed description of this data set see Kossakowski et al. (2017).

In order to keep things clear, we focus on a subset of 12 continuous mood-related variables and two categorical contextual variables.

The Modeling Goal

The goal is to look into the interactions between all pairs of variables, while taking all other variables into account. This means that we can interpret each interaction as a unique interaction between two variables that cannot be explained away by other (mediating) variables in the data set.

The interactions in a Mixed Graphical Model (MGM) are parameters associated with products of two variables if the two variables are continous. If the two variables are categorical, there are \(m \times k - 1\) parameters associated to \(m \times k - 1\) products of indicator functions (dummy variables). For a detailed description of the model see Haslbeck and Waldorp (2017).

Data preprocessing

We begin by loading the mgm package and subsetting the 14 variables we use in this practical. The dataset is loaded automatically with the mgm package.

# Loading the mgm package
library(mgm)
## This is mgm 1.2-2
## Note that the syntax changed considerably from version 1.1-8 to 1.2-0.
## Please report any bugs: https://github.com/jmbh/mgm/issues
# Subset Mood variables and two behavioral variables
symptom_data_short <- symptom_data
selection_ind <- c(which(symptom_data$groups == "Mood"), 
                   which(symptom_data$colnames %in% c("Action", "Who with")))

# Subset
symptom_data_short$data <- symptom_data_short$data[,selection_ind]
symptom_data_short$data <- as.matrix(symptom_data_short$data)
symptom_data_short$type <- symptom_data_short$type[selection_ind]
symptom_data_short$level <- symptom_data_short$level[selection_ind]
symptom_data_short$colnames <- symptom_data_short$colnames[selection_ind]
symptom_data_short$groups <- symptom_data_short$groups[selection_ind]

# Look at first few rows
head(symptom_data_short$data)
##      mood_relaxed mood_down mood_irritat mood_satisfi mood_lonely
## [1,]            5        -1            1            5          -1
## [2,]            4         0            3            3           0
## [3,]            4         0            2            3           0
## [4,]            4         0            1            4           0
## [5,]            4         0            2            4           0
## [6,]            5         0            1            4           0
##      mood_anxious mood_enthus mood_suspic mood_cheerf mood_guilty
## [1,]           -1           4           1           5          -1
## [2,]            0           3           1           4           0
## [3,]            0           4           1           4           0
## [4,]            0           4           1           4           1
## [5,]            0           4           1           4           1
## [6,]            0           3           1           3           1
##      mood_doubt mood_strong soc_who1 act_what1
## [1,]          1           5       10        88
## [2,]          1           4        0        10
## [3,]          2           4       19        45
## [4,]          1           4       10        45
## [5,]          2           3       10        60
## [6,]          2           3       10        45
# Look at type of variables
symptom_data_short$type # 12 continuous ("g"), 2 categorical ("c")
##  [1] "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "c" "c"
# Look at levels of variables
symptom_data_short$level # 1 for continuous variables by convention; >1 for categoricals
##  [1] 1 1 1 1 1 1 1 1 1 1 1 1 5 8

Fit MGM to Data

We provide the data and two vectors specifying the type and number of levels (or categories) for each variable.

The MGM is estimated using \(\ell_1\)-regularization, which biases all parameter estimates towards zero and sets very small parameter estimates to exactly zero. For an excellent treatment on \(\ell_1\)-regularized (LASSO) regression see Hastie, Tibshirani, and Wainwright (2015). The strength of the penalty is controlled by a tuning parameter lambda \(\lambda\). Via lambdaSel = "CV"| and lambdaFolds = 10 we specify that \(\lambda\) is selected using 10-fold cross-validation, k = 2 indicates that we only include pairwise interactions and pbar = FALSE switches off the default progress bar. We save the output of the estimation function in the object mgm_obj.

Note that the cross-validation scheme adds randomness to the estimation procedure. We therefore set a random seed to make the analysis reproducible.

set.seed(1)

mgm_obj <- mgm(data = symptom_data_short$data,
               type = symptom_data_short$type,
               level = symptom_data_short$level,
               lambdaSel = "CV",
               lambdaFolds = 10,
               k = 2,
               pbar = FALSE,
               scale = TRUE)
## Note that the sign of parameter estimates is stored separately; see ?mgm

Look at Output

Markov Random Field (MRF) of MGM

We first focus on the Markov Random Field (MRF) of the MGM, which indicates the conditional independence structure of the MGM, i.e. if any pair of variables are independent conditional on all other variables in the data set. The MRF is represented in a matrix, where a 0 at entry [i, j] indicates conditional independence of variables i and j, and a 1 at entry [i, j] indicates conditional dependence of variables i and j. Seen from a graph- or network perspective this is the adjacency matrix of the MRF.

(mgm_obj$pairwise$wadj!=0)*1
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
##  [1,]    0    1    1    1    1    0    1    1    1     1     0     1     1
##  [2,]    1    0    1    1    1    0    1    1    1     1     1     1     1
##  [3,]    1    1    0    1    0    1    1    1    0     0     1     0     1
##  [4,]    1    1    1    0    0    0    1    0    1     0     1     1     0
##  [5,]    1    1    0    0    0    1    0    1    1     1     0     0     1
##  [6,]    0    0    1    0    1    0    1    1    1     1     1     1     1
##  [7,]    1    1    1    1    0    1    0    0    1     0     0     1     1
##  [8,]    1    1    1    0    1    1    0    0    0     1     0     0     1
##  [9,]    1    1    0    1    1    1    1    0    0     0     1     1     0
## [10,]    1    1    0    0    1    1    0    1    0     0     1     1     1
## [11,]    0    1    1    1    0    1    0    0    1     1     0     1     1
## [12,]    1    1    0    1    0    1    1    0    1     1     1     0     0
## [13,]    1    1    1    0    1    1    1    1    0     1     1     0     0
## [14,]    1    1    1    1    1    1    1    1    1     1     1     1     1
##       [,14]
##  [1,]     1
##  [2,]     1
##  [3,]     1
##  [4,]     1
##  [5,]     1
##  [6,]     1
##  [7,]     1
##  [8,]     1
##  [9,]     1
## [10,]     1
## [11,]     1
## [12,]     1
## [13,]     1
## [14,]     0

The object mgm_obj$pairwise$wadj contains the absolute value of the strength of the dependencies. From a network perspective, it can be seen as a weighted adjacency matrix.

Visualize MRF

In order to get a better intuition of the structure in the adjacency matrix, we can visualize it with a force-directed graph drawing algorithm. Here we use the implementation of the qgraph package (Epskamp et al. 2012):

library(qgraph) # Load qgraph package

qgraph(input = (mgm_obj$pairwise$wadj!=0)*1, 
       nodeNames = symptom_data_short$colnames)