Network based latent dirichlet subtype analysis (ver. 20190920)
nebula( data, modtype, E, H, modeta, nu, alpha, lam, alpha_sigma = 1, beta_sigma = 1, alpha_p = 1, beta_p = 1, mu0 = 0, sig0 = 20, pr0 = 0.5, binit = NULL )
data | list of M data matrices, where each matrix is n samples by p_m features for modality M |
---|---|
modtype | M-length vector of feature types for M modalities. Currently supports continuous(=0) and binary(=1) |
E | e by 4 matrix, each tuple (row) of which represents an edge; (m1,j1,m2,j2) variable j1 of modality m1 is connected to variable j2 of modality m2 |
H | the number of clusters to be fit |
modeta | length M vector of sparsity parameters for M modalities |
nu | smoothness parameter for gamma's |
alpha | concentration parameter for dirichlet process |
lam | shrinkage parameter for means of selected continuous features |
alpha_sigma | shape parameter of the prior of residual variance(sigma^2) (default = 1, i.e. noninformative) |
beta_sigma | rate parameter of the prior of residual variance(sigma^2) (default = 1, i.e. noninformative) |
alpha_p | first shape parameter of the prior of the 'active' probabilities(p_hj) of binary features (default = 1, i.e. noninformative) |
beta_p | second shape parameter of the prior of the 'active' probabilities(p_hj) of binary features (default = 1, i.e. noninformative) |
mu0 | mean of the non-selected continuous features (default is 0) |
sig0 | variance of the non-selected continuous features (default is 20) |
pr0 | 'active' probability of the non-selected binary features (default is 0.5) |
binit | n by H initial matrix of B, exp(B_ih) is proportional to Pr(z_i=h). If NULL (default), random numbers are filled in. |
A list containing clustering assignments, variable selection, and posterior probabilities
clustering cluster assignment
defvar list of M matrices; each matrix is p_m by H indicating the variable j in modality m is a defining variable for the cluster h.
clus_pr n by H matrix containing the probability that the subject i belongs to the cluster h.
defvar_pr list of M matrices; each matrix is p_m by H containing the probability that the variable j in modality m is a defining variable for the cluster h.
def_m list of M matrices; each matrix is p_m by H containing the mean of the variable j as a defining variable for the cluster h. (continuous variable only)
def_lpr list of M matrices: each matrix is p_m by H containing the log probabilities of the variable j being 'active' and a defining variable for the cluster h. (binary variable only)
iter the number of iterations until the algorithm converges.
param A list of the input parameters used in the clustering solution.
Changgee Chang