Greed - Bayesian greedy clustering

Greed enables model based clustering of networks, matrices of count data and much more with different types of generative models. Model selection and clustering is performed in combination by optimizing the Integrated Classification Likelihood. Details of the algorithms and methods proposed by this package can be found in Côme, Jouvin, Latouche, and Bouveyron (2021).

The following generative models are available currently :

  • Stochastic Block Models (sbm-class and misssbm-class)
  • Degree Corrected Stochastic Block Models (dcsbm-class)
  • Multinomial Stochastic Block Models (multsbm-class)
  • Degree Corrected Latent Block Models (co_dcsbm-class)
  • Mixture of Multinomials (mm-class)
  • Gaussian Mixture Model (gmm-class and diaggmm-class)
  • Multivariate Mixture of Gaussian Regression Model (mvmreg-class)

With the Integrated Classification Likelihood, the parameters of the models are integrated out. This allows a natural regularization for complex models. Since the Integrated Classification Likelihood penalizes complex models it allows to automatically find a “natural” value for the number of clusters K, the user only needs to provide an initial guess as well as values for the prior parameters (sensible default values are used if no prior information is available). The optimization is performed by default thanks to a combination of a greedy local search and a genetic algorithm. Several optimization algorithms are available.

Eventually, the whole path of solutions from K to 1 cluster is extracted. This enables a partial ordering of the clusters, and the evaluation of simpler clustering. The package also provides some plotting functionality.