Regression between compositional data sets


Regression between compositional data sets

Tolosana-Delgado, R.; van den Boogaart, K. G.

Abstract

Linear regression where both the explained and the explanatory variables form compositions are naturally tractable within the log-ratio framework. Fitting such models does not imply any diculty: they can be t in a standard way after applying any one-to-one logratio transformation to each compositional set. Problems arise to test and display the model, due to the large dimension of the model parameters space, and the dicult interpretation of classical hypotheses in terms of the original components. This contribution proposes two graphical representations of the model: in the form of a biplot, parallel to redundacy analysis, and as condence ellipses on the parameters projected onto a set of subcompositions. Each of these representations brings also associated a way to test for certain subcompositional independence hypotheses. An exact, general, Scheffé-like test of independence (for the whole composition or any subcomposition) can be derived from a generalized eigenvalue problem of the matrix of regression coecients and its estimation covariance matrix. For certain hypotheses of independence, classical tests based on Hotelling's T2 or X2 distributions can also be adapted. Any of these tests can be used to calculate the radii of condence ellipses on the parameters, in order to visualize the corresponding tests. This provides a toolbox to reduce the complexity of compositional-to-compositional regression, and enables a structured way of exploring and testing which components of the explanatory set influence which components of the explained set.

  • Open Access Logo Contribution to proceedings
    the 5th International Workshop on Compositional Data Analysis, 03.-07.06.2013, Vorau, Österreich
    Proceedings of the 5th International Workshop on Compositional Data Analysis, 978-3-200-03103-6, 164-188

Permalink: https://www.hzdr.de/publications/Publ-18919