Compositional Non-parametric Regression


Compositional Non-parametric Regression

Tolosana-Delgado, R.; van den Boogaart, K. G.

Abstract

Compositional regression is concerned with modelling the dependence of a composition on one or more covariables, or vice versa. State of the art methods typically rely on the assumptions of linearity of the dependence and for tests on the additive logistic normal distribution of the errors. Several different solutions for non-linear regression and tests without normality assumption are available for non-compositional data. Based on them, this contribution derives non-parametric regression models and methods valid for compositional data.

With respect to the non-linear dependence, some sort of regularisation assumption is always required. Different classical approaches can be adapted for compositonal data. LOESS smoothing on pairwise log-ratios or logratio transforms would correspond to some sort of smooth (compositional) derivatives. Regression splines and smoothing splines are already defined in a multivariate way and allow to control the degree of continuity and smoothness by explicit parameters. Piecewise regression needs to be applied to log-ratio transforms and allows to model non-continuous dependence. Geostatistical interpolation or, equivalently, reproducing kernel splines, allow a precise control over the level of continuity and complexity through the variogram.

All methods mentioned admit a multivariate extension which, by virtue of the principle of working in coordinates, automatically give rise to compositional versions of those methods. Moreover, all are either affine equivariant, or else very slight restrictions of them are. Thus, the associated compositional versions deliver results which are: invariant with respect to the choice of basis, scaling invariant, and subcompositionally coherent (in the case of regression with compositional response).

With regard to testing, there are some philosophical difficulties in a classical ``zero slope hypothesis''. A strict test for dependence could be very misleading when used for model selection in a non-parametric setting. As an alternative we propose to check, whether the prediction by the non-parametric model outperforms the prediction by parametric (constant) one. We propose to compare the jacknifed residuals of the two models. This construction allows to construct all meaningful tests of compositional dependence, namely: global lack of dependence, lack of dependence within a subcomposition, as well as restricted dependence within a subcomposition.

  • Vortrag (Konferenzbeitrag)
    CoDaWork 2017, 06.-09.06.2017, Abbadia San Salvatore, Italia

Permalink: https://www.hzdr.de/publications/Publ-24846