Paper accepted to Electronic Journal of Statistics
Fabian Scheipl, Sonja Greven
Regression models with functional responses and covariates constitute a powerful and increasingly important model class. However, regression with functional data poses well known and challenging problems of non-identifiability. This non-identifiability can manifest itself in arbitrarily large errors for coefficient surface estimates despite accurate predictions of the responses, thus invalidating substantial interpretations of the fitted models. We offer an accessible rephrasing of these identifiability issues in realistic applications of penalized linear function-on-function-regression and delimit the set of circumstances under which they are likely to occur in practice. Specifically, non-identifiability that persists under smoothness assumptions on the coefficient surface can occur if the functional covariate’s empirical covariance has a kernel which overlaps that of the roughness penalty of the spline estimator. Extensive simulation studies validate the theoretical insights, explore the extent of the problem and allow us to evaluate their practical consequences under varying assumptions about the data generating processes. A case study illustrates the practical significance of the problem. Based on theoretical considerations and our empirical evaluation, we provide immediately applicable diagnostics for lack of identifiability and give recommendations for avoiding estimation artifacts in practice.