Multi-target feature selection through output space clustering


A key challenge in information theoretic feature selection is to estimate mutual information expressions that capture three desirable terms: the relevancy of a feature with the output, the redundancy and the complementarity between groups of features. The challenge becomes more pronounced in multi-target problems, where the output space is multidimensional. Our work presents a generic algorithm that captures these three desirable terms and is suitable for the well-known multi-target prediction settings of multi-label/dimensional classification and multivariate regression. We achieve this by combining two ideas: deriving low-order information theoretic approximations for the input space and using clustering for deriving low-dimensional approximations of the output space.

European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN)