Multi-label machine learning algorithms for automated image annotation


There are two main purposes for this thesis. Firstly we are trying to improve the multi-label classification techniques and secondly we apply these techniques in automated image annotation field. In machine learning part we examine the Ensemble of Classifier Chains (ECC) algorithm. We modify this algorithm in order to improve the per concept (or per label) performance and improve its performance over Mean Average Precision (MAP) metric. Also we suggest techniques to manipulate the existence of label constraints in a data set. We introduce a post-processing step and we suggest two different techniques to operate the different constraints in the data set. In the second part we focus mainly on the data set that we examine in this work. This dataset is taken out from the photo annotation task of ImageCLEF 2010 contest and we give a short description of it. Then we build models depending on two different kinds of information that we have for every image of the data set, the visual information and the textual information. Another contribution of that work is the suggestion of an ensemble model depending only on different kinds of textual information. An interesting thing to mention is that there is an increasing interest for automated image annotation. Many contests are focused on this field while are already some online applications for image annotation. So it is worth to search and simulate multi-label algorithms in image annotation field in order to see how they perform comparing to other machine learning algorithms.

MSc Thesis, School of Informatics, Aristotle University of Thessaloniki, Greece.
Thesis supervisor: Assistant Professor Grigorios Tsoumakas.