Main Article Content
There is an increasing need to derive semantics from real-world observations to facilitate natural information sharing between machine and human. Conceptual spaces theory is a possible approach and has been proposed as mid-level representation between symbolic and sub-symbolic representations, whereby concepts are represented in a geometrical space that is characterised by a number of quality dimensions. Currently, much of the work has demonstrated how conceptual spaces are created in a knowledge-driven manner, relying on prior knowledge to form concepts and identify quality dimensions. This paper presents a method to create semantic representations using data-driven conceptual spaces which are then used to derive linguistic descriptions of numerical data. Our contribution is a principled approach to automatically construct a conceptual space from a set of known observations wherein the quality dimensions and domains are not known a priori. This novelty of the approach is the ability to select and group semantic features to discriminate between concepts in a data-driven manner while preserving the semantic interpretation that is needed to infer linguistic descriptions for interaction with humans. Two data sets representing leaf images and time series signals are used to evaluate the method. An empirical evaluation for each case study assesses how well linguistic descriptions generated from the conceptual spaces identify unknown observations. Furthermore, comparisons are made with descriptions derived on alternative approaches for generating semantic models.