General Supervised Learning Framework for Open World Classification

Main Article Content

Sai Krishna Theja Bhavaraju
Mohammad Amin Basiri
Charles Nicholson

Abstract

In open-world supervised learning for classification, the training data is incomplete with respect to the full set of relevant classes in the application domain. Most existing research on this problem focuses on computer vision, and many of the proposed methodologies are intrinsically tied to specific machine learning algorithms or data types. However, real-world open-world settings may arise in a wide array of problem contexts, each with its own data type and classifier requirements. Although existing research emphasizes the identification of unknown sets or classes, it does not sufficiently address automatically categorizing these new classes and updating predictive models. In this work, we present a framework that addresses all aspects of the open world classification pipeline. The proposed approach is data- and model-agnostic, making it versatile across different domains. Our framework performs automatic identification and categorization of unknown instances into distinct new classes while dynamically updating predictive models without human intervention. We evaluate it on diverse data types, including images, text, and sensor data, demonstrating effectiveness across experiments with accuracy improvements ranging from 27 to 69 percentage points. To assess robustness and provide practical guidance, we conduct comprehensive sensitivity analysis examining the impact of key parameters including the number of known classes, the Chebyshev confidence parameter, the itemset size parameter, and base classifier quality. Additionally, we provide insights into practical applications through a case study on social media analytics for disaster response, highlighting the adaptability of the framework in real-world scenarios.

Article Details

Section
Articles