General Supervised Learning Framework for Open World Classification

Sai Krishna Theja Bhavaraju; Mohammad Amin Basiri; Charles Nicholson

doi:10.1613/jair.1.20947

PDF

Published: Feb 25, 2026

DOI: https://doi.org/10.1613/jair.1.20947

Keywords:

machine learning, data mining, knowledge discovery, probabilistic reasoning

Sai Krishna Theja Bhavaraju

University of Oklahoma

Mohammad Amin Basiri

University of Oklahoma

https://orcid.org/0000-0002-2005-0393

Charles Nicholson

University of Oklahoma

https://orcid.org/0000-0002-7023-8802

Abstract

In open-world supervised learning for classification, the training data is incomplete with respect to the full set of relevant classes in the application domain. Most existing research on this problem focuses on computer vision, and many of the proposed methodologies are intrinsically tied to specific machine learning algorithms or data types. However, real-world open-world settings may arise in a wide array of problem contexts, each with its own data type and classifier requirements. Although existing research emphasizes the identification of unknown sets or classes, it does not sufficiently address automatically categorizing these new classes and updating predictive models. In this work, we present a framework that addresses all aspects of the open world classification pipeline. The proposed approach is data- and model-agnostic, making it versatile across different domains. Our framework performs automatic identification and categorization of unknown instances into distinct new classes while dynamically updating predictive models without human intervention. We evaluate it on diverse data types, including images, text, and sensor data, demonstrating effectiveness across experiments with accuracy improvements ranging from 27 to 69 percentage points. To assess robustness and provide practical guidance, we conduct comprehensive sensitivity analysis examining the impact of key parameters including the number of known classes, the Chebyshev confidence parameter, the itemset size parameter, and base classifier quality. Additionally, we provide insights into practical applications through a case study on social media analytics for disaster response, highlighting the adaptability of the framework in real-world scenarios.

Issue

Vol. 85 (2026)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details