深圳大学计算机与软件学院

Learning with Open-world Noisy Data via Class-independent Margin in Dual Representation Space

AAAI Conference on Artificial Intelligence (AAAI)

Linchao Pan, Can Gao^*, Jie Zhou, Jinbao Wang

Shenzhen University

Abstract

Learning with Noisy Labels (LNL) aims to improve the model generalization when facing data with noisy labels, and existing methods generally assume that noisy labels come from known classes, called closed-set noise. However, in real-world scenarios, noisy labels from similar unknown classes, i.e., open-set noise, may occur during the training and inference stage. Such open-world noisy labels may significantly impact the performance of LNL methods. In this study, we propose a novel dual-space joint learning method to robustly handle the open-world noise. To mitigate model overfitting on closed-set and open-set noises, a dual representation space is constructed by two networks. One is a projection network that learns shared representations in the prototype space, while the other is a One-Vs-All (OVA) network that makes predictions using unique semantic representations in the class-independent space. Then, bi-level contrastive learning and consistency regularization are introduced in two spaces to enhance the detection capability for data with unknown classes. To benefit from the memorization effects across different types of samples, class-independent margin criteria are designed for sample identification, which selects clean samples, weights closed-set noise, and filters open-set noise effectively. Extensive experiments demonstrate that our method outperforms the state-of-the-art methods and achieves an average accuracy improvement of 4.55\% and an AUROC improvement of 6.17\% on CIFAR80N.

Figure 1: Illustration of the Learning with Open-world Noisy Data (LOND) setup and the effect of open-set noise. Left: In addition to closed-set noise, open-set noise is also present in the training and testing stage. Right: Open-set noise (w/ Open) significantly degrades performance on CIFAR80N with 80\% symmetric noise.

Figure 2: Overview of our framework.

Figure 3: Visualization of learned feature representations on CIFAR80N at Sym-20%, where open-set noise is colored in black.

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 62476171, 62476172, 62076164, and 62206122), the Guangdong Basic and Applied Basic Research Foundation (Grant No. 2024A1515011367), the Guangdong Provincial Key Laboratory (Grant No. 2023B1212060076), and the Shenzhen Institute of Artificial Intelligence and Robotics for Society.

Bibtex

@inproceedings{PanGZ25,

author = {Linchao Pan and Can Gao and Jie Zhou and Jinbao Wang},

title = {Learning with Open-world Noisy Data via Class-independent Margin in Dual Representation Space},

booktitle = {Thirty-Nineth {AAAI} Conference on Artificial Intelligence, {AAAI}, 2025, February 25-March 4, 2025, Philadelphia, Pennsylvania, USA},

pages = {1--9},

year = {2025},

url = {xx},

doi = {xx},

}

Downloads

Paper