Abstract: Multi-expert ensemble models for long-tailed learning typically either learn diverse generalists from the whole dataset or aggregate specialists on different subsets. However, the former is ...
Abstract: Knowledge distillation has become a crucial technique for transferring intricate knowledge from a teacher model to a smaller student model. While logit-based knowledge distillation has shown ...