The rapidly evolving world around us poses significant challenges when developing AI models. Models are often trained on longitudinal data in hopes that it will be representative of future inputs. However, the relevance of all training data is not always consistent, which can lead to problems in practical application.
For instance, consider the CLEAR non-stationary learning benchmark. Here, visual features of objects significantly evolve over a 10-year span. This slow concept drift is a substantial challenge for object categorization models.
Alternative methods such as online and continual learning try to tackle this by updating the model with small amounts of recent data to maintain its relevance. However, different types of data lose relevance at varying rates, posing additional challenges.
In recent research on non-stationary learning, an approach is proposed to assign an importance score to each instance during training. This score can help achieve significant gains over other learning methods on nonstationary learning benchmarks.
To understand this further, let's understand the challenge of concept drift for supervised learning. Using a recent photo categorization task, classifiers were built for about 39M photos from social media over a decade. The model accuracy was measured during and after the training period, revealing the benefits and drawbacks of forgetting older data.
To effectively address slow concept drift, we combined the benefits of offline learning, i.e., the effective reuse of all available data, with continual learning that helps downplay older data.
In this new method, a helper model is used to train weights based on contents and age of each data point. The goal is to improve the model's performance on future data.
The application of this approach has shown promising results in improving accuracy in various fields. For instance, our method consistently outperforms standard continual learning and other algorithms in photo categorization tasks.
The relevance of our approach is further showcased by significant gains in different nonstationary learning challenge datasets which include data sources and modalities like photographs, satellite images, social media text, medical records, sensor readings and tabular data.
As we look forward, extending our approach to continual learning presents an interesting opportunity. The limitations of continual learning still exist. However, the results are promising, and we hope our work will generate increased interest and new ideas in dealing with slow concept drift.
Lastly, we would like to thank Mike Mozer for his valuable insights and feedback that helped shape this work.
Disclaimer: The above article was written with the assistance of an AI. The original sources can be found on Google Blog.