ReVision

Retraining Neural Networks: The Hidden Complexity of Unlearning & Relearning

Synopsis: In a rapidly changing world, firms like Apple and Nokia must adapt quickly. This article explores why retraining neural networks can be more challenging than training them from scratch, using a perceptron to demonstrate the complexities of unlearning and relearning.
Monday, June 17, 2024
ANN
Source : ContentFactory

One of the fascinating aspects of artificial intelligence (AI) is its ability to shed light on human intelligence. Ironically, AI also challenges our cognitive abilities significantly. Technologies like AI are transforming society at an unprecedented pace. In his book "Think Again," Adam Grant emphasizes that in volatile environments, rethinking and unlearning might be more crucial than thinking and learning. This concept is particularly relevant for aging societies, where the adage "You can't teach an old dog new tricks" often rings true. But why is this the case?

The brain structure of younger people differs physiologically from that of older individuals. However, these differences can vary widely among individuals. Research by Creasy and Rapoport suggests that brain functions can remain highly effective even in older age. Beyond physiology, factors such as motivation and emotion play critical roles in learning. Studies by Kim and Marriam highlight that cognitive interest and social interaction are strong motivators for learning among older adults. Our article examines this issue from a mathematical and computer science perspective, inspired by Hinton and Sejnowski. We conduct an experiment with an artificial neural network to illustrate why retraining can be more challenging than initial training.

Artificial neural networks mimic the structure and behavior of brain neurons. Typically, an ANN consists of input cells that receive signals from the external environment. By processing these signals, the network makes decisions. A perceptron, a simple variant of an ANN introduced by Rosenblatt in 1958, is used in our experiment. The perceptron receives input numbers, which are processed through learned weights to produce an output. This output determines the network's decision. Different activation functions, such as the binary step function and the sign function, can be used to determine outputs.

Our experiment is inspired by a real-world scenario. Imagine being a mobile phone manufacturer in 2000, training a perceptron to predict whether customers will buy a particular phone model. At that time, touchscreens were immature, and customers preferred devices with keypads and lower prices, making the Nokia 3310 a best-seller. We train the perceptron using a hypothetical dataset reflecting these preferences. The perceptron gradually adjusts its weights to minimize prediction errors, learning that customers prefer low-priced phones with keypads.

Fast forward to 2007, when touchscreens have become user-friendly, and customers now prefer them over keypads. Additionally, mobile phones have become status symbols, and customers are willing to pay higher prices. Apple's entry into the market with the iPhone exemplifies this shift. To adapt to these new preferences, we retrain the perceptron with updated data. The retraining process requires the network to unlearn old weights and adjust to new ones, which is more complex than initial training.

Experiment shows that retraining an old perceptron takes more iterations than training a new one from scratch. This is because the old network must first unlearn previous weights before learning new ones. In contrast, a new perceptron can directly learn the new weights, making the process more efficient. This insight can be applied to real-world scenarios, such as why established companies struggle to adapt to disruptive innovations compared to new market entrants.

The experiment demonstrates that retraining an ANN can be more costly than training a new one. This principle can be extended to human learning, suggesting that unlearning and relearning are inherently more challenging. In "The Innovator's Dilemma," Christensen explains why market leaders often fail to adapt to new technologies, while new entrants succeed. Economic factors play a role, but our experiment suggests that mathematical reasons might also contribute. New entrants can learn from scratch, while established firms must first unlearn outdated practices.

Radical changes pose challenges not only for businesses but also for society. In "The Second Machine Age," Brynjolfsson and McAfee compare the digital age to the industrial revolution, highlighting the social upheaval caused by disruptive technologies. The struggle to adapt might be due to the difficulty of unlearning and relearning, as illustrated by our experiment. Understanding these challenges can help address the pain points of societal transformation.