Most industries are already starting to feel the impact of AI. Tooling is being released that vastly improves capabilities which both makes people more productive and puts others out of work. More ambitious use cases are hitting the limits of AI due to one key factor: memory. Various AI systems have short memories, where you soon you have to start sending the context of each input alongside the input itself as a reminder of what you're discussing.

This is a big limiting factor in developing complex tooling and systems: how can you work with something that has the memory of a goldfish? We'll dig into the different ways memory extension is currently achieved, some applications of AI when long term memory is available and ethical considerations.

The Evolution of AI and Long-term Memory Techniques

While the rise of super capable AI systems seems to have come out of nowhere for the general public, the history of AI and long-term memory techniques goes back to the development of early neural networks. By understanding the evolution of AI and long-term memory techniques development we can start to see how this might continue to move forward.

Feedforward Neural Networks

The earliest AI models included feedforward neural networks, which involved the flow of information in only one direction. These networks were restricted in their ability to store information between layers of neurons. As a result, they struggled to capture complex temporal dependencies and distinguish between short-term and long-term memory.

Basic Recurrent Neural Networks (RNNs)

To overcome the limitations of feedforward networks, researchers developed basic recurrent neural networks (RNNs). RNNs are capable of processing sequences of data and maintaining some form of memory by allowing information to flow in loops. While they demonstrated better performance than feedforward networks, RNNs struggled with vanishing gradients and handling temporal dependencies over long time horizons.

Long Short-Term Memory (LSTM)

LSTM was one of the most significant advancements in the field of AI and long-term memory. Introduced by Hochreiter and Schmidhuber in 1997, LSTM networks utilize special gate mechanisms that control the flow of information. This approach enables these networks to learn and remember complex temporal dependencies, making them well-suited for tasks that involve sequential data.

Attention Mechanisms

Attention mechanisms emerged as another breakthrough in long-term memory approaches. Mimicking the human cognitive process of selectively focusing on specific aspects, attention mechanisms allow neural networks to weigh the importance of inputs and direct attention accordingly. This capability greatly improved the efficiency of managing long sequences of data and significantly impacted natural language processing and machine translation tasks.

When we talk about long term memory in today's AI systems, there is a strong focus on the size of the context window. As most users aren't going to be training their own models, especially since some of these cost millions of dollars today, the key mechanism for interaction is the context window. There are a few ways of working with this limitation to get closer to desired outcomes, but let's also see what's possible more broadly.

Different Approaches to AI and Long-term Memory

Various techniques have emerged to augment the capabilities of AI systems in managing long-term memory along with addressing the constraints of smaller context windows. The AI field is still quite theory-heavy, but a lot of this has led directly to applications that extend memory with various trade-offs.

Memory-augmented Neural Networks (MANNs)

MANNs integrate external memory storage with traditional neural network architecture to improve its ability to retain and access relevant information over time. These models typically utilize key-value memories that store associations between keys (as features) and values (as their corresponding information). One popular implementation is the Neural Turing Machine (NTM), which combines a neural network controller and an external memory matrix. In financial applications, MANNs can be employed for tasks requiring rapid information retrieval, such as algorithmic trading, portfolio management, and customer service chatbots.

Reinforcement Learning

Reinforcement learning focuses on training AI models to make decisions through trial and error while maximizing the overall reward. Although not a direct long-term memory technique, reinforcement learning implicitly plays a role in capturing sequential dependencies and using past experiences to influence future actions. Its applications include credit scoring systems, trading strategies, and customer segmentation in finance.

Lifelong Learning (Catastrophic Forgetting)

Lifelong learning refers to the development of AI models capable of adapting to new tasks while retaining knowledge from prior experiences. This is achieved by combating catastrophic forgetting, a phenomenon where AI models lose critical information when learning new tasks. Techniques like elastic weight consolidation (EWC) and progressive neural networks (PNN) help mitigate this issue, making lifelong learning a step closer to reality.

Trade-offs in Long-term Memory AI Systems

Long term memory definitely does not come for free, and it depends on what cost you are willing to pay for a given level of quality for an implementation.

The first is computational resources. AI models, especially those like Recurrent Neural Networks (RNNs) or Long Short-Term Memory (LSTM) models that maintain state over time, are very complex and generally require substantial computational power and memory. The more complex the model, the slower it trains and the more hardware it needs.

Training data is the next. "Garbage in, garbage out" still applies to large models and models with long-term memory often require large volumes of relevant, high-quality data to learn effectively. Gathering this data can be time-consuming and a lot of consideration needs to be given to privacy and ethics, especially in sensitive domains like finance. There is a potential workaround where this data could be generated by AI, but this is not a magic fix.

Algorithmic complexity is another challenge. Models capable of long-term memory, such as LSTM or Transformer models, use sophisticated architectures that can be difficult to design, understand, and debug. These complexities can lengthen the development process and require an advanced understanding.

The Impact and Potential of Long-term Memory AI in Finance

AI with a long-term memory capability would have a substantial impact on finance, both internal functions and direct interactions with customers. Digital transformation is already in full swing for most financial institutions, and this would accelerate the impact.

One key area where the long-term memory would bring considerable changes is credit scoring. Traditional credit scoring approaches rely on static variables and do not always capture the intricacies of an applicant's financial behavior over time. However, an AI model with long-term memory could analyze the  loan applicant's financial behavior over time, tracking patterns and trends that emerge over extended periods. This could provide a more holistic and nuanced understanding of an applicant's creditworthiness, leading to more accurate risk assessments.

For example, suppose a potential borrower has had a historically stable financial record but recently experienced a few months of economic hardship. Traditional models might penalise this short-term instability, potentially leading to an unjustifiably high-interest rate or loan denial. An AI system with long-term memory could recognize this as a temporary deviation from a generally reliable financial behavior, leading to a more fair assessment.

Similarly, fraud detection stands to gain significantly. Traditionally, fraud detection systems have relied on detecting anomalies or patterns within relatively limited time frames. However, some fraudulent activities unfold slowly over extended periods, making them hard to detect with these traditional methods.

By remembering patterns and behavior over longer periods, these systems could identify slow-burning anomalies that indicate fraud. For instance, consider a scenario where small, seemingly insignificant amounts are regularly siphoned off a customer's account, flying under the radar of conventional systems. An AI model with long-term memory could recognize this pattern over time, flagging it for further investigation.

Customer service could also be significantly enhanced with long-term memory AI. Virtual assistants or chatbots could remember past interactions with customers, leading to more personalized and efficient service. A customer might previously have inquired about certain financial products or services, and a virtual assistant with long-term memory could recall this information in future interactions, providing tailored suggestions based on the customer's past interest. Their personal data could also be included in any feedback and recommendations.

Conclusion

As AI systems continue to get better at retaining memory this will have deep and profound impacts on both financial institutions themselves, as well as how customers interact with them. Their potential to transform credit scoring, fraud detection, and customer service, among other areas, signals an exciting new chapter in financial innovation.

But, this is not without challenges. Data privacy, computational resources and the complexity of algorithms all need to be addressed strategically and carefully implemented.

Working through these challenges and safely implementing these solutions within financial institutions will unlock significant value for both the company as well as its customers. The earlier companies start experimenting with this technology and understanding the nuances and complexity, the quicker a fully fledged solution can be deployed and the benefits realised.