Bloomberg Research Reveals New Insights on AI Safety in Finance with RAG LLMs
Bloomberg Research on AI Safety in Finance
In a significant effort to address the challenges of artificial intelligence (AI) in finance, Bloomberg has recently released two critical academic papers that shed light on the safety implications of using retrieval augmented generation (RAG)-based large language models (LLMs) in the industry. These studies emphasize the need for more responsible and trustworthy AI solutions, especially in high-stakes environments like capital markets.
Findings of the Research
The first paper, titled "RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models," details how RAG techniques, which are designed to enhance the accuracy of LLMs through context from external data, can paradoxically compromise their safety. Researchers conducted extensive assessments using over 5,000 harmful queries to evaluate the safety profiles of 11 prominent LLMs, including Claude-3.5-Sonnet and GPT-4o. They discovered alarming trends: models that were considered quite safe in non-RAG conditions showed increased vulnerability and generated unsafe outputs when placed in RAG contexts.
Dr. Amanda Stent, Bloomberg's Head of AI Strategy Research in the CTO Office, highlighted the implications of their findings, noting the ubiquity of RAG systems in applications such as customer support and inquiry answering. "AI practitioners must embrace a more thoughtful approach to employing RAG, ensuring adequate safeguards are in place to mitigate potential risks. Our research provides a framework for evaluating AI solutions and identifying blind spots," she stated.
In the second paper,
"Understanding and Mitigating Risks of Generative AI in Financial Services," the focus shifts to the practical applications of GenAI in capital markets. Bloomberg's researchers recognized a gap in existing safety taxonomies and guardrail systems, which often do not adequately address the unique risks associated with financial services, such as confidential disclosures and sector-specific ethical dilemmas.
The Role of Bloomberg's Taxonomy
To fill this void, the team proposed a targeted AI content risk taxonomy tailored to the demands of financial services. This taxonomy extends beyond general-purpose frameworks, addressing risks such as financial service impartiality and misconduct. David Rabinowitz, a Technical Product Manager for AI Guardrails at Bloomberg, remarked, "While there is considerable research on bias and fairness for consumer-oriented GenAI applications, we need focused attention in industry-specific contexts like finance."
Dr. Sebastian Gehrmann, Bloomberg's Head of Responsible AI, emphasized the pressing need for all sectors to adopt AI effectively while mitigating risks, saying, "Our taxonomy aims to empower the financial sector to develop AI systems that are not only compliant with emerging regulatory frameworks but also foster enduring trust with clients."
Presentations and Future Steps
The findings from both papers will be extensively shared in upcoming conferences; the first study will be presented at NAACL 2025 in Albuquerque, New Mexico, while the second will feature at the ACM Conference on Fairness, Accountability, and Transparency in Athens, Greece, this June.
Conclusion
Bloomberg's commitment to advancing AI responsibly in finance is clear from these significant research outputs. The papers not only urge organizations to evaluate the safety of their AI models but also to embrace a fair assessment of AI's capabilities and its integration into the finance landscape. By addressing safety, transparency, and responsible usage, Bloomberg is paving the way for innovations in how financial information is processed and presented through AI.
For further reading, detailed analyses can be viewed on Bloomberg's Tech At Bloomberg blog along with access to their research papers.