Americas

  • United States

Asia

lucas_mearian
Senior Reporter

Q&A: How Thomson Reuters used genAI to enable a citizen developer workforce

feature
Feb 01, 202419 mins
Artificial IntelligenceAugmented RealityChatbots

Thomson Reuters spent years building an AI platform to cull through massive troves of data and documents for its legal, global trade and compliance clients. But when generative AI came along, the company was forced to up its game.

shutterstock 2281307641
Credit: Shutterstock/Beautrium

Over the past three decades, Thomson Reuters has relied on artificial intelligence (AI) to help its clients — and its own employees — sift through troves of digital documents to discover those most relevant to the issue at hand.

But when generative AI (genAI) burst onto the scene in late 2022, the company was forced to rethink its AI strategy to stay ahead of competitors and address an insatiable client demand for industry-specific information.

Thomson Reuters in November unveiled its genAI strategy and product rollout after its integration with Microsoft Copilot, along with a $650 million acqusition of genAI tech provider Casetext and a pledge to invest $100 million annually in new genAI tools for internal and client use.

The company’s genAI solution is a cloud-based, API-driven platform that leverages the full scale of the company’s content to enable employees and clients to build new AI skills with reusable components. The Toronto-based company also had to reskill all of its employees to understand how to best use the new genAI platform.

Along with a staff of more than 2,500 journalists and 6,500 photojournalists, the global content and technology company provides data and information to professionals across the legal, tax, and accounting industries, among others.

Shawn Malhotra, head of engineering at Thomson Reuters, led the creation of an “AI Skills Factory,” a low-code way to create new apps with minimal engineering support. The platform enables technologists to design, build, and deploy tools quickly, while allowing non-techies to experiment with genAI safely, making innovation and ideation faster and more inclusive. 

Because of the success of its genAI platform, Thomson Reuters has been able to roll out three AI-enabled solutions for attorneys and other clients over the past three months, and plans more in the near future.

shawn malhorta head of engineering at thomson reuters Shawn Malhorta

Shawn Malhorta, head of engineering for Thomson Reuters.

Computerworld talked with Malhorta about how his company built its AI strategy and how it has helped workers internally and clients perform their jobs.

Tell me about Thomson Reuters and the problem it was facing that genAI addressed? “GenAI was a game changer. We serve legal professionals, tax professionals, risk and compliance professionals, and Reuters News. We have a history of using natural language processing prompts.

“We started by listening to our customers and realizing that many of the things they struggle with and spend a lot of their time on we can accelerate with these tools around large language models (LLMs). In fact, that created a secondary problem for us; there was so much opportunity across all these end markets that the problem we faced is how do we address all of this at the pace the customers required us to? The genAI platform allows us to innovate at the speed our customers need.

“In November, we launched the AI-Assisted Research on Westlaw Precision memo. We have another two [recent] product launches and being able to do that at pace is really enabled by this AI platform, which lets our developers quickly leverage building blocks that they can reuse across multiple products.”

You must have already had an AI team in place. When genAI arrived, who did you add to that team, and who would you suggest other organizations have on their AI teams? “I don’t think it’s much different from other development efforts. You need developers. You need designers. You need product managers. You need your legal team to ensure what you’re doing is what your customers expect it to be. All of the stakeholders you’d expect to have. The difference with AI is whether it’s legal, developers, marketing, this is new to us. So it’s the same types of teams and so they have to have greater depth of understanding AI, which is why training is so important. And they have to be able to solve the new problems that are emerging.

“We had a great organization already focused on AI and then rapidly grew that team [for genAI]… One of the biggest value adds is even [with] technologists who aren’t AI experts and non-technologists, you need to find a way to get them to be able to add value for customers as well.

“If the solution to the problem is strictly hiring more AI experts to deliver for your customers, you won’t be able to deliver new products fast enough. You have to do it in a scalable way, and for us that’s taking that DBI expertise and [using] that to build the building blocks in order to let the non-AI experts deliver AI value to our customers. That’s the only way I see this scaling.”

When genAI became available to the public in late 2022, how was that a game changer? “We’d been experimenting with the previous versions of [OpenAI’s] large language models, as well as other transformer-based models in the past. So, we’ve always had this on our radar.

“What happened in November 2022 was the size and quality of language models at that time passed an inflection point where problems that we weren’t capable of solving before, we’re now able to. We’d tried with previous models, like used them to help an attorney summarize a whole bunch of cases and get to the salient facts effectively. They couldn’t do that well. All of a sudden they were good enough. So what GPT did was accelerate something that was already in motion, and so we had to quickly react to that.”

Tell me about how you’ve been applying genAI in the workplace through your acquisition of Castext and AI-assisted research tools such Westlaw Precision and CoCounsel Core. “I’ll start with Westlaw. What we released in November last year was the AI-assisted research memo. One of the key problems an attorney faces is at some point virtually all of them have to do legal research. That often means entering search text into a search engine in Westlaw in order to find cases that may be relevant to the legal question you have.

“That requires complete, current, and correct content to make sure you’re searching over the right stuff. And it requires you as an attorney to read through all that content. Hopefully, it only surfaced the relevant content, but you still have to read through it. You’ve got to understand if it was actually relevant to the research you were doing. That’s the first step.

“Previously, we’d used AI to help surface the right information. That solved the search problem. Now, what genAI allowed us to do is not only does Westlaw Precision memo find the right information, it now summarizes all of the cases that might be relevant. It gives you the citations you might need so you know it’s coming from trusted content. And then it actually gives you a readable, understandable summary.

“That saves our customers time. And it helps them provide a higher quality product for their end customers.

“Casetext and CoCouncil was created by some great work from the Casetext team. They were actually one of the first companies in the world to partner with OpenAI prior to the release of GPT. That gave them a head start in creating things they’d refer to as AI-skills — so, things that will help an attorney do things like summarizing documents, asking questions of a database of information and a variety of other skills.

“Rather than search for your own content, you just tell us what you want to do and we surface the right content back to you. So it’ll help get AI-assisted answers much more quickly.”

What use of genAI surprised you? “I think it’s just the quality of the result. What we weren’t surprised by was if you used generative AI without world-class, trusted content, you do have problems. But the scale of those problems wasn’t as big as it used to be. We used to say if you took those language models and graded them prior to the recent innovations, they might have been a D student or an F student; they were doing quite poorly.

“When you came to newer LLMs and tried them out of the box in the customer space, it might have graduated a D or C student. So that was a big change. It was a considerable improvement.

“But what we learned is by applying techniques such as retrieval augmented generation [RAG], which allowed us to take our world-class content and the power of an LLM and combine them, now you can make it an A student. The thing to remember, for our customer base, it’s not OK to be right some of the time. They deal with very high-stakes situation where they have to know the content is trusted and current and complete.

“That RAG-based approach was a real ‘aha moment,’ where we found some real value for our customers.”

Which of your data content systems did you have to plug into the LLMs and did vendors have the APIs needed or did you have to create them? “Some of our content is proprietary, so we’re not getting it from customers. So for years, we’ve built APIs that have made it very accessible. This goes back to that genAI platform. One of the components of the Reuters Thomson platform are simple APIs to allow you to access content.

“If you’re a developer who wants to build an application and you want to access legal content, we have that genAI platform [to] give you easy and safe access to that content. So you can just worry about building the business logic for the application rather than build the APIs to get to the content.”

What were some of the challenges you face? “I think it was just speed. There’s such an appetite to solve problems that genAI is capable of solving. It was ensuring we could deliver at the speed our customers needed but in a safe reliable, and secure way. Those two things can sometimes be at odds with each other.”

What is the Thomson Reuters AI Plaform? It is simply your flavor of Microsoft 365 Copilot or is this a fully proprietary platform? “It is separate [from Copilot]. This is something Thomson Reuters developed. It’s a set of building blocks. Each one aims to make it easier for someone within Thomson Reuters to build a valuable genAI application for our customers. Some of the examples of building blocks…allow you to access content in a safe, secure way. Building blocks allow you to build a front-end experience that’s consistent across all our products, which means they’re easy to use.

“There are building blocks that are used to build the prompt for you. So the end developer doesn’t have to understand all the nuance of how to build a prompt for any given large language model. There are building blocks in there that allow you to experiment with different LLMs, so you can try them out and see which one will work well for you. Some of those proprietary models we built ourselves, others are third-party models we think produce good results.

“Some of those building blocks allow you to access the language model in what we call a low-code or no-code way. That means someone who isn’t a technologist, but who is someone who has an idea. So, say I’m attorney editor at Thomson Reuters and I understand the law, and I wonder if an AI model could potentially do a good job summarizing a type of document. Our genAI platform allows them to experiment and answer that question without having to write any code. That’s really powerful, because going back to that fundamental problem with speed, it’s allowing everyone to take part in that innovation and help with ideas.”

AI-augmented code development has often been cited as low-hanging fruit for first-time users of genAI. What did you find? “There’s two things. Regardless of what function you’re in, and this doesn’t just apply to Reuters, genAI has the potential to help augment what we do — to make us more effective. So, if I look at my development team, we’re absolutely looking at generative AI tools that help them write better code and do that faster. That actually increased developer satisfaction. Again, that goes back to speed of developing new products for our clients.

“So, we’re definitely using that within our developers environment. Then you have that same kind of acceleration where someone at Thomson Reuters who really wasn’t capable of experimenting with AI, who wasn’t a technologist, by augmenting them with these low-code, no-code aspects of the genAI platform, they can take part in that ideation and experimentation process, too. So we’re helping them by almost removing the need for them to code in order to allow them access to experimentation with AI tools.”

What are some of those things your non-technical business users are experimenting with? “Whether it’s helping understanding changes in tax law or it’s the ability to pull out salient facts from case law for research, we have so much expertise as an organization on tax, on risk, fraud and compliance, on the law on the news. What we’ve done is those SME’s are now capable of whatever business problem they’re trying to solve, and there are a lot of them; they can now figure out whether these genAI models are good at solving them.”

What security and privacy concerns do you have with genAI, especially since your LLMs are being run in the cloud or in a co-location facility? “Privacy and security have been at the forefront for us since the beginning. If you look at the markets we serve — legal professionals, tax professionals, risk-fraud-compliance professionals — they’re data sensitive. They have obligations to their customers that we have to help them respect and uphold. So security and privacy is embedded into every part of the development process.

“So, how do you do things at pace? What I didn’t want to do with the platform was have each development team figure out how to best safely and securely access the LLMs, and the content that feeds them, in a responsible way. Because even with the best of intentions, if they’re not doing it by design something could go wrong.

“This is where the genAI platform comes in. By having building blocks, we can ensure we are doing things in the right way. Building blocks ensure data residency and data privacy are respected. Building blocks ensure ethical concerns are being evaluated against the models we create. By doing all that in the platform, now all I have to do is tell these developers, ‘You have to use the platform.’ If they use the platform, then I know they have privacy, security, safety all built in by design.”

How have you trained your tech and business workers to use genAI? “Change management is just as important to us as it is to our customers. So we went through a process of creating foundational AI training for every member of our company. This is training we build with our Thomson Reuters data science experts and our technology experts geared toward a broad audience. So these were fundementals that we thought everybody needs to know in order to best serve our customers. But then we created these spoke training programs for specific parts of the company.

“As an example, in my development organization, we rolled out a much more extensive AI training that was aimed at developers. These are people building products, so obviously AI training is going to go into greater depth than that foundational training. It’s geared toward someone who’s going to be building an AI application. We then rolled that out across the entire team and we track progress to ensure everyone gets through that training.

“And we have similar types of targeted training for other segments of the organization. What a salesperson in front of our customer is going to need to know about AI is going to be different from what a developer on my team will need to know about AI, but they both need to know about AI. So, we’re investing a lot in training and development.”

When did you start the AI training and how did you execute it? “We have dedicated training days. Last year, we had a dedicated AI training day. And that’s not the first time we’ve done that. While we change the material to adapt it to the AI world, AI training is not a new concept for us. We’ve had multiple times over the years where we’ve had our AI experts create training material for us. That’s done in the context of our customers and our business.

“Then we built new training modules to help with training on generative AI models. We created those at the beginning of 2023 as it became something with more broad interest. But the other AI training models we’ve been using for years.”

How many training modules did you create? “There’s so many levels of AI training available to our employees, you could spend weeks trying to get through all of it. Some of it is optional and some of it is mandatory. It’s not optional [as a whole]. The only optional element to it is what level of your PhD do you want to achieve through the training?

“The AI Skills Factory goes back to the generative AI platform we have for clients. So, the Westlaw Research [tool] is part of the AI Skills Factory. They’re all built off this genAI platform. So, the Skills Factory – especially that low-code, no-code environment – [is] where we can develop these new AI-skills to rapidly bring them to market.”

What are the costs, the power requirements and the time that goes into building an AI platform like yours? Do you train up your own LLMs? “There are a variety of ways you can use train an AI model. The largest models in the world are built by providers who invest the time and resources to build those gigantic models that can serve virtually any purpose in the world. That’s not something we’d build ourselves. We’d access those models ourselves just like any of our customers would.

“On us building our own models, we experiment with a variety of things. This goes back to R&D that I was talking about earlier. We experiment with using off-the-shelf models. We experiment with building our own models, which will not be as large as those gigantic models that come from the hyperscalers of the world.

“Again, all of it comes back down to the fact that there will not be a one-size fits all in the future. I imagine a future where depending on the customer problem, we’re going to employ a different kind of model. I suspect our content and subject matter expertise will allow us to provide unique value with custom-built models, but they’re not going to be the size of these gigantic models you see from hyperscalers.

“For us the sweet spot is discovering the smallest model that provides the best response to the customer’s problem. And the smaller the model, the more efficient it is in many aspects, like run time, like costs, like efficiency in all its forms. That’s what our R&D function does. It helps us identify how to use our content and subject-matter expertise to the smallest model that will solve that model for that customer.

Your November announcement mentioned a multi-year strategy. Where do you see AI at Thomson Reuters going? “I think you’re going to see the pace at which we deliver go up. We had one product release in November, we have a couple more coming up and we had our acquisition of Casetext.

“I go back to speed again. I think what you’re going to see is that this isn’t something that’s ever going to be a point solution where we have one generative AI solution. In that diagram I shared, there’s something called AI Assistant and its generative AI power as something that exists across all of our products … where it’s helping you get more value, leverage those AI skills, and solve the problems you’re trying to solve more effectively. And we’re going to keep upping the pace at which we develop those skills and we integrate them into products.

“You’ll see more and more of that as we move ahead. That’s why that investment in the foundational platform was so important at this time, because we feel that will give us a differentiated advantage in the future.”

Your announcement stated Thomson Reuters will invest $100 million in AI. Is that this year or multiple years into the future?  “That’s what we’re organically investing in building our own AI solutions. That is a minimum of $100 million we’re going to put in, that’s the build part of our strategy. But I also want to talk about our recent acquisition of Casetext. If we see something that’s a great fit with Thomson Reuters in every way — culturally, technologically, and most importantly that can help solve customer problems — we’ll make acquisitions as part of our buy strategy.

“Then, there’s the partner strategy, as well. We recently announced a partnership with Microsoft where back during their Build Conference we were one of the first organizations to talk about what an integration with Microsoft Copilot could look like. So we have teams hard at work to show what that vision [is] to help lawyers more efficiently draft contracts in Microsoft Word by leveraging Copilot add-ins — we’re busy at work making that a reality.

“So, you’ll see us looking at all three of those angles: the $100 million to build; where we can buy; and we’ll continue to partner where we can use that to help service our customers.”