Decoding The Future
Decoding the Future, hosted by Fujitsu Uvance, dives into the transformative world of CX, data, and AI. Through conversations with top experts across Asia and Oceania, listeners explore groundbreaking trends like generative AI, vision AI, and data security. Discover how these innovations are reshaping industries like retail and healthcare and gain practical advice on leveraging technology to solve challenges, drive digital transformation, and stay competitive in today’s dynamic landscape.
Decoding The Future
EP.17 Cohere and Enterprise Language Models: The Future of Secure, Sovereign AI
Join Mahesh and Andrew Garkic as they explore enterprise AI and how Cohere is transforming large language models for businesses. Learn how secure, multilingual, and multimodal AI helps enterprises access their data without compromising privacy, compliance, or sovereignty.
💡 Topics Covered:
-Cohere’s Command & Embed models
-Retrieval Augmented Generation (RAG)
-Agentic AI with the North platform
-Use cases in finance, healthcare, legal, and government
-Key takeaway: Enterprise AI is sovereign, multimodal, and multilingual.
Thank you for listening!
Discover more content like this on Decoding the Future.
Learn more about Fujitsu's AI Solutions here.
Andrew (00:00.044)
You can't just put all your enterprise data into ChatGPT and send it off to OpenAI.
If you asked a question in Spanish, you can search over documents that are written in French. The response can be prompted to come back in English. So you can have like a trilingual conversation in one go.
Rather than taking data to the AI, we are taking the AI to the data.
North is the platform to be able to easily build these agents. It's a fairly low code environment.
We might read a book and skim through all the pages, but that's not the most efficient way for computers to be able to search for information.
Three words summarise everything that we covered in the podcast, right? Is it multimodal? Two words? It’s alright.
I guess: sovereign? Multimodal?
Andrew (00:41.723)
Fun.
Mahesh (00:48.590)
G'day everyone, welcome to another episode of Decoding the Future podcast. I've got Andrew Garkic with me today. Welcome back, Andrew. You're almost becoming a regular on the podcast.
Andrew (01:00.000 approx.)
Thank you, Mahesh. It's an honour to come back.
Mahesh (00:48.590 cont.)
Today we're going to be talking about Cohere and enterprise large language models. Models these days, Andrew, are coming out faster than your team goes out for coffee runs.
Cohere operates in a very niche space, I suppose, more enterprise large language models. So how does that fit in with the regular LLMs?
Andrew (01:22.722)
Yeah. I mean, let's think about, if we take, you know, you mentioned coffee as the example, if you go to ChatGPT, you might say: which city has the better coffee, Sydney or Melbourne.
We're here in Sydney, but you're from Melbourne. So it'd be interesting to see what it has to say.
If the large language models say anything other than Melbourne, I seriously question the data it's been trained on.
So what about the enterprise context? I'm not sure, I'm not sure.
So if we try to think about this as an enterprise context, it'd be like maybe I took the team out for coffee, and I want to expense a claim for, you know, a hundred dollars for the coffee, but I don't know what the policy limits are.
So we'll get a chatbot to go look up the policy documents and then I'll understand how I can submit that claim.
Mahesh (01:55.000 approx.)
$100 for a coffee. I really want to know what kind of coffee our team is drinking.
OK, jokes aside, I think we'll talk about three things today:
- What enterprise large language models are
- Cohere and what they do in this space
- Use cases people can actually use this for
So Andrew, what are you seeing in the enterprise space when it comes to large language models?
Andrew (02:45.570)
Yeah. So if you think about ChatGPT, you go to the app and ask a question, it'll go to the internet to get information.
Where we're really seeing the value in the enterprise context is being able to take that same level of intelligence, but with enterprise data.
The challenge is that you can't just put all your enterprise data into ChatGPT and send it off to OpenAI. That's something that we're trying to tackle.
Mahesh (03:10.000 approx.)
Yeah, so the example you’re talking about earlier, the coffee example, right. How do you claim for the coffee expense? All that is in the policy documents within the company. Obviously the internet doesn't know anything about it.
So you feed the data from the enterprise, like all the documents and everything else, ask the question.
Andrew (03:30.000 approx.)
Yeah. And so today you would need to, if you wanted to use ChatGPT, you'd need to get that PDF policy document, upload it into the chat window, and then have a conversation with that.
But we're trying to find more value to be able to do that in a more secure way in that enterprise context.
Mahesh (03:55.000 approx.)
So let's talk a little bit about Cohere.
Cohere focuses on the enterprise large language model space. It's a company that was founded in 2019, so they've been around for about six years.
They got funding for about $1.6 billion, I believe, and I think they’re currently valued around $7 billion.
Fujitsu has taken a stake in the company. We've got a small stake, along with the likes of Nvidia and Oracle. But more importantly, we've also signed a strategic partnership with them.
So we do a lot of work with them.
Andrew (04:08.716)
Yeah. And their CEO, Aidan Gomez, was actually one of the authors on the first paper, Attention Is All You Need, which was foundational for large language models.
He then left DeepMind and went into building Cohere.
What’s interesting is that a lot of these companies are based in the US. Cohere is actually a Canadian company. I think they've got headquarters both in Canada and in San Francisco.
Mahesh (04:45.000 approx.)
So Andrew, let's talk about the problem space itself. What are the problems that we’re trying to solve when you talk about enterprise large language models?
I know you spoke about RAG and everything else, but what else are we trying to…
Andrew (05:05.000 approx.)
If you think about regulated industries, defence, state and federal government, we were talking earlier about data having to be loaded into ChatGPT. That definitely cannot happen.
A big concern for these sorts of clients is privacy and security of data, and also the sovereignty of the data.
These are all the things in play: compliance requirements, GDPR, privacy, security.
They want the capabilities, but at the moment the offerings are in the public cloud, and so that's the challenge.
Mahesh (05:35.000 approx.)
Enterprise large language models aren’t just bigger models. They focus more on security, compliance, and so many other things.
We used the strategic partnership to build a model on top of Cohere called Takane. I’ll come back to that.
But first, let’s talk about their models.
Andrew (05:48.942)
They have a whole family of models called the Command family.
They started with Command R, then Command R+, then Command R 7B. Different iterations.
Similar to OpenAI’s GPT versions.
Their latest flagship model at the moment is Command A, built for the agentic age, being able to plan and use tools.
A differentiator for Cohere is they also have models in the embedding space. They’ve got:
● Embed 4
● Rerank
A lot of frontier companies push models for chat or coding. Cohere is focused on enterprise.
Enterprise value is connecting to enterprise data. That’s where RAG comes in: Retrieval Augmented Generation.
If you ask a question like “How do I claim my coffee?”, it needs to retrieve the relevant info, bring it back to the model, and the model gives the answer.
To do that, enterprise data needs to be put into a format the model can search efficiently. We might read a book and skim pages, but that's not efficient for computers.
Andrew (07:31.298)
What the embed model does is convert characters and text in unstructured documents into numbers.
Then you use distance calculations to find the most relevant information.
Rerank is: once you’ve found, say, the top five most relevant pieces of information, another model sits over the top and says: based on the question and these five documents, rank them in order of what will be most useful.
So those are two additional models Cohere brings.
Embed is also multimodal. The current model is multimodal.
And multilingual.
We’re doing work with a company in Europe with complex schematic diagrams and engineering drawings. It can interpret all of that, which is powerful for field specialists.
Andrew (08:57.454)
A big part of Embed 4 is you can take images in those documents and also convert that to numbers.
So if I asked: “Find me this part manual”, and it had an image associated with it, that image will also show up in search results.
For multilingual: if you translate Spanish documents into those numbers, those numbers are universal to the computer. It doesn’t really matter what language it started in.
Mahesh (09:45.000 approx.)
One more thing we need to talk about, Andrew. Agentic is all the craze these days.
And of course, Cohere have got North.
Andrew (10:10.000 approx.)
Yeah. They’ve released an agentic platform where you can build multiple agents in the enterprise context.
An agent is taking a large language model, giving it the ability to plan, but also execute tools. It might call an API in an ERP system.
North is a fairly low-code environment. You can drag and drop workflows, define logic trees, and control what tools an agent can access.
You can run North on Cohere’s instance, but you can also deploy North in a local on-prem environment.
Andrew (10:52.738)
I was looking at my newsfeed this morning and there’s an announcement today that Cohere North is going to be integrated really well with SAP.
Tighter integrations into enterprise tools will keep getting stronger.
Cohere is also trying to build extremely capable models that are extremely efficient.
Their flagship model Command A can run on two H100 GPUs, which is a game changer for footprint and efficiency.
Mahesh (12:14.838)
This is hugely important. If you want inferencing done in-house, models need to be efficient.
You can’t invest millions in infrastructure just to run AI.
You also become dependent on cloud infrastructure, like Azure or AWS.
Now, multilingual capabilities. Can we double click on that?
Andrew (13:21.672)
Command A has 23 languages, and Embed 4 has over 100 languages.
A use case: ask a question in Spanish, search documents written in French, then prompt the model to respond in English.
So you’re having a trilingual conversation.
The key part is it’s not like we’re translating everything with an interim translation service. It’s embedded at the core.
In a global corporation, one region can interact with documents written in another language.
In defence, synthesising intel across languages becomes a native capability.
Mahesh (15:05.708)
Australia is multicultural. Even in government services like myGov, imagine asking a question in Vietnamese, searching English policies, and getting a response back in Vietnamese.
That’s powerful for accessibility.
Another favourite part is deployment flexibility: cloud, on-prem, private cloud.
This brings up sovereignty. How do you define sovereignty?
Mahesh (16:29.738)
To me there are three things:
- Data sovereignty: where data resides
- Infrastructure sovereignty: where inferencing is run
- Talent sovereignty
I don’t think we should be building large language models from scratch in Australia. It’s too expensive and ongoing.
Better approach is to take an existing model and fine-tune it for national data and language. That’s what we’ve done with Takane.
Andrew (17:30.000 approx.)
That makes sense in Japan’s context. In Australia, Western models might be good enough, but it’s interesting to think about fine-tuning for Māori or Indigenous languages.
Mahesh (17:49.464)
Rather than training from scratch, fine-tune to support languages like Māori and Indigenous languages.
There isn’t a lot of content out there, but fine-tuning should be good enough for a lot of that.
Mahesh (18:25.000 approx.)
Before we wrap up, we touched on defence and public sector. Any other use cases?
Andrew (18:40.000 approx.)
Other regulated industries:
● Healthcare and clinical research
● Financial services
● Legal contracts
● Tender reviews
● Sensitive research
Secure, sovereign AI models can have a huge benefit there.
Mahesh (19:03.706)
Sovereignty and where inferencing happens isn’t a bolt-on. It has to come at an architecture level.
Rather than taking data to the AI, we are taking the AI to the data. The data stays where it is.
Multilinguality, multimodality, and deployment flexibility unlock industries and markets that previously couldn’t access this.
Andrew (20:33.742)
When you’re running models locally, the initial cost can be high due to infrastructure.
Cloud is pay-per-use, but when security, compliance and trust are paramount, there’s no way around it.
Cost becomes less important than those principles.
Mahesh (20:55.000 approx.)
Alright, thanks Andrew. Before we wrap up, I’m going to put you on the spot.
Three words to summarise everything we covered.
Andrew (21:05.000 approx.)
Sovereign. Multimodal. Multilingual.
Mahesh (21:15.000 approx.)
Thanks, Andrew. Thanks for tuning in everyone. Until next time, goodbye.