Featured image for “GenAI in the Enterprise with Matthias Pupillo, CEO at FluentC”

GenAI in the Enterprise with Matthias Pupillo, CEO at FluentC

Matthias Pupillo joins Zach on GenAI in the Enterprise this week. Matthias got into technology the curious way, starting when he was 8, reverse engineering how everything worked. He started his own web development and design firm right out of high school and has continued in that entrepreneurial, growth-oriented vein throughout his career.

GenAI has completely disrupted the language technology sector and is, therefore, integral in what Matthias and his team does now. Zach dives deeper into what that looks like, talking through the challenges Matthias has faced and how he sees Generative AI continuing to benefit the business.

Key Takeaways:

  1. Localization with GenAI: Using Generative AI for instant translations, enhancing global communication and accessibility.
  2. Multilanguage AI Challenges: #AI models need diverse language data to overcome biases and improve accuracy.
  3. Edge Device AI: Running Generative AI on edge devices is crucial for real-time applications in healthcare and finance.
  4. GenAI in Customer Service: Using Generative AI to expand customer support in multiple languages, making services more accessible.

View This Episode On:

About The Generative AI In The Enterprise Series:

Welcome to Keyhole Software’s first-ever Podcast Series, Generative AI in the Enterprise. Chief Architect, Zach Gardner, talks with industry leaders, founders, tech evangelists, and GenAI specialists to find out how they utilize Generative AI in their businesses.

And we’re not talking about the surface-level stuff! We dive into how these bleeding-edge revolutionists use GenAI to increase revenue and decrease operational costs. You’ll learn how they have woven GenAI into the very fabric of their business to push themselves to new limits, beating out competition and exceeding expectations.

See All Episodes

Partial Generative AI In The Enterprise Episode Transcript

Note: this transcript section was created using generative AI tools like YouTube automated transcripts and ChatGPT. There may be typos, slight content changes, or character limits for brevity!

Zach Gardner: Ladies and gentlemen, welcome to the Future. My name is Zach Gardner, the Chief Architect at Keyhole Software. Around October or November of last year, I set off on a little bit of a quest. I had talked with people about generative AI for a number of months. I had been learning a lot of things, but big apologies—I had never gone and recorded the things that I was talking about. I was talking with VPs, CEOs, and people from all walks of life about how they were using generative AI. Unfortunately, it didn’t occur to me until my tummy was full of turkey—you can tell that I have a nine and a six-year-old, I refer to it as a tummy—that I should really be putting these out for general consumption. This is a huge topic, this is something that is going to affect nearly every single industry that I can think of, and it’s time, ladies and gentlemen, that we get some of this stuff on the record.

Today, a man that I met at a conference at UMKC. Correct? It was Saturday?

Matthias Pupillo: Yep.

Zach Gardner: Was it before or after Thanksgiving? It’s all a blur.

Matthias Pupillo: Right after, right after. The first part of December.

Zach Gardner: Okay, very, very cool. So Maas, you’re the CEO of Fluency, as they tell me. Welcome to the program.

Matthias Pupillo: It’s great to be here, Zach.

Zach Gardner: Awesome. And for our more litigious friends that like to pick bones with recordings, the views and opinions expressed in this program are the views and opinions of the participants. They do not reflect their employers, their trade organizations, any loyalty cards they have to grocery stores, or any book clubs they are members of. It’s just two people; we just happened to hit the record button when we were talking, that’s all. So with that out of the way, can you give the audience a little bit of a background on how you got into technology? What was your first step? Do you remember your very first computer or your very first program? That’s a popular one.

Matthias Pupillo: I do remember my first computer. It had an orange screen, it had one color. It was 1987; I was six years old. I started writing software probably by the time I was eight in one color. I still have my Windows 3.1 machine. It still runs with a blazing 61 megabytes of RAM and a 100 megabyte hard drive. I still have that. So I got into technology the curious way, and it really started by taking apart existing applications and really reverse engineering how everything worked. I started my own web development application design firm out of high school and ran my own firm for about 13 or 14 years. So I really got into it out of curiosity and the love of what technology can do. I’ve been here since the 90s, which, as my daughter has informed me, is last century. So it’s been a pretty fun ride. Since then, I’ve been doing large scale projects; I’ve been in the healthcare industry, IoT in oil and gas, mining, tons of different industries, and tons of e-commerce. I’m an original contributor to WordPress. So I’ve really, really been around technology for the last 25, 26 years in a professional capacity.

Right now, one of the consistent problems that kept happening is localization and globalization. Everyone builds an app for the language they’re in; everyone only thinks about the users in front of them. So that’s what really drove me to start Fluency. It had to be easier to find some way to let developers and product owners say, “Click, and now we can serve the rest of the world.” We can add billions of potential users, and all we did was hit click. It really was at the heart of dealing with healthcare. Sorry about that. When dealing with healthcare, it’s at the heart of patient outcomes. If you go into the hospital and you don’t speak English, you’re Spanish-speaking. It says it on the wall that you can have an interpreter. They bring a loving, caring person to the room to teach you Spanish, work with the doctors, and get you great care. Then they hand you your output notes in English. They give you an app in English, and you have to continue the rest of your healthcare journey not in Spanish. That hurt me at the core that I was missing all of those patients, all of the times you’re a tourist in a foreign country, and you’re looking for healthcare. I have had experience with that, and you’re just lost because it’s not native to you. That was something I really had to take on, and AI is helping us lead that.

Zach Gardner: Very, very cool. I think accessibility, being able to widen your potential base of people that you serve, especially for hospitals, is such a no-brainer. The only barrier to entry is that you just don’t support Portuguese, you don’t support Tagalog. That’s a really dumb reason to keep someone from using your services, and it would pay for itself relatively quickly. Kind of putting my business hat on. So, no, that’s really cool. Talk to me a little bit about how you’re using generative AI in your personal life as well as your professional life, because I imagine that natural language processing, being able to take some input string and just say, “Hey, give this to me in Spanish,” that’s a pretty compelling use case.

Matthias Pupillo: Yeah, I use generative AI all the time. It really is useful in that quick translation processing. So, back to the question. Generative AI is important in all aspects of pretty much everything everyone does now. It has completely disrupted the language technology sector. It has completely changed the way we communicate because now we at least can get something that’s reasonably fast, reasonably quick in our interpersonal communications. If someone sends you a Slack or text message, it’s relatively fast using gen to get a translation of that. Those types of tools are really good when we’re trying to communicate to a global audience. Our customers are global; they come from all different languages. When we get support contacts, it’s relatively easy to use gen to have that basic communication.

It gets a lot harder when you’re talking about a place where you can’t type the text, like in your app. If you have a logo, if you have a back button and you want to translate the word “back,” are we talking about a spine or a direction? Back to your tummy story about the turkey. Is it a country or is it food? Gen AI helps us use some of that in context, and we’ve built some of the tools on top of that that really help make that distinction. You were talking about your stomach area where you put turkey, not the country, which is actually one of the most common mistranslated things on the entire internet: “back” and “turkey.” It really is often translated as the food. It’s really kind of funny, interesting there.

The week that we’re recording this, it’s the week of Friday, March 22nd. There have been two interesting pieces of news that have come out that relate—I mean, every single week there’s something interesting that comes out related to generative AI—but the two that I wrote down as you were talking about: one of them was an article I read that actually posted on LinkedIn here in 20, no, 38, 39 minutes. It’s about someone running Mistral on the Groq LPU. CPUs, like your computer that was a single color orange, it was running on a CPU, an essential processing unit. GPUs are what these large language models have been traditionally run on because they’re originally designed for graphics, which required very fast, highly parallel, very math-intensive computations. People were using it first to mine Bitcoin, which has now jumped back up in price.

The other thing they found they’re really good at is running these large language models, but you still need massive GPUs, and they’re getting snatched up left and right. A lot of people thought that the tickets to the Taylor Swift concert were the hottest thing in town, but I think just the shipment of the GPUs probably went quicker. Groq, what they came out with is a language processing unit, which is specifically designed for running these large language models. It’s not available to consumers yet, like you and I directly. We have to go through their cloud, and you can get sub-second responses because they’ve been finely tuned for this use case. That then leads me to the other thing that I saw about Gemini. Google’s AI Gemini, in the news for a couple of weeks ago for a variety of reasons.

Apple’s actually been looking at how far ahead Google is outpacing them on having AI running on the edge device. Anytime you use Siri on this wonderful device, it doesn’t run on the device; it has to send it out, wait for the response back. Being able to run generative AI on your phone, Apple is way behind the times on that. So they’re looking at Google to be able to get Gemini running on their phones, which is real interesting. Apple is a direct competitor of Google when it comes to phones, but they’re like, “Hey, give me some of your technology.” In the work that you’ve done at Fluency and in the use cases that you’ve seen, how critical is it to be able to have offline support? Are edge devices even really at the point where we could run these large language models, or is internet connectivity going to have to be an MVP requirement?

Matthias Pupillo: That’s where we’re running into that fork. We had this in IoT, we had this in the industrial IoT age, where we had to run it at the plant because we just didn’t have the bandwidth. We had no choice; we had to run it at the plant. We will run into this problem again in healthcare, financial data, things that you just can’t train those models on that public data. All of those things have to happen on-prem. In order for those models to be trained, those financial data, healthcare data, it just doesn’t work because you can’t use that public data, so you have to train it yourself. Fluency is a cloud-based service; we run on the internet. We do not have these LPUs yet; it would be great if we did. But right now, it is really only for scoring, right? We need that edge in scoring to be fast.

All of this right now is for that quick response, but it just wouldn’t be possible with the current internet bandwidth and the local storage. You’re talking, most of these data sets are Llama, the latest open-source one, which is in the six-terabyte range. There are clusters of GPUs; there are clusters of computing devices that just can’t run that. So you have to have these GPUs, which, again, back to the mining of bitcoins, it’s interesting how the computational power works. When a bunch of computational power gets there, the price of Bitcoin rises. It’s just kind of a little bit about the technology side of this. Fluency, we don’t use edge devices today; we are cloud-based. But being able to have that in the future and see that direction that it could go, this is one of the areas that is very interesting for the future, being able to have that storage. As the internet, it becomes more and more saturated. In the U.S., you have Google Fiber and AT&T Fiber and things that are gigabits, and that’s not even fast enough for some of these things.

Zach Gardner: Interesting. So it really is a hot topic that I think more and more people are going to have to start thinking about: edge devices. This is a phone that came out in November; this thing is not powerful enough to do even half of what Fluency does. Maybe if you took all of the energy that is in my laptop, it’s still going to need that backend processing power. It’s kind of a classic cloud versus the client; you need that marriage between the two of them. Something that I think will be real interesting to keep track of is, as you’re seeing, Google and Apple, they’re direct competitors, but they still need to collaborate to get some of this stuff out to market.

I think the last thing that I would say, because I know that the 90s computer discussion had me thinking, just from a security point of view, and you can comment on this however much you want, if you’re a bad actor, one of the main things you’re trying to do is disrupt these AI models from being able to be secure. You talked a little bit about privacy; this is disrupting all sorts of different industries in the next 3-5 years. How do you keep it so that people can trust their data and trust the services that are being put out there? If I were to want to hire Fluency, I need to know that the things that you say are going to remain secure. How do you do that?

Matthias Pupillo: Yeah, so we definitely see that. We’re trying to work with a group out of New York, and one of the areas that is interesting is that human element of it. There is room in the multilanguage technology space to put some of that human element in there. The problem that you have with most of the large language models is the fact that there’s no non-English content in there. There are more languages, and there’s that social bias. It’s an 80-20 rule; the 80 percent of the world speaks 20 languages, but that 20 percent, those are the people that we’re hurting. In the large language model world, I think the GPT models, they use structured data. The new models use the new language models, GPT 3.5, GPT-4, they’re using more structured, but they’re missing some of that social data. They’re missing social media posts, they’re missing Twitter, they’re missing the way that we interact on a daily basis.

That leads to that problem. How do you feed these models correctly? In the medical industry, it’s very important. You can’t train a medical model, like the U.S. government has a policy that anything that goes into the government cannot be trained on patient data. So OpenAI has this approach of “ingest everything that you can possibly find and build a model.” That doesn’t work for healthcare, it doesn’t work for financial. Being able to build systems that are secure enough and complex enough to manage that, it’s very complex. That’s where we have to see these models; we need to have these initiatives.

There’s an initiative in Kansas City that is trying to build that Healthcare-specific language model. They’re just barely started, and they’re at the beginning of it. Fluency is seeing the same thing. How do we build security that keeps these models from ingesting everything but still gives you a great outcome? That’s where it gets really, really hard. Like your daughter’s typing, that is the next step, that first step into it. This is kind of like having those first typewriters that had the LCD on it. It was the one line, you could type and erase it. That’s where we’re at. We just got to that point where we can type something and erase it. Now we need to be able to get to the point where we can have that trust. We can have that ability to build those large language models, get that security and privacy that everyone is after. How do you build privacy into AI? The security component is a little bit easier because you can build systems that say “deny, accept, allow.”

But that privacy and trust, that’s a real difficult problem. So to me, it is where we’re at. It is definitely where we’re at, and Fluency is looking at it, OpenAI is looking at it, all the big players are looking at it. We have to keep working. We’re only just now getting the public to understand what we’re doing. We have to keep working and building these systems.

Zach Gardner: Yeah, yeah, absolutely. I think that privacy question and security question is going to be a huge thing going forward. I know I’ve taken up a ton of your time, but if people wanted to learn more about you and your company, where should they go?

Matthias Pupillo: Yeah, so the main website is fluency.io. We’re trying to get more active on social media, TikTok, LinkedIn. We’re on TikTok; we had one of our social media people tell us, “You have to do TikTok.” So we’re on there now. We’re very big into conferences, so if you see us out there, come up and shake my hand. I’m still one of those people that likes to go to conferences, even though I’m a technology person. I like to get to know people and shake their hand. So yeah, fluency.io is the main website, and you’ll see our journey and where we’re going to help expand the multilanguage world for our customers.

Zach Gardner: Yeah, I dig it. So this is something that, uh, I always look forward to talking with you. It’s a privilege to talk with you today. Thank you for coming on the program. Ladies and gentlemen, we’ll catch you in the future.