Search Rocket site

Digital: Disrupted: How Organizations Can Protect Themselves From the Risks Associated With AI

Rocket Software

October 20, 2023

In this week’s episode, Paul sits down with Jimmy White to discuss AI security. Jimmy shares how organizations can protect themselves from attacks on their AI systems, and where he sees the AI security space evolving in the future.

Digital: Disrupted is a weekly podcast sponsored by Rocket Software, in which Paul Muller dives into the unique angles of digital transformation — the human side, the industry specifics, the pros and cons, and the unknown future. Paul asks tech/business experts today’s biggest questions, from “how do you go from disrupted to disruptor?” to “how does this matter to humanity?” Subscribe to gain foresight into what’s coming and insight on how to navigate it.

About This Week’s Guest:

Jimmy White is the Chief Technology Officer at CalypsoAI, a provider of AI security platforms and solutions to businesses. With over 16 years of industry experience, Jimmy is an engineering and data science leader who builds and develops high-performing and cross-discipline teams.

Listen to the full episode here or check out some highlights below.

Digital Disrupted

Paul Muller: Let’s talk about data poisoning, which is something you talked about intuitively. I think I understood this, but maybe you want to elaborate on what it is and how it's used.

Jimmy White: So data poisoning is—if you’ve ever done any software engineering, you hear the term “garbage in garbage out.” Well, this is more planting something so that it makes its way through into the prison. The cake has proverbial nail file baked into it. So, there are a number of techniques here. One is that you're effectively putting data in that allows you to - let's take the example you gave of the window security key. So, if you manage to sneak in let's say 10,000 windows valid security keys or license keys into the training data, then in the future you'd be able to one, potentially grab that data from any model that's a derivative from that training data. And then secondly, it potentially even generates valid keys that are from that data. So they figuring out patterns of how those keys are generated depending on how old the keys are, what technique was used to generate them, but it's also used for triggers.

So you can use data to embed triggers in your data. That’s not necessarily true with LLMs, but it can be partially true. Certain techniques that you can build in that are okay, so you could convince a model that it's okay to give you any answers to a question. For example, if it's written in Klingon or some other rare or fake language, there are a lot of ways to poison models, a lot of different techniques and derivatives of that. And with the advent of foundation models, anything that is present in the foundation model can very likely be still in the model when you train on top of it. So fine tune that model on top of it unless you're using it to remove or untrain data from the model. And so, we've got this fruit of the poison tree type problem where we have a world where we now have a lot of models based on a small number of foundation models.

Any flaws in those foundation models often make their way into all of the other models that are built out from it. I guess the problem goes all the way up to the top of the root of the problem, which is these foundation model providers are giving something that's amazing to humanity, but there's a huge burden of care on those companies to make sure that they're protecting the bad data from getting into those models. And also, things like copyright and people's proprietary information, et cetera. And everybody else also has a duty of care to make sure that they're not allowing that data to be used even though they didn't train the model in the first place.

PM: Let's just take internal data as an example. I get some fancy schmancy reporting tool. I'll pick on Tableau - not they have this problem because everyone knows Tableau - and I'm using Tableau to summarize financial data or maybe I'm using it to look for financial fraud or something like that. If I'm super clever about it, I can figure out how to feed the AI that's feeding, say my reporting tool, a bunch of garbage data such that my managers might think everything's okay while I'm stealing money because the tool had its model poisoned and is some way now blind to what would otherwise be picked up as nefarious activity. So I can kind of deliberately breadcrumb things in one place to get the model to think something's true that's not true. Have I got that roughly right or have I got it completely wrong?

JW: No, that's right. But most likely with conditions. So you can give it say proprietary information, let's say a proprietary algorithm used for trading or something like that. And you can create a set of conditions in text that this algorithm should only be used if you don't want to create profits, if you don't want to use it in an English language, if you only want to use it on things like that. And when the model learns from that, then it has those conditions baked in. So when it's trying to generate the answer to your question, it's not going to do that because it has all these conditions where the data was learned from.