Reader

From the Editor

Welcome to the sixth issue of Signals & Soapboxes. Every week covers a signal shaping our future, a candid opinion, a risk on the horizon and a question you should be asking. Short, sharp and straight to the point.

Kerry Knight Chart.PR

📡 The Signal

A community of people is actively manipulating AI models into producing things they shouldn't: bomb-making guides, pathogen sequencing instructions, personalised ransomware etc. They are not hackers in a traditional sense; the tools they use are psychological. They use language, not code, to break the safety limits developers have set.

Because the AIs are trained on our words, they can be fooled in much the same way that we can.

Jamie Bartlett, writing for The Guardian, raises three concerns: the psychological toll on jailbreakers who manipulate systems; the risk that ordinary users accidentally jailbreak a model without realising; and why this community exists at all.

These individuals are paid to uncover safety issues that should have been resolved before these models were released to the public, not after.

No one - not even the people who build them - knows precisely how these models work, which means no one knows how to make them fully safe, either.

That statement alone should give everyone pause.

Source:

Jamie Bartlett, The Guardian: Meet the AI jailbreakers: 'I see the worst things humanity has produced'

🎙️ The Soapbox

Bartlett's article states two reasons why jailbreakers exist: the developers who build the models don't fully understand how they work, and the models are deployed without comprehensive safety testing.

Organisations that accept a vendor's internal safety assessment and deploy these tools without a second look are leaving the door wide open to AI-powered issues and crises. A vendor marking a model as safe tells you it behaved as expected in their conditions, against their tests and with their users. It says nothing about yours.

Even under the EU AI Act (covered last issue), there is no requirement for developers to publicly disclose how safety assessments are conducted. You are essentially trusting a conclusion with no visibility into the methodology behind it.

Jailbreakers are post-market safety testers using language and psychological manipulation to find and flag vulnerabilities left by inadequate pre-release testing. The fact this community exists at all should tell you something about the weight you are placing on a safety claim that was made without your organisation in mind.

Consider this a stark warning: don't rely on the work of others. Run your own tests and confirm safe use for your specific requirements.

A special thanks to Jamie Bartlett and his LinkedIn post serendipitously landing in my feed and bringing this topic to my attention.

Do you have a soapbox you stand on? Or an opinion worth airing? The soapbox is open to contributors. Get in touch: soapbox@kerrybknight.co.uk

👤 The Identity Brief

Synthetic identity fraud is now the fastest-growing category of fraud globally, making up 11% of all reported cases: an eightfold increase from 2024, according to LexisNexis Risk Solutions' annual cybercrime report, which analysed 116 billion transactions processed through its Digital Identity Network.

Synthetic identity fraud involves creating fictitious identities by blending real and fabricated personal information...These constructed identities can pass initial verification checks and build credit histories over months or years

Beyond the financial risk, synthetic identities impact any organisation that hires, authenticates or transacts with people online. This risk is compounded by how these identities are built. Unlike stolen identity fraud, there is no immediate victim response to trigger detection. Verification systems are designed to catch stolen identities, not fabricated ones.

Your onboarding and verification processes are built to catch stolen identities; flagging synthetic ones is a different problem entirely.

Source:

ID Tech: Synthetic Identity Fraud now 11 percent of all global fraud: report

🤔 The Question

By whose standards is your AI safe?

Signals & Soapbox Editor

Kerry is a Chartered PR Practitioner specialising in crisis preparedness, reputation management and AI governance.

As a strategic communications advisor, her day-to-day work focuses on translating complex or contested subject matter into clear, credible positioning that strengthens decision-making. She has advised on public scrutiny, misinformation, employee disputes and complex reputation challenges, providing the insight leaders need to maintain clear, defensible positions under pressure.