AI Models and Deception

News

Hosted on MSN17d

AI can strategically lie: From innocent errors to lying, manipulation, and deception

no one can reliably train large language models not to deceive.” Dr. Park points out the differing attitudes among engineers toward AI deception. Some actively implement strict safety measures ...

Hosted on MSN25d

OpenAI study says punishing AI models for lying doesn't help — It only sharpens their deceptive and obscure workarounds

AI models are reportedly great at covering their tracks, making it easy for the monitor to overlook their obscured deception. OpenAI's GPT-4o model was used to oversee an unreleased frontier ...

10h

AI will make the mind games of war much more risky

Successful military operations must now fool both human commanders and the AI that advises them. This creates an opening, two ...

11d

AI models get stuck 'overthinking.' Nvidia, Google, and Foundry could have a fix.

AI models are overthinking, leading to decreased accuracy. So Nvidia, Google, and Foundry researchers created something new.

Microsoft3d

Cyber Signals Issue 9 | AI-powered deception: Emerging fraud threats and countermeasures

Read the latest edition of Cyber Signals to learn how Microsoft is protecting its platforms and customers from AI-enhanced ...

2don MSN

OpenAI partner says it had relatively little time to test the company’s o3 AI model

Metr, a frequent OpenAI partner, suggested in a blog post that it wasn't given much time to evaluate the company's powerful ...

2don MSN

OpenAI updated its safety framework—but no longer sees mass manipulation and disinformation as a critical risk

OpenAI’s updated AI safety framework drops key pre-release testing requirements—including for persuasive or manipulative ...

CSO Online16d

AI disinformation didn’t upend 2024 elections, but the threat is very real

The next phase of AI disinformation won’t just target voters but target organizations, supply chains, and critical ...

Manufacturing.net2d

Research Aims to Boost AI Usage in Cybersecurity

A number of cutting edge industries are integrating AI models, as these systems exhibit high accuracy in analyzing data in ...

29d

Cloudflare turns AI against itself with endless maze of irrelevant facts

Cloudflare describes this as just "the first iteration" of using AI defensively against bots. Future plans include making the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results