AI systems are already deceiving us -- and that's a problem, experts warn

The China Mail - AI systems are already deceiving us -- and that's a problem, experts warn

Beijing 7°C

USD -

AED 3.672503

AFN 66.358865

ALL 83.521386

AMD 382.507047

ANG 1.789982

AOA 916.999942

ARS 1420.001095

AUD 1.532297

AWG 1.8075

AZN 1.700215

BAM 1.69102

BBD 2.013765

BDT 122.075429

BGN 1.69038

BHD 0.376985

BIF 2944.950242

BMD 1

BND 1.302709

BOB 6.934237

BRL 5.288594

BSD 0.999836

BTN 88.626912

BWP 13.379849

BYN 3.408468

BYR 19600

BZD 2.010825

CAD 1.402695

CDF 2507.503045

CHF 0.801795

CLF 0.023892

CLP 937.280025

CNY 7.11965

CNH 7.121545

COP 3768.72

CRC 501.990757

CUC 1

CUP 26.5

CVE 95.337115

CZK 20.97225

DJF 178.040619

DKK 6.453275

DOP 64.274876

DZD 130.334215

EGP 47.2332

ERN 15

ETB 153.531271

EUR 0.86414

FJD 2.2795

FKP 0.760151

GBP 0.76071

GEL 2.704944

GGP 0.760151

GHS 10.938284

GIP 0.760151

GMD 73.493505

GNF 8679.111511

GTQ 7.663975

GYD 209.177056

HKD 7.773075

HNL 26.305664

HRK 6.510503

HTG 130.902048

HUF 333.164946

IDR 16717.4

ILS 3.217055

IMP 0.760151

INR 88.53915

IQD 1309.809957

IRR 42112.502065

ISK 126.509901

JEP 0.760151

JMD 160.929279

JOD 0.709026

JPY 154.216503

KES 129.120362

KGS 87.449766

KHR 4015.251731

KMF 421.000542

KPW 899.978423

KRW 1464.569693

KWD 0.307097

KYD 0.833232

KZT 523.811582

LAK 21710.560445

LBP 89534.40718

LKR 304.034308

LRD 182.9689

LSL 17.183334

LTL 2.95274

LVL 0.604891

LYD 5.455693

MAD 9.256256

MDL 16.972307

MGA 4491.671602

MKD 53.199952

MMK 2099.547411

MNT 3580.914225

MOP 8.005153

MRU 39.702748

MUR 45.889881

MVR 15.405021

MWK 1733.71722

MXN 18.36573

MYR 4.138985

MZN 63.949746

NAD 17.183334

NGN 1437.069362

NIO 36.789182

NOK 10.08201

NPR 141.802446

NZD 1.770055

OMR 0.384485

PAB 0.999844

PEN 3.374604

PGK 4.221029

PHP 58.961021

PKR 282.700265

PLN 3.65467

PYG 7082.89022

QAR 3.644192

RON 4.393097

RSD 101.25215

RUB 81.322855

RWF 1453.231252

SAR 3.750481

SBD 8.237372

SCR 13.77609

SDG 600.496166

SEK 9.485902

SGD 1.30182

SHP 0.750259

SLE 23.194491

SLL 20969.499529

SOS 570.381162

SRD 38.496501

STD 20697.981008

STN 21.18296

SVC 8.748206

SYP 11056.693449

SZL 17.178084

THB 32.402502

TJS 9.263432

TMT 3.5

TND 2.951633

TOP 2.342104

TRY 42.23324

TTD 6.782064

TWD 31.013798

TZS 2450.602922

UAH 42.041441

UGX 3509.484861

UYU 39.780907

UZS 12013.003856

VES 230.803902

VND 26315

VUV 122.395188

WST 2.82323

XAF 567.14739

XAG 0.019568

XAU 0.000242

XCD 2.70255

XCG 1.801951

XDR 0.705352

XOF 567.14739

XPF 103.114354

YER 238.509303

ZAR 17.15325

ZMK 9001.201907

ZMW 22.620808

ZWL 321.999592

RIO

0.9600

70.29

+1.37%
RBGPF

0.0000

76

0%
CMSC

0.0400

23.89

+0.17%
RYCEF

0.0200

14.82

+0.13%
SCS

-0.0200

15.74

-0.13%
CMSD

0.0600

24.16

+0.25%
BCC

-0.8100

69.83

-1.16%
NGG

-0.4200

77.33

-0.54%
GSK

0.7300

47.36

+1.54%
BCE

-0.2500

22.94

-1.09%
JRI

-0.0600

13.68

-0.44%
BTI

0.8300

55.42

+1.5%
VOD

0.1200

11.7

+1.03%
BP

0.5400

37.12

+1.45%
RELX

-0.2400

42.03

-0.57%
AZN

2.9000

87.48

+3.32%

AI systems are already deceiving us -- and that's a problem, experts warn

TECHNOLOGY 10.05.2024

Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.

Text size:

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.

"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."

Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.

- World domination game -

The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.

Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.

In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."

It added: "We have no plans to use this research or its learnings in our products."

A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.

In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.

- 'Mysterious goals' -

Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.

To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.

To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."

And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.

P.Ho--ThChM

The China Mail - AI systems are already deceiving us -- and that's a problem, experts warn

AI systems are already deceiving us -- and that's a problem, experts warn

Featured

'Western tech dominance fading' at Lisbon's Web Summit

AI agents open door to new hacking threats

No link between paracetamol and autism, major review finds

Typhoon exposes centuries-old shipwreck off Vietnam port