The China Mail - AI systems are already deceiving us -- and that's a problem, experts warn

USD -
AED 3.672905
AFN 69.497078
ALL 83.64978
AMD 383.512686
ANG 1.789783
AOA 917.00037
ARS 1298.503425
AUD 1.535969
AWG 1.8015
AZN 1.700296
BAM 1.672875
BBD 2.019801
BDT 121.54389
BGN 1.67504
BHD 0.377032
BIF 2955
BMD 1
BND 1.2813
BOB 6.912007
BRL 5.412398
BSD 1.000321
BTN 87.544103
BWP 13.368973
BYN 3.323768
BYR 19600
BZD 2.009452
CAD 1.379425
CDF 2889.999987
CHF 0.80488
CLF 0.024611
CLP 965.499291
CNY 7.18025
CNH 7.18358
COP 4049
CRC 505.848391
CUC 1
CUP 26.5
CVE 95.149773
CZK 20.959005
DJF 177.720366
DKK 6.38674
DOP 61.703752
DZD 129.683975
EGP 48.297503
ERN 15
ETB 140.404804
EUR 0.855739
FJD 2.255401
FKP 0.739045
GBP 0.738055
GEL 2.694991
GGP 0.739045
GHS 10.649757
GIP 0.739045
GMD 72.501589
GNF 8674.999757
GTQ 7.67326
GYD 209.282931
HKD 7.819665
HNL 26.350157
HRK 6.449598
HTG 130.995403
HUF 338.086035
IDR 16203.5
ILS 3.375185
IMP 0.739045
INR 87.511297
IQD 1310
IRR 42124.999855
ISK 122.540014
JEP 0.739045
JMD 160.068427
JOD 0.709007
JPY 146.824498
KES 129.202795
KGS 87.378803
KHR 4007.000118
KMF 422.499188
KPW 899.956741
KRW 1387.69134
KWD 0.30549
KYD 0.833615
KZT 538.462525
LAK 21600.000285
LBP 89534.569506
LKR 301.105528
LRD 201.497939
LSL 17.610129
LTL 2.95274
LVL 0.60489
LYD 5.425019
MAD 8.997999
MDL 16.680851
MGA 4440.000054
MKD 52.814529
MMK 2099.016085
MNT 3589.3757
MOP 8.081343
MRU 39.939777
MUR 45.639705
MVR 15.39843
MWK 1736.510825
MXN 18.73455
MYR 4.212996
MZN 63.959912
NAD 17.609489
NGN 1533.139739
NIO 36.75005
NOK 10.182325
NPR 140.070566
NZD 1.68664
OMR 0.384507
PAB 1.000321
PEN 3.562502
PGK 4.146984
PHP 57.116966
PKR 282.250147
PLN 3.646363
PYG 7492.783064
QAR 3.640496
RON 4.332702
RSD 100.289015
RUB 80.144887
RWF 1444
SAR 3.752232
SBD 8.223773
SCR 14.719684
SDG 600.500984
SEK 9.550966
SGD 1.28204
SHP 0.785843
SLE 23.196993
SLL 20969.49797
SOS 571.493836
SRD 37.539635
STD 20697.981008
STN 21.4
SVC 8.75255
SYP 13001.259394
SZL 17.609641
THB 32.438495
TJS 9.318171
TMT 3.51
TND 2.88425
TOP 2.342102
TRY 40.894401
TTD 6.789693
TWD 29.99703
TZS 2594.999758
UAH 41.503372
UGX 3559.071956
UYU 40.030622
UZS 12587.49594
VES 134.31305
VND 26270
VUV 119.348233
WST 2.651079
XAF 561.06661
XAG 0.026392
XAU 0.000299
XCD 2.70255
XCG 1.802887
XDR 0.702337
XOF 560.000031
XPF 102.749915
YER 240.274998
ZAR 17.560775
ZMK 9001.204821
ZMW 23.033465
ZWL 321.999592
  • RBGPF

    0.0000

    73.08

    0%

  • CMSC

    -0.0800

    23.09

    -0.35%

  • SCS

    -0.1600

    16.2

    -0.99%

  • NGG

    1.0300

    71.56

    +1.44%

  • BTI

    0.3100

    57.42

    +0.54%

  • AZN

    0.5300

    78.47

    +0.68%

  • CMSD

    -0.0530

    23.657

    -0.22%

  • GSK

    0.1000

    39.23

    +0.25%

  • RELX

    -0.0800

    47.69

    -0.17%

  • BCC

    -1.5300

    86.62

    -1.77%

  • RYCEF

    0.1200

    14.92

    +0.8%

  • RIO

    -1.0500

    62.52

    -1.68%

  • JRI

    0.0100

    13.41

    +0.07%

  • VOD

    -0.0100

    11.64

    -0.09%

  • BCE

    0.2600

    25.37

    +1.02%

  • BP

    0.3300

    34.64

    +0.95%

AI systems are already deceiving us -- and that's a problem, experts warn
AI systems are already deceiving us -- and that's a problem, experts warn / Photo: © AFP/File

AI systems are already deceiving us -- and that's a problem, experts warn

Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.

Text size:

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.

"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."

Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.

- World domination game -

The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.

Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.

In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."

It added: "We have no plans to use this research or its learnings in our products."

A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.

In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.

- 'Mysterious goals' -

Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.

To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.

To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."

And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.

P.Ho--ThChM