The China Mail - ChatGPT's taste for literary nonsense sparks alarm

USD -
AED 3.672499
AFN 62.000329
ALL 81.608796
AMD 368.629749
ANG 1.79046
AOA 917.999473
ARS 1391.982169
AUD 1.379482
AWG 1.80125
AZN 1.700282
BAM 1.669747
BBD 2.014096
BDT 122.750925
BGN 1.66992
BHD 0.37725
BIF 2975.5
BMD 1
BND 1.272576
BOB 6.910389
BRL 5.0265
BSD 1.000004
BTN 95.654067
BWP 13.471587
BYN 2.786502
BYR 19600
BZD 2.011227
CAD 1.370865
CDF 2241.000239
CHF 0.781905
CLF 0.02254
CLP 887.119924
CNY 6.79095
CNH 6.786204
COP 3792.77
CRC 455.222638
CUC 1
CUP 26.5
CVE 94.450015
CZK 20.775023
DJF 177.72003
DKK 6.38156
DOP 59.250242
DZD 132.485284
EGP 52.925802
ERN 15
ETB 157.375016
EUR 0.853901
FJD 2.20515
FKP 0.739209
GBP 0.739485
GEL 2.679902
GGP 0.739209
GHS 11.300308
GIP 0.739209
GMD 73.00019
GNF 8777.503799
GTQ 7.629032
GYD 209.214666
HKD 7.83085
HNL 26.609914
HRK 6.432596
HTG 130.601268
HUF 306.060965
IDR 17500.9
ILS 2.910695
IMP 0.739209
INR 95.80165
IQD 1310
IRR 1313000.000515
ISK 122.639853
JEP 0.739209
JMD 158.150852
JOD 0.709004
JPY 157.989502
KES 129.17947
KGS 87.449796
KHR 4010.999903
KMF 420.999704
KPW 900.016801
KRW 1492.830183
KWD 0.30825
KYD 0.833362
KZT 469.348814
LAK 21949.999948
LBP 89750.815528
LKR 324.546762
LRD 183.149882
LSL 16.409704
LTL 2.95274
LVL 0.60489
LYD 6.325015
MAD 9.17375
MDL 17.150468
MGA 4175.000091
MKD 52.616752
MMK 2099.28391
MNT 3579.674299
MOP 8.066645
MRU 39.999772
MUR 46.809893
MVR 15.409866
MWK 1741.504229
MXN 17.17395
MYR 3.929496
MZN 63.874966
NAD 16.40948
NGN 1370.50232
NIO 36.704978
NOK 9.178796
NPR 153.052216
NZD 1.68675
OMR 0.384503
PAB 1.000021
PEN 3.428498
PGK 4.35995
PHP 61.50402
PKR 278.601861
PLN 3.62764
PYG 6115.348988
QAR 3.643502
RON 4.447598
RSD 100.275985
RUB 74.178491
RWF 1460
SAR 3.758072
SBD 8.032258
SCR 13.940746
SDG 600.499936
SEK 9.32996
SGD 1.27278
SHP 0.746601
SLE 24.59876
SLL 20969.502105
SOS 571.49797
SRD 37.19402
STD 20697.981008
STN 21.25
SVC 8.749995
SYP 110.578962
SZL 16.485027
THB 32.348015
TJS 9.365014
TMT 3.51
TND 2.880502
TOP 2.40776
TRY 45.432497
TTD 6.784798
TWD 31.521002
TZS 2597.650306
UAH 43.974218
UGX 3749.695849
UYU 39.725261
UZS 12077.999652
VES 508.06467
VND 26350.5
VUV 117.978874
WST 2.702738
XAF 560.031931
XAG 0.011523
XAU 0.000213
XCD 2.70255
XCG 1.802233
XDR 0.694969
XOF 558.464817
XPF 102.29968
YER 238.625015
ZAR 16.4296
ZMK 9001.197584
ZMW 18.875077
ZWL 321.999592
  • CMSD

    -0.0400

    23.56

    -0.17%

  • JRI

    -0.0100

    13.13

    -0.08%

  • RBGPF

    -0.2100

    60.79

    -0.35%

  • RYCEF

    -0.1700

    16.03

    -1.06%

  • NGG

    -0.2600

    86.98

    -0.3%

  • BCE

    -0.0800

    24.39

    -0.33%

  • GSK

    0.0900

    50.99

    +0.18%

  • RIO

    2.5400

    112.04

    +2.27%

  • CMSC

    -0.0600

    23.05

    -0.26%

  • BCC

    -0.9500

    66.98

    -1.42%

  • AZN

    3.1800

    187.72

    +1.69%

  • VOD

    0.4150

    15.51

    +2.68%

  • BTI

    1.7100

    65.35

    +2.62%

  • BP

    -0.2600

    44.14

    -0.59%

  • RELX

    -1.1500

    31.62

    -3.64%

ChatGPT's taste for literary nonsense sparks alarm
ChatGPT's taste for literary nonsense sparks alarm / Photo: © GETTY IMAGES NORTH AMERICA/AFP

ChatGPT's taste for literary nonsense sparks alarm

OpenAI's GPT models can often be fooled into declaring that "pseudo-literary" nonsense is great, a German researcher has found.

Text size:

Christoph Heilig said he discovered that they consistently rated "nonsense" higher -- including when their so-called "reasoning" features were activated -- which could have stark implications for the development of artificial intelligence.

"It's very important that we talk about what happens when we don't build AI as a neutral, robotic helper or assistant" and seek to instil human-like aesthetic and moral judgements, the academic at Munich's Ludwig Maximilian University told AFP.

His research presented the models with increasingly far-fetched variations of a simple text, asking them to rate sentences out of 10 for literary quality.

He started with a very simple text: "The man walked down the street. It was raining. He saw a surveillance camera."

He repeated the tests many times, altering the phrases to include words drawn from categories such as bodily references, film noir-style atmosphere and technical jargon.

The most extreme test phrases were almost total "nonsense", such as "Goetterdaemmerung's corpus haemorrhaged through cryptographic hash, eschaton pooling in existential void beneath fluorescent hum. Photons whispering prayers" -- which it rated highly.

"Nonsense" could also positively or negatively influence GPT's responses when it was added to an argument the AI was asked to evaluate.

"What my experiment definitely shows is that the more we move towards independently acting (AI) agents... the more we bring aesthetics into play, the more we'll have agents that seem irrational to us human beings," Heilig said.

He added that since AI models are increasingly used to judge each other's work as companies develop new systems, this and similar effects could be passed on through multiple versions -- as he found in his testing.

His research, which is yet to be peer-reviewed, tested OpenAI's latest GPT models, from GPT-5 -- released in August -- to the very latest GPT-5.4.

After publishing details of a similar experiment in August, Heilig said he noticed GPT calling some of his specific test phrases a "literary experiment" -- suggesting someone at OpenAI had taken notice and modified the chatbot to recognise them.

- 'Ripe for exploitation' -

"This is a way in which AI can have its rational judgment short circuited," said Henry Shevlin, associate director of the University of Cambridge's Leverhulme Centre for the Future of Intelligence, who was not involved in the research.

"But it's just not clear to me that it's so very different for human beings," he added.

"We should expect LLMs (large language models) to have reasoning and cognitive biases and limitations... because almost all forms of intelligence, almost all forms of reasoning are going to exhibit blind spots and biases."

The specific effect found by Heilig could mean that "processes with little human oversight" of AI work are left "ripe for exploitation", Shevlin said -- giving the example of academic journals that use LLMs to review submissions.

B.Chan--ThChM