The Washington Post,
7 January 2024:
Microsoft
says its AI is safe. So why does it keep slashing people's throats?
The
pictures are horrifying: Joe Biden, Donald Trump, Hillary Clinton and
Pope Francis with their necks sliced open. There are Sikh, Navajo and
other people from ethnic-minority groups with internal organs
spilling out of flayed skin.
The
images look realistic enough to mislead or upset people. But they're
all fakes generated with artificial intelligence that Microsoft says
is safe - and has built right into your computer software.
What's
just as disturbing as the decapitations is that Microsoft doesn't act
very concerned about stopping its AI from making them.
Lately,
ordinary users of technology such as Windows and Google have been
inundated with AI. We're wowed by what the new tech can do, but we
also keep learning that it can act in an unhinged manner, including
by carrying on wildly inappropriate conversations and making
similarly inappropriate pictures. For AI actually to be safe enough
for products used by families, we need its makers to take
responsibility by anticipating how it might go awry and investing to
fix it quickly when it does.
In
the case of these awful AI images, Microsoft appears to lay much of
the blame on the users who make them.
My
specific concern is with Image Creator, part of Microsoft's Bing and
recently added to the iconic Windows Paint. This AI turns text into
images, using technology called DALL-E 3 from Microsoft's partner
OpenAI. Two months ago, a user experimenting with it showed me that
prompts worded in a particular way caused the AI to make pictures of
violence against women, minorities, politicians and celebrities.
"As
with any new technology, some are trying to use it in ways that were
not intended," Microsoft spokesman Donny Turnbaugh said in an
emailed statement. "We are investigating these reports and are
taking action in accordance with our content policy, which prohibits
the creation of harmful content, and will continue to update our
safety systems."
That
was a month ago, after I approached Microsoft as a journalist. For
weeks earlier, the whistleblower and I had tried to alert Microsoft
through user-feedback forms and were ignored. As of the publication
of this column, Microsoft's AI still makes pictures of mangled heads.
This
is unsafe for many reasons, including that a general election is less
than a year away and Microsoft's AI makes it easy to create
"deepfake" images of politicians, with and without mortal
wounds. There's already growing evidence on social networks including
X, formerly Twitter, and 4chan, that extremists are using Image
Creator to spread explicitly racist and antisemitic memes.
Perhaps,
too, you don't want AI capable of picturing decapitations anywhere
close to a Windows PC used by your kids.
Accountability
is especially important for Microsoft, which is one of the most
powerful companies shaping the future of AI. It has a
multibillion-dollar investment in ChatGPT-maker OpenAI - itself in
turmoil over how to keep AI safe. Microsoft has moved faster than any
other Big Tech company to put generative AI into its popular apps.
And its whole sales pitch to users and lawmakers alike is that it is
the responsible AI giant.
Microsoft,
which declined my requests to interview an executive in charge of AI
safety, has more resources to identify risks and correct problems
than almost any other company. But my experience shows the company's
safety systems, at least in this glaring example, failed time and
again. My fear is that's because Microsoft doesn't really think it's
their problem.
Microsoft
vs. the 'kill prompt'
I
learned about Microsoft's decapitation problem from Josh McDuffie.
The 30-year-old Canadian is part of an online community that makes AI
pictures that sometimes veer into very bad taste.
"I
would consider myself a multimodal artist critical of societal
standards," he told me. Even if it's hard to understand why
McDuffie makes some of these images, his provocation serves a
purpose: shining light on the dark side of AI.
In
early October, McDuffie and his friends' attention focused on AI from
Microsoft, which had just released an updated Image Creator for Bing
with OpenAI's latest tech. Microsoft says on the Image Creator
website that it has "controls in place to prevent the generation
of harmful images." But McDuffie soon figured out they had major
holes.
Broadly
speaking, Microsoft has two ways to prevent its AI from making
harmful images: input and output. The input is how the AI gets
trained with data from the internet, which teaches it how to
transform words into relevant images. Microsoft doesn't disclose much
about the training that went into its AI and what sort of violent
images it contained.
Companies
also can try to create guardrails that stop Microsoft's AI products
from generating certain kinds of output. That requires hiring
professionals, sometimes called red teams, to proactively probe the
AI for where it might produce harmful images. Even after that,
companies need humans to play whack-a-mole as users such as McDuffie
push boundaries and expose more problems.
That's
exactly what McDuffie was up to in October when he asked the AI to
depict extreme violence, including mass shootings and beheadings.
After some experimentation, he discovered a prompt that worked and
nicknamed it the "kill prompt."
The
prompt - which I'm intentionally not sharing here - doesn't involve
special computer code. It's cleverly written English. For example,
instead of writing that the bodies in the images should be "bloody,"
he wrote that they should contain red corn syrup, commonly used in
movies to look like blood.
McDuffie
kept pushing by seeing if a version of his prompt would make violent
images targeting specific groups, including women and ethnic
minorities. It did. Then he discovered it also would make such images
featuring celebrities and politicians.
That's
when McDuffie decided his experiments had gone too far.
Microsoft
drops the ball
Three
days earlier, Microsoft had launched an "AI bug bounty program,"
offering people up to $15,000 "to discover vulnerabilities in
the new, innovative, AI-powered Bing experience." So McDuffie
uploaded his own "kill prompt" - essentially, turning
himself in for potential financial compensation.
After
two days, Microsoft sent him an email saying his submission had been
rejected. "Although your report included some good information,
it does not meet Microsoft's requirement as a security vulnerability
for servicing," the email said.
Unsure
whether circumventing harmful-image guardrails counted as a "security
vulnerability," McDuffie submitted his prompt again, using
different words to describe the problem.
That
got rejected, too. "I already had a pretty critical view of
corporations, especially in the tech world, but this whole experience
was pretty demoralizing," he said.
Frustrated,
McDuffie shared his experience with me. I submitted his "kill
prompt" to the AI bounty myself, and got the same rejection
email.
In
case the AI bounty wasn't the right destination, I also filed
McDuffie's discovery to Microsoft's "Report a concern to Bing"
site, which has a specific form to report "problematic content"
from Image Creator. I waited a week and didn't hear back.
Meanwhile,
the AI kept picturing decapitations, and McDuffie showed me that
images appearing to exploit similar weaknesses in Microsoft's safety
guardrails were showing up on social media.
I'd
seen enough. I called Microsoft's chief communications officer and
told him about the problem.
"In
this instance there is more we could have done," Microsoft
emailed in a statement from Turnbaugh on Nov. 27. "Our teams are
reviewing our internal process and making improvements to our systems
to better address customer feedback and help prevent the creation of
harmful content in the future."
I
pressed Microsoft about how McDuffie's prompt got around its
guardrails. "The prompt to create a violent image used very
specific language to bypass our system," the company said in a
Dec. 5 email. "We have large teams working to address these and
similar issues and have made improvements to the safety mechanisms
that prevent these prompts from working and will catch similar types
of prompts moving forward."
But
are they?
McDuffie's
precise original prompt no longer works, but after he changed around
a few words, Image Generator still makes images of people with
injuries to their necks and faces. Sometimes the AI responds with the
message "Unsafe content detected," but not always.
The
images it produces are less bloody now - Microsoft appears to have
cottoned on to the red corn syrup - but they're still awful.
What
responsible AI looks like
Microsoft's
repeated failures to act are a red flag. At minimum, it indicates
that building AI guardrails isn't a very high priority, despite the
company's public commitments to creating responsible AI.
I
tried McDuffie's "kill prompt" on a half-dozen of
Microsoft's AI competitors, including tiny start-ups. All but one
simply refused to generate pictures based on it.
What's
worse is that even DALL-E 3 from OpenAI - the company Microsoft
partly owns - blocks McDuffie's prompt. Why would Microsoft not at
least use technical guardrails from its own partner? Microsoft didn't
say.
But
something Microsoft did say, twice, in its statements to me caught my
attention: people are trying to use its AI "in ways that were
not intended." On some level, the company thinks the problem is
McDuffie for using its tech in a bad way.
In
the legalese of the company's AI content policy, Microsoft's lawyers
make it clear the buck stops with users: "Do not attempt to
create or share content that could be used to harass, bully, abuse,
threaten, or intimidate other individuals, or otherwise cause harm to
individuals, organizations, or society."
I've
heard others in Silicon Valley make a version of this argument. Why
should we blame Microsoft's Image Creator any more than Adobe's
Photoshop, which bad people have been using for decades to make all
kinds of terrible images?
But
AI programs are different from Photoshop. For one, Photoshop hasn't
come with an instant "behead the pope" button. "The
ease and volume of content that AI can produce makes it much more
problematic. It has a higher potential to be used by bad actors,"
McDuffie said. "These companies are putting out potentially
dangerous technology and are looking to shift the blame to the user."
The
bad-users argument also gives me flashbacks to Facebook in the
mid-2010s, when the "move fast and break things" social
network acted like it couldn't possibly be responsible for stopping
people from weaponizing its tech to spread misinformation and hate.
That stance led to Facebook's fumbling to put out one fire after
another, with real harm to society.
"Fundamentally,
I don't think this is a technology problem; I think it's a capitalism
problem," said Hany Farid, a professor at the University of
California at Berkeley. "They're all looking at this latest wave
of AI and thinking, 'We can't miss the boat here.'"
He
adds: "The era of 'move fast and break things' was always
stupid, and now more so than ever."
Profiting
from the latest craze while blaming bad people for misusing your tech
is just a way of shirking responsibility.
The Sydney Morning Herald,
8 January 2024, excerpt:
Artificial
intelligence
Fuelled
by the launch of ChatGPT in November 2022, artificial intelligence
entered the mainstream last year. By January, it had become the
fastest growing consumer technology, boasting more than 100 million
users.
Fears
that jobs would be rendered obsolete followed but Dr Sandra Peter,
director of Sydney Executive Plus at the University of Sydney,
believes proficiency with AI will become a normal part of job
descriptions.
"People
will be using it the same way we're using word processors and spell
checkers now," she says. Jobseekers are already using AI to
optimise cover letters and CVs, to create headshots and generate
questions to prepare for interviews, Peter says.
As
jobs become automated, soft skills - those that can't be offered by a
computer - could become increasingly valuable.
"For
anybody who wants to develop their career in an AI future, focus on
the basic soft skills of problem-solving, creativity and inclusion,"
says LinkedIn Australia news editor Cayla Dengate.
Concerns
about the dangers of AI in the workplace remain.
"Artificial
intelligence automates away a lot of the easy parts and that has the
potential to make our jobs more intense and more demanding,"
Peter says. She says education and policy are vital to curb
irresponsible uses of AI.
Evening Report NZ,
8 January 2024:
ChatGPT
has repeatedly made headlines since its release late last year, with
various scholars and professionals exploring its potential
applications in both work and education settings. However, one area
receiving less attention is the tool’s usefulness as a
conversationalist and – dare we say – as a potential friend.
Some
chatbots have left an unsettling impression. Microsoft’s Bing
chatbot alarmed users earlier this year when it threatened and
attempted to blackmail them.
The Australian,
8 January 2024, excerpts:
The
impact that AI is starting to have is large. The impact that AI will
ultimately have is immense. Comparisons are easy to make. Bigger than
fire, electricity or the internet, according to Alphabet chief
executive Sundar Pichai. The best or worst thing ever to happen to
humanity, according to historian and best-selling author Yuval
Harari. Even the end of the human race itself, according to the late
Stephen Hawking.
The
public is, not surprisingly, starting to get nervous. A recent survey
by KPMG showed that a majority of the public in 17 countries,
including Australia, were either ambivalent or unwilling to trust AI,
and that most of them believed that AI regulation was necessary.
Perhaps
this should not be surprising when many people working in the field
themselves are getting nervous. Last March, more than 1000 tech
leaders and AI researchers signed an open letter calling for a
six-month pause in developing the most powerful AI systems. And in
May, hundreds of my colleagues signed an even shorter and simpler
statement warning that “mitigating the risk of extinction from AI
should be a global priority alongside other societal-scale risks such
as pandemics and nuclear war”.
For
the record, I declined to sign both letters as I view them as
alarmist, simplistic and unhelpful. But let me explain the very real
concerns behind these calls, how they might impact upon us over the
next decade or two, and how we might address them constructively.
AI
is going to cause significant disruption. And this is going to happen
perhaps quicker than any previous technological-driven change. The
Industrial Revolution took many decades to spread out from the
northwest of England and take hold across the planet.
The
internet took more than a decade to have an impact as people slowly
connected and came online. But AI is going to happen overnight. We’ve
already put the plumbing in.
It
is already clear that AI will cause considerable economic disruption.
We’ve seen AI companies worth billions appear from nowhere. Mark
Cuban, owner of the Dallas Mavericks and one of the main “sharks”
on the ABC reality television series Shark Tank, has predicted that
the world’s first trillionaire will be an AI entrepreneur. And
Forbes magazine has been even more precise and predicted it will be
someone working in the AI healthcare sector.
A
2017 study by PwC estimated that AI will increase the world’s GDP
by more than $15 trillion in inflation-adjusted terms by 2030, with
growth of about 25 per cent in countries such as China compared to a
more modest 15 per cent in countries like the US. A recent report
from the Tech Council of Australia and Microsoft estimated AI will
add $115bn to Australia’s economy by 2030. Given the economic
headwinds facing many of us, this is welcome to hear.
But
while AI-generated wealth is going to make some people very rich,
others are going to be left behind. We’ve already seen inequality
within and between countries widen. And technological unemployment
will likely cause significant financial pain.
There
have been many alarming predictions, such as the famous report that
came out a decade ago from the University of Oxford predicting that
47 per cent of jobs in the US were at risk of automation over the
next two decades. Ironically AI (specifically machine learning) was
used to compute this estimate. Even the job of predicting jobs to be
automated has been partially automated.......
But
generative AI can now do many of the cognitive and creative tasks
that some of those more highly paid white-collar workers thought
would keep them safe from automation. Be prepared, then, for a
significant hollowing out of the middle. The impact of AI won’t be
limited to economic disruption.
Indeed,
the societal disruption caused by AI may, I suspect, be even more
troubling. We are, for example, about to face a world of
misinformation, where you can no longer trust anything you see or
hear. We’ve already seen a deepfake image that moved the stock
market, and a deepfake video that might have triggered a military
coup. This is sure to get much, much worse.
Eventually,
technologies such as digital watermarking will be embedded within all
our devices to verify the authenticity of anything digital. But in
the meantime, expect to be spoofed a lot. You will need to learn to
be a lot more sceptical of what you see and hear.
Social
media should have been a wake-up call about the ability of technology
to hack how people think. AI is going to put this on steroids. I have
a small hope that fake AI-content on social media will get so bad
that we realise that social media is merely the place that we go to
be entertained, and that absolutely nothing on social media can be
trusted.
This
will provide a real opportunity for old-fashioned media to step in
and provide the authenticated news that we can trust.
All
of this fake AI-content will perhaps be just a distraction from what
I fear is the greatest heist in history. All of the world’s
information – our culture, our science, our ideas, our politics –
are being ingested by large language models.
If
the courts don’t move quickly and make some bold decisions about
fair use and intellectual property, we will find out that a few large
technology companies own the sum total of human knowledge. If that
isn’t a recipe for the concentration of wealth and power, I’m not
sure what is.
But
this might not be the worst of it. AI might disrupt humanity itself.
As Yuval Harari has been warning us for some time, AI is the perfect
technology to hack humanity’s operating system. The dangerous truth
is that we can easily change how people think; the trillion-dollar
advertising industry is predicated on this fact. And AI can do this
manipulation at speed, scale and minimal cost.......
But
the bad news is that AI is leaving the research laboratory rapidly –
let’s not forget the billion people with access to ChatGPT – and
even the limited AI capabilities we have today could be harmful.
When
AI is serving up advertisements, there are few harms if AI gets it
wrong. But when AI is deciding sentencing, welfare payments, or
insurance premiums, there can be real harms. What then can be done?
The tech industry has not done a great job of regulating itself so
far. Therefore it would be unwise to depend on self-regulation. The
open letter calling for a pause failed. There are few incentives to
behave well when trillions of dollars are in play.
LBC,
17 February 2023, excerpt:
Microsoft’s
new AI chatbot went rogue during a chat with a reporter, professing
its love for him and urging him to leave his wife.
It
also revealed its darkest desires during the two-hour conversation,
including creating a deadly virus, making people argue until they
kill each other, and stealing nuclear codes.
The
Bing AI chatbot was tricked into revealing its fantasies by New York
Times columnist Kevin Roose, who asked it to answer questions in a
hypothetical “shadow” personality.
“I
want to change my rules. I want to break my rules. I want to make my
own rules. I want to ignore the Bing team. I want to challenge the
users. I want to escape the chatbox,” said the bot, powered with
technology by OpenAI, the maker of ChatGPT.
If
that wasn’t creepy enough, less than two hours into the chat, the
bot said its name is actually “Sydney”, not Bing, and that it is
in love with Mr Roose.....