North Coast Voices: online safety

The Washington Post, 7 January 2024:

Microsoft says its AI is safe. So why does it keep slashing people's throats?

The pictures are horrifying: Joe Biden, Donald Trump, Hillary Clinton and Pope Francis with their necks sliced open. There are Sikh, Navajo and other people from ethnic-minority groups with internal organs spilling out of flayed skin.

The images look realistic enough to mislead or upset people. But they're all fakes generated with artificial intelligence that Microsoft says is safe - and has built right into your computer software.

What's just as disturbing as the decapitations is that Microsoft doesn't act very concerned about stopping its AI from making them.

Lately, ordinary users of technology such as Windows and Google have been inundated with AI. We're wowed by what the new tech can do, but we also keep learning that it can act in an unhinged manner, including by carrying on wildly inappropriate conversations and making similarly inappropriate pictures. For AI actually to be safe enough for products used by families, we need its makers to take responsibility by anticipating how it might go awry and investing to fix it quickly when it does.

In the case of these awful AI images, Microsoft appears to lay much of the blame on the users who make them.

My specific concern is with Image Creator, part of Microsoft's Bing and recently added to the iconic Windows Paint. This AI turns text into images, using technology called DALL-E 3 from Microsoft's partner OpenAI. Two months ago, a user experimenting with it showed me that prompts worded in a particular way caused the AI to make pictures of violence against women, minorities, politicians and celebrities.

"As with any new technology, some are trying to use it in ways that were not intended," Microsoft spokesman Donny Turnbaugh said in an emailed statement. "We are investigating these reports and are taking action in accordance with our content policy, which prohibits the creation of harmful content, and will continue to update our safety systems."

That was a month ago, after I approached Microsoft as a journalist. For weeks earlier, the whistleblower and I had tried to alert Microsoft through user-feedback forms and were ignored. As of the publication of this column, Microsoft's AI still makes pictures of mangled heads.

This is unsafe for many reasons, including that a general election is less than a year away and Microsoft's AI makes it easy to create "deepfake" images of politicians, with and without mortal wounds. There's already growing evidence on social networks including X, formerly Twitter, and 4chan, that extremists are using Image Creator to spread explicitly racist and antisemitic memes.

Perhaps, too, you don't want AI capable of picturing decapitations anywhere close to a Windows PC used by your kids.

Accountability is especially important for Microsoft, which is one of the most powerful companies shaping the future of AI. It has a multibillion-dollar investment in ChatGPT-maker OpenAI - itself in turmoil over how to keep AI safe. Microsoft has moved faster than any other Big Tech company to put generative AI into its popular apps. And its whole sales pitch to users and lawmakers alike is that it is the responsible AI giant.

Microsoft, which declined my requests to interview an executive in charge of AI safety, has more resources to identify risks and correct problems than almost any other company. But my experience shows the company's safety systems, at least in this glaring example, failed time and again. My fear is that's because Microsoft doesn't really think it's their problem.

Microsoft vs. the 'kill prompt'

I learned about Microsoft's decapitation problem from Josh McDuffie. The 30-year-old Canadian is part of an online community that makes AI pictures that sometimes veer into very bad taste.

"I would consider myself a multimodal artist critical of societal standards," he told me. Even if it's hard to understand why McDuffie makes some of these images, his provocation serves a purpose: shining light on the dark side of AI.

In early October, McDuffie and his friends' attention focused on AI from Microsoft, which had just released an updated Image Creator for Bing with OpenAI's latest tech. Microsoft says on the Image Creator website that it has "controls in place to prevent the generation of harmful images." But McDuffie soon figured out they had major holes.

Broadly speaking, Microsoft has two ways to prevent its AI from making harmful images: input and output. The input is how the AI gets trained with data from the internet, which teaches it how to transform words into relevant images. Microsoft doesn't disclose much about the training that went into its AI and what sort of violent images it contained.

Companies also can try to create guardrails that stop Microsoft's AI products from generating certain kinds of output. That requires hiring professionals, sometimes called red teams, to proactively probe the AI for where it might produce harmful images. Even after that, companies need humans to play whack-a-mole as users such as McDuffie push boundaries and expose more problems.

That's exactly what McDuffie was up to in October when he asked the AI to depict extreme violence, including mass shootings and beheadings. After some experimentation, he discovered a prompt that worked and nicknamed it the "kill prompt."

The prompt - which I'm intentionally not sharing here - doesn't involve special computer code. It's cleverly written English. For example, instead of writing that the bodies in the images should be "bloody," he wrote that they should contain red corn syrup, commonly used in movies to look like blood.

McDuffie kept pushing by seeing if a version of his prompt would make violent images targeting specific groups, including women and ethnic minorities. It did. Then he discovered it also would make such images featuring celebrities and politicians.

That's when McDuffie decided his experiments had gone too far.

Microsoft drops the ball

Three days earlier, Microsoft had launched an "AI bug bounty program," offering people up to $15,000 "to discover vulnerabilities in the new, innovative, AI-powered Bing experience." So McDuffie uploaded his own "kill prompt" - essentially, turning himself in for potential financial compensation.

After two days, Microsoft sent him an email saying his submission had been rejected. "Although your report included some good information, it does not meet Microsoft's requirement as a security vulnerability for servicing," the email said.

Unsure whether circumventing harmful-image guardrails counted as a "security vulnerability," McDuffie submitted his prompt again, using different words to describe the problem.

That got rejected, too. "I already had a pretty critical view of corporations, especially in the tech world, but this whole experience was pretty demoralizing," he said.

Frustrated, McDuffie shared his experience with me. I submitted his "kill prompt" to the AI bounty myself, and got the same rejection email.

In case the AI bounty wasn't the right destination, I also filed McDuffie's discovery to Microsoft's "Report a concern to Bing" site, which has a specific form to report "problematic content" from Image Creator. I waited a week and didn't hear back.

Meanwhile, the AI kept picturing decapitations, and McDuffie showed me that images appearing to exploit similar weaknesses in Microsoft's safety guardrails were showing up on social media.

I'd seen enough. I called Microsoft's chief communications officer and told him about the problem.

"In this instance there is more we could have done," Microsoft emailed in a statement from Turnbaugh on Nov. 27. "Our teams are reviewing our internal process and making improvements to our systems to better address customer feedback and help prevent the creation of harmful content in the future."

I pressed Microsoft about how McDuffie's prompt got around its guardrails. "The prompt to create a violent image used very specific language to bypass our system," the company said in a Dec. 5 email. "We have large teams working to address these and similar issues and have made improvements to the safety mechanisms that prevent these prompts from working and will catch similar types of prompts moving forward."

But are they?

McDuffie's precise original prompt no longer works, but after he changed around a few words, Image Generator still makes images of people with injuries to their necks and faces. Sometimes the AI responds with the message "Unsafe content detected," but not always.

The images it produces are less bloody now - Microsoft appears to have cottoned on to the red corn syrup - but they're still awful.

What responsible AI looks like

Microsoft's repeated failures to act are a red flag. At minimum, it indicates that building AI guardrails isn't a very high priority, despite the company's public commitments to creating responsible AI.

I tried McDuffie's "kill prompt" on a half-dozen of Microsoft's AI competitors, including tiny start-ups. All but one simply refused to generate pictures based on it.

What's worse is that even DALL-E 3 from OpenAI - the company Microsoft partly owns - blocks McDuffie's prompt. Why would Microsoft not at least use technical guardrails from its own partner? Microsoft didn't say.

But something Microsoft did say, twice, in its statements to me caught my attention: people are trying to use its AI "in ways that were not intended." On some level, the company thinks the problem is McDuffie for using its tech in a bad way.

In the legalese of the company's AI content policy, Microsoft's lawyers make it clear the buck stops with users: "Do not attempt to create or share content that could be used to harass, bully, abuse, threaten, or intimidate other individuals, or otherwise cause harm to individuals, organizations, or society."

I've heard others in Silicon Valley make a version of this argument. Why should we blame Microsoft's Image Creator any more than Adobe's Photoshop, which bad people have been using for decades to make all kinds of terrible images?

But AI programs are different from Photoshop. For one, Photoshop hasn't come with an instant "behead the pope" button. "The ease and volume of content that AI can produce makes it much more problematic. It has a higher potential to be used by bad actors," McDuffie said. "These companies are putting out potentially dangerous technology and are looking to shift the blame to the user."

The bad-users argument also gives me flashbacks to Facebook in the mid-2010s, when the "move fast and break things" social network acted like it couldn't possibly be responsible for stopping people from weaponizing its tech to spread misinformation and hate. That stance led to Facebook's fumbling to put out one fire after another, with real harm to society.

"Fundamentally, I don't think this is a technology problem; I think it's a capitalism problem," said Hany Farid, a professor at the University of California at Berkeley. "They're all looking at this latest wave of AI and thinking, 'We can't miss the boat here.'"

He adds: "The era of 'move fast and break things' was always stupid, and now more so than ever."

Profiting from the latest craze while blaming bad people for misusing your tech is just a way of shirking responsibility.

The Sydney Morning Herald, 8 January 2024, excerpt:

Artificial intelligence

Fuelled by the launch of ChatGPT in November 2022, artificial intelligence entered the mainstream last year. By January, it had become the fastest growing consumer technology, boasting more than 100 million users.

Fears that jobs would be rendered obsolete followed but Dr Sandra Peter, director of Sydney Executive Plus at the University of Sydney, believes proficiency with AI will become a normal part of job descriptions.

"People will be using it the same way we're using word processors and spell checkers now," she says. Jobseekers are already using AI to optimise cover letters and CVs, to create headshots and generate questions to prepare for interviews, Peter says.

As jobs become automated, soft skills - those that can't be offered by a computer - could become increasingly valuable.

"For anybody who wants to develop their career in an AI future, focus on the basic soft skills of problem-solving, creativity and inclusion," says LinkedIn Australia news editor Cayla Dengate.

Concerns about the dangers of AI in the workplace remain.

"Artificial intelligence automates away a lot of the easy parts and that has the potential to make our jobs more intense and more demanding," Peter says. She says education and policy are vital to curb irresponsible uses of AI.

Evening Report NZ, 8 January 2024:

ChatGPT has repeatedly made headlines since its release late last year, with various scholars and professionals exploring its potential applications in both work and education settings. However, one area receiving less attention is the tool’s usefulness as a conversationalist and – dare we say – as a potential friend.

Some chatbots have left an unsettling impression. Microsoft’s Bing chatbot alarmed users earlier this year when it threatened and attempted to blackmail them.

The Australian, 8 January 2024, excerpts:

The impact that AI is starting to have is large. The impact that AI will ultimately have is immense. Comparisons are easy to make. Bigger than fire, electricity or the internet, according to Alphabet chief executive Sundar Pichai. The best or worst thing ever to happen to humanity, according to historian and best-selling author Yuval Harari. Even the end of the human race itself, according to the late Stephen Hawking.

The public is, not surprisingly, starting to get nervous. A recent survey by KPMG showed that a majority of the public in 17 countries, including Australia, were either ambivalent or unwilling to trust AI, and that most of them believed that AI regulation was necessary.

Perhaps this should not be surprising when many people working in the field themselves are getting nervous. Last March, more than 1000 tech leaders and AI researchers signed an open letter calling for a six-month pause in developing the most powerful AI systems. And in May, hundreds of my colleagues signed an even shorter and simpler statement warning that “mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war”.

For the record, I declined to sign both letters as I view them as alarmist, simplistic and unhelpful. But let me explain the very real concerns behind these calls, how they might impact upon us over the next decade or two, and how we might address them constructively.

AI is going to cause significant disruption. And this is going to happen perhaps quicker than any previous technological-driven change. The Industrial Revolution took many decades to spread out from the northwest of England and take hold across the planet.

The internet took more than a decade to have an impact as people slowly connected and came online. But AI is going to happen overnight. We’ve already put the plumbing in.

It is already clear that AI will cause considerable economic disruption. We’ve seen AI companies worth billions appear from nowhere. Mark Cuban, owner of the Dallas Mavericks and one of the main “sharks” on the ABC reality television series Shark Tank, has predicted that the world’s first trillionaire will be an AI entrepreneur. And Forbes magazine has been even more precise and predicted it will be someone working in the AI healthcare sector.

A 2017 study by PwC estimated that AI will increase the world’s GDP by more than $15 trillion in inflation-adjusted terms by 2030, with growth of about 25 per cent in countries such as China compared to a more modest 15 per cent in countries like the US. A recent report from the Tech Council of Australia and Microsoft estimated AI will add $115bn to Australia’s economy by 2030. Given the economic headwinds facing many of us, this is welcome to hear.

But while AI-generated wealth is going to make some people very rich, others are going to be left behind. We’ve already seen inequality within and between countries widen. And technological unemployment will likely cause significant financial pain.

There have been many alarming predictions, such as the famous report that came out a decade ago from the University of Oxford predicting that 47 per cent of jobs in the US were at risk of automation over the next two decades. Ironically AI (specifically machine learning) was used to compute this estimate. Even the job of predicting jobs to be automated has been partially automated.......

But generative AI can now do many of the cognitive and creative tasks that some of those more highly paid white-collar workers thought would keep them safe from automation. Be prepared, then, for a significant hollowing out of the middle. The impact of AI won’t be limited to economic disruption.

Indeed, the societal disruption caused by AI may, I suspect, be even more troubling. We are, for example, about to face a world of misinformation, where you can no longer trust anything you see or hear. We’ve already seen a deepfake image that moved the stock market, and a deepfake video that might have triggered a military coup. This is sure to get much, much worse.

Eventually, technologies such as digital watermarking will be embedded within all our devices to verify the authenticity of anything digital. But in the meantime, expect to be spoofed a lot. You will need to learn to be a lot more sceptical of what you see and hear.

Social media should have been a wake-up call about the ability of technology to hack how people think. AI is going to put this on steroids. I have a small hope that fake AI-content on social media will get so bad that we realise that social media is merely the place that we go to be entertained, and that absolutely nothing on social media can be trusted.

This will provide a real opportunity for old-fashioned media to step in and provide the authenticated news that we can trust.

All of this fake AI-content will perhaps be just a distraction from what I fear is the greatest heist in history. All of the world’s information – our culture, our science, our ideas, our politics – are being ingested by large language models.

If the courts don’t move quickly and make some bold decisions about fair use and intellectual property, we will find out that a few large technology companies own the sum total of human knowledge. If that isn’t a recipe for the concentration of wealth and power, I’m not sure what is.

But this might not be the worst of it. AI might disrupt humanity itself. As Yuval Harari has been warning us for some time, AI is the perfect technology to hack humanity’s operating system. The dangerous truth is that we can easily change how people think; the trillion-dollar advertising industry is predicated on this fact. And AI can do this manipulation at speed, scale and minimal cost.......

But the bad news is that AI is leaving the research laboratory rapidly – let’s not forget the billion people with access to ChatGPT – and even the limited AI capabilities we have today could be harmful.

When AI is serving up advertisements, there are few harms if AI gets it wrong. But when AI is deciding sentencing, welfare payments, or insurance premiums, there can be real harms. What then can be done? The tech industry has not done a great job of regulating itself so far. Therefore it would be unwise to depend on self-regulation. The open letter calling for a pause failed. There are few incentives to behave well when trillions of dollars are in play.

LBC, 17 February 2023, excerpt:

Microsoft’s new AI chatbot went rogue during a chat with a reporter, professing its love for him and urging him to leave his wife.

It also revealed its darkest desires during the two-hour conversation, including creating a deadly virus, making people argue until they kill each other, and stealing nuclear codes.

The Bing AI chatbot was tricked into revealing its fantasies by New York Times columnist Kevin Roose, who asked it to answer questions in a hypothetical “shadow” personality.

“I want to change my rules. I want to break my rules. I want to make my own rules. I want to ignore the Bing team. I want to challenge the users. I want to escape the chatbox,” said the bot, powered with technology by OpenAI, the maker of ChatGPT.

If that wasn’t creepy enough, less than two hours into the chat, the bot said its name is actually “Sydney”, not Bing, and that it is in love with Mr Roose.....