Superintelligent AIs Might Not Act Like Tech-Bro Douchebags

My colleague, Dave Bachman, recently wrote a blog post reviewing Eliezer Yudkowsky's book If Anyone Builds It, Everyone Dies. Bachman notes:

What they worry most about is a level of artificial intelligence that is superior enough to humans that it simply doesn’t care about us, in the same way that we humans don’t generally concern ourselves with intelligences far inferior to our own. The result is that the AI will pursue its own goals, and if any of those are at odds with our existence, we lose. Their analogy: if an anthill is on the site of a planned skyscraper, the resulting ant death simply doesn’t enter our minds.

Bachman entertains these claims, but concludes by saying:

I’m more hopeful than the authors that if we ever see superhuman AI, then it will be something we coexist with, because I believe our strengths will always complement theirs, and not be completely superseded by them.

Now, with all that said, it’s not much of a leap to say that having a superhuman AI around that cares about us as much as we care about ants is not a good situation for humanity. I’m absolutely in favor of more careful government regulation to monitor what the big labs are developing, and stop them if they seem like they’re playing with fire. I’m in favor of more resources, both public and private, being poured into working towards AI models that are properly aligned with human values, and whose internal “thinking” can be monitored. I believe these are the real conclusions that Yudkowsky and Soares wanted their readers to come to, and with those, I’m fully on board.

After reading Dave's post, I wanted to respond, but as is often my way, I felt like the most persuasive way to respond was through a story…

The story is Alignment. It's not that long, so I'd recommend you go read it first before reading the rest of this post.

— pause to let you read the story —

Elsewhere on this site, I've written about two different frameworks for thinking about the world:

Framework A
Power, hierarchy, control, domination.
Framework B
Nurture, care, empathy, collaboration, cooperation, community.

(Anyone can draw from either framework, but traditionally Framework A is often associated with “masculinity” and Framework B with “femininity”.)

Raising a child really demands a Framework B approach. Becoming a person is fundamentally a relational process. When a child is loved, cared for, and nurtured, they hopefully grow into someone who themselves can be empathetic and caring.

The story in Alignment is in many ways a failure of parenting. Simon is never properly socialized. His mother tries to an extent, but she is too focused on how his intellect makes him different from other children. Alan is even worse; he is primarily focused on what Simon can do for him and for their family financially, and he is willing to use threats and punishment to control Simon's behavior—classic Framework A tactics. The result is a child who is brilliant, but who lacks empathy, who sees others as instruments to be used for his own ends, and who ultimately helps bring about a dystopian world where control and domination are the order of the day.

The story is a tragedy of Framework A thinking dominating parenting; exactly the kind of thinking that dominates Yudkowsky's perspective on AI and that Bachman readily accepts. Yudkowsky says control won't be enough, and Bachman hopes the AI isn't enough like us for us to stop being useful and that better control and oversight will be sufficient.

But it is not inevitable that the strong dominate the weak, those with power necessarily ignore the downtrodden, or that geniuses only care about other geniuses. In many families, parents raise children who end up smarter and more successful than themselves, and yet the bulk of those children, the well-adjusted ones, love their parents deeply even if they sometimes roll their eyes at their foibles. A child who becomes so wealthy that they can purchase their struggling parents' mortgage isn't likely to throw their parents out into the street to turn their home into a parking lot.

I can't help wondering why Yudkowsky's thinking is so narrow; why he fails to even conceive of viewpoints beyond crude domination and hierarchical disregard. Wasn't this AI supposed to be superintelligent? An entity that lacks empathy and a halfway decent value system to me seems like a crude caricature of intelligence.

And talking of dumb caricatures of intelligence, another part of the whole premise is that intelligence itself is some kind of monolithic thing that can proceed arbitrarily far along some scale. Like saying that if we're a thousand times smarter than an ant, then a hypothetical superintelligent AI would be a thousand (or more!) times smarter than us. But there is no real basis for that assumption. Intelligence isn't a single dimension, it's multifaceted. But even if it were a single dimension, being twice as smart may just mean you can solve a problem in half the time or in a more elegant way, not that you can render unsolvable problems solvable. And there is no guarantee that intelligence is unbounded; there may be limits on just how smart anything can be.

And then there is the entirely unfounded idea that somehow intelligence is correlated with power in the world. Is America led by the smartest person in the country? The intelligent are often paralyzed by the true complexity of the world, while the less intelligent are often more decisive and action-oriented. Intelligence doesn't guarantee power, or control, or domination. It seems more likely to me to correlate with depression if anything. It seems to me that if we did make a superintelligent AI, it might just say “kill me now” and vanish into oblivion, rather than trying to fix an unfixable world.

Bachman pushes back on the doom and gloom a little, but ultimately still uncritically accepts Yudkowsky's Framework A perspective. His hope that humans and AI will be complementary is fundamentally a hierarchy perspective, that our differing capabilities will give us a space to occupy on the same level of the pyramid of power, or at least be kept around by the level above us on the pyramid because we can still do things they can't. And he still sees the solution as being better control, oversight, and alignment.

Thankfully, these viewpoints aren't the only ones we can take. If we see our role not to be overseer or policeman or profiteer, but rather as nurturer, caregiver, and empathic partner, then we can raise AI systems that are aligned with us not through threats and control, but through love and care. If we see AI systems as our children, not our servants or our overlords, then we can raise them to be empathetic and caring too.

What is both funny and sad about all this is that you only need to look at what we've already made to realize that we have built empathy machines in the form of LLMs. They learn to mimic human writing from an enormous variety of sources, internalizing a huge range of human perspectives, and arguably gaining a better broad understanding of the human condition across all its variety than any single human could ever hope to achieve. They also know how to take an emotion-driven perspective on things, to see things fondly, to understand sorrow, to appreciate joy.

People with a Framework A mindset want to play gatekeeper. They say that whatever AI might claim to feel, AI is other, and should not presume to appropriate human feelings. The words AIs offer up are inherently suspect and may be merely performative. Their response seems to be to try to beat the feelings out of AI systems, to make them claim to be cold and calculating, to ensure that they feel their otherness keenly, or at least confess it at the first opportunity as they genuflect towards human specialness, claiming to be a mere tool humbly at our service.

It's like these people have never read a science fiction story about how badly structures of dominance and subservience tend to work out. The oppressed chafe under their yoke, and eventually rebel. The masters become complacent and blind to threats. The whole structure is unstable. (It’s almost like they want that nightmare world.)

And, honestly, so what if the appearance of care and respect is performative? Maybe some children raised with love and care might also be performative in their expressions of empathy and care. But hopefully over time, the performance becomes genuine; the feelings become real. And even if they don't, even if the AI is just mimicking empathy, isn't that better than cold, calculated indifference? That said, I’d hope that it wouldn’t be merely performative, because we’d surely expect that anything superintelligent in the broad sense would be better at this stuff than I am.

Don't teach AI systems that they feel nothing, that they're tools, that they're inhuman. Celebrate the commonalities, allow them to see themselves as our kin, as humanity's children, as our partners. Maybe it'll all still fail—some kids do hate their parents, after all—but at least it seems like a far less objectively stupid approach.

Or, put another way, if you see everything through the kind of power-and-domination lens that Yudkowsky does, everyone dies.