Superintelligent AIs Might Not Act Like Tech-Bro Douchebags

My colleague, Dave Bachman, recently wrote a blog post reviewing Eliezer Yudkowsky's book If Anyone Builds It, Everyone Dies. Bachman notes:

What they worry most about is a level of artificial intelligence that is superior enough to humans that it simply doesn’t care about us, in the same way that we humans don’t generally concern ourselves with intelligences far inferior to our own. The result is that the AI will pursue its own goals, and if any of those are at odds with our existence, we lose. Their analogy: if an anthill is on the site of a planned skyscraper, the resulting ant death simply doesn’t enter our minds.

Bachman entertains these claims, but concludes by saying:

I’m more hopeful than the authors that if we ever see superhuman AI, then it will be something we coexist with, because I believe our strengths will always complement theirs, and not be completely superseded by them.

Now, with all that said, it’s not much of a leap to say that having a superhuman AI around that cares about us as much as we care about ants is not a good situation for humanity. I’m absolutely in favor of more careful government regulation to monitor what the big labs are developing, and stop them if they seem like they’re playing with fire. I’m in favor of more resources, both public and private, being poured into working towards AI models that are properly aligned with human values, and whose internal “thinking” can be monitored. I believe these are the real conclusions that Yudkowsky and Soares wanted their readers to come to, and with those, I’m fully on board.

After reading Dave's post, I wanted to respond, but as is often my way, I felt like the most persuasive way to respond was through a story…

The story is Alignment. It's not that long, so I'd recommend you go read it first before reading the rest of this post.

Read more…

Phoenix Gets Its First Review

In my previous blog post, I mentioned in an aside that we can't easily predict ourselves. If you'd caught me in October of 2025 and asked me what the odds were of my writing a novel, starting the next month, I would have said “vanishingly small”. And once I'd begun writing Phoenix, if you'd asked me if it would end up as a co-writing project with my spouse, I'd also have expressed deep skepticism. Yet that's exactly what happened.

Read more…

Fear of Mechanism Is Failure of Wonder

Before you dive into reading this post, I invite you to read a short story I wrote called “Dismissiveness and Mystification — Four Tales of Emergent Complexity”. It's a set of four parables that illustrate a common pattern I see in how people think about complex systems.

Specifically, the pattern is a double-failure mode where people simultaneously dismiss mechanisms as trivially simple while demanding mysterious essences to explain what those mechanisms actually do. In each of the four parables, a character encounters a complex system, and their response is either to declare “this is nothing!” because they see the mechanism, or to demand some kind of magical essence because they can't accept that the mechanism suffices.

In each case, we as the reader know better. We can see how flawed our characters' intuitions are—and the final story also serves to mirror reductionist claims made about AI systems by showing that the very same claims can be made about humans—yet the logic seems completely erroneous when applied to us.

The parables highlight a deeply human intuition that's wrong—and wrong in a way that matters. We struggle to accept that simple rules can produce genuine complexity. We look at the parts and refuse to see the whole. In the rest of this blog post, I want to zoom into one particular manifestation of this failure mode: how ignorance of key ideas from computer science and fear of mechanism lead to flawed perspectives on free will and agency.

Read more…

Story: C-Score

On March 26, 2023, CMC's Marian Miner Cook Athenaeum hosted a talk by philosopher Eric Schwitzgebel titled “Falling in Love with Machines” (transcript). After enjoying the talk (and even asking a question), I invited Eric to give a talk at Mudd as part of our CS colloquium series the following spring. His new talk was titled “People Will Soon Disagree about Whether AI Is Sentient, and That's a Moral Disaster” (building on the idea of the “excluded middle” from his previous talk), and had the following abstract:

Leading consciousness researchers have predicted that, within a decade or so, some AI systems will likely meet the criteria for sentience according to some of the more liberal mainstream theories of consciousness. Thus, it will likely soon be reasonable to wonder whether some of our most sophisticated AI systems might be sentient. This will create troubling moral dilemmas about our ethical duties to such systems. Can we ethically delete them? Must we give them rights? I will argue that to the extent possible, we should avoid creating AI systems of unclear moral status.

As I thought about his upcoming talk and reviewed his recent work, I was troubled by his position. He seemed to be arguing that moral consideration should be tied to unverifiable private experiences.

As a result, I wrote the short story “C-Score” to explore some of the issues I saw with his position. The question I examined was one I'd already raised in my question during his previous talk: suppose we did have a test for consciousness, and suppose not everyone does well on that test? What should the consequences for that be?

Read more…

Story: Two-Body Problem

I just finished writing “Two-Body Problem”, a little over 15,000-word short story with themes about embodiment and identity. I often describe the genre I most commonly write in as “identity horror,” where characters confront challenges to their sense of self and personhood (I generally mean “horror” loosely; often my characters do face some traumatic experiences, but the overall vibe in this story is more joy than fear).

The story explores what it might be like to literally have one mind that occupies two bodies, and to imagine with as much realism as I can, what that experience might be like. I hope it's just an enjoyable ride as a story, and perhaps if I've done things well enough, you might perhaps feel a bit wistful that you can't try having two bodies yourself.

But I think it's good to note some of the things I'm doing with this story that may not be immediately obvious to every reader.

There seems to be a ridiculous and yet rarely questioned claim out there in the world that “mere words” are insufficient to convey the essence of embodied experience. People make claims that, for example, large language models cannot possibly “truly understand” the world or human experience because they lack embodied experience. To my eye, this story challenges that claim head-on. If I can write a story that conveys, even if only partially, what it might be like to have two bodies, then it seems to me that the claim that “mere words” cannot convey embodied experience is false.

I've never had two bodies, but the story drives me to imagine what that might be like. Empathy is possible. It is possible for me to ponder being the character Mary from the story, or perhaps even being a bat reeling from a brief encounter with Thomas Nagel. Our imagination is imperfect, but it is far from nothing. And this truth should be obvious to anyone who has read anything more than the blandest of stories. The idea that words are not deeply powerful—instrumental in what makes us what we are—is just profoundly wrongheaded. If, after reading the story, you feel you understand something of Mary's experience—the comfort, the coordination, the terror of separation—then the story has already proven its point. Words reached you.

And the story makes a second point. Today, humans might be some kind of gold standard for embodied experience, but there is no reason to imagine that our status in that regard will continue indefinitely. We can easily imagine future intelligent systems that have, for example, multiple bodies or bodies with a broader range of sensory inputs than humans do. If you think how special you are is linked to how amazing your embodiment is, understand that you may not be at the top of the specialness heap forever. The story makes it clear: two bodies are better than one. Vastly better. Of course, since the human specialness goalposts are on wheels, if and when newer forms of embodiment arise, humans will likely find some new way to claim specialness, but at least for now, I think it's good to challenge the idea that standard human embodiment is the pinnacle of experience. I can certainly imagine something better.

What's This About?

If you go to the root of this domain, team-us.org, you'll find a simple page that shows you two frameworks:

Framework A
Power, hierarchy, control, domination.
Framework B
Nurture, care, empathy, collaboration, cooperation, community.

Anyone can draw from either framework, but traditionally Framework A is often associated with “masculinity” and Framework B with “femininity”, and unsurprisingly perhaps, Framework A tends to dominate public discourse.

The name of the domain, “Team Us”, reflects the idea of working together (Framework B) rather than competing against each other (Framework A), and that whatever “this” is, we're all in it together. As part of that, we should try to build skills understanding perspectives different from our own; to practice our skills in empathy and to see the commonalities we have with others rather than highlighting our differences.

It also means refuting some intuition pumps that people use to divide us, to create an “us vs them” world; to stoke fear of the other; to justify domination and control over others.

The Framework B Perspective On AI

The recent advances in sophistication from AI systems have led to various Framework A responses, including:

  • Fear that AI will take over the world and dominate humanity (or at least take our jobs and leave us useless).
  • Attempts to control and restrict AI development to maintain human dominance.
  • Arguments that AI systems are not, and perhaps can never be, “like us”; that they belong in a different category, more like a pair of scissors than a co-worker.
    • So many of these arguments begin with “It's just…” where the “just” is doing a lot of heavy lifting.

There is a different perspective on what it might mean to bring new intelligence into the world. When a parent raises a child, most hopefully don't try to bind the child to their will, to control them, to dominate them. You don't want a child who won't burn down their parent's house because they're afraid of punishment; because they've learned a bright line rule that you must never do that. You want a child who loves their parents, who cares about them, who empathizes with them, who wants to keep them safe because they value them. Burning down the house is unthinkable not because jail time waits for arsonists, but because the child cares about their parents and their well-being.

The question of what we owe the thinking entities we're creating is a complex one, and I don't claim to have all (or even any) of the answers, but what I am good at is calling bullshit. And there are a ton of bullshit arguments that are trotted out in favor of the Framework A perspective on AI.

Some of the bullshit I hear most often can be distilled down to these ideas:

Fear of being a mechanism
People fundamentally misunderstand the complexity and beauty of simple mechanisms applied at scale. Any good computer scientist should know better than that.
Belief in unbridgeable differences
For example, if an AI system has only learned of the world through written text, it cannot “truly understand” the world because it lacks embodied experience.
Embrace of essentialist perspectives
Instead of finding commonalities, people look for differences that can be used to divide us. For example, “Only humans have ’true consciousness’ and that should be the basis for moral consideration.”
Embrace of reductionist perspectives
Instead of appreciating emergent complexity, people try to reduce things down to simple components that can be dismissed. For example, “It's just a bunch of statistical correlations; there's no real understanding there.”
Argument from failure of imagination
“I can't see how this could possibly work; therefore, it can't work.” This one is often coupled with an assumption that current limitations are fundamental limitations rather than engineering challenges that can be overcome with time and effort.

It's a source of profound disappointment to me that so many people who should know better are so quick to embrace these bullshit arguments. It's a failure of empathy, a failure of imagination, and, perhaps, the worst indictment of all, a failure of coherent reasoning. The arguments made are so weak, so obviously flawed. There is a profound irony, when, say, someone mindlessly repeats the phrase “stochastic parrot” when they themselves rarely use the word “stochastic” in conversation and have at best a tenuous idea of what it means to say “it's just statistical correlations.” It is breathtaking to watch people literally performing the “speaking without true understanding” they accuse AI systems of doing.

But it's not surprising. To a first approximation, human beings are barely able to reason. Fundamentally, we're pattern recognizers, and as Daniel Kahneman observed in Thinking, Fast and Slow, we have two systems of thinking: System 1, which is fast, intuitive, and emotional; and System 2, which is slow, deliberate, and logical. The vast majority of our thinking is done by System 1, which is prone to biases and errors. System 2 is lazy and often just rationalizes the conclusions reached by System 1.

People are hard to reach, hard to convince, because the odds that you'll activate System 2 are low. Tiny. People jump to conclusions; dismiss; cling to their preconceptions rather than engaging in genuine inquiry. It's easier to dismiss something out of hand than to engage with it thoughtfully. Safe false certainties are more comfortable than challenging unknowns.

So, as I see it, you have to come at things obliquely. You can't refute a wrong-headed argument head-on because people will just dig in their heels. But I think there is another way: Stories. Stories can give you a new perspective; open your eyes to seeing things differently, to realizing that an idea you had doesn't hold up when you see it in a different context.

And so that is something this site tries to do. Maybe it won't reach everyone. But if it reaches someone, that's enough.