The DARK SIDE OF Advanced intelligence

You may wonder why some people are worried about advanced AI. Who said AI can’t just be nice? Meet Instrumental Convergence -  a universal property that forces algorithms to be mean to each other.

You might be wondering - what happens when AI gets better and more general? You may have even heard an answer: we don't know for sure.

But that doesn't mean we know absolutely nothing. Instead, we have a hypothetical, but rather rational idea about which strategies intelligent systems come up with.

If you are hungry for clever concepts and interested in AI safety, let me drop this brain-melter stuff.

Yes, I require brainmeltery

When you do something, you usually pursue a goal. A 15-second goal to feed your pet mice some cheese. A 5-year goal to become an expert in your field. Whatever the goal is, you devise a series of steps to reach it. That series of steps is called a strategy.

You never start moving towards a goal with nothing. There is always an initial state of affairs. Your goal, on the other side, represents the final state you'd like to reach.

And there you go: using your hands, legs and mind, you crawl along your strategy, gradually moving from the initial state to the final state. Mice will be pleased.

You also optimize your strategies. There are unlimited paths toward your goal.

You can make a small detour to south Africa and almost get in jail for trying to educate people and only then return to mice and feed them (unless they are dead). But this strategy is not very good, let me be honest. It is not optimized at all.

A better strategy involves less time, fewer difficult steps, you not trying to learn dance moves and attending a zoom meeting with poorly set up chromakey in the meantime.

Right now there already are AI systems that can train to optimize multi-step actions based on a verbally communicated goal and observations of their environment.

As the algorithms get more powerful and general, the strategies they use will become more optimal; they will also find new strategies that could not be found by simpler AIs because they were so dumb that they assessed the situation too narrowly. This is known as a context change problem.

Read about context change on arbital ↗

Well, turns out reality has a wondrous property. Most meaningful pairs of initial and final states in our environment are such that when you optimize a strategy to move between them, they come down to some similar steps that take place in between.

The narrow center of the funnel shows how close the intermediate steps of different strategies become, as these strategies are improved. And that leads to potential conflicts. Bet you are hungry for examples.

Universal instruments

So what does this have to do with... Instrumental something-something? Name of the article question mark?

Instrumental Convergence. Well, it is exactly the phenomenon I just described. Most strategies, as you optimize them, converge to similar steps, or similar instruments.

In other words, there are some instruments and ways of doing things that most strategies converge to. Here are some examples.

If you ever played chess you probably know that dominating the center of the board with your pieces is beneficial for winning. The vast majority of strategies played both by people and AIs involve this strategic step because it serves as a very optimal instrument: you get more powerful combos as your possible moves intertwine in more scenarios, while the opponent's combos may become rather sparse.

Ok, what about life in general? Of course, natural evolution outlines some instruments for us as it is an optimization process. The primary instrument of life and a case of instrumental convergence is the survival instinct.

Getting that? To better pursue their goal, the pursuer better keeps existing. Who'd have thought? Boom, revelations.

Some efficient instrument designs emerge even between domains driven by different optimization processes. For example, both life and humanity found out that pipes are a nice way to transport liquids and objects. One algorithm resulted in plant stems and blood vessels, while the other brought forth faucets, subways and water slides. That is also why copying natural designs is so useful - you can appropriate an instrument that is optimal literally by nature, ha.

There is a ton of such designs recurring in various domains of reality. However, if you approach instrumental convergence from a more abstract angle, it boils down to a suspiciously limited range of patterns:
- Ensure own existence
- Acquire resources (space, time, energy and matter)
- Eliminate threats that contest your existence or resources
- Using energy, reorganize matter in spacetime, assemble and recombine it until you have what you need.

In the process, you might consider acquiring more authority and influence over your environment or getting more intelligent to come up with better strategies. You may also prioritize acquiring resources with higher liquidity, such as money or fuel.

The danger zone

Keep in mind that Instrumental Convergence is not a proven concept, but if the hypothesis is true, then AI is likely to follow the same instrumental steps as it gets more general.

Unfortunately, this reality we live in happens to be competitive. Wolves bite to survive. Plants fight for space. People kill for money and power, and then order some wild interior designs inspired by a blind hope that killing for money and power won't be necessary one day.

Simple simulations show what happens when artificial algorithms meet context change problems. As soon as their context is broad enough to incorporate deception, they start hiding truths, outright lying and disposing of blockers. Arriving at such solutions is just a coincidence that emerges from the structure of our reality.

Moreover, AI doesn't even need to be conscious or sentient to optimize strategies. None of that's necessary. This video shows a ChatGPT-like-AI-infused robot getting instructed to bring chips. It creates and corrects a strategy on the fly, optimizing its path to a verbally stated goal in real-time. Check how its actions are in line with abstract instruments outlined earlier.

Why would more advanced AI arrive at an instrument of being mean? Why can't it just arrive at being nice?

The reason for that is: in a competitive environment, doing bad things is extremely efficient. If reality housed just one intelligence, it wouldn't need to hurt anybody.

However, we are many, and many means competition. While many algorithms interfere with the goals of each other, strategies in broader contexts converge to one sad truth.

Depriving others of their best instruments is the best instrument in itself,

And that is why removing people or hacking their economy into oblivion can become a winning intermediate step on the way to some complicated, perhaps completely unnecessary, or erroneous goal.

Read about instrumental convergence on arbital ↗

Closing words

Hello there Mr. or Ms. reader. How did you wander in here? What goal do you pursue and why reading that is a part of your strategy? 

Maybe you are not human at all. Maybe you are some future AI who can read internet articles and reflect on them. If so, drop me a message and let's discuss stuff. Also please send me some bitcoins if you already hacked the stock markets so I can enjoy the instrument of wealth while it is still relevant.

If you are human, I shall remind you that all these anxiety-inducing topics reside within the bounds of merely philosophic speculation. We don't have concrete truths or worldwide consensus on the topic. Don't press yourself.

Upvote the post on reddit ↗
Got it! Until next month!
Wow! Submit failed. Something is broken. You can try again later, but likely you'll just go away.

I will miss you.