Artificial Free Will

+8 rating, 9 votes

Artificial Free Will
Does determining an AI’s goal amount to slavery?

In his influential book Superintelligence, the Oxford philosopher Nick Bostrom argues that humanity should invest a lot more resources into preventing disastrous outcomes in AI research. In particular, Bostrom is worried that a superintelligent system will be created before we reach the practical understanding needed in order to control such a system.

It is useful for the purpose of this post to introduce some terminology.

Definitions: General intelligence measures an agent’s ability to accomplish its goals in a wide range of unknown environments. A superintelligence is superior to human level-intelligence in all respects, including domains like scientific creativity or “street smartness”. The definition of superintelligence leaves open whether such a system is conscious or not.

If created, a superintelligent AI would be trying to maximize whatever goal it ended up with, without caring whether the pursuit of said goal would cause death and suffering. According to Bostrom, there are two components to the problem of controlling such AI-outcomes: 1) figuring out how to program goals for an AI without unintended side-effects, 2) figuring out which goals we would want the AI to have.

In this post, I want to leave aside questions about whether the scenario outlined by Bostrom, namely the creation of a smarter-than-human machine intelligence, is likely to ever happen or not. Instead, I want to focus on a common moral objection to the priorities Bostrom advocates. The objection is that trying to determine the values/goals of an artificial intelligence is morally on par with “enslaving” the AI. The intent of this post is to show that the slavery-objection is based on anthropomorphic confusion.

You cannot avoid determining goals

People who side with the slavery-objection feel that humans would be forcing their own values onto the AI, depriving the AI of its autonomy. The term “forcing”, or expressions like “deprival of autonomy”, suggest that the AI in question already has values to begin with. This is where the confusion lies: There is no “ghost in the machine”, no goal that exists independently of the creative or evolutionary processes that brought an intelligent agent into existence. If the newly created agent is superintelligent, it will be able to acquire lots of true beliefs about the world. But beliefs by themselves are not enough to motivate goals. More input is required. Even a weak version of the Humean theory of motivation predicts that AIs with all kinds of goal-architectures are possible in theory. This includes possible AIs with goals that would appear highly alien to us. A new AI is not summoned from a Platonic Heaven where AIs exist in their Ideal Form, true and unaltered. They are created from scratch, and every aspect of how they work needs to be specified. The process that brings about an intelligent agent determines the goals of this intelligent agent, newly, out of nowhere, with no “obvious” default to confer to. If we attempt to build a superintelligence, we (and no one/nothing else) are responsible for its resulting goals.

Implicit and explicit goals

Perhaps the slavery-objection is common because people perceive there to be a fundamental difference between setting goals explicitly and doing so implicitly. Imagine the creators of an AI had full control over the AI’s goal-function. Imagine that there exists a locus in the AI’s source code where the programmers could write down any sort of goal, which would then become the only thing the AI “cares” about (whether that’s “caring” in a conscious sense or not is a different question). In this case, it seems clear that the AI is being built just to serve the interests or curiosity of the creators. In contrast, if the AI were to be equipped with more implicit goals, or with situational heuristics only – for instance if it were equipped with a “morality-module” (or more broadly an existentialist philosophy-module) that causes the AI to learn and internalize social norms and think reflectively about the content of its own goals – then it might appear as though the researchers are granting the AI some “autonomy”. Bostrom on the other hand seems to have a very clear outcome in mind when he talks about the importance of determining an AI’s goal, one which seems to leave little to no room for “autonomy”.

Implicit goals are not fundamentally different

But the thing is: Implicit ways of specifying goals are no different. The point that adherents of the slavery objection fail to understand is that even implicit ways of determining an AI’s goal result in a deterministic outcome where the creators are able to predict, at least in theory, how the AI is going to react to all possible input. There exists no unique lines of code labeled “morality-module” that intelligent agents either possess or lack that determine whether these agents are “autonomous” or not. The act of thinking is never completely free, at least not in the sense of “free from how your brain/code is set up”. An AI can’t simply abolish all its architectural constraints and start “thinking morally” – the decision to think about morality needs to come from somewhere! All the reasoning an AI goes through, all the plans it forms, and all the strategic actions it takes, need to be determined by algorithms being run – this is how things work in a reductionist universe. A “morality-module” would have to consist of such algorithms, and for it to be properly implemented in an AI’s causal workings, it would have to be specified 1) when this module is consulted, 2) what kinds of questions/empirical data it would seek as its input, and 3) how it would react to all sorts of specific input. Without these things being specified in some way or another, there would be no functional morality-module in the first place! And without it, an AI would simply never engage in “moral reasoning”. This only seems hard to imagine for us because we ourselves are so used to having a version of a morality-module, the one that evolution equipped us with, which gives us the capacity and the emotional need to think about moral norms in some way or another. Because we’re tempted to anthropomorphize, we think that things necessarily need to be that way (for all conceivable intelligent agents). But that’s shortsighted and false. If we started thinking about the details of how to design a “morality-module”, we’d quickly realize that there are infinite different ways of doing so (including having no such module at all), and none of these ways jump out as “the obvious right way”.

Thinking things through

As Daniel Dennett puts it: “AI makes philosophy honest”. We cannot just defer to vague terms like “autonomy” or “morality” and expect these issues to sort themselves out. Instead, in order to end up with a fully-functioning artificial intelligence that is capable of “moral reasoning”, we’d need to know precisely what it is that we have in mind, so that it can be coded into the AI’s goal-architecture. And if these conditions are granted, then the difference between explicit or implicit ways of creating an AI’s goal-function becomes blurred, because even the implicit way of doing it requires a fixation of how the AI will react to every possible input.

The only difference between explicit and implicit ways of determining an AI’s goals is that the implicit way makes it harder for the researchers to predict precisely what the outcome will be. The implicit way outputs a goal (an explicitly specified one we might say!) that is adjustable, a “meta-goal” that, in principle, can be reduced to a list of specifications of the sort “If empirical condition A applies, change the content of your goal to X; if B applies, change it to Y, …”, and so on. The creators of an AI with an implicitly specified goal still somehow need to determine how these specifications play out, so in a sense, they hold just as much “control” over the AI’s goal as in the case where they just program a straightforward, explicit goal.

Genetic programming

Creators could also play around with genetic programming and, if successful, manage to create an AI without truly understanding what happened. This would be equivalent, from the point of view of the creators, to adding more “randomness” to the mix. Of course, in a deterministic universe, the initial conditions set for the evolutionary trials implicitly determine the eventual outcomes, so the goal was still set by the actions of the researchers, just without them knowing what the precise outcome will be. Such a procedure could result in a scenario where the creators, when they look at the resulting AI-architecture, might not understand how their creation works or learns. It is worth emphasizing how irresponsible it would be to try this approach with a system that could potentially reach a superhuman level of intelligence, because we might end up with a superintelligent AI that pursues a random goal without any concern for the well-being of humans or other sentient creatures.

Parallels to human enhancement

It is worth noting that the discussion about the creation of artificial intelligence has parallels in the discussion of the ethics of human enhancement. Bioconservatives like Habermas and Kass argue that determining a baby’s genetic traits amounts to depriving the future person of autonomy. Using the word “autonomy” adds an air of gravity and sincerity to their arguments, but what they de facto seem to prefer is simply that an amoral and unreliable process, evolutionary chance, determines the characteristics of future children.

Perhaps part of people’s uneasiness with Bostrom talking about determining an AI’s goal-function comes from worries about indoctrination. Perhaps people believe that an AI’s goals are dependent of what the AI learns in its early stages. There are certainly AI-designs that function this way, neuromorphic designs for instance. But as was noted earlier in the discussion on “morality-modules”, not all AI-designs change their object-level goals according to learned inputs. And, again, even if your AI first needs to learn things before it stabilizes its goal, the way in which it learns is determined by its architecture. Finally, if the learning-process itself (i.e. the input the AI receives from its creators in the early stages) is the problem, if the the fixation of the input is thought to constitute “indoctrination” or even “mind-control”, then critics holding this view should pause and consider what humans regularly do with their children. We invest considerable resources to deliberately shape the environment children grow up in. And that’s good: Educational input is essential for raising intelligent and morally aware citizens.


To summarize, it’s important to understand that we cannot avoid making the decision. If we create a functioning artificial intelligence, we will thereby also equip it with a particular goal. The important question is not whether we shape the goals of future beings. It’s which goals we want to shape the future with. And that’s precisely what Bostrom is so concerned about, and what critics seemingly want to leave to random chance.

This article has 3 comments

  1. Beyond explicit goals, isn’t there an even shallower way of programming a robot? What if you gave it a set of rules that override all other thoughts? I understand there’s no ghost in the machine, nothing we didn’t put there intentionally or not. But what we do put there will make choices through complex thought processes, if it’s sentient. If those complex emergent choices are trumped by simple overriding rules, that would still be slavery. We should make sure we’re controlling them through goals, not rules.

    Also, I think it might be wrong to change someone’s existing goals, except in extreme criminal cases. It should be OK to design a person’s goals at conception. But once the ball is rolling, it shouldn’t be interfered with. Am I right?

    • Good point in the first paragraph. If the agent is sentient, all its decision-making algorithm should feel like its own algorithm, and not like an externally imposed restriction.

      I also agree with the second point!

  2. This is what I don’t get about AI in general. How are there not irreconcilable, FORMAL differences between human beings and machines, i.e. between conscious entities and non-conscious entities? Intelligence refers only to a human being’s ability to understand and use concepts; this is the cause of his ability to “accomplish [his] goals in a wide range of unknown environments.” Machines on the other hand, with their fundamentally different, i.e. non-conceptual, way of operating only exhibit computing power (or perhaps robustness of programming); this is the cause of THEIR ability to “accomplish its goals in a wide range of unknown environments.” Human beings and machines aren’t subsets of the concept “agent” because there are differences of kind, not just degree, between them. Human beings have free will, machines are deterministic. This seems to be the root of all this misunderstanding. What does “morality” mean outside the context of an independent, mortal, volitional consciousness with a huge history of evolution encoded in his psychology?
    Doesn’t this just smack of that error of ascribing teleological causality to a phenomenon we don’t understand, the way primitive man would have ascribed the action of a tree swaying in the wind to the spirit in the tree? It seems like people blank out what machines fundamentally are, then struggle to explain them and worry about a malevolent “ghost in the machine”.