It’s in the interest of agents to achieve their own goals as well as possible. When we implement this in our behavior, we are acting rationally. But what does this mean in an applied setting, acting so as to best achieve our goals?
When we are absolutely certain about the consequences of our actions, the case is clear. We should choose those options which lead to the outcome we value the most. If my goal is to earn as much money as possible and I have a choice to receive either $10 or $100, the choice is clear: I should pick the $100.
However, the world, and especially also our senses, are not structured in a such a way that we can predict the consequences of what we do with certainty. For example, if I’m considering whether or not to buy a lottery ticket, most likely I’ll simply lose the money I paid. But there is a small probability that I will win more than the cost of the ticket. I can calculate these probabilities based on the payout structure.
In other cases, we have no such clear information to calculate the chances of winning or losing. Nevertheless, we still always need to place probabilities on possible outcomes by using our best judgment and all the information and experience that we have. Once we have a clear picture of the probabilities involved, we do best to choose the action that maximizes our expected utility.
What is expected utility?
“Utility” refers to how much we value a particular outcome. This depends on our goals. For instance, if someone’s goal is to get as much money as possible, then the utility of an outcome would simply be the monetary value the person could bring into their possession. People’s real goals are of course much more complex, and it would be difficult to put exact numbers on e.g. how much people value friendship, but nevertheless, people prefer some outcomes to other outcomes, which is what the concept of utility quantifies. If an option gives me twice as much utility as another option, then it means that I, when I’m fully informed and unbiased, would find it twice as good.
The expected utility of an action is composed of the sum of the utilities of all possible outcomes, with each outcome being weighted according to the probability of its occurrence. The sum of all probabilities must equal 1. The formula is as follows:
EU stands for the expected utility,
p stands for the probability that a particular outcome will occur,
and U is the utility of said outcomes, if they occur.
To use an easy example, we will calculate the expected monetary value (EV) for a game of dice. You are offered a bet where you can either
a) receive $10 right now, or
b) you can play a game where you will receive $66 if a fair, six-sided die, shows a six on the first throw, and lose one dollar if the die does not show a six.
The expected value of option a) is $10, only one outcome is given. The expected value of option b) consists of two components we need to add up. It is calculated as follows:
The probability of (die not showing a six) is multiplied by the value of that outcome:
(⅚)*(-1) = -$0.83
The probability of (die will show a six) is multiplied by the value of that outcome:
(⅙)*66 = $11
Then we calculate the sum of both:
-$0.83+$11 = $10.17
It follows that I win $0.17 more in expectation if I opt to participate in the dice game. Does that mean that someone is irrational when she chooses the $10 with certainty? Not necessarily! It depends on her utility function, that is, it depends on what’s (how) important to her.
The difference between value and utility
It may well be that we end up disproportionately disappointed in the five of the six cases where we lose in the above example, in comparison to how happy we’d be if we win. If a player wants to also include the worth of her own well-being in the considerations, then it could easily be the case that the extra disappointment doesn’t outweigh the expected extra $0.17 gained from choosing the riskier option.
There are many reasons why one might rationally choose the safe option. It could for instance be that we are in a situation in which we urgently need $10, e.g. we are in town without money and absolutely must buy a particular birthday gift for someone before the shops close. In this situation, it would be very bad for us to choose the small chance of winning a lot of money (and thus to maximize the expected amount of money we will own), because in five out of every six cases, we would not be able to afford the present and thus would disappoint the person we hoped to buy it for – something we’d consider much worse than forfeiting a mere 17 cents in expectation.
No one simply wants to maximize the amount of money they own. Having twice as much money is not always twice as good. Money (and goods general) usually have diminishing marginal utility.
It is often assumed that maximizing utility always has to do with egoism, or the accumulation of money. This is an unfortunate misconception. In principle, all sorts of things fall under utility, it just comes down to what our goals are. Altruistic goals, too, can perfectly well fall under the term utility, as it simply constitutes all that we care about.
Why maximize expected utility?
Why should we maximize expected utility and not any other relation between probability of occurrence of an outcome and its utility?
Assuming that we are completely altruistic and that our utility increases linearly with every person we help (i.e. it would be n times better to help n people than it would be to help one person), consider the following scenario:
An epidemic has broken out on a small island. All 20,000 inhabitants may die from a disease that leads to agonizing death by suffocation within three days after the infection. Experts estimate that, without intervention, the entire island’s population will become infected and suffocate within a few weeks. The inhabitants of this island do not have medicine to cure themselves, so they are dependent on external aid. We find ourselves on a larger neighboring island and are coordinating a rescue strategy. We are faced with choosing the medicine, we can use SafeRelieve and/or CheapRelieve. Both compounds are curative as well as preventive. SafeRelieve costs $2.04 per person treated, and treated patients recover in 100% of cases. CheapRelieve is a cheaper but less safe product. It heals completely in only 50% of the cases, whereas in the remaining cases it is not effective at all. This treatment costs $1 per person. The budget for the rescue is strictly restricted to $10,000. The packages will be distributed so that each person in the population has an equal chance of receiving treatment. How should we proceed?
Suppose we spend all our money on SafeRelieve, then we save 10,000 / 2.04 or about 4,900 inhabitants. If we spend our money on the second drug, we save 10000/1 * 0.5, i.e, approximately 5,000 inhabitants in expectation. CheapRelieve may seem risky, but it is the better choice in expectation.
Theoretically, it would be possible (the chance of this happening is actually 2.3%, see endnote c. here) that we end up unlucky, with CheapRelieve having no effect in a large number of cases, so that fewer people will be helped by it that would be helped by SafeRelieve. However, luck can go the other way too, and we might save even more people than 5,000.
Splitting
If we split the money fifty-fifty, buying both SafeRelieve and CheapRelieve, then we help 4,950 people in expectation. Due to the SafeRelieve drugs we distribute we reduce the variance, that is we reduce the chance that extremely few people will be saved if we happen to be unlucky. The price we pay for this is that we also reduce the chance that a very large number of people will be saved, and that the expected number of people who will be saved is reduced. If spending all our money on CheapRelieve is better than spending all our money on SafeRelieve, then there seems to be no argument why we should invest any money in SafeRelieve at all. The more drugs you can buy, the clearer it becomes that SafeRelieve is the better choice.
Let’s consider another scenario: Again, we coordinate the rescue package, but this time we find out that we have received an additional donation of $51, with which we can buy more drugs. If we buy SafeRelieve we guarantee that 51/2.04 = 25 additional people will be saved. If we buy CheapRelieve, we save 25.5 people in expectation. This time, there is a 44%(!) chance that CheapRelieve will save less people than SafeRelieve. Shall we, unlike in the case of the large donation, decide to rely on the safer option? The answer is no, because we should not only be concerned about risks, but also about opportunities. Even more often than 44%, CheapRelief will cure more than 25 people.
The additional $51 is not isolated; it is part of the overall budget. If we had started with a budget of $10,051, there would not have been any reason to change the strategy we chose previously. As we have seen above, it is better not to distribute the money between the two drugs.
The law of large numbers
The expected value of an outcome essentially measures the average outcome that would occur if we went through the same scenario a huge number of times. CheapRelieve is the drug with the highest expected utility. In addition, the larger the number of people that we are treating, the less likely it is the actual outcome ends up being much lower than the expected one.
An argument for always maximizing expected value is the “Law of Large Numbers”. In the long term, if we had to go through the same scenario again and again, the total amount of utility we would achieve is virtually guaranteed to be highest when we maximize the expected value for each decision.
This argument may be persuasive, but one could still argue that we really make the decision only once. Why should we maximize expected utility in unique decision-situations?
Arbitrariness
To respond to this, it’s useful to consider the following: If we are not going to maximize expected value, what else would we do? Once we begin to value outcomes less as their risk increases, the question becomes: How much less do we want to weight them? Here it seems as though, in principle, any form of weighting would be possible, without any of these infinite possible weightings standing out. Maximizing expected utility, however, is unique and makes sense to use.
Axiomatic approach
Another argument for expected utility maximization is the axiomatic approach, based on the Von Neumann-Morgenstern utility theorem. If a person states her preferences over a set of lotteries/bets, and if these preferences are in accordance with four intuitively very plausible axioms, then it can be proven that this person is acting as if she wanted to maximize her expected utility. Or in other words, this means that the rejection of expected value maximization results in a violation of at least one intuitively very obvious axiom.
The altruistic perspective
For altruistic utility functions there is, interestingly, an additional argument for maximizing expected value even in isolated decision-making situations. We can make decisions as if we were behind the veil of ignorance. We can ask ourselves which decision we would prefer as a randomly selected person in the group of affected beings. We would choose the course of action in which our chance of being saved is greatest. In the example involving the island, we would always use all our money to buy CheapRelieve, even if only $51 worth of drugs can be purchased.
Attitude to risk
Always choosing the option that maximizes expected utility is called being risk neutral. Preferring outcomes where there is higher safety (even if the expected value of another option would be higher) is considered risk averse. Because people often perceive losses to be greater than gains, they may often make irrational, risk-averse decisions.
People who argue that risk aversion can be rational often do not understand the difference between value and utility. If someone offers me a bet in which I would put my entire fortune on a coin toss, giving me one extra dollar if I win, this bet would give me $0.50 profit in expected monetary value. However, in terms of utility, it would obviously be much worse for me to lose all my belongings than it would be good for me to double my wealth. The fact that with money one is sometimes (but not always!) “risk-averse” makes sense because, ultimately, it is utility that we want to maximize.
References
This piece is based on Brian Tomasik’s Why maximize expected value?
Martin Peterson (2009). An Introduction to Decision Theory. Cambridge University Press