## Bayesian Robustness

As an addendum to my previous post I would like to show how sensitive that type of analysis is to our prior assumptions. Why would we do this? Well, let’s say we’re not really sure what a good estimate is for our prior assumptions. We can instead choose a range of values and see what happens to our posterior analysis.

I used a beta distribution to to generate random values for $P(X|\theta)$, $P(X)$, and $P(\theta)$ (with means centered around 0.99, 0.2, and 0.0001, respectively). After generating those random values I obtained the following density for $P(\theta|X)$:

This shows a heavily weighted right tail. For this particular example the skewness comes from our assumption about $P(X|\theta)$; extremely small values can cause it to blow up. Nevertheless, the average state of our beliefs ($E[P(X|\theta)]\approx 0.004$) says that we can be comfortable about our assumptions.

We could stop here, but I would like to dive into a little risk analysis. Wouldn’t you say that even a 1% risk is too much of a risk to allow Jeff Bezos to walk freely in American society, knowing he has a large potential for harm? Let’s say that if Bezos is an upstanding citizen, he provides some arbitrary unit of benefit to society (say +1 utility). However, if he is more concerned with bringing about the New World Order than selling Amazon Prime memberships–and he certainly has the economic means to bring about mass destruction–then we can say his negative contribution is a hundred times worse than his positive contribution (say -100 utility).

So what is my belief about Bezos’ expected contribution to society after observing the evidence against him? Using the above distribution he still, on average, provides 0.96 utility to society.

Even though his downside is disproportionately negative (based on my assumptions) compared to his upside, my beliefs suggest that it’s still not really worth entertaining.

## Bayes’ Theorem

I wanted to start off my blog with something simple to dip my feet in the water, and so I’m going to go with Bayes’ Theorem. Here it is:

$P(\theta|X) = \frac{P(X|\theta) P(\theta)}{P(X)}$

This guy is single-handedly responsible for creating an entire branch of statistics, and it so simple that its derivation is typically done in an introductory class on the topic (usually in the first couple weeks when you’re going over the basics of probability theory). It’s not until you go into more advanced classes that one realize it has a lot more to say than how big the ratio is of the cross section of circles on a Venn diagram. When I was taught Bayes’ Theorem I just thought of it as a nifty little trick for converting $P(A|B)$ statements to something that uses $P(B|A)$. And as a student I said to myself “Cool, good enough to do my homework and pass a test I’ll never see this again.”

Fast forward 2 years later and I actually DO see it bullshit again. And surprise, it’s used for EVERYTHING. In fact, there is an entire field of mathematics dedicated to understanding its properties. It provides a new way of not just looking at statistics, or of probability, but of human knowledge! Bayes’ Theorem tells us to stop looking as our knowledge as some fixed property. The world may contain true and false statements about itself, but our knowledge about it is constantly fluctuating with new evidence, and we need to update our ideas about the world accordingly based on that evidence.

So let’s look back at the original theorem. I don’t like the way it’s usually written. Instead I’d like to make a small adjustment.

$P(\theta|X) = \frac{P(X|\theta) P(\theta)}{P(X)} = \frac{P(X|\theta) }{P(X)} P(\theta) = \text{BayesianAdjustment}(X,\theta) P(\theta)$

My little improvement on Bayes’ Theorem consists of just highlighting the little adjustment factor given by $\frac{P(X|\theta) }{P(X)}$. I believe this little ratio hasn’t been given the right amount of credit in the current literature. If we think of $P(\theta)$ as our confidence in the belief of some statement, and $P(\theta|X)$ to be our updated level of confidence of some statement (where X is some evidence for or against that statement), then the adjustment factor should tell us exactly how much our beliefs should change.

So let’s think of $\theta$ as some statement about the world. This could be anything, like say… “Jeff Bezos is an Illuminati shill.” Now personally I don’t believe this is true, but I like to think of myself as an open minded individual so I won’t completely rule it out. I will assign the accuracy of this statement some small probability. Let’s say $P(\theta)=0.0001$. So there is one chance in 10,000 that Jeff Bezos is definitely an Illuminati shill.

So how can you tell if someone is actually working for the Illuminati? Well, every once in a while they’ll throw out a hand signal (sort of like a low key gang sign). See exhibit A:

So one day Jeff Bezos is giving a keynote address and he decides to sit down. Lo and behold the camera gives him a quick glance and this creep is throwing out this ungodly hand sign, signaling his complicity in a hostile world-takeover by our Satanic overlords. Or he could’ve just randomly rested his hands there for no particular reason (like I said, I’m open minded). Let’s assign a value to these two possible explanations….

Let’s call the act of giving the hand signal our evidence $X$. Then in general the probability of Bezos giving out the hand signal if he is an Illuminati member is $P(X|\theta)=1.0$, and the probability of him putting his hands there (for no particular reason) is $P(X) = 0.2$. Looking at these numbers by themselves the evidence seems pretty damning, but we still haven’t considered our prior assumptions about Jeff Bezos. Initially we pegged his odds of being a devil-worshipper at 1 in 10,000. Let’s plug all these into Bayes’ Theorem and see what should be our updated confidence in Jeff Bezos’ Illuminati complicitness should be, given this new piece of evidence…

$P(\theta|X) = \frac{P(X|\theta)}{P(X)} P(\theta)= \frac{1.0}{0.2} \times 0.0001= 0.0005$

Well, I definitely think he’s more likely to bring about the New World Order than I did before, but not by a significant enough margin to spout apocalyptic nonsense via HAM radio…

Now I’d like to ask the reader: When was the last time you changed your mind about something? Can you assign numbers to your beliefs? If you would like to go beyond what I discussed, try asking yourself how robust your inferences are. How much do your posterior beliefs change based on your prior assumptions? Change up your values and come up with a basic “sensitivity” analysis.

Let me see what you come up with!

-Mason

## Introduction to Mathematical Interpretations

Allow me to wax poetic for a little bit. Mathematics is like poetry: it is the art of conveying an idea in as efficient and concise a manner as possible. A beautiful equation can convey mountains of ideas in a single line, and to read and understand all of those ideas can take a few seconds or it can take a lifetime.

The purpose of this blog will be to unravel some of the mathematical statements I have come across, and interpret them in a way that I hope will be enlightening and enjoyable to others. The math I tend to enjoy generally comes from the fields of analysis, statistics and applied mathematics (some ideas I have for initial equations I would like to dive into include Bayes’ Theorem, Fisher Information, and the Kelly Criterion).

One final note: Unlike a poem we don’t think of mathematics as aesthetically pleasing. I hope that by writing my ideas behind equations I can demonstrate, just like with a haiku, how beautiful a small little statement can be.

-Mason McElroy