Sunday 6 November 2011

Laurence Tribe: Maths on Trial (2)


Tribe’s reaction to the Kaplan-Cullison model

Tribe’s most important article on the subject of maths on trial is divided into two parts. One of them discusses the use of Bayes’ theorem at trial, and the other discusses the merits and lack thereof of the Kaplan-Cullison model for jury decisions, that we explained in an earlier post.

Today we’ll look at Tribe’s reaction to the Kaplan-Cullison model. (Reference: pp. 1381-1389 of Trial by Mathematics, Harv. L. Rev. 1970-1971).

In the first sentence of this part of his article, Tribe puts his finger on the obvious problem in the Kaplan-Cullison: assigning numerical answers to such vague questions as “How much would you regret the erroneous conviction of this defendant for armed robbery?” Indeed, as he points out, the answer to this question can depend largely on the consequences of such a conviction; consequences to the defendant’s children, for example, if he is a single father, yet these consequences may have absolutely nothing to do with judging the case at hand, and therefore the “quantity of regret” (if quantifiable, already doubtful) may not be relevant in coming to a decision. For these reasons, Tribe says, “any equation designed to compute the threshold probability above which conviction would be preferable to acquittal would have to be far more complex than Kaplan and Cullison have supposed”.

Indeed, Tribe points out that the Kaplan-Cullison model contains some properties which are in inherent contradiction with the law itself. For example, in order to properly numerically assess their own preference for acquitting the defendant if he is guilty over convicting him if innocent, the trier would naturally need to consider as much information as possible about the consequences of conviction, for example to the defendant’s family, to his reputation etc., or the proposed length of his sentence, and also the consequences of his acquittal if he is guilty, for example if it is known that he holds many prior convictions for the same type of crime or has been engaging in behavior that can appear relevant. But these facts are generally kept from the jury in a trial, given that their duty is restricted to the sole determination of whether or not the defendant is guilty of the crime charged.

For example, in the very recent conviction of Vincent Tabak for the strangling murder of Joanna Yeates in England, the jury (and the public) learned only after his conviction that he had spent an enormous amount of time during the days and weeks preceding the crime in searching out pornography sites showing images of women being choked, with a particular concentration on blonde women some of whom bore a resemblance to Joanna. It was considered that this information could not provide the jury with any factual knowledge about whether Tabak was guilty, and it was therefore withheld.

Tribe concludes his analysis by explaining that in any case, no such model can be considered in the absence of an absolute numerical decision about what kind of precision is aimed at in the trial process. If it is known, for instance, that convicting 60% of the guilty correlates to the unfortunate conviction of 1% of innocent people, and convicting 80% of the guilty correlates to the conviction of 1.2% of innocents (correlations which are obviously very difficult to establish at all), then the trial process must be designed with a specific fixed goal in general corresponding to one such level of precision.

But this specific goal then flies in the face of some of the basic tenets of justice: the “presumption of innocence” and “acquittal in all cases of doubt” since as Tribe says, “After deciding in a deliberate and calculating way that it is willing to convict twelve innocent defendants out of 1000 in order to convict 800 who are guilty – because that is thought to be preferable to convicting just 6 who are innocent but only 500 who are guilty – a community would be hard pressed to insist in its culture and rhetoric that the rights of innocent persons must not be deliberately sacrificed.

In conclusion, Tribe rejects the use of a mathematical model for these reasons: because the rights which are threatened by the use of specific desired proportions of success versus failure in trial go deeper than simply a question of desirable outcomes.

“The presumption of innocence, the rights to counsel and confrontation, the privilege against self-incrimination, and a variety of other trial rights, matter not only as devices for achieving or avoiding certain kinds of trial outcomes, but also as affirmations of respect for the accused as a human being – affirmations that remind him and the public about the sort of society we want to become and, indeed, about the sort of society we are.

Laurence Tribe: Maths on Trial (1)


Who is Laurence Tribe?



Laurence Tribe is the grandfather of all the lawyers who have written scholarly articles on the basic topic of this blog - the use of mathematics in trial situations. Born in 1941, Tribe majored in mathematics as an undergraduate before attending Harvard Law School, and served as a law clerk for two years before joining the faculty of Harvard Law. He has argued many cases in court, of which no less than 34 before the U.S. Supreme Court, and published a number of scholarly articles. Until recently, he was on leave from Harvard to work as "senior counselor for access to justice" in the Justice Department of the Obama administration (he calls Obama "the best student I ever had"). But he resigned from this position, ostensibly for medical and personal reasons, although he was also involved in a vigorous protest against the Obama administration's treatement of Private Bradley Manning of Wikileaks fame.

In my eyes, what makes Tribe outstanding is his contribution to the study of the role of mathematics in trials. By an astonishing coincidence, this former math major was involved in one of the seminal cases in which probability calculations were involved to identify the perpetrators of a robbery (details in our book Maths on Trial), and this perhaps sparked his abiding interest in the subject.

In the spring of 1971, a remarkable article about the use of mathematics in trials was published in the Harvard Law Review by M. Finkelstein and W. Fairley:

A Bayesian Approach to Identification Evidence, by M. Finkelstein and W. Fairley (83 Harvard Law Rev. 489, 1971).

In response, Tribe published the seminal article Trial by Mathematics, Precision and Ritual in the Legal Process, by Laurence Tribe (84 Harvard Law Rev. 1329, 1971).

I propose to investigate the contents of both articles, little by little, and subjectively and not in order. Both, and especially Tribe's, are simply packed with fascinating reflections on the nature of law, justice and the trial system, and the role and dangers of mathematics in that context.

Tribe's article contains a detailed discussion of the dangers of using mathematics (in particular, using them in the way that Finkelstein and Fairley recommend, using Bayes' theorem) in trials in two different ways. The lion's share of the article deals with the use of mathematics to weigh the probability of specific evidence in a specific trial; the second, smaller part concerns his reactions to mathematical models for jury decisions. We’ll go over the main ideas Tribe expresses in a series of upcoming posts.

Sunday 18 September 2011

Bayes' theorem - examples

The Cancer Test Problem

Remember Bayes' theorem:


P(A|B) = P(B|A) * P(A)/P(B)


The following problem is a very famous case of how to use the theorem.

We have a cancer-detecting test which gives which gives a positive result for 90% of people who do have the cancer, but also gives a positive result for 10% of people who don’t actually have the cancer. A patient comes in and gets a positive result. How worried should they be? (In other words, what is the chance that they do actually have cancer?)

Well in fact, Bayes' theorem tells us that we don’t actually have all the information necessary to answer this question.

Indeed, let’s set:
Event A = person has cancer
Event B = test is positive

We are looking for P(A|B). We know P(B|A) (it’s 90%) and P(B|nonA) (it’s 10%). Knowing P(B|nonA) is actually as much information as knowing P(B), because we can calculate P(B) from this and P(A) using the total probability law – we’ll see this later. So we still need to know P(A).

Suppose the question now is: 2% of people have this cancer. We have a cancer-detecting test which gives which gives a positive result for 90% of people who do have the cancer, but also gives a positive result for 10% of people who don’t actually have the cancer. A patient comes in and gets a positive result. How worried should they be? (In other words, what is the chance that they do actually have cancer?)

A lot of doctors were asked this question. Only 15% of them got it right (this article cites a few studies for this result – it’s also a very interesting and entertaining read). They generally estimated the chance that the person did indeed have the cancer to be very high, close to 90%.

But what is the correct number?

A more intuitive way of thinking about this problem is the following:

Take a pool of 1,000 people.
  
20 of them have cancer
    18 of these will have a positive reading on the test
980 of them do not have cancer.
    98 of these will have a positive reading on the test

So in total, 116 people will have a positive reading on the test, and 18 of these will actually have cancer. So the probability that a person with a positive reading does actually have cancer is 18/116 = 15.5% which is still relatively low. So it would be a far better approach to understand the maths involved in this, and not freak your patient out without good reason.

Let’s now calculate the probability using Bayes theorem – hopefully, we will get the same result.

Our first step is to calculate P(B). We do this using the law of total probability:

P(B) = P(B|A)P(A) + P(B|nonA)P(nonA)
        = 0.9*0.02 + 0.1*0.98
        = 0.116

Now Bayes’theorem: P(A|B) = P(B|A)P(A)/P(B) = 0.9*0.02/0.116 = 15.5% (yes!!)

This example question is detailed in this video, which also presents some other very interesting counterintive statistical issues. The part we’re interested in is at 11 minutes, but the entire thing is worth a watch.


The Prosecutor's Fallacy

Bayes’ theorem often applies when considering the probability that a person is guilty of a crime. Indeed, misunderstanding it leads to what is called the prosecutor’s fallacy – when you interpret the small probability of someone fitting the evidence as a small probability that an accused who does fit the evidence is in fact innocent.

Let’s consider the following case:

In a murder case you have found a sample of the murderer’s DNA, and there is a 0.1% chance of a random someone’s DNA matching this sample. You have found a man whose DNA does match.

Then the correct interpretation is NOT there is 0.1% chance that this man is not the murderer, ie there is 99.9% chance that he is the murderer.

Bayes’ theorem tells us that in order to calculate this last probability – the probability that the man is guilty, given that he matches the DNA, one also needs to take into account the probability of a random person being a murderer, which is extremely low, say it is 0.01%.

Let’s use the following notation:

Event A = The man is guilty
Event B = The man’s DNA matches the one found

Then we have
P(B) = 0.1%
P(A) = 0.01%

The probability we are interested in is P(A|B): the probability that the man is guilty, given that his DNA matches the killer’s.

Bayes’ theorem gives us that P(A|B) = P(B|A)*P(A)/P(B)

Now P(B|A) is the probability that the man’s DNA would match the killer’s, if he is indeed the killer. Which should be pretty close to 1, if your DNA testing is any good! P(A) and P(B) are given above, so in the end:

P(A|B) = 10%

Most definitely not a cause for putting someone in prison!

Obviously, in actual cases, this is not the only thing to take into account. If the man’s DNA matches the killer, and he also matches a description of the killer, and has no alibi, and some shoes were found in his house covered in blood, the odds would change somewhat.

Wednesday 7 September 2011

Bayes' Theorem - Introduction

Lawyers often talk about Bayesian analysis. It is, in fact, one of the major ways in which maths play a role in law: it is used for estimating the probative value of certain pieces of evidence in the presence of other evidence.

We will want to refer to Bayes' theorem quite often, so this post is devoted to a simple and complete explanation.

First of all, let’s introduce some notation. We write:


P(A) = probability of A = probability that the event A happens

P(A|B) = probability of A given B = probability that the event A happens, given that the event B has happened.

Bayes’ theorem relates P(A|B) to P(B|A). And it turns out that the relation involves what is called the prior probabilities – the probabilities P(A) (probability that A happens, without considering B at all) and P(B) (probability that B happens, without considering A at all).

The exact relationship is the following:

P(A|B) = P(B|A) * P(A) / P(B)

Now, this may not mean anything to you right now, but it is actually very counter-intuitive. People generally feel like they have a good idea of what P(B|A) is if they know P(A|B) and just one of the prior probabilities, like P(B). But depending on what the prior probability P(A) is, that good idea may in fact be completely off.

For example, imagine you are a teacher, and you give your students a very difficult test, which every year only 5% of your students get an A on. Obviously a good (or bad) way for a student of increasing his chances is cheating – say that cheating gives him a 40% chance of getting an A. Now, imagine that you have a student who has got an A. As the teacher, you might be tempted to think that the student has cheated. Well, Bayes’ theorem tells you that you cannot know how likely it is that this is the case if you don’t consider the prior probability of cheating.

First, let’s figure out the information we do have:

Call A the event getting an A on the test, and B the event that the student has cheated.
Then we know P(A|B) = 0.4 and P(A) = 0.05.

We want to know P(B|A), the probability that the student has cheated, given that he got an A.

Since we need to know the probability of cheating, P(B), let's consider two different situations.

First situation:
suppose this was a fairly relaxed test setting, where students were not supervised very carefully, and you estimate that 10% of them cheated, ie P(B) = 0.1.

Then Bayes’ theorem tells us that P(B|A) = P(A|B)*P(B)/P(A) = 0.4*0.1/0.05 = 0.8. In other words, if the student got an A, then there's an 80% chance that he did it by cheating.

Second situation: suppose this was a very controlled exam, where students were in individual rooms with individual supervisors. The chance of cheating is still not 0, but it’s definitely a lot smaller, say just 1%, or 0.01. Then Bayes’ theorem tells us that P(B|A) = 0.4*0.01/0.05 = 0.08, or 8%. So if you have a student who got an A, it's not that likely that he cheated, and you need not be too concerned.

Of course, this is a rather simple problem – it is a lot more probable that the student cheated if he was in an environment where it was easy to cheat, than if he wasn’t – not exactly surprising. But the problems can get a lot more confusing than that, as we will see next time!

Saturday 3 September 2011

Mathematical Model for Jury Decisions


Kaplan and Cullison: Jury Decision Model

The concept of “reasonable doubt” is one of the most difficult to quantify In the whole of legal theory. Suppose you’re on a jury and you’ve seen all the evidence, and you’ve come to some conclusion about the probability of the defendant: 75%, say, or 90%. Should you vote for a conviction? 
 
In this post we are going to discuss a mathematical model proposed by John Kaplan and Alan Cullison, which is supposed to help jurors, who have already assessed the probability of guilt of the defendant, to decide what verdict to return. The novelty and importance of their idea is that in order to reach a decision, the juror must make some kind of measurement of his own personal degree of repugnance at the idea of acquitting a guilty person, a degree which can obviously vary greatly according to the crime being judged, and the danger of its being repeated if the culprit is acquitted.
 
Our explanation of the model comes from a seminal 1971 article by Laurence Tribe:Trial by Mathematics, Precision and Ritual in the Legal Process (84 Harvard Law Rev. 1329, 1971). Tribe’s rebuttal of the model, and his discussion of the use of mathematics in trials in general, is complex and fascinating, and will be the subject of a series of future posts. 

Kaplan and Cullison’s model for jury decision-making.

Let's say that the trial is over, all the evidence has been seen, and the trier (i.e. the jury member) assesses the probability of guilt of the accused as some probability value P between 0 and 1.
Now the trier must decide on a verdict by choosing between two acts: convict or acquit.

There are four possibilities for the outcome of a trial:
C_G (conviction of a guilty person)
C_I (conviction of an innocent person)
A_G (acquittal of a guilty person)
A_I (acquittal of an innocent person).
The trier will assign numerical values between 0 and 1 to each of these possibilities. He begins by taking C_G=1 (most desirable) and C_I=0 (least desirable).
Next, the trier must assign values to A_G and A_I according to the following procedure. Start with A_G.

The trier asks himself the following question: "Would I rather have a result that I know to be A_G, or take a 1/2 - 1/2 chance between C_G and C_I?”


If he realizes that he would prefer A_G, then he knows that the value of A_G will be greater than 1/2. Now he will try to see if it's greater than 3/4 by asking himself: "Would I rather have a result that I know to be A_G, or take a 3/4 - 1/4 chance between C_G and C_I?”
And so on, until he closes in on an actual value for A_G. He then uses the same procedure for A_I.

There are many things to keep in mind when making the decisions of what one would prefer. The choice of a value for A_G is particularly delicate, because if the outcome of the trial is the acquittal of the culprit, then the jury may feel some responsibility if, for example, the crime was a type of brutal murder which then occurs again after the criminal's acquittal. In such a situation, the trier may well feel that A_G is not preferable to a 50-50 chance between C_G and C_I. The value for A_G is never likely to be less than 1/2, since few people if any would take a chance between C_G and C_I if the likelihood of C_I is actually perceived as greater than C_G. But 0.5 may be an acceptable value, or, if the crime is not likely to re-occur, A_G may end up being a high value such as 0.9, in order to minimize the chance of convicting an innocent.


The same procedure is used to determine a value for A_I, but the meaning is different. On the whole, A_I is a better outcome than A_G, because at least the trier has not erred in his work. Therefore the trier is unlikely to take much of a risk of convicting an innocent, in comparison to acquitting an innocent, and the value of A_I will tend to be significantly higher than 0.5. On the other hand, the trier may deeply dislike the outcome A_I because if the real culprit is in fact free, then there is a danger that he will continue his crimes, so he may not want to simply set A_I=1; he may prefer to have a good chance of having convicted the guilty party than to be certain of having acquitted someone innocent.
Still, on the whole, A_I is likely to be quite high, higher than A_G.

Now that the numbers P, A_G and A_I have been fixed subjectively by the trier, the Kaplan-Cullison model suggests the following calculation to decide between conviction and acquittal.
If the trier chooses to convict, he will get C_G with a probability of P, and C_I with a probability of 1-P. Defined the "expected utility" UC of the choice “convict” by the standard weighted formula

UC = P C_G + (1-P) C_I, which in fact is always just equal to UC = P.


Similarly, if the trier chooses to acquit, there's a probability of P that he'll actually get A_G and (1-P) that he'll actually get A_I, so we can defined the "expected utility" UA of the choice A by


UA = P A_G + (1-P) A_I.


Both UC and UA are numbers, and the model says all we have to do is compare them.


If UC > UA, then choose to convict. If UA > UC, then choose to acquit.


Examples
: Suppose the trier is only 75% convinced of the guilt of the accused. The trier has done all of the above calculations and has fixed A_G at 0.5 and A_I at 0.9.

Then according to the formulas, UC = .75 and UA = .75 x .5 + .25 x .9 = .375+.225=.6.


Thus UC > UA so in this situation, the trier should vote to convict.


This result may seem really bizarre in view of the injunction to vote for a conviction only if convinced of guilt "beyond a reasonable doubt". Clearly a 75% probability of guilt is not beyond a reasonable doubt. Yet the Kaplan-Cullison model leads to a recommendation to convict.

On the other hand, the choices for A_G and A_I above are not necessarily the most normal choices. There are many other possible attitudes. In the case of an unimportant crime, the trier may set A_G=1 and A_I=1, meaning that he would rather acquit the accused, whether innocent or guilty, than accept even the smallest chance of convicting an innocent person. If so, then he will find that UA=1, meaning no matter what the value P of his conviction that the accused is guilty, even if P=99%, he should vote to acquit.

In essence, what the model is suggesting is that the concept of "beyond a reasonable doubt" be replaced by the concept of "utility", with the trier taking into account the negative aspects of acquitting a guilty person or convicting an innocent.
This marks a deep rift with respect to the tradition underlying the way trials are conducted and jury decisions are made. It is an important question and one well worth considering.

In a series of upcoming posts, I will introduce the work of Laurence Tribe, and in particular explain his reasons for rejecting this model.
These reasons are deep and fascinating, and go far beyond the realm of mathematics or even legal theory, into the domain of psychology.

Monday 4 July 2011

Casey Anthony: Defense's logical fallacies

Closing arguments are finished in the Casey Anthony trial taking place in Orlando, Florida. The prosecution's rebuttal is finished. The jury is going to start deliberations today.

The closing arguments by the defense took four hours, as against the prosecution's hour and a half.

The prosecution's arguments were short and sweet. Attorney Jeff Ashton went through every single thing that Casey did and said during the 31 days following the disappearance of her little daughter Caylee. Every lie that she told her mother and her friends, every party, every date, every phone call, even her "Bella Vita" tattoo. Ashton talked about Casey's MySpace password "Timer55", and explained that she'd told her brother Lee that Timer55 referred to the 55 days between Casey's disappearance and Casey's 3rd birthday, when she knew that her mother, Cindy Anthony, would no longer accept being put off, and would require and demand to see her granddaughter. The way Ashton reconstructed it, Casey was hoping for just 55 days of freedom, 55 days to live with her lover, party and do what she wanted. She didn't flee across country or try to disappear. She just hung out with her friends in Orlando, enjoying herself. Casey Anthony had a short term mentality. She thought she could fob her mom off for those 55 days, "et après le deluge".

She didn't count with Cindy's persistence. After 31 days, Cindy had enough, started calling Casey's friends, tracked Casey down, brought her home, discovered that her granddaughter was missing, and called the police. The web of lies that Casey had been building for a month, in fact for many many months - that she worked, that Caylee had a nanny, that Caylee was being well looked-after - fell apart under police investigation.

According to Ashton, on June 15, 2008, Casey fought with her mom, scooped up her daughter, left home, hopefully gave the child chloroform (the words "chloroform" and "how to make chloroform" were entered into google search no less than 84 times from the Anthony's computer), smothered her, left the little body in the trunk of the car, and went off to party. Two or three days later, she disposed of the body by placing it in a laundry bag and leaving it in a marshy swamp minutes from the Anthony home. But not before the car had become completely inundated with the smell of human decomposition.

The defense counters Ashton's depiction with another theory: that Caylee got out of the house by herself early on the morning of her disappearance - he proves by photographs that she was able to open the back door by herself, climbed the ladder to the above-ground pool - here again he proves by photographs that she was able to do this - and drowned. Then, according to the defense, George Anthony found her, and Casey was awoken by his screaming and shouting that she was guilty for what had happened. George, according to the defense, got rid of the body and forced Casey to keep completely silent about what had happened, under threat and also because, according to the defense, years of sexual abuse had taught Casey to lie, to deny reality, and to obey her father.

Unfortunately, the defense's arguments fall down in a few major places. The main one, of course, is that defense attorney Jose Baez couldn't get a single witness to admit to knowing anything whatsoever about any alleged abuse of Casey by George. In fact, when Baez drew from Casey's former boyfriend the admission that Casey had confided "a big secret" to him, and thunderingly asked whether that secret was that Casey had been abused by her father, the witness responded by a simple "No". It turned out that the secret was that her brother had once groped her. It also emerged that her father, a former police officer, sometimes hit her when he was angry.

Judge Belvin Perry, who to my mind has emerged as a star of the bench and an example of American common sense, plain thinking and straight justice, forbade Baez to even mention sexual abuse in his closing statements. You can't just invent stuff without even a shred of evidence.

Undeterred, Baez sketched out the rest of his theory in detail in his closing arguments, essentially substituting George's influence over Casey for the sexual abuse. He spent most of his time explaining that the prosecution didn't really prove anything. He concentrated intense attacks on George Anthony.

Both sides left out some very important things. Ashton mentioned the obvious question of why Casey should kill her child rather than simply leave Caylee with her parents and disappear to make a new life for herself elsewhere, but explained it with the simple "Cindy would never have permitted that". How many kids run away - and how many parents actually permit them to do so? I thought that was a flaw in his argument. There's a much better answer: Casey was wildly jealous of Cindy. Caylee shared her affection between her mom and her grandmother, and there's evidence that Casey sometimes thought Caylee preferred her grandmother. During one of her lengthy lies to police, she claimed that the nanny who had kidnapped Caylee had allowed her to talk to Caylee on the phone. She said something to the effect that Caylee seemed happy and didn't ask to see her, and then added some oddly significant words. "She wouldn't have been that way with my mom." Leaving Caylee with Cindy would in some sense have meant that Cindy won. Ashton stressed that Casey killed Caylee in order to have fun and freedom. He didn't go into the possibility that Casey killed Caylee to hurt her mother, perhaps because it's hard to find actual proof of such a statement. But it's real.

Ashton also avoided talking about how Casey was finally - after months - persuaded to admit that her entire story about a nanny called Zanny was fiction, and what she said next. This is something that apparently we will never get to know. Some say that she invented the name "Zanny the nanny" because of a children's book with that character; others say that she gave Caylee frequent doses of Xanax to put her to sleep while she partied, and "Zanny" was a nickname for "Xanax". Who knows?

As for Baez, what he failed to mention are some inconsistencies which, to my mind, are absolutely fatal to his argument.

If George Anthony hid the body and forced his daughter to remain silent about it, then some of his subsequent actions make no sense whatsoever.

Baez made much of his attempted suicide (for guilt?), his words "I have always let each of you down in more ways than I can remember" (abuse?), his use of duct tape around the house.

He thought he had an answer for everything. Why should there be duct tape on the face of a child who accidentally drowned? Well, the meter reader who found the body in the marsh took the skull home and put duct tape on it to hold the pieces together before putting it back in the woods and calling the police - duct tape that just happened to be the same rare and no longer produced brand that George Anthony had in the house. What the... ?

But hey, what about this?

When George Anthony picked up Casey's car from the tow yard, two days before Cindy got Casey to admit that Caylee had in fact disappeared, it smelled horribly and strongly of human decomposition. George stated that as a former police officer, he knew that smell, and he recognized it beyond a doubt. He was afraid to find the corpse of Casey or Caylee in the trunk, but instead, he found only a bag of trash.

George Anthony did not call the police that day, he says, because as far as he knew there was no reason to: he and Cindy weren't yet aware that anything was wrong, Casey was still fobbing them off with happy stories about how busy she and Caylee were and how much fun they were having. But he must have told Cindy that the car smelled like a dead body, because she stated this when she called 911 two days later, after learning that Caylee was missing, and she herself wouldn't have had any opportunity to recognize the odor.

But think about this. If George Anthony were guilty of getting rid of Caylee's body, why would he have told Cindy that he recognized the odor of human decomposition in the trunk of Casey's car? To get Cindy worried about the possibility of Caylee's death and to bring on an investigation would be the last thing he would have wanted. If he were guilty of covering up an accidental death, he wouldn't have wanted anything related to death to be noticed. He would have told her that the car stank of rotting trash.

And, what about the stain? Why would George Anthony ever have mentioned a new stain in the trunk that he had never seen before? How much wiser, if he were guilty (of anything at all), to have kept quiet.

This is just one of the logical fallacies in Jose Baez's arguments. There are others. It's actually an interesting mental exercise to pick them out. In fact, Baez's use of language already shows that he's not a logical thinker. Here's an example:

"Dr. Vass is not a chemist. He would like to be a chemist, but he isn't. I'd like to be a race car driver, and sometimes I drive pretty fast (heh, heh), but I'm not one. And he's not a chemist. And all the other chemists came to different conclusions."

You notice that pesky little "other" that crept in there? Maybe somewhere in Baez's unconscious, Dr. Vass is something of a chemist after all...

And here's another. "There isn't any stain there. The prosecution is trying to make you angry, so that if you stare hard enough and long enough at that picture, you'll just start believing that you see a stain. There isn't any stain. And the witness told us that not even professional cleaners could get that stain out. That car is ten years old and had another owner before."

There isn't any stain, but the car is old so it's normal for it to have stains, and in fact even professionals couldn't get the stain out. Yet there isn't any stain.

And what about Jose's "Freudian slips"? Did you notice him shouting "The truth stops here! - err, uhh, the truth starts here..." And what about when he's in the process of demolishing George Anthony and says "He can't lie at all! err, he can't tell the truth, at all." And when he says to the jury: "Right now as you sit here, you must certainly have more answers...uh, more questions than answers."

Freud didn't write volumes about these slips for nothing.

Sunday 3 July 2011

James Surowiecki: The Wisdom of Crowds

In the last post we talked about Intrade, which (along with other online betting and prediction sites such as Newsfutures) has a novel manner of predicting the future: letting the people decide.

A specific, factual prediction is made. If you believe it will happen, you buy shares, if you believe it won't happen, you sell shares. The prices at which you, individually choose to buy and sell reflect the strength of your conviction. The price of the stock fluctuates with each trade: essentially, the price of the stock at any given moment is the price that the latest person who bought it paid for it. That price is always between $0.00 and $10.00, and the "collective perception" of the probability of the statement's truth is 10 times the price of the stock.

The major question underlying this system is: why should we believe that a bunch of people laying bets and buying and selling stock knows anything about what's going to happen or not happen? They're not experts, what do they know?

Ah, but as it turns out, apparently they do know. Research has shown again and again that the dominant opinion of a crowd (if it's a yes or no question) or the average opinion (if it's a numerical question) tends to be closer to the truth than the opinions of individual experts, no matter how expert they may be.

For example, a 1984 study showed that the price of orange-juice futures was a more reliable weather predictor, in certain citrus-growing states, than the U.S. Weather Service forecast!

Much earlier, 19th century anthropologist Francis Galton was both surprised and impressed to discover that at a county fair guess-the-weight-of-the-ox competition, the average of the guesses of the 800 farmers present was astonishingly close to the actual weight of the ox - closer than all but a very few of the individual guesses, and closer than all of the guesses made by cattle experts present at the event. (It is said that Galton's observation greatly reinforced his faith in the system of democracy.)

But everyone knows that crowds can be swayed into wild, irrational behavior - stock market panics, lynchings, mass hysteria. So why should the wisdom of crowds be trusted?

One James Surowiecki, staff writer at The New Yorker, has developed a simple and striking theory to answer that question. According to Surowiecki, for a crowd to be a "wise crowd", it needs to have four qualities:

1) Diversity of opinion: Each person should have private information even if it's just an eccentric interpretation of the known facts.

2) Independence: People's opinions aren't determined by the opinions of those around them.

3) Decentralization: People are able to specialize and draw on local knowledge.

4) Aggregation: Some mechanism exists for turning private judgments into a collective decision.

[This table summarizing Surowiecki's theory was drawn directly from the Wikipedia entry on "The Wisdom of Crowds".]

In examples of collective error or misbehavior, at least one of the elements above is missing from the crowd; most generally "independence", as the anger behind a lynching or the panic leading to a financial crash are highly contagious.

But, based on many studies, Surowiecki demonstrates that the predictive ability of "wise crowds" is remarkable.

Estimating the probabilities of possible outcomes of factual matters is fundamental in developing strategy. So if a "wise crowd" can make some of those predictions for us, let us by all means pay attention.








Saturday 2 July 2011

Dominique Strauss-Kahn: situation reversal!

"Right now the odds of a conviction for a felony are hovering around 8.5% for DSK, according to Intrade, an online betting forum", I just read on a blog.

Intrade? Strauss-Kahn? An online betting forum for outcomes of current events?

Yes - it really exists! Chances for Strauss-Kahn to be convicted of at least one charge have just gone up a little, actually, from 8.5% to 9.1%.

Well, contrary to all expectation, Strauss-Kahn's accuser's credibility has just collapsed completely, and the case against him pretty much along with it. Gone are the ankle bracelet, the house arrest and the gigantic bail. Strauss-Kahn is free on his own recognizance, without a passport but able to travel within the U.S. His next court date is July 18.

Two posts ago, I was talking about developing strategy in a risky situation and using the Strauss-Kahn affair as an example. To develop strategy, it's necessary to be able to predict outcomes. That is exactly what Intrade is doing.

So what is their method? It's simple and brilliant, based on collective perception. Anyone who wants can place a bet of "Yes" or "No" to the statement "Dominique Strauss-Kahn to be found guilty of at least one charge", and each bet is associated to two quantities: the number of shares you buy (if you bet "Yes") or sell (if you bet "No"), and the price of those shares, which changes at each new bet. Every share price is between $0.00 and $10.00, and the probability of the "Yes" outcome at any given moment is equal to 10 times the price of the share. Shares can only be bought and sold by betters, not by Intrade. It is possible to offer for sale shares you don't own: you name your own sale price, but you have to buy them back later. If you wait till the event happens and your prediction is right, you buy them back at $0.00 so all the profit is yours, but if you were wrong, you have to buy them back at $10.00, which is more than what you sold them for. (In fact, you can buy or sell shares at the going price or name your own price at any time. For instance, you might want to sell at a lower price than the going price if you just want to get out of the whole story, or if you think that the situation is going to undergo a sudden reversal. But that will only work if you can find a buyer or seller who agrees with your proposed rate.)

How do predictions get started on Intrade?

A new statement is made, for example, "Dominique Strauss-Kahn to be elected President of France in 2012".

Shares for sale and purchase offers are visible. The site shows buyers offering to purchase a total of 10,179 shares at tiny prices ranging between $0.01 and $0.50, and 30 shares being offered for sale at about $9.99. This is indicative of a very low rate of belief that Strauss-Kahn will be the next French president.

(Personally, I might be willing to bet that he will, but making a $3.00 profit is not worth my trouble.)

None of these offers have found any takers yet, so there is no "going price" and no outcome prediction at present.

But once trades start - once someone offers a sale and another person accepts to purchase at that price - the share price is fixed, and the outcome probability is taken to be 10 times the share price.

Back to the question of whether Dominique Strauss-Kahn will be convicted on any of the seven original counts: I've just joined Intrade, and sold 10 shares at $0.84 - I believe he's going to walk free.

My action brought the prediction that he'll be found guilty down from 9.1 % to 8.3 %.

If he goes free, I make $8.40 ! :D

If he doesn't, I lose $91.60. :(

What that means is that most people agree with my prediction. Intrade is using collective perception to make predictions and that is a very neat idea.

In my next post, I'll talk about the fact that collective perception from a diverse group of people works better for predicting outcomes than either polling or asking experts.


___________________________________________


Let me end this post by offering a white rose of respect and sorrow for Intrade's founder, Irishman John Delaney of County Kildare. A band at the top of Intrade's main page informs us that he died two weeks ago, 50 metres below the summit of Mount Everest, a summit he had always longed to climb. Delaney leaves behind his wife, two sons aged three and two and a tiny daughter, born just two days before his death. My heart goes out to them.

Thursday 30 June 2011

Amanda Knox Trial: Crazy Numbers

A very important day in the Amanda Knox appeal trial being held in Perugia, Italy.

Knox and her former boyfriend Raffaele Sollecito have been found guilty and sentenced to 26 and 25 years of prison respectively for the murder of British Erasmus student Meredith Kercher, at the cottage the two girls shared with two other roommates.

Amongst many troubling indices of guilt were a large kitchen knife found in the drawer of Raffaele's flat, a five-minute walk from the girls' cottage, and a small piece of Meredith's bra, the part of the clasp containing two hooks, which had been cut or torn off and separated from the rest of the bra during the attack on the unfortunate student.

The knife was tested for DNA, and while a sample of Amanda's was found on the handle, a tiny amount attributed to Meredith was found on the blade. As for the bra hooks, analysis in the forensic lab of the Scientific Police of Rome found a mixture of DNA coming from two people: a larger portion from Meredith and a small amount clearly identifiable as Raffaele's.

No other DNA of Raffaele was found in the room where Meredith was stabbed. Apart from the bra hooks, the clear evidence against him consists of a bloody footprint attributed to him found on the bathmat, and incontrovertible proof that he lied several times, in particular about his alibi.

Electropherograms (DNA graphs) of the sample found on the knife blade have been made public, and can be easily compared to available electropherograms of DNA known to be Meredith's. Although the peaks in the knife DNA are much lower, indicating a very tiny sample, the two graphs are obviously identical, with peak pairs located in precisely the same positions.

Amanda Knox's father, Curt Knox, and her mother, Edda Mellas, have commented in public at great length on the subject of the DNA found on the knife. Criticizing the work of the forensic biologist from Rome, Patrizia Stefanoni, the Knox/Mellas family has repeated in every media venue across the United States such messages as "the DNA on the knife has only a 1% chance of belonging to Meredith" or "the DNA on the knife could belong to half of Italy".

These statements are mathematical nonsense. It is quite undermining to their efforts at proving their daughter's innocence via a media blitz to make statements which are so mathematically wrong. The DNA on the knife corresponds at all thirteen genetic loci with that of Meredith Kercher. The chance that a different person produced that DNA sample is 1 in trillions or quadrillions - far more than the earth's population. If genetic analysis is used to identify human beings, it is because a complete sample (usually considered to be thirteen clear genetic loci, each one consisting of a pair of peaks) is considered just about 100% certain.

To say that the knife DNA has a "1% chance of belonging to Meredith" is simply a flagrant misuse of "numberspeak" in order to influence a public assumed to be either too ignorant to notice, or simply to have no access to the documents (electropherograms) that prove the opposite.

What the Knox/Mellas family should have said, and would have said loud and clear all along if they had been correctly advised from a mathematical point of view, is that the DNA from the knife blade clearly belongs to Meredith, but there is a clear possibility of that electropherogram being the result of contamination. This is, indeed, what Amanda Knox's lawyers held in court. The suggestion was dismissed, however, by the judge and jury in the court of first instance, because it was never explained how and where contamination might have occurred. The knife was collected by gloved investigators and packaged in a new unused envelope before being transmitted to the Rome laboratory for analysis, and Meredith had never set foot in Raffaele's house, so it seems extremely unlikely that any contamination occurred during the collection.

Raffaele further complicated his own situation by reacting with a completely unlikely story when he learned about Meredith's DNA on his knife. Instead of saying "That's impossible, she was never at my house" - or preserving a wise silence - he spun a tale about how it was normal that her DNA was on his knife because once he pricked her with it by accident while they were cooking together. Although he had once or twice cooked at the flat the girls' shared, he and Amanda both denied having ever taken the knife there, so his story sounded horribly like a lie.

Nevertheless, Raffaele and Amanda have had a break today. As a gesture of objectivity, the judge in their appeal trial handed the knife, the bra clasp, and all of the documents concerning them to a pair of objective, court-appointed experts from a university in Rome. Their report was handed in to the Court today, and leaked - already yesterday - to a number of sources.

The conclusions are that contamination may have occurred.

For the knife, it is pointed out in the report that given the tiny quantity of DNA on the blade, it should not have been analysed in the same laboratory where other samples of Meredith's DNA were already analysed, as the earlier samples can easily contaminate the later tinier ones. Proper protocol would have indicated taking the tiny knife sample to a different lab for analysis, but this was not done.

For the bra clasp, the report explains that there were any number of possibilities for contamination due to faulty collection procedures. It furthermore specifies that Dr. Stefanoni erred in attributing the DNA on the hooks to just two people, as there are clearly further contributors to the mix of DNA on the hooks, with more than one male. As no other male has been identified - neither the third accused perpetrator of the murder, Rudy Guede, nor Meredith's boyfriend, nor any other known person, this presence could be indicative of contamination.

There is a more evidence against Amanda Knox and Raffaele Sollecito: their own lies, mixed traces of Amanda's blood or DNA and Meredith's about the murder cottage, a painfully obvious effort at staging a faked burglary. Acquittal is far from certain. But the experts findings in the report turned in today will help, removing any proof of premeditation - such as bringing the knife from Raffaele's place to the cottage - and any actual guilty trace of Raffaele in the murder house, apart from a fuzzily outlined footprint.

The news reports that Amanda is singing and dancing today.

The experts' report will be presented in court on July 25.

Monday 6 June 2011

Dominique Strauss-Kahn: recommended strategy

When people are in a situation where they have to make tough decisions, taking a number of factors into account and evaluating the risks, they generally proceed by a combination of imagination, instinct and principle.

Actuaries, however, who deal with risk evaluation professionally, are used to placing monetary values on things which are generally considered to escape measurement in financial terms, such as the risk of accidental death.

What would make a lot of sense, and certainly makes quite an interesting exercise, would be to use an actuary-style approach when selecting strategy, except in terms of a unit of measure that corresponds to human feelings rather than currency.

There is no such unit of measure in our language: apparently it isn’t a concept that is used enough to have justified the creation of an actual word. For today, let’s use the expression catastrophe points.

You measure all the bad things that can happen to you in catastrophe points. The measurements are subjective, since obviously different people are not necessarily going to assign the same number of points to different catastrophes. One person might think it worse to lose a job than a lover, for example, while another might feel just the opposite. These measurements are individual, subjective figures, but when a person is deciding on a strategy in a difficult situation, they can be really helpful and shed quite a new light.

As an exercise in the use of catastrophe points in choosing strategy, let’s amuse ourselves by considering the case of Dominique Strauss-Kahn, currently under house arrest due to an accusation of attempted rape and sequestration against a hotel maid in New York.


From information leaked by police and prosecutors, Strauss-Kahn is said to have emerged naked from the shower in his luxury suite at the Sofitel to find the maid on her way to clean his bedroom, under the mistaken impression that he was not in the hotel. What is he accused of having done? Grabbed the woman, pressed his private parts against her mouth, tried to pull off her stockings, grabbed her again as she fled up the corridor to the bathroom, and prevented her from leaving the suite by closing the door which she had left open, as maids generally do while cleaning.

We don’t yet really have enough information to come to a clear conclusion about what actually happened in that hotel suite on that day. Why did the woman enter an occupied hotel room at all? How did she get out? Perhaps Strauss-Kahn didn’t intend to actually rape her; he may be aware within himself that he wouldn’t go beyond a certain point in forcing an unwilling woman. It’s hard to tell for now; we will learn more as the case unfolds. But it does seem pretty clear that Strauss-Kahn actually did much of what he is accused of doing. Yet he doesn’t appear to feel any trace of guilt or remorse. This may be as much a sign of mental trouble as his repeated aggressive sexual approaches towards women. Or else, maybe it just comes from a sense of entitlement due to his wealth and position, or from a profound conviction of his own irresistible charm.

Under the assumption - which seems pretty clear - that Strauss-Kahn does not feel real guilt or remorse for his acts, and that he doesn’t feel any particular inner or moral pressure to confess, he is in a perfect position to make a numerical assessment of the gains and losses of the strategies open to him. The goal of this post is to help him do that, while getting in a little practice about how one might think this way in front of one’s own personal dilemmas.


In order to help Mr. Strauss-Kahn with the task of choosing an optimal strategy, we’re going to have to make a guess at the values in terms of catastrophe points that certain negative events would have for him. This is obviously the point where our little exercise loses validity, since a person can only really make that evaluation for him- or herself. However, as we’ve said, something of Strauss-Kahn’s personality has filtered out recently through the news, and we think we can make a stab at a realistic assessment as follows.

First of all, we have to note that Strauss-Kahn is in possession of a considerable fortune (largely belonging to his wife) estimated at about 50 million dollars. Therefore, the loss of a million dollars will be worth many less catastrophe points for him than it would be for most of us. We can safely say that DSK would be more than happy to pay a couple of million dollars to stay out of prison for a year.

But he isn’t just risking prison: he stands to lose much more, in terms of his career, and also in terms of the difficult-to-measure advantage of “face”, something of little importance to many people, but of tremendous importance to those who choose to engage in public life. The major losses that DSK is facing in his present situation are: loss of money, loss of freedom (i.e. prison), loss of face and loss of career.

Let’s assign catastrophe points to these losses as we might imagine that DSK measures them.

Money: loss of $1,000,000 = 1 catastrophe point

Prison: loss of 1 year of freedom = 2 catastrophe points

Career: loss of (possibility of) important public office = 10 catastrophe points

Face: loss of face because guilty of sex attack = 3 catastrophe points
...........loss of face because proven to be a liar = 5 catastrophe points
...........loss of face because of illegally buying his way out = 8 catastrophe points

Admittedly, these quantities are selected more or less intuitively: we imagine that DSK regrets the impossibility of running for president more than he feels guilty, that he feels less ashamed that people should think of him as a sex attacker (i.e. sex bomb) than that they should think of him as a liar, etc. But when using our method in your own strategies, you’ll be able to choose the numbers that really make sense to you.

Now let’s go over the DSK’s possible strategies and evaluate the risks and losses associated with each one. Naturally, our risk evaluations are also guesses, but they seem fairly probable. Note that the victim in the case is a witness and not a plaintiff, therefore DSK cannot make any legal offer to negotiate with her; in fact he cannot even contact her as she is presently being sequestered in order to prevent any possibility of witness tampering. Any effort to influence her via her family in Africa, say, would be highly illegal, yet difficult to prove or to punish given the international nature of the situation.

I. Deny, deny, deny.
Strategy: Go to court and plead innocent.
Risk: guilty verdict, probability 90%.
Loss if successful: if DSK is acquitted, no loss.
Losses if unsuccessful:
..........freedom (say 5 years)
..........any possibility of a career in public life
..........money (say $1,000,000 in damages)

II. Cards on the table.

Strategy: Go to court and plead guilty, express remorse.
Risk: None. Guilty verdict certain.
Losses:
..........freedom (say 3 years)
..........any possibility of a career in public life.
..........face: guilty plus liar (denying crime until guilty plea)
..........money (say $1,000,000 in damages)

III. Sneaky sneaky.

Strategy: secretly offer money to the victim’s family
Risk: truth may come out: 70% (hard to hide)
Loss if strategy successful:
..........money (say $2,000,000)
Loss if strategy fails:
..........face (illegally buying his way out)
..........Has to return to strategy II (can hardly plead innocent after this)

Now, DSK wants to choose the best strategy based purely on the possible risks, gains and losses. Which strategy is the wisest?

To answer the question, we count up the numerical result of each strategy, using the “expected losses” formula given by the sum of the probability of each outcome multiplied by the catastrophe points of that outcome.

Strategy I:
....If successful: 0 (10%=0.1)
....If unsuccessful: 21 (90%=0.9)
....Expected result: .1x0 + .9x21 = 18.9 catastrophe points

Strategy II:
....25 (100%=1.0)
....Expected result: 1x25 = 25 catastrophe points

Strategy III:
....If successful: 2 (30%=0.3)
....If unsuccessful: 33 (70%=0.7)
....Expected result: .2x3 + .7x33 = 23.7 catastrophe points

So, it looks like DSK’s best -- or least catastrophic! -- option is to go to trial and deny everything. We’ll wager that’s what he’ll choose to do, because given his character, a guilty plea involves too much loss of face, and an effort at secret negotiation involves too high a risk.

Top lawyer Ivan Fisher has expressed the opinion that pleading not guilty is a bad strategy, that if Strauss-Kahn confessed to having a psychological compulsion to attack women, expressed remorse, and sought professional help, he would garner widespread sympathy and probably less if any prison time. That’s a very interesting analysis, in that Fisher is clearly equating a “bad strategy” with a strategy that will lead to a severer sentence. In other words, Fisher is assigning a lot of catastrophe points to a prison term, and not so many to the loss of face involved in confessing to psychological problems, to needing help, to doing wrong. But that assessment corresponds to a defence lawyer’s professional task, not to Dominique Strauss-Kahn’s mindset. If DSK pleads not guilty, if he goes on declaring his innocent throughout the trial, then whether he wins or loses he’ll go on claiming innocence till the day he dies. It looks like that’s may just the most important thing for him, in the end.


News Flash!
Today, June 6, 2011, Strauss-Kahn pled not guilty at his arraignment in New York. According to news reports, he pronounced only those two words and nothing else. We’ll see what the future holds, but it’s a fact that his attitude today bears out our analysis.