Maths on Trial: 2012

Thursday 4 October 2012

The Obama-Romney Debate and Geogaugy

Popular wisdom may say that Barack Obama lost the debate to Mitt Romney on eloquence, confidence and body language. But looking at the text shows that he won hands down on what we like to call "geogaugy": numerical estimation of real-life quantities. In this case, real-life money such as taxes and the federal deficit.

Here's one of the most striking exchanges between Obama and Romney; it's good to read it without taking into account Romney's well-studied body attitude, facial expression and vocal effects compared to Obama's in truth paler performance.

MR. ROMNEY: I will not, under any circumstances, raise taxes on middle-income families. I will lower taxes on middle-income families... let’s get to the bottom line. That is, I want to bring down rates. I want to bring down the rates down, at the same time lower deductions and exemptions and credits and so forth so we keep getting the revenue we need.

Obama's response:

PRESIDENT OBAMA: Now, Governor Romney’s proposal that he has been promoting for 18 months calls for a $5 trillion tax cut on top of $2 trillion of additional spending for our military. And he is saying that he is going to pay for it by closing loopholes and deductions. The problem is that he’s been asked a — over a hundred times how you would close those deductions and loopholes and he hasn't been able to identify them.

Mr. Lehrer: All right.

PRESIDENT OBAMA: When you add up all the loopholes and deductions that upper income individuals can — are currently taking advantage of — if you take those all away — you don’t come close to paying for $5 trillion in tax cuts and $2 trillion in additional military spending.

BANG! Shouldn't this be totally obvious? Is Romney seriously fooling anyone? Loopholes are going to produce $7 trillion?

Romney tried to muddy the waters by claiming that he was not calling for a $5 trillion tax cut.

MR. ROMNEY: Let me - let me repeat - let me repeat what I said. I'm not in favor of a $5 trillion tax cut. That's not my plan. My plan is not to put in place any tax cut that will add to the deficit. That's point one. So you may keep referring to it as a $5 trillion tax cut, but that's not my plan.

But this is foozling. He certainly did propose a series of tax cuts that would add up to $5 trillion dollars over a period of about 10 years; his disclaimer does not concern the existence of the $5 trillion tax cut, but only the fact that it was not going to increase the budget deficit. What Obama said is correct. (Sadly, in spite of the cute little "never mind" comment, Obama did not take enough advantage of Romney's confusing these two issues and thereby denying his own proposed tax cuts.)

Kudos to President Obama for bringing meaning to those numbers by a little wise geogaugy.

PRESIDENT OBAMA: The fact is that if you are lowering the rates the way you describe, Governor, then it is not possible to come up with enough deductions and loopholes that only affect high-income individuals to avoid either raising the deficit or burdening the middle-class. It's - it's math. It's arithmetic.

I would just say this to the American people. If you believe that we can cut taxes by $5 trillion and add $2 trillion in additional spending that the military is not asking for — $7 trillion, just to give you a sense, over 10 years that’s more than our entire defense budget — and you think that by closing loopholes and deductions for the well-to-do, somehow you will not end up picking up the tab, then Governor Romney's plan may work for you.

But I think math, common sense and our history shows us that's not a recipe for job growth.

Let's see for a minute; this tax cut is supposed to cost $5 trillion, so what would that mean for individual middle-class wage earners? Well, there are a roughly estimated 120 million households in the United States and a very roughly estimated half of them qualify as "middle-class". Let's suppose that a normal middle-class household has two wage-earners in it, so we're talking about 120 million middle-class taxpayers in the United States. Obviously, the tax cuts Romney is proposing are going to affect small businesses and not only households, but this is a rough estimation. So if tax cuts for 120 million individual middle-class wage-earners are going to cost the government $5 trillion, that means that each individual would get a tax break of about $40,000 over the 10-year period under discussion. So the tax cut would mean $4,000 dollars a year. It's nice, but it's not life-changing.

As Obama says, it's very hard to imagine that closing loopholes that allow the rich to become richer is going to produce enough money to finance the $7 trillion expense. But let's do Romney a favor, and suppose for a moment that it's true, or at least nearly true; maybe his tax cut will have to be a little more modest. Then everything would be fine, right?

But wait - what about the $16 trillion U.S. budget deficit? Isn't Romney also supposed to explain how he intends to find the money to balance that?

Well, here's how he does it: he explains that his tax cut for the middle class will create such economic stimulation that there will be millions of new jobs, and all of these newly employed people will pay taxes, thence providing the money to balance the budget!

MR. ROMNEY: My plan is not like anything that's been tried before. My plan is to bring down rates but also bring down deductions and exemptions and credits at the same time so the revenue stays in, but that we bring down rates to get more people working. My priority is putting people back to work in America...

Mathematically there are - there are three ways that you can cut a deficit. One, of course, is to raise taxes. [OK but he's going to lower them instead.] Number two is to cut spending. [Right, cut spending on "Obamacare" and PBS, and close those loopholes, too. Oh no, I forgot, that's just supposed to balance out the tax cut.] And number three is to grow the economy because if more people work in a growing economy, they're paying taxes and you can get the job done that way.

Get the rates down, lower deductions and exemptions to create more jobs, because there’s nothing better for getting us to a balanced budget than having more people working, earning more money, paying — (chuckles) — more taxes. That’s by far the most effective and efficient way to get this budget balanced.

Oh, okay. Now, that makes sense: the tax cut will create jobs and these new working people will pay taxes and that will bring in lots of money to balance the budget. So let's see. Unemployment in the U.S. right now is at over 20 million and I don't know how many millions of jobs Romney sees his tax cut of $3 or $4 thousand yearly actually creating, but let's be insanely optimistic and say his policies cut unemployment by a half, creating 10 million jobs.

The 10 million newly employed people are all going to start paying taxes, which is great, because now in ten years these ten million people are going to pay for a $16 trillion deficit that the rest of the country's taxpayers' money is not sufficing to reduce. That means that each of the 10 million newly employed people will be responsible for paying $1,600,000 worth of taxes to Romney's government in ten years - $160,000 a year each! Bingo - problem solved!

Well, if Romney reduces unemployment to 0%, then each new employee only has to pay $80,000 of taxes a year....

Right.

Saturday 7 April 2012

Childhood obesity

New York "socialite" Dara-Lynn Weiss has taken a beating lately because she published an article in Vogue about putting her 7-year old daughter on a diet.

The main thrust of the criticisms that have been leveled at Dara-Lynn are, in summary, something like this:

(1) Children should be left to enjoy a happy childhood and not forced to become self-conscious about food at such a young age - teenage and adulthood will already bring enough psychological food-related problems to American women. Dara-Lynn's attitude is likely to produce an eating disorder in an innocent, happy child.

(2) In spite of the doctor's recommendation and Dara-Lynn's own remarks, her true reason for putting her child on a diet was not health-based, but appearance based: what she really wanted was for the little girl to to look and feel pretty. So really she was pandering to one of our shallower tendencies, namely to overrate the importance of looks and also the importance of what other people think of us.

(3) Dara-Lynn's methods, which contained many elements that may have been humiliating for the child, such as scolding her in front of others and publishing an article on her weight problems in a fashion magazine, may have created present and future psychological problems for the child concerning all kinds of aspects, such as her self-consciousness about her weight, her looks, her mother's love for her, or her relationship with her friends.

Reading over the dozens, even hundreds of angry comments reacting to her Vogue article, I want to break a lance in defence of Dara-Lynn.

To start with - and this is the easiest and simplest objection - those who say that children should be left free to enjoy their childhood and not be put on diets are in straightforward contradiction with every piece of medical information coming our way.

From Wikipedia (and Dara-Lynn quotes this in her article): "There are plenty of statistics available that prove child obesity in America is at epidemic levels. In the last 30 years the number of children who are overweight has tripled to 15%....The mixture of fast food diets along with sedentary lifestyles is creating a generation of children who are facing very adult health issues like high cholesterol, diabetes and heart disease." (If this blog entry is on the Maths on Trial blog at all, it's because this is STATISTICAL evidence that the problem faced by Dara-Lynn is a real and prevalent one!)

Is there really anyone out there who says that Dara-Lynn should have left her daughter to run straight into the jungle of health problems described here, when in fact, mother and daughter working together could actually do something about it? I don't know, but the "leave that poor child alone" attitude is really not convincing.Secondly, there's the psychological aspect directly concerning food. Many commentators say that Dara-Lynn's behavior, especially in the kind of social circles - Greenwich village intelligentsia - where she and her daughter live and move, are probably going to produce eating disorders in the child, and that in fact, these disorders are just as dangerous and actually more likely than the cholesterol and other problems cited above.

But, these comments ignore the fact that an obese child already has an eating disorder! In fact, I really don't believe that there is any difference between children who become fat because they are constantly hungry and have grown over the years to develop habits of eating very large portions and a large amount of carbohydrates, and grownups who behave the same way. Chronic overeating and the inability to recognize the signals of fullnes do form an eating disorder, whatever the age. Many adults fight with this disorder more or less all the time, and dream of some sadly non-existent simple and healthy form of appetite control to replace the missing awareness. Addiction to food, especially addiction to carbohydrates, exists. Everyone knows that little kids can suffer from this kind of eating disorder just as much as anyone else; most everyone probably even knows a kid who does. So a parent of a child with this problem is really caught between one eating disorder that already exists, and another one that may be more dangerous for the health, but on the other hand may not ever occur. Indeed, although some overweight people may go overboard in dieting and eventually become anorexic, the large majority will just manage to control their eating desires enough to reach a healthy weight, and then yo-yo up and down over the months and years as addiction battles with will-power. Yes, this is an eating disorder too, but it is less dangerous than obesity, not more, and it has its satisfactions, too, both when good things get eaten and when the diets reach their goal.

And this leads into the third reason for which I think that many of Dara-Lynn's critics are missing the beat: it is absurdly and even horridly wrong to pretend to believe or imagine that in a Western country at least, an obese child lives a happy, tranquil, psychologically undisturbed life until their parents cruelly interfere. If you don't believe me, have a look at the interview videos of Britain's new tenor star, Jonathan Antoine. He describes the suffering of growing up obese, and doesn't leave the slightest doubt that it is no enviable condition.

Wikipedia's page on obese children says it all: "Children who are obese also must confront the many psychological issues that being overweight creates. Overweight children often have low self-esteem, which is made worse when they are unable to participate in normal activities such as sports or on the playground... Obese children are teased, bullied and made to feel inferior on a number of levels."

So, maybe Dara-Lynn was not the one creating a worrying, humiliating psychological environment in the mind of her happy little girl. Maybe what she was actually trying to do was PROTECT the child from one. Yes, she started the diet, not just because the doctor said to, but because the little girl cried when a boy in school called her fat. But why is that a bad thing? All she was doing was waking up to the suffering her daughter was going to endure and was already starting to endure: exactly that psychological suffering from teasing and bullying expressed in the Wikipedia passage above.

How can a parent in Dara-Lynn's situation protect her little girl?

Should she just tell her that she has to live with it, ignore it, rise above it, and not listen to what other people say?

Should she teach her that being overweight is just part of "the way she is", and should be respected like any other physical or character trait?

What about teaching the girl that she can DO SOMETHING ABOUT IT? And helping her do it, and prove to herself that SHE CAN DO IT, that she can OVERCOME HER PROBLEMS?!

I have no connections to publishing in glossy magazines, but minus that aspect, if I were in Dara-Lynn's shoes, I think I would try to do the same thing that she did. I'd do it with love but also with strictness and a clear goal in mind. My own personal quirks, foibles and problems would play a role, as they do in all of my interactions with other people, all the time. I wouldn't be any more capable than she was to predict the long-term outcome. But I'd still do it. Good for her.

Sunday 29 January 2012

Laurence Tribe: Maths on Trial (7)

Tribe’s reaction to the use of Bayes’ theorem at trial

In this last post of the series, we continue to summarize Tribe’s objections to the use of Bayes’ theorem at trial, with our responses. Last time we discussed his first objection:

1….The Distortion of Outcomes

Today, we continue with the other three objections.

2….The End of Innocence: A Presumption of Guilt?

“At least in criminal cases, and perhaps also in civil cases resting on allegations of moral fault, further difficulties lurk in the very fact that the trier is forced by the Finkelstein-Fairley technique to arrive at an explicit quantitative estimate of the likely truth at or near the trial’s start, or at least before some of the most significant evidence has been put before him.” Tribe’s argument in this section is that the idea of forcing a juror to arrive at some kind of mathematical figure for the prior probability of guilt, before applying Bayes’ theorem to update that prior according to some new piece of numerical evidence, is fundamentally against the presumption of guilt, and that this injustice is not rectified even by setting the prior probability to an unreasonably low value.

Tribe stresses the importance of the juror listening to all of the evidence before “reaching any judgment, even a tentative one, as to his probable guilt”; he terms this one of the “intangible aspects” of our commitment to the proposition that “a man who stands accused of crime is no less entitled than his accuser to freedom and respect as an innocent member o the community”. Tribe admits that in reality, a juror listening to a trial may not actually be considering the accused as certainly innocent until a complete proof has been laid out before him; he may, in fact, swing backwards and forwards in his estimate of guilt as the trial proceeds, thus holding some vague idea of “prior probability of guilt” at all times. However, according to Tribe, these impulses must not be spoken or expressed, let alone called to the fore, out of respect for the presumption of innocence. “Society ought to speak of accused men as innocent, and treat them as innocent, until they have been properly convicted after all they have to offer in their defense has been carefully weighed,” and “Jurors cannot at the same time estimate probable guilt and suspend judgment until they have heard all the defendant has to say.”

We feel that the proper method to counter Tribe’s worries and to allow jurors to keep all their thoughts about possible guilt, and their personal, intimate estimates of probability of guilt, silent and unspoken throughout the trial, is the presentation of a table such as the one given by Finkelstein and Fairley, in which many different values for the prior probability of guilt are entered, and the update according to the new evidence is calculated for each of them. In this way, the jury members are never required to explicitly formulate a probability of guilt at any stage; it is enough for them to place themselves loosely within the table, without being asked to specify where, or even to consider the entire set of output results and the meaning they indicate before coming to any final decision. This would certainly respect the duty of silence that Tribe describes.

We note as an aside that Tribe himself appears disturbed by the fact that the duty of silence is at the same time a call for lack of candor. In a few well-chosen words he expresses his obviously deeply held feeling that while a lack of candor should not be something that is required lightly, there are cases in which it serves a higher moral purpose. Lies or refusals to consider the full weight of the evidence are not required, merely a silence on the subject in respect for the presumption of innocence: “One need not say everything all at once in order to be truthful, and saying some things in certain ways and at certain times in the trial process may interfere with other more important messages that the process should seek to convey and with attitudes that it should seek to preserve.”

We respect and agree with this approach, but as explained above, Bayes’ theorem can be correctly presented and used at trial without undermining it.

3…The Quantification of Sacrifice

One extremely disturbing factor about the use of any mathematical method to determine guilt is that since the final probability, when all the evidence has been taken into account, is rarely likely to be equal to 1.0, it will, in cases of conviction, tend to be some concrete and convincing figure such as, for example, 0.98. Unfortunately, accepting such a figure as the probability of guilt can also be expressed as saying that one accepts that 2 people out of every 100 convicted are expected to be innocent. While there is no doubt that many innocent people are convicted each year, it is indeed disturbing to have an actual figure of the number of such people that one expects. “There is something intrinsically immoral about condemning a man as a criminal while telling oneself, “I believe that there is a chance of one in twenty that this defendant is innocent, but a 1/20 risk of sacrificing him erroneously is one I am willing to run in the interest of the public’s – and my own – safety.” Tribe wishes for any justice system to be structured in such a way as to avoid ever having to make such a proclamation with a specifically published figure. He considers that it is morally superior for a juror to express himself as being “very sure” or “as sure as possible” of guilt than to give an actual figure, which would prove his readiness to accept such or such a percentage of innocent sacrificial victims.

Naturally, there are miscarriages of justice in any justice system, but Tribe points out the vast moral difference between society’s recognizing the necessity of tolerating them, and the fact of its actually embracing a policy that juries “ought to convict in the face of this acknowledged and quantified uncertainty”. He prefers sticking to the notion of “guilt beyond a reasonable doubt”, which represents “a subtle compromise between the knowledge, on the one hand, that we cannot realistically insist on acquittal whenever guilt is less than absolutely certain, and the realization, on the other hand, that the cost of spelling that out explicitly and with calculated precision in the trial itself would be too high.”

Our response to this argument is that Bayes’ theorem should not, probably ever, be used to compute the probability of guilt. It should be brought in to establish certain factual subsidiary questions, such as the probability of its being the defendant or another person who left a certain print, was seen in a certain place, was present at a certain time, transported a certain object, and so forth. The actual final decision of innocence or guilt should and must be made by jurors without recourse to any numerical calculation.

4…The Dehumanization of Justice

Tribe’s final argument against the use of mathematical methods at trial is simply that they “threaten to make the legal system seem even more alien and inhuman than it already does to distressingly many…The need now is to enhance the community comprehension of the trial process, not to exacerbate an already serious problem by shrouding the process in mathematical obscurity.” Tribe worries that “guided and perhaps intimidated by the seeming inexorability of numbers, induced by the persuasive force of formulas and the precision of the decimal points to perceive themselves as performing a largely mechanical and automatic role, few jurors could be relied upon to recall, let alone to perform, this humanizing function, to employ their intuition and their sense of community values to shape their ultimate conclusions.”

Our response to this is that with the advent of DNA analysis, mathematics at trial is here to stay. It is perhaps not yet fully understood that the statistical analyses used on DNA are no different than many other cases and situations where Bayes’ theorem can be applied. The public has grown used to seeing DNA analyses presented at trial over the decades since Tribe wrote his article, and juries deal with the situation competently enough in the main, having expert witnesses explain the issues to them in layman’s terms, and not, generally, forgetting to employ their common sense.

It seems to us that what is needed is a general education aimed at the public, so that little by little, the notions used there, which are not more difficult than much of the mathematics seen at school, become familiar and trustworthy to the public at large, from which juries are drawn. We believe that this is the only way in which Tribe’s profound moral and social concerns can be reconciled with the fact that mathematics at trial is here to stay. We also believe that such a general public education is a legitimate and reasonable aim to work towards, which is the whole purpose of this blog.

Friday 27 January 2012

Laurence Tribe: Maths on Trial (6)

Tribe’s reaction to the use of Bayes’ theorem at trial

We finally got to the heart of Tribe’s article:

Fourthly, an emotional and deeply human explanation of his final decision to recommend the avoidance of mathematical methods altogether in the area of criminal law.

In this second to last post of the series, we want to explain Tribe’s objections to the use of Bayes’ theorem at trial, in examples such as the one given by Finkelstein and Fairley, because if we intend to support the use of Bayes’ theorem and Bayesian networks (under very specific conditions and criteria which are still to be developed), we need to understand the major objections first.

He gives four objections, described in detail. Today we discuss the first of the four.

1…The Distortion of Outcomes

Tribe points out the difficulty of settling on a prior probability of guilt of the defendant, before using Bayes’ theorem to update this probability in the light of new, numerical evidence. According to Tribe, “Because the Finkelstein-Fairley technique thus compels the jury to begin with a number of the most dubious value, the use of that technique at trial would be very likely to yield wholly inaccurate, and misleadingly precise, conclusions.”

We don’t believe that this is a serious problem, however, for the following important reason: if the input to Bayes is not a known statistical figure but merely a subjective evaluation, then a wide range of different possibilities should be input. If the outcomes re then very different, one can discard the use of Bayes in that particular situation. But it can happen that in spite of very different inputs, the outcomes are all quite similar, just as it turned out in Finkelstein and Fairley’s example. What this means is that the numerical evaluation of the knife palm print evidence is actually more indicative of guilt than one might intuitively believe.

Tribe also claims that in court, a figure like “one in a thousand”, and a table like that given by Finkelstein and Fairley, could unduly impress a jury. “The problem with the overpowering number, that one hard piece of information, is that it may dwarf all efforts to put it into perspective with more impressionistic sorts of evidence,” and “The problem – that of the overbearing impressiveness of numbers – pervades all cases in which the trial use of mathematics is proposed. And, whenever such use is in fact accomplished by methods resembling those of Finkelstein and Fairley, the problem becomes acute.” He particularly warns against this happening when the numerical evidence is not connected to the specific case at hand, like the palm print, but concerns a general situation of which the case at hand is just one instance, like the barrel falling out of the window and the information that 60% of the time such an incident is caused by a negligent act.

We believe that this kind of information, which could be used to help a jury fix its prior assumption of guilt, not to update it, is not suitable for courtroom use, as it says nothing about the particular case at hand. And we think that the type of numerical information that does pertain to the case at hand, such as in Finkelstein-Fairley’s example, will impress the jury in a reasonable manner; we see no reason to believe that the jury might be overpowered, if the matter is presented in a reasonable and non-dramatic manner, and if the use of Bayes’ theorem eventually becomes a common and well-recognized occurrence in court.

Tribe warns against the attempt to simplify events in order to apply Bayes’ theorem more fittingly. He gives as an example the fact that Finkelstein and Fairley assumed that the defendant would leave a palm print on the knife if he used it to kill, ignoring the possibility that he may have worn gloves, or that he might have wiped off his prints, or even that the perpetrator being someone other than the boyfriend might have left a smudged version of his different print, which smudging strangely resembled the boyfriend’s print. He adds other forgotten possibilities for the palm print, such as, for example, that the defendant left his own palm print during an innocent use of the knife, which was subsequently used by someone wearing gloves to perform the killing; someone who left the palm print either because he did not see it, or with the conscious intention to frame the defendant. “Finkelstein and Fairley overlook the risk of frame-up altogether – despite the nasty fact that the most inculpatory item of evidence may be the item most likely to be used to frame an innocent man.”

All of these are highly unlikely events, but they do have non-zero probabilities, and Tribe considers that a result obtained by forgetting them all carries a real risk of being in error. We agree with this, particularly with the frame-up theory, which was also so remarkably forgotten in the case of Joe Sneed. Tribe adds that including all these possibilities into the presentation of Bayes’ theorem would make the formula extremely complicated and unrealistic to use, especially as all the probabilities of these events would have to be estimated.

Our response to this argument is that a Bayesian network would be more suited to the complicated situation than Bayes’ theorem; but in general, it is best to use Bayesian networks only when a fairly wide range of reasonable input probabilities can go in, whereas the outcomes point clearly in a certain direction. It would be worth making the attempt on the Finkelstein Fairley example to see whether it is such a case. Such experiments could be made before a case was actually being tried in court, during the pretrial investigations, and Bayes would not be introduced into court at all unless the use of Bayes turned out to be convincing. This approach should solve the objection that Tribe expressed in the following words: “It simply does not follow that trial accuracy will be enhanced if some of the important variables are quantified and subjected to Bayesian analysis, leaving the softer ones – those to which meaningful numbers are hardest to attach – in an impressionistic limbo.”

Tribe also objects that even if the jury is convinced by the figures that the defendant held the knife and stabbed his girlfriend, they still have to take into account the state of mind of the defendant during the act before deciding whether he is actually guilty of murder. “One consequence of mathematical proof, then, may be to shift the focus away from such elements as volition, knowledge and intent, and toward such elements as identity and occurrence – for the same reason that the hard variables tend to swap the soft.”

We are not convinced by this argument. We trust the defendant’s counsel to raise the question of his mental state in front of the jury.

Finally, Tribe points out that the jury must be absolutely ignorant of the new, numerical piece of evidence until the moment when it is introduced at trial together with the Bayesian calculation. Otherwise, if they have heard anything of it beforehand, they will have already factored it into their estimation of the prior probability of guilt, and its force will be used unfairly twice over.

This is a good argument. It is important to ensure that the jury does not hear about the new evidence until it appears within the Bayesian framework. For this reason, it may be necessary to exclude certain pieces of evidence which hint at the specific piece of numerical evidence, right up until that point.

Tribe also gives an example where Bayes theorem can give a seriously wrong result. This example is very simple: a robbery that took 15 minutes was committed in a certain place between 3:00 and 3:30 a.m., and the defendant was seen in a car a ½-mile from the scene at 3:10 a.m. Given this information, the jury will come to a prior probability of X that the defendant is guilty. Then the new piece of information is brought, saying that the defendant was seen in a car a ½-mile from the scene at 3:20 a.m. If the probability that, being guilty, this was the case, is calculated and used to update the prior probability, it will increase the jury’s conviction of the defendant’s guilt. But in fact, the two times taken together show that the defendant must be innocent!

It seems to us that it should not be difficult to avoid such obvious traps when designing a Bayesian network to deal with the facts in a specific case. But we do appreciate Tribe’s intelligence and sense of humour in even inventing this example.

Thursday 26 January 2012

Laurence Tribe: Maths on Trial (5)

Tribe’s summary of Bayes’ Theorem in law

We’ve been describing the contents of Laurence Tribe’s seminal article on mathematics at trial. The first two parts we discussed were:

Firstly, a description of the kind of use of mathematics at trial that he is specifically going to discuss, with examples;

Secondly, a review of the traditional arguments that judges have used against mathematics at trial, together with Tribe’s reaction to these arguments;

Today we’re going to summarize the next part:

Thirdly, a sketch of the introduction of Bayes’ theorem at trial as introduced by Finkelstein and Fairley.

Tribe starts by giving an informal introduction to the relevance of Bayes’ theorem: “In deciding a disputed proposition, a rational factfinder probably begins with some initial, a priori estimate of the likelihood of the proposition’s truth, then updates his prior estimate in light of discoverable evidence bearing on that proposition, and arrives finally at a modified assessment of the proposition’s likely truth in light of whatever evidence he has considered. When many items of evidence are involved, each has the effect of adjusting, in greater or lesser degree, the factfinder’s evaluation of the probability that the proposition before him is true. If this incremental process of cumulating evidence could be given quantitative expression, the factfinder might then be able to combine mathematical and non-mathematical evidence in a perfectly natural way, giving each neither more nor less weight than it logically deserves.”

He then explains that such a mathematical expression for updating probabilities in the light of new evidence exist: in simple cases, it is precisely Bayes’ theorem, and in more complicated situations with many different factors, this theorem can be expanded into the theory of Bayesian networks. Tribe gives a mathematical explanation of Bayes’ theorem (which we’ve already explained in previous posts: as for Bayesian networks, we’ll be dealing with them in a series of future posts). Tribe goes on to give a few examples of applications of Bayes’ theorem to legal cases.

Suppose that a juror estimates the probability of guilt of a defendant during a trial, in light of all the evidence seen so far, as about 2/3. Then new evidence comes up to the effect that after the crime was committed, the defendant took the first plane out of town. The juror makes a guess that the probability of a guilty criminal taking the first flight out of town is maybe 0.2, whereas the probability that an innocent person might do so (in order to distance themselves from the crime, for example) might be about 0.1. A direct application of Bayes’ theorem then has the effect of updating the juror’s probability of guilt up to 4/5.

Shortly before Tribe’s article, law professor Michael Finkelstein and statistician William Fairley coauthored an article, “A Bayesian approach to identification evidence”, 83 Harvard Law Review 1970, in which they described a similar type of scenario. Here, a woman’s body is found in a ditch in an urban area. There is evidence that the deceased quarreled violently with her boyfriend the night before and that he struck her on other occasions. A palm print similar to the defendant’s is found on the knife that was used to kill the woman. However, experts can only say that such prints can belong to no more than one person in a thousand.

The question is, how should this figure be incorporated into the jury’s assessment of the defendant’s guilt? The figure of 1 in 1000 can be quite misleading in itself, and is often confused with the probability of the defendant’s innocence. In fact, it does not have much meaning taken out of context; the important thing is to know the size of the population of possible murderers. For example, if nothing is known of the murderer, then all males in the area could be considered possible suspects; if the male population in the area is 1 million, then 1000 people can possess a hand that can leave a palmprint of the type found, which does not go a long way to identifying the boyfriend as the perpetrator. If more is known about the suspect, this can narrow down the relevant population and make the 1 in 1000 figure more telling.

Finkelstein and Fairley observe that if the boyfriend did use the knife to kill the woman, then he almost certainly left the print there, and conversely, that if he was not the one who used it, there would be only one chance in a thousand that a print similar to his would be found on the knife. They then use Bayes’ Theorem to calculate the updated probability of the defendant’s innocence for various values of the initial probability X that jurors might have in their minds before seeing the print evidence.

Bayes’ theorem leads to the following updates:

Probability before print evidence ................ Updated probability using print evidence

0.01 .................................................................................... 0.909

0.25 .................................................................................... 0.997

0.75 .................................................................................... 0.9996

So, for instance, the use of Bayes’ theorem means that a juror who only believes that there is about one chance in four that the defendant killed his girlfriend should revise his belief to the near-certainty of 99.7% after learning of the palm print evidence.

There is no doubt that a numerical approach like this to quantifying the importance of new evidence can be useful and surprising, especially if it is used in a manner that respects the subjective appreciation of each jury member, by giving a range of different possibilities for the original, prior estimations of guilt, before introduction of that evidence which has a numerical value (frequency of the palm print).

In spite of this, Tribe expresses doubt about the usefulness of these methods. “In the next section…we examine the costs we must be prepared to incur if we would follow the path Finkelstein and Fairley propose. What will presently be identified as certain costs of quantified methods of proof might conceivably be worth incurring if the benefit in increased trial accuracy were great enough. It turns out, however, that mathematical proof, far from providing any clear benefit, may in fact decrease the likelihood of accurate outcomes.”

We (the authors of this blog) believe that while the use of mathematical methods in trials is full of danger, above all the danger of mathematics being misused by non-experts and the danger of even correct mathematics being misunderstood by judges and jurors, there is nevertheless a great deal that can be done in the way of making sure that mathematical methods are applied correctly in the courtroom, and yield improved and more accurate outcomes.

But in order to make a really deep investigation of the subject, we first need to explore these dangers in depth, as well as the most cogent theoretical reasons against it – those expressed by Tribe.

Our next and last post on the subject of Tribe’s article will explain these. They make a lot of sense and should be taken seriously; in fact, they are so right and well-expressed that they convinced large numbers of people for three decades to keep mathematics out of the courtroom. Tribe’s ideas are not wrong. But we believe that it is possible to move beyond the problems he points out, by a carefully designed and controlled use of mathematics in trial.

Wednesday 25 January 2012

Laurence Tribe: Maths on Trial (4)

Tribe’s reaction to standard objections to math on trial

In the last post we gave a description of the first part of Tribe’s seminal article:

Firstly, a description of the kind of use of mathematics at trial that he is specifically going to discuss, with examples.

Today we will summarize his arguments in the second part.

Secondly, a review of the traditional arguments that judges have used against mathematics at trial, together with Tribe’s reaction to these arguments;

He lists objections that have been made by judges to the introduction of statistical evidence at trial, some of which has actually been written into law.

Objection 1: At first glance, probability concept might appear to have no application in deciding precisely what did or did not happen on a specific prior occasion: either it did or it didn’t – period.

Tribe’s reaction: Although this is true in itself, the statistical knowledge can be very useful in cases where it is used in conjunction with sufficient further information.

Objection 2: Making use of the mathematical information available first requires transforming it from evidence about the generality of cases to evidence about the particular case; some feel that no such translation is possible.

Tribe’s reaction: this kind of information is important for the trier of fact to come to a decision about the likelihood of certain events, for instance the “4/5” probability that the blue bus that hit the plaintiff belonged to the defendant who was responsible for operating 4/5 of the blue buses in town.

Objection 3: In very few cases, if any, can the mathematical evidence, taken alone and in the setting of a completed lawsuit, establish the proposition to which it is directed with sufficient probative force to prevail.

Tribe’s reaction: But the fact that mathematical evidence taken alone can rarely, if ever, establish the crucial proposition with sufficient certitude to meet the applicable standard of proof does not imply that such evidence – when properly combined with other, more conventional, evidence in the same case – cannot supply a useful link in the process of proof…The real issue is whether there is any acceptable way of combining mathematical with non-mathematical evidence. If there is, mathematical evidence can indeed assume the role traditionally played by other forms of proof.

Now, it is a fact that many mathematicians and statisticians have proposed a way to integrate mathematics and traditional evidence, based on Bayes’ theorem. In fact, Tribe’s whole article actually was written as a response to an article on the use of Bayes’ theorem at trial, authored by Finkelstein and Fairley (we’ll investigate this and similar articles in future posts). In the third part of Tribe’s article, he briefly summarizes the point of view of those who advocate using Bayes’ theorem at trial. This will be the subject of the next post

Tuesday 24 January 2012

Laurence Tribe: Maths on Trial (3)

Tribe’s reaction to probability at trial

The other main theme of Laurence Tribe’s article “Trial by Mathematics: Precision and Ritual in the Legal Process” (84 Harvard Law Review 1338 1970-71) is, to use his own words, “cases in which mathematical methods are turned to the task of deciding what occurred on a particular, unique occasion, as opposed to cases in which the very task defined by the applicable law is that of measuring the statistical characteristics or likely effects of some process or the statistical features of some population of people or events”.

This part of the article is divided into four main sections:

Firstly, a description of the kind of use of mathematics at trial that he is specifically going to discuss, with examples;

Secondly, a review of the traditional arguments that judges have used against mathematics at trial, together with Tribe’s reaction to these arguments;

Thirdly, a sketch of the introduction of Bayes’ theorem at trial as introduced by Finkelstein and Fairley;

Fourthly, the very heart of the article: an emotional and deeply human explanation of his final decision to recommend the avoidance of mathematical methods altogether in the area of criminal law.

Tribe’s opinions held such sway in the world of legal thinking that it has been said by those who strongly favour the properly thought out and properly controlled use of mathematics at trial that Tribe alone held back the development of that area of research for a good 30 years. If this is the case, then it seems worth exploring Tribe’s objections in some detail before going on to what we believe the future holds in the way of probability at trial.

Today’s post will summarize the first part of Tribe’s article: a description of the kind of

mathematics at trial that he is aiming to discuss.

Tribe divides the use of mathematics in deciding what occurred on a particular occasion into three distinct possibilities:

(1) determining whether an event did or did not occur,

(2) determining the identity of the individual responsible for certain acts,

(3) determining the intention behind certain acts.

For each type, he gives a few examples of the kind of problem that may arise. For (1), one example is that of a man who is accused of leaving his car at a parking place for over one hour in defiance of the rules. The witness is the officer who testified that twice, at times separated by over an hour, he observed that car in that particular place, and that he noted the precise position of the front and back tires. The car owner’s defense is that he drove away during the hour and then came back later, and his wheel positions happened to be the same as they were the first time by chance. The probability of this happening is between about 1/12 and 1/144. But what role should this probability play in judging the car owner’s innocence or guilt?

A second example is that of a barrel falling out of someone’s window onto another person’s head: the question is whether some negligent act was the cause of the fall. Supposing that it is statistically known that over 60% of such incidents are caused by a negligent act, should this fact be allowed in court?

For (2), he gives the example of a plaintiff negligently run over by a blue bus, who accuses defendant of negligence on the grounds that defendant is a bus operator who operates 4/5 of the blue buses in town. How important should that figure be in judging whether the blue bus in question did or did not belong to plaintiff?

Another example is that of a man found shot to death in his mistress’ apartment, the question being whether she shot him. There is evidence to prove that in 95% of all known cases in which a man is killed in his mistress’ apartment, the mistress is the killer. Is this evidence of sufficient relevance to be introduced in trial? Does it have any role to play?

Finally, for (3), Tribe gives the example of a recently insured building burning down, with the owner insisting that the fire was an accident. If it is statistically known that less than one fire in 20 that occurs shortly after an insurance purchase occurs purely by chance, what role should such a statistic play in the investigation?

Tribe points out that courts faced with the emergence of this kind of evidence during trials have tended to deal with it in an entirely ad hoc manner, sometimes supporting the mathematical proof proffered by a lawyer, more frequently judging it to be improper. But, he points out, “as the number and variety of cases continues to mount, the difficulty of dealing intelligently with them in the absence of any coherent theory is becoming increasingly apparent.”

He is obviously right, even if the “coherent theory” that he develops in the rest of the article, which we’ll cover in the next posts, is far from satisfying everybody!

Pages

Thursday 4 October 2012

Saturday 7 April 2012

Sunday 29 January 2012

Friday 27 January 2012

Thursday 26 January 2012

Wednesday 25 January 2012

Tuesday 24 January 2012