In the case of Illinois v. Winfield, attorney Richard Gutierrez of the Cook County, Illinois Public Defender Office asked a Chicago judge to hold a hearing on the scientific validity of forensic firearms analysis. This is the field that claims to be able to match a bullet or shell casing to the gun that fired it. Circuit court judge William Hooks agreed to hold the hearing, and after considering evidence from the state and defense, he issued a landmark opinion in February 2023 which barred prosecutors from putting their analyst on the witness stand.
It was the first such ruling on forensic firearms analysis by any criminal court in the country. A handful of state and federal courts had previously put restrictions on the language these analysts sometimes use on the stand, citing the lack of scientific research to support their conclusions. But Hooks’s opinion was the first to bar an analyst’s testimony entirely. It was a big deal, because forensic firearms analysis is one of the most common types of expertise in the criminal legal system. Juries around the country rely on it daily to send thousands of people to prison each year.
But as of last month, Hooks’s ruling is no longer valid in Illinois. After a bizarre series of events, which began with an allegation of racism against Hooks and resulted in his retirement, the judge who replaced him then vacated the opinion, effectively erasing it from Illinois case law. Just like that, a small bombshell and long overdue win for science-based forensics was taken off the books.
A few months after Hooks’s ruling, I published a long piece at The Watch laying out the myriad problems with forensic firearms analysis, and in particular its analysts’ claim that they can match one bullet or casing to one gun to the exclusion of all other guns on the planet. But here’s a quick and dirty summary:
The first problem is that there’s just no scientific evidence to support the premise upon which the entire field rests — that every gun leaves unique marks on the bullets and shell casings it fires. It isn’t even clear that this premise could be proven or disproven. For example, with DNA we know exactly how often certain genetic markers occur in the population, so we can calculate the odds that a DNA sample was left by a particular person. We do not know how many guns are capable of leaving a particular mark on a bullet or shell casing. Moreover, deciding whether a series of marks constitutes a “match” to a specific gun is an entirely subjective process based on the judgment and “experience and expertise” of the analyst. We can’t even say that a single gun leaves the same unique marks over time. The marks likely change over time as the grooves in gun barrels wear down.
This brings us to the second problem with this field: Even if the claim that every gun leaves signature marks could be proven, there’s no evidence that these analysts are good at matching those marks to the gun that left them. There’s even less evidence that they’re good enough at it for their testimony to be the reason someone goes to prison.
The interesting part here is that unlike the first problem, we could assess how good these analysts are at matching bullets to the guns that fired them: We could simply test them. We could give them blind proficiency tests that accurately reflect how they perform their analysis in real-world cases. The best way to do this would be to occasionally intersperse “test cases” for which the ground truth is known in with their day-to-day cases.
But as with other controversial “pattern matching” fields of forensics, firearms analysts have resisted this sort of testing. After criticism from groups like the National Academies of Science, some practitioners did begin administering some testing, but those tests have tended to be laughably lax. Here’s an excerpt from my 2023 piece:
In many of the practitioner-administered forensic firearms tests participants were given two sets of bullets. Then, using identifying characteristics, they were asked to match each bullet in one set to a bullet in the other set fired by the same gun. The test takers knew that each bullet in Group A had a corresponding match in Group B.
There are few real-world scenarios in which an analyst would be asked to do this. More typically, an analyst is given a single bullet and asked to determine if it is a match to test bullets fired by a particular gun.
Practitioner-administered tests also tend to be easier. They avoid using varieties of guns known to have similar rifling, or using bullets fired by two guns of the same make and model. The ability to make such distinctions in a courtroom is precisely what makes these analysts valuable to prosecutors, so it’s the sort of thing any worthwhile competency test ought to cover.
One almost comical example cited by the defense in Cook County was a test administered by Todd Weller, the state’s own expert witness. When attorneys at the Bronx Public Defender Service obtained a copy of that test, they gave it to six attorneys in their office, none of whom had any training in firearms analysis. Every staffer who took the test passed it “with flying colors.”
As more scientific bodies began to criticize the field, firearms analysis groups and the FBI did put together a couple more challenging proficiency tests, one in 2016 and another in 2020. But there were still lots of problems.
The first problem is with how the tests were administered. Again, the ideal scenario would be to mix text cases in with analysts’ day to day work, so they wouldn’t know when they’re being tested. But these two tests were given to volunteers who took them outside the lab. When you know you’re being tested, it’s only human nature to adjust accordingly.
This brings us to a second, related problem, which is how these tests were scored. For each question, the analysts were asked to determine whether two bullets had been fired from the same gun, from different guns, or to give an “inconclusive” answer, which means there was insufficient information to say either way.
But any question for which an analyst answered “inconclusive” was scored as correct. In theory, then, a test-taker could achieve a “perfect” score by simply answering “inconclusive” for every question. This obviously defeats the purpose of a proficiency test.
These test takers, then, operated under starkly different incentives than they typically face at the crime lab. The test-takers were incentivized to be more cautious — to answer “undetermined” if they had even the slightest bit of doubt. By contrast, at many crime labs, analysts face pressure from police and prosecutors to “find” matches, or more commonly, to downgrade an exclusion to “inconclusive” to avoid derailing a prosecution.
In fact, at the Illinois state crime lab — the very lab that participated in this case — it is a matter of policy to never “exclude” a given bullet from a given gun. They always either find a match, or they’ll say the evidence is inconclusive.
I think it’s safe to say that most people think a crime lab’s purpose is to use science to find the truth — that analysts will testify to exculpating evidence as readily as they’ll implicate. But that often isn’t the case, and it definitely isn’t the case in Illinois when it comes to firearms analysis. Here the crime lab might implicate you, or it might say it doesn’t know. But it will not exonerate you.
This policy is especially problematic given that there’s substantially more scientific research to support an analyst telling a jury “this particular gun could not have fired this bullet” than there is to say, “this is the only gun on earth that could have fired this bullet.”
Perhaps not surprisingly, analysis of the 2016 and 2020 tests showed that test-takers were more likely to answer “undetermined” than is typical. They were also more likely to answer “undetermined” on comparisons that should have exclusions.
There is at least one crime lab that properly tests its firearms analysts by inserting test cases into their daily work. That lab is the Houston Forensic Science Center, operated by forensic reformer Peter Stout. I interviewed Stout a couple years ago. Here’s what those tests found:
For sensitivity tests — in which analysts are asked to determine if two bullets were fired by the same gun — the Houston lab’s firearm specialists had an error rate of 24 percent. For specificity tests — in which analysts are asked to determine if two bullets were fired by different guns — the error rate climbed to 66 percent.
So in the only proficiency test thus far to test analysts as they engage in their day-to-day work —and at a lab that that presumably puts a premium on hiring analysts who are careful and conscientious — they produced error rates that should disqualify them from ever testifying about a “match” in front of a jury.
Getting back to the Chicago case, Judge Hooks’s opinion touched on all of these problems with forensic firearms analysis. He explained the lack of scientific foundation for the discipline’s core premises, and why the state witness’ testimony about the many procedures, professional organizations, and “peer-reviewed publications,” were just distractions for the fundamental problem at the core of the entire field: They want to be able to tell juries that they can match one bullet to one gun to the exclusion of all other guns, but they have yet to demonstrate the ability to actually do that in properly controlled tests.
A few courts have prohibited analysts from using words like “match” or “scientific certainty” (to much anger and backlash from analysts and law enforcement groups). To justify his decision to prohibit the state’s analyst from testifying at all, Hooks pointed to studies showing that such distinctions in wording are often lost on juries. They just tend to just hear an expert matching a bullet to a gun. There’s also no real objective criteria for when these various terms are used, and they vary from analyst to analyst, and from lab to lab.
Hooks determined, then, that the only science-supported expertise an analyst could provide a jury are broad observations such as whether a given gun is capable of firing a given caliber of bullet, or in some cases to say a particular make and model of gun is capable of leaving certain marks. But because that sort of information doesn’t require specialized expertise, Hooks ruled there was little value in allowing the state’s expert to testify.
Then it all got weird. Several months after the opinion came down, a defense attorney in a separate case filed an ethics complaint against Hooks. The attorney, who was representing an Arab-American man, claimed that in a closed-door session between Hooks, prosecutors, and the attorney himself, Hooks made disparaging and racist comments about Arab men. That complaint prompted Hooks’s recusal from that case. It then got worse for Hooks. The prosecutors who were present during the alleged remarks claimed that Hooks contacted them to urge them to vouch for him. That’s a pretty bright line that judges can’t cross. Hooks was suspended, and he eventually retired.
Hooks already had a reputation as a bomb thrower, and in particular for his skepticism of police and prosecutors (of course, this Chicago, so there’s good reason for that). He’s also known for being comparatively sympathetic to the accused. In 2018, Hooks accused of creating a hostile working environment for reprimanding another judge over what he saw as her overly deferential treatment of police officers who lie on the stand. He was ordered to take anger management classes. Hooks was also criticized for scolding a prosecutor during the trials of police officers accused of torturing suspects under the direction of the notorious detective John Burge. Hooks chastised the prosecutor of acting more like a defense attorney for the officers he was supposed to be prosecuting.
In other words, Hooks has made some enemies. That said, the comments he’s alleged to have made about Arab men are ugly and obviously of a distinctly different character than his outburst of anger and frustration over police abuse and corruption.
Hooks has denied the racism allegations. I’m not really in a position to have an informed opinion on their veracity. But for the purpose of this discussion, the bottom line is that Hooks is no longer a judge. And at the time of his retirement, the state had a pending motion asking him to reconsider his ruling on firearms analysis in Winfield.
Judges rarely grant motions to reconsider. There’s a good reason for this — you’re asking the same judge who just ruled against you to second guess themselves. You’re basically saying, “Maybe you didn’t fully think this through.”
But in Winfield, the state was asking a new judge to reconsider a groundbreaking ruling by a controversial now-former judge. Moreover, she was being asked to reconsider a ruling that, were it to be adopted by other courts, could call thousands of convictions into question. It would also make it more difficult for the state to win convictions going forward, at least in that particular court, and it was a ruling that could be cited by defense attorneys around the country.
That new judge, Jennifer Coleman, also happens to be a former prosecutor. In 2021, Coleman resigned from the Cook County DA’s Office in the wake of criticism of how the office handled the police killing of 13-year-old Adam Toledo. Coleman was District Attorney Kim Foxx’s second in command at the time, and her resignation was widely seen as her taking a fall for the office. She was then appointed to the bench in March 2023.
Coleman ruled on the state’s motion to reconsider month. She vacated Hooks’s historic opinion and replaced it with her own allowing the state’s analyst to testify.
Coleman’s opinion does bar the state’s firearms analyst from using the phrase “reasonable degree of scientific certainty.” Otherwise, the two opinions couldn’t be more different. You’d also be hard-pressed to find a better example of the perils of asking judges to be the gatekeepers of good and bad science. Hooks’s opinion is an in-depth, science-steeped analysis of the myriad issues surrounding the dearth of scientific research to support firearms analysis, the lack of sufficient proficiency testing, and included a thorough discussion of the critiques of the field by scientific bodies. I suspect most firearms analyst would disagree with his conclusions — and most critics of the field from the world of science would agree with them. But what’s clear is that he at least wrestles with the science itself.
Coleman’s opinion, by contrast reads like every other opinion we see when a judge rejects one of these challenges. Which is to say that it isn’t a scientific analysis, but a legal one. Instead of citing scientific studies, Coleman cites case law. She points out that courts all over the country have repeatedly upheld the validity of forensic firearms analysis in thousands of cases. But she largely avoids engaging with whether those prior opinions were right or wrong. She finds the case law so overwhelming that, she argues, that her predecessor shouldn’t have even granted a hearing on the validity of such testimony, much less ruled for the defense in the wake of that hearing.
Coleman’s reasoning mirrors the gap between law and science we see over and over in these cases. While the law strives for consistency and predictability, science is constantly changing as we acquire new information. Those clashing priorities create huge problems when the two fields intersect. Under Coleman’s opinion, so long as there’s sufficient case law supporting it, no Illinois court short of the state supreme court can reconsider the reliability of any field of expertise, no matter how much scientific research shows it to be nonsense.
When Coleman does discuss actual science, she falls into the same traps that have ensnared judges in these cases for decades. Every state in the country applies one of two standards to assess expert testimony. Under both standards, expert testimony must be generally accepted within the “relevant scientific community.” That phrase has become critical, because these challenges almost always turn on how the courts define “relevant scientific community.”
If you limit the “relevant scientific community” to practitioners in the very field that’s under challenge, those practitioners tend to overwhelmingly believe their own field is scientifically sound. This of course isn’t surprising. If you set out to assess the legitimacy of palm reading, but limit the “relevant scientific community” for your assessment to people who read palms, you’re going to get near-unanimous agreement that palm reading is legitimate.
This seems obvious — no well-meaning person would practice a field they know to be illegitimate, and therefore the practitioners of a field facing scrutiny are not the most reliable sources to consult about that field’s legitimacy.
But the courts have consistently ruled the opposite, that merely being a practitioner in a given field of forensics confers the ability to objectively evaluate the reliability of that field. In fact, some courts (and many, many prosecutors) have gone further, arguing that only practitioners are capable of assessing a Frye or Daubert challenge to their own field.
The cognitive scientists, statisticians and others who critique dubious forensics have never done their own bullet comparison or bitemark analysis, the argument goes, so who are they to say if these fields are legitimate?
As the authors of an editorial in Scientific American pointed out a few years ago, this line of argument reveals a fundamental misunderstanding of how the scientific method works.
In the courts, firearms examiners present themselves as experts. Indeed, they do possess the expertise of a practitioner in the application of forensic techniques, much as a physician is a practitioner of medical tools such as drugs or vaccines. But there is a key distinction between this form of expertise and that of a researcher, who is professionally trained in experimental design, statistics and the scientific method; who manipulates inputs and measures outputs to confirm that the techniques are valid. Both forms of expertise have value, but for different purposes. If you need a COVID vaccine, the nurse has the right form of expertise. By contrast, if you want to know whether the vaccine is effective, you don’t ask the nurse; you ask research scientists who understand how it was created and tested.
We can actually test this theory. There are some fields of expertise once used in court that have now been universally discredited. We can look at how the courts treated challenges to those fields back when they were still accepted. One example is “voiceprint analysis,” or the claim that an analyst could identify a voice on a recording to a specific person. And the case law here shows exactly what we’d expect. When the courts limited the “relevant scientific community” of these fields to practitioners, they overwhelmingly allowed the testimony. When judges listened to scientists from outside of the field, they were far more likely to bar or restrict the testimony.
In the Winfield case, the defense cited or included statements from research scientists, statisticians, psychologists, and cognitive scientists. These are people who specialize in cognitive bias, designing and administering proficiency tests, and designing systems with incentive structures that encourage objective, unbiased analysis. Judge Hooks defined the “relevant scientific community” broadly enough to consider these experts’ opinions. And they of course found the field of forensic firearms analysis to be lacking.
The state presented just two experts, both of whom work in the field of forensic firearms analysis. And in vacating Hooks’s ruling, Judge Coleman had no problem limiting the “relevant scientific community” to those two experts.
. . the prior judge, citing the Defendant’s Brief as support, rejected the State’s evidence simply because the witnesses also work, and make a living, in the relevant field. In so doing, he inaccurately defined the relevant scientific [sic] accordingly . . .
He . . . stated that ” … reliance on scientific practice, and the opinions of law enforcement agencies cannot carry the day under Frye.” No legal authority justifies this definition. Indeed, the relevant scientific community may be made up made up solely of forensic scientists.
Recently, a federal district court addressed whether the relevant community of ballistics examiners may be limited to those whose professional standing and financial livelihoods depend on the discipline. United States v. Graham, 4:23-cr-00006 (WD Va., 2024). The court found that the relevant community is indeed made up of firearm-toolmark examiners. Graham reasoned that the “overall scientific community” lacked the intellectual rigor that characterizes the practice of an expert in the field.
So under Coleman’s ruling — and under the federal case she cited, and indeed under most current case law around the country — the courts can determine the legitimacy of entire fields of forensics that have soundly been rejected by scientific bodies like the National Academies or the President’s Council of Advisors on Science and Technology by simply disregarding those bodies and considering only the opinions of experts who practice and make their living in those fields. (It’s worth noting that all but a few fields of forensics were developed in a law enforcement setting, not in a scientific one, and thus were never really grounded in science to begin with.)
The iconic defense attorney and civil rights activist Stephen Bright recently co-wrote a book with James Kwak called The Fear of Too Much Justice. The book’s premise is that our criminal legal system has become so reliant on systemic injustice that any real attempt to ensure truly just outcomes would grind it all to a halt. Put another way, politicians, attorneys, and public officials have become inured to the cruelty and unfairness of this system because it’s just too difficult to do anything about it. It’s a slog just to get the courts to acknowledge error in an individual case. It’s all but impossible to get them to acknowledge bigger, more consequential, more systemic problems.
Coleman’s reversal of Hooks’s ruling is an apt example of the Bright/Kwak book’s thesis. To allow a ruling that there’s science behind the idea that particular bullet can be matched to a particular gun could reopen tens of thousands of convictions. It would make future gun prosecutions and convictions much more difficult. But perhaps more importantly, it would require the courts to concede that a common field of analysis used in courts around the country hundreds of times per day is, in fact, bullshit. It would raise profound questions about the legitimacy of how the courts have been evaluating expert testimony for decades. And that would call into question the legitimacy of the courts themselves.
It would also be the correct ruling, so any large-scale reevaluation such a ruling might prompt would ultimately make the courts more legitimate, not less. But that could be a long, painful, and expensive process. So it’s far easier to continue polishing the system’s veneer of legitimacy — to continue citing to wrongheaded case law and pretend everything is fine.
Judge Hooks’s ruling was historic and groundbreaking. But it was also a small victory in a single criminal court in a single city. To get even that, it took a uniquely well-funded, well-staffed public defender office with the resources to allow one of its attorneys to specialize in this area of forensics and mount a credible challenge. It also took a judge with a history of skepticism of police and prosecutors and a willingness to question the core legitimacy of the system in which he built his career and professional identity. Unfortunately, in his case, that same subversive, contrarian nature also won him plenty of enemies — and allegedly manifested in uglier ways. And that ultimately allowed his opinion to be retracted.
Ours system is designed to punish and incarcerate. Because of that, when there are errors that benefit the state — even clear, egregious errors — it takes years, sometimes decades, to fix them. Meanwhile, the smallest, most limited, and hardest-fought wins — even righteous ones — can be erased in the blink of an eye.
This piece was first published at The Watch.