Jonathan Moreno-Medina, Bocar Ba, Aurelie Ouss, and Patrick Bayer speaking on Officer-Involved Shootings: The Media Language of Police Shootings.
The Hoover Institution hosts a seminar series on Using Text as Data in Policy Analysis, co-organized by Steven J. Davis and Justin Grimmer. These seminars will feature applications of natural language processing, structured human readings, and machine learning methods to text as data to examine policy issues in economics, history, national security, political science, and other fields.
Our 19th meeting features a conversation with Jonathan Moreno-Medina, Bocar Ba, Aurelie Ouss, and Patrick Bayer on Officer-Involved Shootings: The Media Language of Police Shootings on Thursday, May 25, 2023 from 9:00AM – 10:30AM PT.
>> Justin Grimmer: Hello everyone, and welcome to the Hoover Institution workshop on using text as data in policy analysis. In this workshop, we're gonna feature applications of natural language processing, structured human readings and machine learning method to text as data to examine policy issues in economics, history, national security, political science, and really any other related social science field.
I'm Justin Grimmer, I co-organized the workshop with Steve Davis, who unfortunately can't be here today because he's traveling. We are thrilled to have Jonathan Moreno-Medina here today from the University of Texas, San Antonio, presenting his work. Officer involved, the media language of police killing, sorry about that. Just some quick ground rules, Jonathan's gonna speak for 30 to 40 minutes.
If you have questions, please place them in the Q&A feature. If the question's really pressing, I might interject, but otherwise, after that 30 to 40 minutes, either I'll ask the question for you or I might recognize you and have you ask your question live. After about an hour of the presentation, we're gonna turn the recording off and we're gonna go to a more informal Q&A session to ask more nuts and bolts style questions.
And so with that, Jonathan, take it away.
>> Jonathan Moreno-Medina: All right, thank you so much, Justin, and thank you so much for the invitation, I'm very excited about presenting this project to y'all. So I'm gonna share my screen and please let me know if everything is looking good on your end, all right?
Well, again, I'm very excited about being here today, I'm gonna talk about the project called office and evolve the media language of police killings. And this is a joint work with Aurelie Ouss, Bocar Bar, and Pat Bayer, I and Aurelie at the University of Pennsylvania, and Bocar and Pat are at Duke University, all right?
So, let me start with this motivating plot about 1000 civilians are killed by the police each year in the US. The number has been fairly stable for the last few years, and there has been quite a big debate in policy discussions and societal discussions about the police violence in general, and how should we think about trade offs in this area and so on.
But for all of the discussions that we've had on these issues, I think it's really important to consider what is the depiction of these killings in the media, as that is the main way in which people figure out and find out that these events have taken place. So, how does the media cover police killings?
And it's really important to think about language as a choice. So, the media is gonna have a choice in terms of how does it present specific incidents that involve police violence. And so here is one example in which you might present a story. You might say a police officer shot and killed a civilian, this is one version of presenting this story.
But now consider an alternative version of presenting the same story or incident in which you say a civilian died in an officer involved shooting. So, just think about how you feel and what type of mental image does each one of these sentences evoke. It's not necessarily clear that both are conveying the same mental image.
So, this is what we wanna get to these differential ways of presenting these stories. So, here are a few examples of how this had been presented, say, in TV, when you have a killing made by the police of a civilian, sentences like fatal, officer involved shooting, are very common.
CPD calls, officer involved shooting justified, right? So, this notion of officer involved shooting is quite prevalent. We're gonna have a little bit more to say about these type of incidents. But on the other hand, one possible benchmark that we might think about how this depiction of homicides is occurring is to think about, well, how does the media cover homicides that involve just civilians that do not have the police in it?
So, here are a few cases of civilian violence or civilian killings. And the first one, you see that the language is quite direct and explicit, says who did what to whom in a very direct fashion, which in the examples on the left, that was particularly rare. So, this is sort of incidental or observational differences that do not necessarily imply that there's something systematically different between the coverage of police versus civilian killings in the media.
So, that's part of what we are actually trying to get it at, and we can actually see if this is systematic or just more anecdotal. So, what we're gonna do in this paper, four main things that we're gonna do. The first is that we are gonna propose a method to measure obfuscation.
And when we think about obfuscation, essentially is the idea that you're covering a specific event or transmitting information about event. But in language that doesn't cover all the details or leaves some interpretation up in the air and a little bit of a blurry picture of what has transpired.
And we're gonna build this method using both linguistic theory and natural language processing, and we're gonna merge the two together to propose a method to measure this type of obfuscation. The second is that then we're gonna move on and measure to what extent we see differential obfuscation in the coverage of police killings in the US, and compare it with similar civilian killings.
And similar here is in quotes, and I'll have again a lot to say about what similar means in our context. The third thing that we're gonna do is then, once we figure out to whatever extent we find that there's differential use of obfuscation in police versus civilian killings, the natural question is to ask if this actually matters at all for how these events are perceived by an audience member, right?
So, what we're gonna do is directly evaluate if we can see obfuscation having an effect on the assessment of moral responsibility or demand for penalties or preferences for police reform on the side of a person reading two versions of the same story. Lastly, we are gonna go back to our, Analysis of coverage at the national level, and we are gonna further elaborate on heterogeneity and obfuscation that we can see based on our experimental results.
Our experimental results are gonna suggest that a particular type of incidence might be more likely to have an effect of obfuscation in how people perceive them. And we're gonna go back to the observational data and see if what we find in that dimension. So in terms of the preview of the results, on the first front, on the media analysis front, we're gonna find that obfuscation is more frequent in news stories about police killings.
And we find that there's more obfuscation if the perpetrator is an officer versus a civilian, about 25% more in any category of obfuscation. But also that obfuscation is particularly large as compared to civilian killings in headline sentences or in the first sentence that informs on this incident, right?
So it is well known that the headline sentence carries an important weight on how people remember stories, and the attention that they pay to these sentences is usually larger than in other ones. And this is where we find an even larger difference. When we move to the online experiment, we are gonna find that obfuscation indeed influences people's perception of events.
If you present a story with more obfuscation in it to a responder, that responder is less likely to think that the officer is responsible and demand for penalties on that side. They're also less likely to donate to police reform organizations, and this is going to be especially true if the victim appears to be unarmed in the description of the events.
Right, let me jump now into the bulk of the paper. So when we talk about obfuscation, it's gonna be important to lay out a framework for how to think about obfuscation, and here we're gonna consider different dimensions, possible dimensions of obfuscation. So Toolan, who's a linguist, presents four dimensions of obfuscation based on semantics or the meaning or representation that is conveyed through language to somebody receiving that information.
So on the one hand, we can think of a benchmark in terms of the active voice. So here you would have a sentence that is strong, direct, and clear in tone, so an example would be, in our setup, a police officer killed a person, right? So you know who did what action here in blue police officer, what was that action?
Here underscored is the verb kill, and who was the person who was affected by that action? A person, right? In red. Now, we can think of a possible increment on the level of obfuscation and is to use the passive voice, so in the passive voice, you're gonna background the agent.
So an example would be a person was killed by a police officer, so now whoever did the action is at the background of the sentence, and, yeah, that's gonna be the first transformation of the first sentence. A further obfuscation increase would be to use some form of nominalization.
Now, denominalization means transforming an action into a noun, so, for example, you can say a person was killed in an officer involved shooting, right? So the act of shooting somebody becomes now a noun, and you can describe that noun with some form of objective here, officer involved. So you're describing a lot of things in this action by making it a noun and only talking about an officer involved shooting.
Now, notice that in this depiction of a person was killed and an officer involved shooting. Now the role of the police officer is muddled, it's not very clear if it was the officer who did the killing or not, right? You just know that there was an officer present during this action taking place.
Then we can think of, again, another increment in that obfuscation dimension and is to further transform the passive version of the sentence and drop completely the agent. By agent, I mean, or we mean whoever did the action, right? That is called the causal agent. So if you compare the sentence here in number one, under the number one header, in the passive voice, it says a person was killed by a police officer.
Now, it is completely grammatically correct to drop whoever did the action from the sentence, and so you can say a person was killed, right? And now all of these, again, it's important to note that all of these versions of these sentences are of not lying. They're still depicting in some sense the truth of what transpired, but to different degrees of completeness of the information.
Now, lastly, you can change the verb that you're using and pass from a verb like kill to an intransitive verb like die. And it has been studied in linguistics and moral psychology and cognition that these type of verbs like die and an intransitive verb, they're not causal, right?
They're not causal verbs. So meaning when you say that somebody died, that doesn't imply that there was a cause for it, right? So here you can say a person died of, and that seems to, again, present a mental image that is quite different from the others two, from the other three or four.
Where there is an implied cause as to the event that took place. So again, the dimension of obfuscation is increasing as you go down this list of possible sentences. And we're gonna also group the last two, the passive no agent plus the intransitive verb use as no explicit agent category, just to talk about the most obfuscatory type of sentences.
So then we're gonna move on to the news and see to what extent we find different degrees of usage of one versus the other type. So the data is coming from a couple of main sources, the first, on the police killing site, is covering the years between 2013 and 2019 as using the mapping police violence database for this period of time.
It has about 7000 victims, and for those who don't know, the MPBE is a research collective that collects and verifies data from other data sets regarding police killings like fatal encounters. And it includes information on the name. The race, age, location and so on of the victim and some other characteristics, like if there's an indicator for the victim, supposedly unarmed, and if there was a presence of body camera and so on.
And the accuracy of this database has been double checked with other studies, and it's fairly, fairly good. On the civilian side, we're gonna take the data from 2014 to 2018 coming from the Gun Violence Archive. This contains the quasi universe of gun related violent incidents in the US.
And again, we're gonna be able to see information on the name, age, geolocation of victim and suspect in some cases. And we're gonna also predict the probability that the person belongs to a specific group of race. Okay, now, the news stories, data is coming from a couple, the main source is gonna be news exposure.
So this is a data vendor that records the captions of most TV news stations across the country. And it would have some other information about ratings, station and affiliation and so on. And so we're gonna match the incidents to the news stories doing the following. We're gonna find if a story has the name of a victim or the address plus a few key terms, and if the story appears in the media within seven days of the homicide taking place.
So an example here for how we would match a specific incident to a story, and this is an example of what a story would be. Here we have, they would mention the suspect name is accused of killing, and then you have the victim name. Investigators say suspect name was driving Saturday night when he pulled out behind victims names, grandmother at an intersection, and so on.
And so here you have the description of the story, this is what we mean by story. Now, we're gonna compare police killings with civilian killings, and I said something about similar civilian killings. So what do we mean by similar civilian killings? So I think it's useful to think about our approach as having a parallel with a budget set approach in normal economics, but this is gonna be a language, budget set type of approach.
What we want to get at to is in which incidence of killings by civilians could the media have used the same language for the coverage as in the police instances, right? So, for example, when you look at transitive verbs like kill and shoot, they require knowing who did the action, who is the causal agent of that action.
So we know that in the case of police killings, because it was the police, whereas the media might or might not know who the causal agent is, who did the killing in a case of civilian killings. So in order to make a valid comparison group and to compare cases in which the media could have chosen among the same set of options for language in the depiction of the story.
We're only gonna consider stories where the suspect appears explicitly somewhere in the story. And we're gonna have a bunch of different robustness checks in the paper for other type of samples that actually go through. Other filters that we're gonna use is that we're gonna use only gun deaths, which are about 90% of all police killings.
The news stories that we're gonna consider need to be classified as crime with relatively high probability, and we're only gonna consider sentences that preferred and convey information about the killing itself. So not every type of sentence is gonna be in the analysis, but just the ones that are informing on the killing.
So how are we gonna classify these different ways of presenting this story? So we're gonna use natural language processing, as I mentioned before, and we're mainly gonna use three different tasks and steps for our data. And we can get into the nitty gritty details in the Q&A or after the presentation, and I'll be happy to jump into the details.
But the first step is gonna be identify what news stories are about crime, so this is a new text classification task, so that's the first element. The second one is gonna be to determine who did what action in news story. And that is gonna require us to do a coreference resolution task, which effectively means finding out different versions in which one agent can be presented with different words, right?
So if I say Jonathan Moreno is making a presentation, he's wearing a blue shirt, so he, in that second sentence is Jonathan Moreno, right? So we want to tie in all the ways in which the text can refer to the same individual by using this coreference resolution. And the third task is gonna be identifying active and passive structures and nominalizations and the use of trans intransitive verbs.
And we're gonna define obfuscation in a sentence as measured by a dummy for each one of these categories. And then the last part is gonna, in terms of natural language processing, we're gonna use semantic role labeling, which implies, again, kinda that parsing of who did what to whom.
Okay, so that's, broadly speaking, what we're gonna do in the NLP space. And all of these is using this big natural language processing models like Bert and the successors that now everybody's been talking about as of late. Okay, so now the empirical specification is gonna be relatively simple.
We are just gonna run a very simple regression where we're gonna have the measurement of obfuscation on the left hand side as our outcome. And on the right hand side, we're gonna have an indicator for specific homicide being a police killing or not. And we're gonna have a set of, sorry about that, we're gonna have a set of controls here in DX that include age, sex and race of a victim and so on.
We're also gonna include media market fixed effects, so we're gonna look at within media market variation in the coverage. And we're also gonna use potentially different ways of controlling for time trends or time fixed effects as well. So here the analysis is gonna be at the sentence level for individual i at time t in station s and media market d.
So let me jump into the first set of results. So what we find is that there is more obfuscation where the police are the perpetrators of the killing than in the civilian cases. And so here, the way to interpret this figure is that this is in comparison to the average on the civilian side.
So overall, we find about 25% more obfuscation in the stories that are related to a police killing than in civilian killings. And when we focus on the more obfuscatory type of sentences, on the no explicit agent category that I mentioned before, that's about 20%, so slightly lower point estimate.
When we break down that by different categories, the intransitive and the no agent. Again, which are the two most obvious catalytic ways, are still appearing more than in civilian cases, nominalization and passive also much more than in the civilian cases. So overall, across the four dimensions that we're focusing on, we find that coverage in police killings is using more obfuscation than the civilian ones.
Now, when we now pay attention and focus only on the first sentence in the story where we're calling the headline, we find that these numbers are larger. So more than 40% of any type of obfuscation, and closer to 50% in the no-explicit agent, which again is the most obfuscatory way of presenting these stories.
Now, the intransitive is used even more than 60% more than in comparable civilian cases as well, and again, across the four dimensions, we see a very similar pattern. So again, this seems to point to so the fact that these first sentences that are critical on how people perceive the story are the ones that are using more obfuscation.
I'm gonna transition now to talk about the experiment. And so, in the experiment, we wanted to find out if these differential semantic structures or different levels of obfuscation are affecting people's perceptions of these events. So, we run an online experiment, and we are trying to tackle three big questions.
We wanna know if these different semantic structures are affecting the perception of officer responsibility, demand for penalties for the officer, and perceptions of policing overall. So, we run an online experiment with about 2400 subjects recruited in prolific, and we use a pre-register document where we have our main hypothesis laid out in advance.
Then for each one of these subjects to the responders, we're gonna relate to them an incident where a police officer killed a civilian. And we're gonna experimentally vary two dimensions of that experiment. First, we're gonna vary the narrative structure or the obfuscation measurements. And then we're also gonna vary mentioning or not if the civilian was armed or not, okay, just to look at the differential responses with that variation.
So, just to give you an example of what the treatment arms are, we are gonna have as the benchmark, again, a version with the active voice. And so the active story presented to the responder would be something like that. A police officer killed a 52 year old man on Friday night, according to the police department.
An officer responded to a home near 21st street, an avenue sea, for a report of domestic violence just before 09:30 p.m. As the officer arrived, he came into contact with a 52-year-old man, the police officer shot the man. The man was taken to a hospital, where he later died, no officer was hurt in the incident.
So, the two sentences in red are appearing in the active voice, and those are the main parts of the story that we're gonna switch across treatment arms. So just to give you an example, here is the criminal for the passive version of the story, and again, now in blue, you see what the transformation of the sentence was, right?
So, in the second case, we go from the police officer shot the man to the man was shot by the police officer, and we're gonna have two more treatment arms. The other two are no agent plus nominalization, which imply as the first sentence, a 52-year-old man was killed in an officer involved shooting on Friday night.
And the last one using the intransitive would be a 52 year old man died in an officer involved shooting on Friday night, okay? So, these are the four different stories that we're gonna present to the responders, and we're gonna see how the response varies. So here we find that obfuscation decreases perception of responsibility, and this is leading me to the second main set of results.
So again, this is the percent difference relative to the active sentence structure. So if we present any type of obfuscation, we find about eight percentage points lower response of the respondent saying that they think that the officer is morally responsible for the killing. When we use the most obvious category type of language, what we call the no explicit agent, we find that that number actually goes up to about 12%.
So 12% less likely to say that the officer was more or less responsible when we use even more obfuscation. When we now focus on support for penalties, we find that any type of obfuscation is again decreasing the probability that the respondent says that they would support any type of penalty by the police department or any kind of penalty that is legal in nature as well.
And when we use the most obfuscatory form of these stories, that differential drop in support for penalties goes even higher. So, it's about 7 to 8% when we use the most obfuscatory type of sentences. And lastly, also we have a question about donation to police reform agencies. And we see the point estimate being lower for any type of obfuscation treatment arms than it is for the control, although it is not statistically significant of 5%.
But it is when we just focus on the most obfuscatory type of language. And again, this is a drop of about 5% in the donations that the respondent would give to political reform organizations. Okay, so now the effects of obfuscation, we can parse them out by when we present the victim as explicitly having a weapon in the story versus when we do not.
And so, when we're gonna have the red bar representing the stories where we do not mention any weapon, and in blue, when we do mention a weapon. So when we do not mention that the respondent has a weapon, you see that these effects of any type of obfuscation are larger in magnitude than when the suspect is presented to have had a weapon.
So, I think this is across the board, and we think that this suggests that in scenarios in which the police might be on a loser type of ground to justify the killing, like when there's no depiction of the suspect having a weapon, that's when- Kind of his obfuscatory language has even more power right?
Now, we also ask people to relate to us and tell us what is it that they just read in terms of the stories that we presented to them, okay? And then we go and classify how they tell us the story back into themselves using the active voice or any type of obfuscation in the presentation of the story.
And so what we find is that the mentioning of explicit role for the police in the killing goes down as we kind of increase the obfuscation measurements. So for the most extreme type of obfuscation, which again is the intransitive plus nominalization. We see that individuals are mentioning the explicit role of the policing the killing 16% points less often.
They're also less likely to use the very clear active voice the more obfuscation we present to them. So if we present the story in passive voice, there 14% points less likely to retell the storing inactive voice? If we use the no agent plus normalization, that's 27% points less likely to use the active voice.
And if we use again, then intransitive anomalization, that 35% points lower probability of telling the story in the active voice. And the flip side is for using the no explicit agent. So that goes up as the obfuscation increases. So I think this is telling a story in which we need to think about the effects that media has on people's perceptions.
But not just only directly through the audience members, but how these audience members are retelling the story to their social networks and social circles, right? So there's an amplification effect here that might be important to consider as well, right? A few takeaways from our online experiment. We find that obfuscation indeed changes people's perception of these police killings, although the information can be thought about as the same.
And you might consider that there's only a difference on how it is presented. This actually affects how people perceive the incident and their preferences in the policy dimension as well. It also influences how a story is retold, so the effects can go beyond the viewers themselves. And these effects are particularly strong if we do not state that the civilian was armed, right?
So based on these results, then we ask ourselves if we're finding that this obfuscation has stronger treatment effects when the civilian is unarmed. If what would be the differential coverage that we find in the observational data along those lines. So we go back to the observational data, and then we find out what is the differential obfuscation usage in cases in which we find an armed description of the person getting killed.
And in cases in which there was a body worn camera. So what we're doing here is we are splitting the police killing incidents into two ones that have a depiction of an armed civilian and ones do not, according to the MPV. And again, comparing with the same group of civilian killings.
And so here we see a much larger obfuscation of any type when the civilian is preserved to be unarmed, and also in the more obfuscatory type of language than when the civilian is armed. In cases of body worn cameras, again, we see the same pattern. So again, this is potentially pointing to a story in which the less justification might be perceived on the side of the police about the killing.
The more likely that we see even more obfuscation in the media portraying these events. Okay, so let me wrap up here so a few takeaways from our paper. The first is that although the media can be slanted in filtering coverage of certain events, and there's ample literature on these fronts.
Or in the words that are being used to describe them, and there's again, a big literature on political slant or gender language and so on. This paper is contributing to the idea that in certain instances, we want to explicitly focus on semantic structures, and in these cases we actually can measure obfuscation.
And here we use the setup of police killings. But we believe that this framework for thinking about obfuscation in general, on how specific actions are described or certain specific events are described, is applicable to other dimensions and areas as well. The second one is that NLP natural language processing techniques can be a powerful tool to expand the tool set that we have available as economists.
And social researchers to understand political outcomes, firm performance and firm exposure to political, social and climate risk. This has been, again, there's ample literature on this front. What we think we're adding here is that we can combine these powerful languages, natural language processing tools, with explicit linguistic frameworks.
Linguists have been thinking about some of these issues for a very long time. And sometimes they present a very nice implementable framework in which you can take their insights into the analysis that you want to do by merging together the NLP tools. And this provides testable hypotheses that, for the most part, have been a little bit lacking in the linguistics literature.
The third is that media coverage of crime and criminal justice can affect perception of crime, police injury behaviors and housing market. There's again ample literature work on this area, but here we think we're contributing by showing that these semantic differences in the coverage can influence people's perception of events.
Which in turn can affect preferences for policies and in particular for crime policies in our setup and police policies as well, right? So that's it. Thank you so much for your time. I'll be happy to take your questions and comments and hopefully have a great discussion.
>> Justin Grimmer: Thank you, so I'll kick off some questions and then we'll open it up to the floor.
Okay, so I have an overarching question that I think intersects with both the sort of. Observational NLP work in the experiment, but I'll break it apart. And so I wanna think about what obfuscation is. So one way to think about it is that it's a tactic to obscure who's responsible for an event.
Another thing to think, though, is maybe it's a tactic to ensure accuracy in reporting. And so, for example, one might wonder, one way to think about this is that from your experiment, for example, it could be very unclear who's responsible or exactly what happened. And as a reporter or news agency, they don't want to make an attribution without knowing, like all of the facts.
And so, for example, I would be curious if there is an evolution in language in reporting around a particular police killing. And of course, that requires there to be many stories per police killing, but one might then ask, as the facts become more clear or body camera footage is released.
And it's the case that the shooting was, for example, in self defense of the officer, do they continue to use the obfuscation like language in order to perhaps appropriately ensure the officer isn't being accused of murder? In contrast to that, it was just the anniversary of the George Floyd murder, and that is correctly classified as murder, which is even more, more clarity on the language than even killed, right.
So that not only attributes that the action was taken, but that it was inappropriate and criminal. So I'm curious if you're able to get at that sort of trend in language, perhaps across stories, and then I'll ask about the experiments as well.
>> Jonathan Moreno-Medina: Yeah, well, thank you for that question, Justin, that those are great questions.
So regarding the, so there's something that I think maybe I should be a little bit more explicit here when we're thinking about our paper. We are focusing on documenting the extent to which this differential coverage is occurring and asking the question if this matters or not. But again, as you correctly point out, there's a little bit of a hole there when we think about what is causing this differential obfuscation.
So as to the why, we have a few hypotheses. One of them is the one that you point out to, is like a tactic for not presenting very clearly who did what action, right. And that might have to do with incentives on the side of the police, or you might have to do with incentives on the side of the journalists.
And there are several dimensions there that we're trying to think a little bit more thoroughly as we progress, but we haven't laid down one way or another. Now, the idea that there might be issues surrounding the type of language that you wanna use just to cover yourself from potential liability or legal action on the part of any participant on how you portrayed the story.
I think that's a good observation, but I believe that the associate press, for example, they have specific guidelines as to how to cover yourself from these potential issues while still presenting the story as we would expect it to be presented. So, for example, you can say according to or supposedly, and use these type of add-ons to cover yourself from potential issues on that front.
And we're not doing anything on that regard, meaning, even if they are, or not using these cushion type of language against legal action, that should still appear in our differential usage. But I agree that this is something that is potentially quite important on the side of journalists, and it's something that we haven't done a lot on.
And I agree that it would be very interesting to look at how the presentation of each story can evolve in time. Because we're using a seven day window, we're gonna be a little bit more limited to perhaps follow the whole range of the story, especially if it becomes too viral.
But at least within that seven days, I think there's, room to explore what you're suggesting, which I think is a very good idea, thanks.
>> Justin Grimmer: So yeah, I mean, that it's a super fascinating question, and certainly the question doesn't, is not meant to, undermine the analysis at all, which is spectacular, yeah, it's sort of more.
>> Jonathan Moreno-Medina: No absolutely, I got that, I came at that comment very well.
>> Justin Grimmer: Yeah, okay, cool. So on the experiment, I actually, I think there's a related interpretation of the experimental results, which fascinating. Which is that individuals view the obfuscatory language as an indication that the facts remain unclear.
So if individuals have, sort of read newspapers and they know when newspapers are clear, when they use verbs like murder or kill, the facts of the matter are decided. And even if I don't read the rest, I know that the newspaper, it's like, appropriate for the newspaper to make that point because they know what happened.
In contrast, if it's this intransitive, nominalized version of the sentence, perhaps that implies that it is still ambiguous. What happened in this situation, no one really knows. And I think that would be consistent with the results, again, does not undermine the finding, but offers a sort of, I think, a slightly different interpretation of what's going on.
>> Jonathan Moreno-Medina: Yeah, no, I agree with that's definitely one possible mechanism as to why we're finding what we're finding. Interestingly enough, I think that would explain the differential response by people reading into these events. Whereas there's still the issue of the journalists themselves choosing one version of the sentence versus another, right, and if they actually know that this is how it's gonna be read or not, right by the audience.
But I agree with you, that's a possible interpretation as to how language is operating.
>> Justin Grimmer: Okay, so we're gonna open up to the floor a little bit, so can we let Erin Carter ask her question, please?
>> Erin Baggott Carter: Thank you, this is really fascinating and obviously a very important presentation.
So I was curious just at the very beginning, some of the details of the text corpus construction and the text analysis. So, one thing just starting at the very beginning, the terms that you used to construct the corpus were very specific and also present tense and something like shot, for example, past tense, murder, murdered.
Those are also more sort of charged and have more direct attribution, so I was wondering about if you ran any sort of validation checks with different terms or different methods of constructing the corpus. And then, relatedly, the question that I put in the chat was, if you standardize by a number of sentences.
For example, one form of bias might be there might be fewer sentences about police killings relative to more senses about civilians killing police, or some sort of variation like that, or by the length of the sentence if they're particularly terse when covering certain events. So curious about those sorts of things, thank you.
>> Jonathan Moreno-Medina: Yeah, thank you so much, Aaron, great questions regarding the text corpus. So I gave kind of the quick rundown of what are the keywords that are being used in the paper? We have the full list, it's a little bit. More extensive than the one that I presented.
So we were using essentially different tense versions of the verb. It can be present or it can be past. And so, yeah, we're trying to essentially match a story that we're gonna know that it refers to the incident that we know took place. And I'm thinking about all the key possible words that can be attached there.
But then we do the further filtering of running text classification algorithm just to double check that these matched text that we have actually refers to a story about a killing and not something else. But, yeah, that's a good observation. The second one related to the sentences length, number of sentences or sentence length.
That's something that we actually haven't done. But I think it's a great idea also as to how there's a possibility that different ways of presenting the story would occupy more space, for example. So if you think that the journalists are gonna be space constrained in the presentation of the story and they wanna make it as concise as possible, maybe matters.
So I agree that doing some analysis on the number of sentences or sentence length is something that we definitely would like to do but we haven't done, but that's a great suggestion. Thank you.
>> Justin Grimmer: Okay, could we let Elizabeth Elder ask her question, please?
>> Elizabeth Elder: Great, thank you, this is really, really interesting.
I had a question kind of related to what Justin mentioned on what is causing journalists to adopt one form of language versus another. And in particular, something I think I've noticed a lot of discussion about in the last couple years, is the extent to which journalists kind of uncritically adopt language that police departments use when describing these incidents versus kind of independently deciding how to characterize them.
So I'm curious the extent to which the media story about a killing maps on to the language that's used in a police release or report of what happened. And it seems like that would be something that's possible to test empirically, and maybe something that would be a separate paper would be whether that's changing over time.
It seems like journalists are maybe trying to be a little bit more cognizant about this. Maybe there would be more divergence in overlap between the level of obfuscation as this has come more to be a topic of discussion among journalists.
>> Jonathan Moreno-Medina: Yeah, thank you so much, Elizabeth. Well, first thing that I would like to say is that you are completely attuned to our referees.
So that's exactly the comment that we got, and I think it's a very good one just to essentially get a little bit beyond what we're doing now and try to find out the source of this language. So, yeah, we're in the process of doing that analysis, trying to map it out with the way that the police themselves are presenting these stories.
Now, if they do have or if they do not, that again, has, again, fascinating potential mechanisms as to the why a journalist would adopt the same language that they see from police departments or not. Essentially, they're in a repeated game sort of interaction where the journalists are gonna get the scoop of the story if the police department actually lets them know that an event took place as well.
So they might like to not necessarily upset the police officers and kinda maybe adopt the same way that they present the story, or it might be something else. But I think that the issue of kinda this momentum in which the journalists are using specific language without special consideration as to what the language is, is something that has been shown in other papers.
So, for example, Milena Giudalova has showed how when the AP changes the guidelines on how to talk about undocumented immigrants. That actually follows on how people talk about these stories and has an effect on people's perception about immigrants. So I think it's a very interesting possibility that we're in the process of exploring.
But thank you.
>> Elizabeth Elder: Thank you.
>> Justin Grimmer: I'm gonna follow up with a question, and then we have some more questions in the chat. The comparison in the paper is this really nice comparison between police killings and civilian action. I'm curious, what are the characteristics of police killings that get active voice?
What goes on in those incidents? And is it, relatedly, I know that you're including all the fixed effects that we wanna see for particular news agencies, stuff like that, right? But I'm also very interested in if there's, like, a political decision agencies are making. And so just to put that on a spectrum, you could imagine every police killing, if it's called a murder, that's like some type of, perhaps, like, far left outlets may call it that way.
You can imagine the other way, an outlet that gives a presumption that every police killing is 100% justified every time. And you might put that on the other end of the spectrum. And so you might think that there's some uncovering of some politics and very much related to what you're talking about with the repeated game and adopting police language.
>> Jonathan Moreno-Medina: Yeah, that actually is, so two things that I wanna say. The first is that we do some analysis in terms of the differential obfuscation across other dimensions, some of which you mentioned. So, for example, we see that Democrat leaning media markets are using slightly more obfuscation than the more right leaning.
We believe that that again, fits with the story of accountability. Whenever possible, accountability pressures on police departments is higher. That's where we see more obfuscation being used, as in the case of the perpetrator being, or the victim being unarmed or there being body worn cameras. So if there is more accountability in more Democrat leaning departments, that would fit.
But yeah, we also explore what happens, for example, with national level TV stations and across the political spectrum. How does that differential obfuscation is occurring? And we don't see any big patterns, to be frank. So these obfuscation, perhaps it's a little bit more modeled around the middle, but it's nothing that is very, very clear.
We have some plots in the paper showing that essentially MSNBC and Fox News have the highest levels and the centrist TV stations have less differential obfuscation. So that's one part of the response. The second one is that I think you point out to something that we haven't done that is a very good idea, which is to try to get a sense of language regarding moral responsibility directly in how the story is presented and measuring that differentially.
Across TV stations. And that's something that we haven't done. But you pointing out to the fact that is the word murder being mentioned in the story or not, and maybe we can track that. And there might be different ways of doing that, but we haven't done it. That's a great idea, thanks.
>> Justin Grimmer: Excellent, okay, so we have a couple questions here in the Q&A to close out. Can we give David Gibbons a chance to ask his question, please?
>> David Gibbons: Yeah, a number of years ago, I observed the LA Times injected into all their articles about police shootings. Some language that inferred that investigation would be implemented to determine whether the police shooting was justified, as if the default was that it was unjustified.
In other words, the police officer was deemed guilty and we were required to judge him innocent. Is that in the opposite of what your study has shown?
>> Jonathan Moreno-Medina: So I'm not exactly sure I understand the question, but we are not talking about anything related to the justification of the killing, right?
So either if the killing is justified or not, that's outside of the purview of what we're doing. We're just focusing on how it's presented in the media. Yeah, so that would be the first thing that I would say. Now, I'm not exactly sure that I understood how you would think as to mapping that out to the article that you mentioned.
>> David Gibbons: Well, the articles, regardless of how the titles and the subtitles were generated, they made it sound as if the default was that the police shooting was unjustified. And they were investigating whether it could be justified, and that was the default.
>> Jonathan Moreno-Medina: So I actually haven't seen that article.
It would be great if maybe you can send me that information.
>> David Gibbons: This is ongoing.
>> Jonathan Moreno-Medina: Okay, yeah, so I wasn't aware. And that's something that we actually haven't looked at. I think that that falls into what Justin was suggesting before. And to think about the justification language surrounded in and embedded in the story being presented, that's something that we haven't really captured.
But I agree, that would be quite fascinating to take a look and at, yeah.
>> Justin Grimmer: Okay, so I think we have time for one last substantive question before we close it out. So Eric Pescinski, can we give him the chance to ask his question?
>> Eric: Well, sure, thank you.
I appreciate the study that you did and kind of talking about it in general. I was just wondering, there is the fact that law enforcements granted the lawful authority to use force up to and including deadly force. And maybe that is leading to some of that obfuscation that your baseline looking at, right, of different than civilian encounters.
Where, I mean, all the examples that you provided kind of, somebody followed somebody from the bar and their car and honk their horn and got into a shooting. It's like, okay, well, I think it's a little tough to try and compare those two and say, here is a law enforcement officer involved shooting versus a civilian shooting.
It goes with that. So I think it's a little tough to try and compare those, especially when the underlying premise is that law enforcement does have that lawful authority in every state to use force. They're granted that authority. And so there is an underlying presumption that the officers using force in their duty, in their capacity as law enforcement.
And combining that with some of the points brought up earlier of making sure that the media outlets have kind of their facts first before jumping to a conclusion or assigning blame or justification or whatever that might go along with that.
>> Jonathan Moreno-Medina: Yeah.
>> Eric: And the fact that those law enforcement shootings are generally investigated at a much higher level because of the public scrutiny that goes along with it.
And so many of the police accountability, transparency, policies, and laws and guidelines that have been rolled forth.
>> Jonathan Moreno-Medina: Yeah, so I appreciate that comment, and I agree, I mean, with the sentiment of your comment in so far as we're not discounting the fact that the police has the presumed legal authority to engage in these actions.
Now, I think the important element and the underlying assumption that we are making is that the way that you present the story can be the same across the two types. So either a person in uniform, a police officer, shot and killed somebody, or it was a civilian, the physical action, the event, is the same, right?
So you, as a journalist, you can use both type of sentences to describe what happened. Now, there's the issue of how this is perceived, perhaps by media, as being justified or not. But the underlying point is that the media could have used both type of sentences, right? One in which you say person a died versus one in which you say person a killed person b, right, both in the civilian and in the police killing cases.
So I think that is the underlying assumption that we're making, is that the media could have used both type of sentences. And then we're just gonna cover the extent to which they use one or the other. And I think a fascinating possibility is if maybe using one type is conveying something about the implied justification of the killing or not.
But that doesn't detract from the fact that they could have used the other type of sentence, if that makes sense.
>> Eric: Well, I would agree with that in absolutely what you said there. I think just the converse of everything that you just presented would be if you take that officer involved shooting and you're using it solely in the active voice, right, just as that comparison, right?
The premise was by obfuscating this and minimizing some of that through all the different language that's used, that there's repercussions to that, right? There's less donations to police accountability. There's different things like that. Well, the converse would be true is if that language was always used in the active voice, then there is an inherent implication that there was wrongdoing or fault by the officer.
And I think it's bifurcating that out between, you kept using the moral responsibility, right? And that's a whole different conversation, the moral responsibility versus the legal responsibility. And that does have something to do with it, right? I mean, you can get into all of that with police accountability, or anything along those lines, it's what's morally correct or justifiable versus what's legally correct.
And with our legal system, it's based on legally correct, not necessarily morally correct.
>> Jonathan Moreno-Medina: Yeah, so I agree with you. And that gets as to the why question that we're not falling on one side versus the other. But we are fairly confident in the results in terms of what we're finding, in terms of the differential obfuscation and showing evidence that this does matter.
Now, those two things, I don't believe, were as clearly established before, but we think that's where our main contribution is laying. And as to the why, which I think permeates to what you're talking about, that's a little bit more of an open-ended area, and I think it's worth exploring going forward.
>> Justin Grimmer: So on that, we're gonna have to cut off the recorded discussion. So if you wanna stick around and keep talking, we'll talk afterwards. But Jonathan, thank you so much for that fantastic presentation, incredible research.
>> Jonathan Moreno-Medina: Thank you so much, Justin and everybody who's been working on this.
And by the way, great workshop that you have going on.
>> Justin Grimmer: Well, thank you very much. See you all soon.
Jonathan Moreno-Medina is an assistant professor in the Department of Economics at the University of Texas–San Antonio. His research focuses on urban, public, and media economics. Moreno-Medina received his Ph.D. in economics from Duke University in 2021. He also earned an MA at Université catholique de Louvain and a BA from National University of Colombia.
Patrick Bayer’s research focuses on a wide range of subjects, including racial inequality and segregation, social interactions, discrimination, neighborhood effects, housing market dynamics, education, and criminal justice. His most recent work has been published in Econometrica, Review of Economic Studies, American Economic Review, and Quarterly Journal of Economics. He is currently working on various projects that examine unequal jury representation and its consequences, school spending, the intergenerational consequences of residential segregation, neighborhood tipping, gentrification, policing, and criminal justice. Bayer is currently the Gilhuly Family Distinguished Professor in Economics at Duke University and a research associate at the National Bureau of Economic Research. He served as chair of the the Duke Economics Department from 2009 to 2015. Bayer received his PhD in economics from Stanford University in 1999 and his BA in mathematics from Princeton University in 1994. He served on the faculty of Yale University for seven years before joining Duke’s Economics Department in 2006.
Bocar Ba is assistantan assistant professor of economics at Duke University and a faculty research fellow at the National Bureau of Economic Research. Using insights from the labor economics and political economy literature, he seeks to understand police use of force, overall police officer behavior, and what cities want from their local law enforcement. His recent work focuses on evaluating ways to reduce the scope of policing in our society and evaluating the demands of police abolitionists. With the Invisible Institute, he has built the Citizens Police Data Project website (https://cpdp.co), which collects information on Chicago police officers, including misconduct, use of force, and awards. He is also an academic advisor for Police Scorecard (https://policescorecard.org) and Mapping Police Violence (https://mappingpoliceviolence.org), websites collecting information on police performance and police killings in the United States. His research on policing and public safety has been published in Science, the Journal of Labor Economics, the Journal of Economic Perspectives, and the American Economic Journal: Economic Policy. He holds a BSc in economics from the Université du Québec à Montréal, an MA in economics from the University of British Columbia, and an MPP/PhD in public policy from the University of Chicago.
Aurélie Ouss is Assistant Professor of Criminology at the University of Pennsylvania and a Faculty Research Fellow at the National Bureau of Economic Research. Her research examines how good design of criminal justice institutions and policies can make law enforcement fairer and more efficient. Her work, conducted in collaboration with court actors in place like New York, Philadelphia or Paris, has been published in journals such as Science and The Journal of Political Economy. She received her Ph.D. in Economics from Harvard University, a Masters in Economics from the Paris School of Economics and a B.A. in Econometrics and Sociology from Ecole Normale Superieure. She came to Penn after a post-doctoral fellowship at the University of Chicago Crime Lab.
Steven J. Davis is senior fellow at the Hoover Institution and professor of economics at the University of Chicago Booth School of Business. He studies business dynamics, labor markets, and public policy. He advises the U.S. Congressional Budget Office and the Federal Reserve Bank of Atlanta, co-organizes the Asian Monetary Policy Forum and is co-creator of the Economic Policy Uncertainty Indices, the Survey of Business Uncertainty, and the Survey of Working Arrangements and Attitudes.
Justin Grimmer is a senior fellow at the Hoover Institution and a professor in the Department of Political Science at Stanford University. His current research focuses on American political institutions, elections, and developing new machine-learning methods for the study of politics.