Our 29th workshop features a conversation with Andrew Hall on “Investing in Political Expertise: The Remarkable Scale of Corporate Policy Teams" on December 9, 2024, from 9:00AM – 10:30AM PT.
The Hoover Institution Workshop on Using Text as Data in Policy Analysis showcases applications of natural language processing, structured human readings, and machine learning methods to analyze text as data for examining policy issues in economics, history, national security, political science, and other fields.
WATCH THE WEBINAR
>> Erin Baggott Carter: Good morning and welcome to the Hoover Institution workshop series on using text as data in policy analysis. I am Erin Baggott Carter, a Hoover Fellow. And my co-organizer is Steve Davis, the Ford Senior Fellow and Director of Hoover Research at the Hoover Institution. Today we are delighted to have Andy Hall here to give a presentation on investing in political expertise, the remarkable scale of corporate policy teams.
As is our practice, we will have a formal presentation with Q and A for about an hour. And after that you are welcome to stick around for an informal discussion for which we'll turn off the recording. So Andy, please take it away.
>> Andrew Hall: Wonderful. Thank you for having me here.
I'm just gonna get my slides up and we'll get started. All right, so this is joint work with Anna sun, who's my pre doctoral research fellow here at Stanford at cpr. And this is very much work in progress. So very excited to get your feedback as we build this out into a full blown paper.
Okay, so the motivation for this paper is an age old question in the study of political economy, which is how do firms seek to influence politics? And one of the most important ways we think that firms do this is through lobbying. And if you speak to Americans, you look at survey data from Americans, indeed, one of the things you'll hear is that there's a pervasive belief that lobbying is extremely important and in a really unpopular way, that special interests of various kinds seek to, in their view, distort the policy process.
And so I've selected one of about a million different cartoons I could have shown you here, just capturing this view that lobbyists have something to do with the dysfunction of Washington. And that there's something about lobbying that influences policy in a really important way. And I think a lot of people share this interest in understanding lobbying and how it works.
What's striking, and this is the jumping off point for our paper, is that research on lobbying has really encountered some deep puzzles in trying to understand how it works, to what extent it works and in what ways, and why firms think they're doing it. And so one of the most provocative questions about lobbying, and I'm here paraphrasing a very famous paper by Stephen Solberg, Jim Snyder and others, which was called why is there so little Money in Politics?
I'm asking the question here, why is there so little money in lobbying? And what I'm showing you here is just public data. This comes from open secrets. This just comes from legally required disclosures about the total amount of money that's being spent on lobbying at the federal level in the US and what you can see is obviously to the average person, and to me, four or five billion dollars is a lot of money.
But actually, relative to the budgets of many of the largest companies in America, this isn't even a drop in the bott bucket. And so there's been a lot of work to try to understand if lobbying is so important and you, through lobbying, you can get whatever you want from the federal government, then actually why are companies spending, quote, unquote, so little money doing it?
And so that's been one huge question in this literature. And then the second question is, well, how does lobbying actually get lobbyists what they want? What are they doing that gets them policy outcomes that they want to the extent that they're able to do that? And the literature really focuses here on two possible mechanisms, right?
One is informational lobbying. And the idea here is that firms or other groups are able to invest a lot in developing expertise around policy issues that allows them to credibly show policymakers, if you set up a policy this way versus that way, it's not going to work, work the way you want it to.
And that information, because it's valuable, influences ultimately where policy lands. And in exchange for providing that expertise, right, there's some sort of bias that is the idea in these models. So I give you, I, as the firm, give you some information. It helps you, as the policymaker, craft a better, more effective policy.
But in exchange, I move policy a little bit towards where I would like it to be. So that's one big area in this literature. And then the second, which is, I think, much more what the American people picture is quid pro quo lobbying. And the idea here is that basically lobbyists are people who've spent a lot of time developing personal connections with people on the Hill, in Washington, and they're simply able to use this network of friends and favors to secure policy outcomes that they want.
Obviously, that second category has a particularly negative sort of normative loading attached to it. There's a lot of literature. I'm not gonna give you a full literature review, but there's a lot of literature on trying to understand both of these questions. First, why isn't there more money going into lobbying?
And second, how is lobbying working? And these are two really classic papers, both in the aerial, both about lobbyists. And I don't wanna caricature this. There's certainly room for both mechanisms to be at play, but the literature has really repeatedly found the most evidence for something like quid pro quo lobbying, in the sense that both of these papers are finding a particularly strong link between a lobbyist's connections, that is their personal relationships with policymakers, and their value as lobbyists.
And so the implication of this research is maybe expertise and providing information are important, but personal connections seem to be especially important. My belief and the reason I started this project is that we've actually missed something really big that helps us to come up with different answers to both of these big questions, both to why we're not observing so much money going into lobbying, as well as to how lobbying works.
And I'm just gonna briefly motivate this by just telling you kind of what I've observed personally. So I'm a professor here at Stanford. I also do a range of advising work in tech. And starting in 2017, I had a really special experience where I got to work inside what's called the global affairs team at Meta.
And many, many large companies have these sort of global affairs teams which contain within them a variety of different teams that work on a variety of different topics around what they call policy. In addition to that personal experience that I've had, I've also had a chance, through that work and through other things I've been involved with, to actually interview a bunch of what are called policy leaders at a wide range of tech companies.
And through that experience, I feel like I learned a couple of really important things that really changed my view of this literature. The first is that, and I observed this directly and have spoken to a number of people about it, it's actually really hard for firms to figure out what their political strategy is.
And what I observed was that that part of the challenge is actually a lot harder than communicating to policymakers what you think they should do. Once you've decided what you think they should do, that part can be farmed out to lobbyists. They're really good. Good at that. But they don't understand the nuances of your business and the realities of how the particular products that you make, maybe there are ways they can be compromised to a cord for some political realities, but not others.
You're doing a very, very complicated optimization problem based on your internal understanding of the market and of your products before you're ready to even know what policy you want to advocate for. And what I observed, particularly through some of these interviews I did, is that inside these firms, a tremendous amount of mind share and a tremendous amount of resources are put towards figuring this question out.
So if you think about a new technology regulatory issue, here's one that I've never worked on, so I can just speak to it generically. Encrypted messaging. That's a very, very hard policy question, right? On the one hand, it's pretty clear why we might want encrypted messaging. It gives us all privacy.
On the other hand, from a law enforcement perspective, we might want it not to be encrypted. Each tech company, and there are many, are going to bump up against this encryption issue in different ways. And it's really not obvious from any tech company's perspective what the optimal policy around encryption is.
Maybe some of them think it should be encrypted for all, for everything, all the time. Others maybe are willing to consider these sort of backdoors. It's really a nuanced issue where it's not obvious what the right answer is. And almost every tech company has a team of people.
And in fact, I'll show you one working on this particular question. And none of that activity is observed when we just look at lobbying data. And so that was really, I think, the key thing that I experienced that I've become convinced is an important research question is what is going on with all this political activity that occurs inside firms?
Because if there's a lot of it, then that helps to answer this first question in the sense of maybe there's actually a lot more political spending going on than we realize, because lobbying is just the tip of this much larger iceberg. And to the second question, if that money is being spent internally to the firm, then in some sense, definitionally, it can't be spent on lobbying, because those are not people going out and talking to policymakers.
These are people doing work inside the firm. So I think it's a potentially really important new channel for us to understand about how lobbying and political influence really works. So the key challenge that this paper is intended to address is that we're really limited in this literature to studying publicly disclosed data.
So essentially, all research on lobbying in American politics starts from what's sometimes called the LDA data, which is the data that comes from the Lobbying Disclosure Act. And that is a very, very nice publicly available resource that shows you all of the people and all of the firms who have registered as lobbyists, and with, without getting into too much detail.
Just very speaking very roughly, that for the purpose of this law, a lobbyist is anyone spending more than 20% of their time speaking to public officials on behalf of a paying client. That's a loose paraphrase of the real law. And similarly, companies are also required to disclose campaign contributions that they make to politicians.
And so those are two really important sources of possible firm influence, but they're also very, very limited, right? We're only observing the part of this spending and lobbying that occurs when the firms have to disclose it, which is when their lobbyist spends more than 20% of their time speaking to policymakers on their behalf.
We don't have any data. I'm not aware of, really, any paper that speaks to the possibility that there's a whole bunch of other people working inside these firms, working on politics and on influencing policy, but who are not regularly speaking to policymakers themselves and therefore don't have to register.
And our paper is, in some sense, extremely simple. As you'll see, the purpose of our paper is quite simply, for the first time, to get data on those people and try to speak to how important they are and how many of them there are and so forth. So I'm just going to give you a quick preview, in part because I'm not sure how timely I can manage to keep myself, but I'll try my best.
So just to make sure I give you a preview of what we find, we're going to build this first data set on who is a member of these corporate policy teams, which is what we're going to call these internal policy organizations. And we're gonna draw in about 100 million records of American workers from a data vendor that gets data from LinkedIn.
And what we're gonna find is that our estimate is that there are 15 times as many internal policy team members in corporate America than there are registered lobbyists. So for every lobbyist we observe registering in the federal database, we estimate there's 15 more people possibly working right alongside that lobbyist, but who don't register because they're not themselves speaking to policymakers.
Just to make this number very sharp for you. In the Fortune 100, firms that retain an average of 15 lobbyists, that's a mix of in house and external lobbyists. And they have, on average, we estimate, 189 members of their internal policy team. The interesting fact we're gonna show is that these internal policy people are much less likely to be revolving door hires than lobbyists.
So what I mean by that is they're much less likely to have previous experience having worked in the government at either the state or federal level, as best as we can tell. And I think that's an important observation because it again suggests that these people are doing something different.
It's not their job to work their relationships to convey policy ideas and influence the process that way. It's instead their job to study the policy environment and develop information that can persuade policymakers. We're also going to show you that these policy teams and lobbyists seem to be compliments in the sense that when a firm grows one of them, they also grow the other.
And finally, I'll show you some very preliminary evidence that these policy teams, again consistent with the idea that they're doing something different than the lobbyists, are less partisan in their voter registration than the lobbyists. So there's some effort done to try to make your lobbying team speak to both parties.
And so the lobbyists get a little bit closer to 50, 50 in their partisan split, and we don't observe that. And the policy team, again, consistent with the idea that they're doing something other than speaking to policymakers on the Hill most of their time. Okay, there's going to be three quick sections to this talk.
I'm going to give you a little bit more background of what we mean by policy teams, why we think they're important. Then I'm hoping to spend most of my time in the second section here, which is building this data set. And that's going to be entirely about working with text.
And then finally, I'm going to just run you through those results, which I already told you verbally. Okay, so what are policy teams? Our working definition is that corporate policy teams are these internal teams seeking to influence the political landscape and advise the company on how to navigate the political environment.
If you talk to people who work in these jobs, and I'm going to show you some evidence for this in a moment, they talk at least about three major. Categories of work that they work on. One which is the most obvious is that they're constantly monitoring the global policy environment, especially at these large firms.
They're constantly watching for what policy issues are arising in what jurisdictions. And they're trying to keep an eye on whether they think those policy conversations are heading in directions that will be favorable for the firm rather than unfavorable. And that's their kind of the most important thing they often talk about.
They do other things, though, that are really important to that role, and then I think help to emphasize how important their sort of internal role is relative to lobbyists. So one is that they also often advise the company on the plan for developing new products that might intersect with the policy environment in some way.
So this is maybe particularly salient for me, at least in tech, where a lot of tech companies, because they're software based, they're constantly rolling out new products. And policymakers might care a lot about some new product that say, YouTube is rolling out across the world or something like that.
And so they don't just launch the product and then work to lobby Washington or something like that. They're anticipating in advance what they think the political reception to their new products will be. And the policy team is very, very involved in that work. Finally, and this comes up a lot in tech, but it's true in other places as well.
They also sometimes help to develop the internal policies that govern these products, again with an eye towards politics. So you might think about privacy, user privacy in tech as an example of this, where there's both legal requirements, but companies will often choose to do things where there aren't legal requirements in anticipation of the political reaction to their products.
And that would be another thing that an internal policy team would work on. Okay, I wanna make this very concrete for you. So you see I'm not making this up. And so I'm gonna tell you more in a little bit about our data. But before I get there, I just pulled these myself manually.
This looks a lot like some of our data. I pulled a set of job currently open job listings from LinkedIn. So LinkedIn has a service where you can apply for jobs right through LinkedIn. I just went through and manually searched and found a range of policy job openings on LinkedIn and pulled for you, the descriptions of these jobs.
So this is how the companies themselves articulate publicly what they think policy teams are doing. So first one, off the bat, this is for a role director and head of public policy at Netflix for UCAN is US and Canada. And here's the description of what they think this job is gonna do.
You're gonna work with business partners to understand all aspects of the business and its needs, explain the impact of public policy developments on the business, monitor and advance legislative and regulatory initiatives in the region, develop relationships. So you'll see here there's some of this idea of lobbying here in terms of developing these relationships, but it's third on the list.
It's not the key part of their job. And I just want to quickly show you they are actually listing out the specific policy topics on which they want this person to work. And it's pretty fascinating. And you can see it's a range of things. Netflix is a big company.
They're hitting up against a bunch of different global and US And Canada policy issues. And you can see this person's expected to be working on all of them to some extent. Okay, here's one from Meta on messaging policy. I'm not gonna read you the whole thing, but I just, I highlighted some key parts.
They want this person to develop our strategy, advise product teams on the hardest privacy and data use questions we face, engage with civil society, society, policymakers and other stakeholders. So again, there is some notion of engaging with the outside world, but it is not the priority for this job.
Same thing on here's one from Amazon. Public policy efforts related to customer access to pharmacy care. I chose this one. Just give you a sense that there's a wide range of these types of roles, and they tend to specialize in different segments of the policy landscape. In this one, they want you to monitor a broad range of stakeholders at the state and local levels.
Here's one this I wanted to pull one from a different kind of company. Here's Roblox. That's a game maker. You might be surprised to hear that a video game maker is engaged in this. But in fact, I'm gonna show you some evidence, like basically all companies are if they're big enough or involved in this kind of stuff.
So what does Roblox want this person to do here? They do highlight relationships with state policymakers and political figures, but also identify, monitor, analyze policy issues, and so forth. You get the point, I'm not gonna go through too many of these. Here's one from YouTube. It's similar. Now I wanna step back and just tell you quickly, sort of like what do these companies think these teams are accomplishing?
And then, we'll dive into the data. So this is a great document I found online. It comes from Charles Rivers Associates, that's an economic consulting firm. This is targeted at the pharmaceutical industry. And indeed, it turns out the pharmaceutical industry is one of the largest employers of policy people, in addition to lobbyists.
And in this document, CRA is sort of walking through how they see the policy function operating in pharmaceutical corporations. And I don't actually understand this flowchart, so I'm not gonna walk you through too much. But basically, the point is you are constantly monitoring the global operating environment and mixing together your observation of what is going on in the policy landscape with how does your particular company fit in?
What are you good at? And how are you gonna balance your niche in the market with how the policy conversation is going? And they lay this out in three steps. This is what a policy team is supposed to do in a pharmaceutical company. Step one, determine the key policy issues.
So you have to, you have to be scanning the environment, figure out of the various policy conversations going on, which are most important to your company. Where do you think the most value will come for you if a policy comes out in a way you think is good for your industry and your company, rather than bad, given those priorities?
Step two, figure out your particular strategy and the arguments you're gonna make, the data you're going to bring to bear, and so forth on that policy question. And I'll tell you from my personal experience, that matters a lot and takes a lot of work. So I was really surprised to discover this, particularly when I spoke with people at other tech companies there.
They run survey, you know, they run very, very large surveys all the time. They have whole teams of people who study academic research to try to understand what is the best data that's out there on this question. They're constantly working on developing this evidence, right? And ultimately that goes into a binder, let's say that goes to a lobbyist.
But the work that goes into building that binder is much, much harder than the work that goes into communicating it later. And indeed, the last step of the process, as CRA puts it, is disseminate the argument. So this is sort of the binder that I'm talking about. Okay, so I could talk way greater length about this, but I wanna make sure to focus mostly on our method and data.
So that's the background on why we think policy teams are important for understanding lobbying. But they haven't been studied before, and they haven't been studied before because we need new data to understand them. So in order to do this, we're gonna start, we got from this data vendor relatively comprehensive data on roughly.
There are 100 million U.S. workers, spanning about 10,000 firms in the U.S. this data is built out from public LinkedIn profiles. And because it's from their LinkedIn profiles, it's going to give us things like every worker's name, their job title, their history of work, their education, sometimes their location, things of that nature.
We're also gonna combine that with a sample of about 100,000 job descriptions similar to the ones I just showed you. Those are gonna turn out to be really essential for how we're going to figure out if we're doing a good job of finding policy people. Finally, we're gonna pull in other kinds of public records, including the disclosure data on lobbyists, so that we can compare policy teams to lobbyists and a variety of firm level data for publicly traded firms and the like.
Okay, let me show you what one of these public LinkedIn profiles looks like. Brad Smith is, perhaps, the most famous policy leader in tech, he's vice chair and president, Microsoft. This is what the top of his public LinkedIn profile looks like. You can see you get his name, and his current title as well as where he works.
We see a little bit of an about title with him, and then we get to see his whole experience. And this is really, really valuable because obviously over our time period, more and more people are joining LinkedIn, and the norms around being on LinkedIn are becoming stronger in the areas that we're studying.
But what's really nice is even if you've only joined recently, we almost always get your work history because one of the most important things LinkedIn wants in your profile is all the places you used to work. So, that's very valuable for us. We also get his education, and we're going to get that for basically everyone in our sample, okay?
This is really the heart of the project, it's a very straightforward but quite challenging data, Texas data problem. We have all these LinkedIn profiles, 100 million LinkedIn profiles, and now we wanna figure out which of these are what we call policy people. And so, we're gonna do this in three steps, roughly.
So, first, we're going to search job titles for words we think indicate something about policy. Our initial plan was just to stop there, but then as we inspected them, it became clear to us that some titles are very unequivocally about policy and others are definitely not. And we needed to do something much, much more heavy duty to get an accurate, or a relatively accurate estimate.
So, how are we gonna do that? We're gonna actually use the text from the jobs descriptions data to estimate probabilities that each job title actually connotes a real policy role. And we're going to use ChatGPT to do that, and I'll talk about how we did that, okay? Then once we have those estimated probabilities, we'll go back to our job title data, will put in those probabilities.
And then, we'll collapse down to the firm level with this sort of probability weighted estimate of the total number of policy team members. Okay, so, here's how we do it. We came up with a list of titles first, based on my own personal experience, and then fanning out from there on, based on people we knew worked in policy and what their titles were.
And then, we kind of manually iterated for a long time. So, we went back, we went through, we searched for a bunch, we read their LinkedIn profiles, we realized a bunch of them weren't. We developed a list of words we're gonna exclude. Then we determined that wasn't even getting us close, and that's when we went to the job description data.
So, what are some of these words? Just to give you a sense, policy. So, in a lot of companies, you'll be like the head of policy. Sometimes it's called regulatory affairs in the pharmaceutical industry, it usually seems to be called regulatory affairs. Sometimes it's called government affairs, or government something else.
These are all word stems, just to be clear. It could be external affairs, external relations, global affairs, and I'll just flag. This is kind of the key issue that we're gonna run into is when your title is something like Director of Legislative Affairs. We're done, we know, we know you work in policy.
That's like a phrase that just unequivocally signals that you're working on, you're basically a lobbyist. You're all the way on one end of the spectrum. In the middle, we have something like policy, where you're very likely to be doing the kind of work we're talking about, but you could be doing something else.
There are other kinds of policy roles inside tech companies, including people who are, you know, writing compliance policies, and they're lawyers, they're not really trying to influence the political environment. And so, some of the titles are unambiguous and others are not. We started by excluding a set of, of titles that we see with that hit on some of these words, but clearly weren't about what we mean by policy.
For example, security manager, but that clearly wasn't getting us far enough. So then, what we did is we took advantage of this very, very large data set that comes along in the same data vendor, in addition to all the LinkedIn profiles, gives you this other huge data set of job descriptions, open jobs over time.
They come from a variety of sources, it's primarily from LinkedIn, but also other online job sites. We pulled a sample of 100,000 of these just to make it tractable for now, later I think we can scale that up. And then, we came up with the following workflow. And this is really kind of the heart of the paper.
So, what we do is we, first of all, we search in those job descriptions for our titles. And then, that gives us sort of like if we just did our naive thing where we defined policy jobs based on these titles, here are all the jobs we would find as being policy jobs.
Then, we take those policy job descriptions without the titles, and we put them through the ChatGPT API. And we ask ChatGPT to read them and tell us whether it, I guess ChatGPT is an it, thinks it's a policy or a non-policy job. And I'm gonna show you the exact prompt instruction that we give to ChatGPT.
Once we have back from ChatGPT this 01 classification for policy job, we can use that to adjust our naive estimate based on the titles and in particular where ChatGPT has told us, yeah, everyone with the regulatory affairs title is definitely in policy. We just leave that as is, but where we learned from ChatGPT.
Yeah, no, in the policy category, 60% of them, 60% of the job descriptions look like policy jobs. Then, we'll adjust down by that percentage. Okay, and so that's how we'll use this estimated accuracy rate to in some sense debias the data. Okay, so here's the prompt we came up with for ChatGPT, I apologize, it's a little long.
But here's what we tell ChatGPT, your job is to read a job description, first, remove all words that also appear in the job title. The reason we do that is we don't want it to become circular where it uses our search to decide everything that we search for is accurate.
Then, classify whether the job is a corporate policy team job or not, using only its job summary and responsibilities output. One if you think the job is policy related, and zero if not. We define corporate policy jobs as jobs in which the person is tasked with influencing the political environment and complying with regulations on behalf of the firm.
It goes on, but I'll just leave it there, okay? So, that's what we ask ChatGPT to do, and this is what we get back. And I just want to emphasize before I go into this, right? The key reason we have to do this, I should have said this earlier.
We don't have this. Job description data for all 100 million people in LinkedIn. So if we want to make this leap from what we think the job descriptions tell us about which titles are or not policy jobs, and apply that to all 100 million LinkedIn users, we need a bridge, because we can't just look at all the job descriptions for all 100 million.
We only have the sample of 100,000. So we have to use these 100,000, just in some sense, train this model that we can then apply to just the job titles in the rest of the LinkedIn data. So here's what we get, this is sort of the main table that comes out of the classifications that ChatGPT does.
We've sorted these job titles that we search for by the rate at which ChatGPT thinks they are, in fact, policy jobs. So some of them at the very top, they're slam dunks, but they're not necessarily all that common. So this column where we show the accuracy rate from ChatGPT, that shows us of the jobs that we said based on the title were policy, what percent of them where we had job descriptions, did ChatGPT agree that it was a policy job?
And then this final column is of all the jobs in LinkedIn that we found with that search term. Like what percent of all the search terms that we found were found by this search term? Okay, so what we can see here is in the first row, if your title in LinkedIn is federal, has something to do with federal affairs, we're really, really confident that you work in policy.
But you're a tiny fraction of all the policy people, potential policy people, we found. So if you go down the list, you can see these top ones are really, really clear, but they're not always that common in the data. And it's really, we start to see more meaningful hits when we look what we call government affairs, where we find a 92.5% rate at which ChatGPT thinks they indeed work in policy.
And that actually gets us 2.9% of all the potential hits. And if we keep going down, I'll just flag a couple of the rows here, regulatory affairs, that's one where ChatGPT estimates 88% of the job descriptions are about policy. And that's a huge chunk, that's 20% of everyone we find.
Go all the way down here to the policy row. Here, we're only 50/50 on whether the person works to influence politics, and that's response, and that's a pretty big chunk of the data. That's 40% of the hits. Okay, so that's our key, sort of the key approach we're taking is we're now going to take these accuracy rates that we get from ChatGPT and use them to down weight the number of people working in policy in our overall data.
So if your title in LinkedIn is policy, we find a hundred people with that title at Amazon. Let's say we're gonna give Amazon credit for 50 people working in policy in that time period. And we're gonna do this over time, okay? Just speeding through, I don't wanna take up too much more time.
This is just kind of a validation check to ask after we do this whole exercise. Now, looking at all the job descriptions, policy and not policy in our sample, if we run a Lasso and predict ChatGPT coding of policy, not policy based on the words included in those job descriptions, which words seem to be doing the work?
Which words are, really like, seem to be most correlated with ChatGPT's decision about whether this is a policy job or not. And it's kind of exactly the jobs you would expect. Affair comes out really, really strong regulatory, government, legislative, regulator, policy formula, and so forth. So this is just kind of what we've done here, just to be clear, is selected the coefficients from this Lasso on the words that are most predictive of being called a real policy job.
Okay, that was sort of a speed run through what is a relatively complex issue. Happy to tackle it more in the Q&A. I want to wrap up by just running you through what I already previewed for you were our results. So this is a really, really simple, but I think, very powerful plot where this is a plot of every point here is a binned average of many firms and each firm we're plotting on the X axis the log of their market cap.
So these are publicly traded firms, and on the Y axis, the average number of people they have as estimated through the procedure I just described for you working in policy. And we compare that to the number of lobbyists we know work for them from the disclosure data. And so what you can see is just like in lobbying, most of the action here is in the larger firms.
And as you look at those larger firms, you see a larger and larger gap where they're really investing heavily in building out these internal policy teams. And as you can see, in these larger firms, they have way more people working in policy than they have lobbyists. And that's sort of, I think, the absolute key finding from our paper is if you're trying to understand how much money companies are spending, influencing politics, and you only have the public lobbying disclosure data, you're missing a ton of the action.
Now, we don't have any good estimate of the exact cost of all these workers. We don't know how much they're being paid. But you can, I think you can be relatively confident that they're getting that the aggregate amount spent on these people, plus the capital required to build out the infrastructure for them inside these companies is a lot more than is being spent on lobbying directly, okay?
Another interesting fact here is just that if you look at it over time, so this is just our average across all these companies over time, what you see is that the number of lobbyists retained by companies is pretty flat over time. Which is, which is similar to what I showed you in that open secrets plot at the beginning.
But over the same time period, the policy teams are growing in size quite noticeably, almost doubling in size by the end of the time period. So that suggests something kind of interesting that as politics is becoming more fraught in the US, firms seem to be responding by really growing their internal policy teams.
Just briefly here on the claim that they're compliments. I wanted to keep the results section short, so I didn't. I can show you there's a strong cross-sectional correlation between the size of your lobbying team and the size of your policy team. But maybe more interestingly, when we run regressions predicting the size of your policy team based on the number of lobbyists, both in house or external lobbyists that you have.
When we include company fixed effects, or we're looking within companies over time, or we even do a sort of a diff and diff with year fixed effects like in column two here. We see this strong positive relationship where for each lobbyist that a company, with each new lobbyist that a company retains, they're also predicted to be growing their policy teams.
And we think that this indicates, and I have some other evidence for this that I didn't squeeze into the slides, that these are compliments in the sense that when the external environment becomes more, in some sense fraught. When you feel like you need more policy influence, you seem to respond both by hiring more lobbyists and by making your policy team larger.
And that's roughly consistent. Obviously, that's not dispositive. It's roughly consistent with the sort of model I have in mind for what's going on, which is the internal policy teams are developing the expertise on these policy issues. They're then handing those off to the lobbyists to communicate to policymakers, which makes them strong.
To one another, okay, this is, I think, perhaps the most important thing that really suggests the mechanism here is informational, not quid pro quo. If you look, what we do here is we go through your LinkedIn profile, and find whether you've previously worked for the government. And when we do that, we can do that for everyone.
And so, now we can compare lobbyists to these policy team members, to everyone else who works at the firm. And that's really cool because I think it gives us a sense of who are the policy team people closer to. Do they look like the lobbyists, or do they look like everyone else?
And the answer is they're in between, which makes perfect sense. But they're closer to everyone else than they are to the lobbyists. And so, if you look, you can see the government experience is really dominated by the lobbyists, who are at about 15%, which I will note is pretty similar to estimates in previous work.
I think the Trebi paper that I showed you at the beginning, they were at like 20% with government experience. So, the lobbyists have quite a bit of government experience, which makes sense because they're in the relationship business to some extent. The policy people are down at 2.5% total, 1.5 in the executive, 1% in the legislative branch, which is kind of similar to the non-policy workers.
It's not very much, and I think that really emphasizes this point that firms are investing a lot in these policy teams. But it's not necessarily so much about having, or holding relationships with people you work with in Washington, it's about developing expertise. This is my last empirical slide, and then I'll wrap up.
And there's a lot of work that went into this one slide, we link everyone in the LinkedIn data to the voter file so that we can estimate your partisanship by how you register to vote, if you registered to vote. And then, we can look at, are you a registered Democrat, are you a registered Republican, are you a registered Independent, or are you registered with another party?
I'm subsetting here to only the people we find in the voter file. There are, of course, many people who don't register to vote, I'm not counting them here. And what we can see is if you look at the lobbyist category, which is in the middle here, they're 50% Democrats and 32% Republicans.
And then, if you compare that to the policy people, you see that they're 48% Democrats. So, they're roughly as much Democrats as the lobbyists are, but there's way fewer Republicans. There's 19% registered Republicans versus 32% for lobbyists. There's many possible explanations for this, but I would offer just one possibility is for big companies, it is absolutely required that they retain lobbyists from both sides.
And of course that's somewhat responsive to who's in power. But they have to be able to talk to both parties. To the extent the policy people aren't always talking to people in parties, it's perhaps less important for them to be balanced. And so, they end up looking quite a lot more similar to the rest of the workers, which I show you over here on the left, than they do to the lobbyists.
So they're not as balanced in terms of their partisanship as the lobbyists are. Now, the lobbyists are not balanced, but they're more balanced than the policy team members are. Okay, so let me just wrap up here. The purpose of this paper is to be the first paper to define and quantify corporate policy teams.
What we're estimating so far, this is again work in progress, is that these policy teams are 15 times the size of lobbying teams, roughly. They're significantly way less likely to have government experience, and they seem less partisan than lobbyists. And we think that they're compliments to lobbyists, not substitutes.
And we really think there's two key takeaways to this paper. One, the literature has really underestimated the total amount of corporate political spending by only being able to focus on these publicly disclosed expenditures that occur outside the firm through lobbying. And the reality is that much more money is probably spent inside the firm employing and resourcing these internal corporate policy teams.
Moreover, when we turn to the mechanisms of how does lobbying work, we think that this evidence is probably closer leans us a little, lets the pendulum swing a little back towards informational lobbying, and away from quid pro quo lobbying. Again, not to say that both are not important, they clearly are.
But what we're finding is of all this money that is being spent on lobbying that wasn't previously being measured by the literature. We think a lot of it is going towards develop, studying and developing ideas, strategy and evidence, and less towards the relationship management, which is the part that you do observe in the publicly disclosed data.
We have some very tentative policy takeaways as my last slide and I don't want to, I don't want to lean on these too heavily because this is very much work in progress. But if it's true, that most lobbying is informational in nature and occurs inside the firm. Then policies to limit public lobbying, which is a favorite thing that people always talk about, are likely to have limited efficacy, and could even be counterproductive in the sense that most of the efforts going on inside the firm to develop this expertise.
And if all you do is try to intervene on the communication part, it seems hard to see whether, or how that's going to get you things that you want. Instead, what might be most effective, and this is obviously hard to figure out how to do, would be to get more information to be provided by sources with different biases, right?
So, if we're in an informational lobbying world, then a lot of the bias that we worry about comes from the fact that the people able to accumulate, and use the information are on a particular side of the issue, particular incentives. So, if we could get the government or other actors to have comparable levels of information investment, we could come to different policy outcomes.
I'll just note, having put this data together, it's possible that in the same way we have disclosure requirements around lobbyists, we might want to consider some kind of disclosure requirement for policy teams. I think it's very hard to figure out why, or how that would work in practice, but it's certainly something I anticipate to come up as a discussion from this paper.
So I'll leave it there, and looking forward to the discussion.
>> Erin Baggott Carter: Fantastic, thank you, Andy, for a really fascinating paper. It's wonderful to sort of see some of the trends with this data on a part of corporate lobbying that's clearly, so important and utterly undocumented. So, thanks for sharing this with us.
So, to kick us off, I have a bunch of questions about the text, but I'll just start with one about the substance. So, you've argued that this suggests that this corporate policy lobbying is a compliment, not a substitute. But I would question whether that's really the case. So in particular, you're looking at the number of lobbyists, right?
But what's potentially much more important is the value of the lobbying contracts, so in particular. So, it was interesting to see this spread where this is really driven by biggest firms, right? And so, I'm wondering if there could be a poaching mechanism going on here. So, the biggest firms poach their favorite lobbyists, bring them in house, and this is really about cost-cutting, right?
Instead of paying a firm these, these contracts, you can have someone who you like, who's really specialized, and you're paying them a large salary. But you know, it, you know, it, you know, not nearly what you're doing with these huge contracts.
>> Andrew Hall: Yeah.
>> Erin Baggott Carter: So, the question is, and, you know, maybe the, the 14 other people are just, you know, supporting this person.
So, I'm curious if you have any intuition about that, or if you could look at that with other data that might be available. I'm not sure if the value of the contracts is in LDA the way it is in Fara, for example, but I'm curious what you think about that issue.
>> Andrew Hall: Yeah, a couple thoughts, so, we do have the ability to look at all the spending on lobbying. And we actually have plots where we compare the spending on policy teams to the spending on lobbying. The reason I didn't show them is I'm personally a little bit dubious on the way the policy team spend is estimated.
So basically the data vendor that we use estimates salaries for everyone, but it's not clear how they're. They're estimate using data from Glassdoor, which is like I guess a relatively common thing. It just seems like a very hard estimation problem. And so the headcount stuff seems safer for comparison, but we do have some loose back of the envelope estimates on the amount spent.
And of course it's not surprising a lot more seems to be spent on the policy team people with any plausible guess of their salaries. And that's, that's only their salaries. We're not including, you know, all the opex that goes on around these people, which I know is quite large in terms of the broader, like sense of your question.
I mean, I guess one thing I would say is it's definitely possible that there's these particular kinds of substitution. I would say if someone switches from being an external lobbyist to being brought in house in our analysis, we'll actually continue to count that person as a lobbyist. And so they, in some sense that would get captured.
That wouldn't be counted as how we compare the policy team to the lobbyists. Even though those in house lobbyists sit on the policy team. We're kind of counting everyone. We take all the people who are in the public disclosure data as lobbyists and keep them as one bucket.
And everyone else in policy is the other bucket. Yeah, I don't know if that answer you-.
>> Erin Baggott Carter: I feel like you have a deeper kind of substance. I think that what emergency. So the LDA database must not give you like a contract value. So maybe an implication is that, like it might be interesting for you in the future to also look at the FAR data.
So the FAR data has the contract value. So lobby a salary sure. But like everything else is paid to the firm. So if you could find the same.
>> Andrew Hall: I see.
>> Erin Baggott Carter: There, you know, where you can actually those two things. That would be. That would be really.
>> Andrew Hall: Yeah, that's helpful for sure. Yeah, the downside will be we won't have the comparable money for the policy team. But yeah, that would be super interesting.
>> Erin Baggott Carter: Yep, I mean there might be some overlap in your data. So I'm thinking like a firm like Boeing, for example, which, yeah, many lobbyists working in China registered, etc., for those sorts of things, but also does a ton of lobbying domestically on other issues.
So with some lobbying some firms with some overlap.
>> Andrew Hall: Okay, that's super interesting.
>> Steven Davis: I've got some questions and comments. First, just back to your initial theme of broadening the lens through which we look at data that is designed to influence the activities that are designed to influence the policy process.
And I like, I like the enlargement, but there's much more that's still outside the scope of what you're looking at. You hinted at it with the remarks about cra. So CRA does a lot of corporate sponsored research that is designed to provide an informational, analytical basis for these companies to think about their policy environment, but also potentially to influence it.
And there's a lot of money spent on that.
>> Andrew Hall: Yes.
>> Steven Davis: In some areas there's enormous sums spent in the competition policy arena by companies that are subject to that. These are not small sums. Pharmaceuticals, you mentioned pharmaceuticals, that's another one. So there's an enormous amount and I think it's close to what you think of as the informational aspect of lobbying and quotation mark.
It's really policy influence. So that's done in many ways. It's done by retaining academics directly. It's probably done more often by retaining firms like CRA. It'd be nice to try to quantify that if you could.
>> Andrew Hall: Yeah.
>> Steven Davis: At least get rough numbers. And then there's another set of think tanks and non-profit organizations that are also basically, many of them see themselves as in the policy influence, information provision business.
Many of them, not all of them, but many of them are funded by corporate sources to some extent or entirely. So I think if you could bring that in, it would just reinforce your overall theme that the policy influence in exchange for the information provision and analytical framework, provision in exchange for some impact on policy is even more important than what's suggested by your existing data.
>> Andrew Hall: Yeah, I totally agree. We thought about some directions to go. The dream would be detailed budgets of these companies. Even if we had a detailed budget for one company, I think it would be fascinating to lay some of that out. Of course, we can't get that. So what we've thought about along your lines is trying.
Yeah, trying to kind of landscape the consulting industry, the academic papers, the expert testimony. I think there are ways to broaden in that way.
>> Steven Davis: Yeah, exactly. Even one or two case studies like I know from personal experience that Microsoft went through its experience in the antitrust arena in the 90s, started to realize, we need to fund research that kind of tells our story and that explains what an operating system does.
I participated in some that, but so did many others. I believe Google, Meta, they've all gone down that path to some extent. For sure. So you might do it that way, other than a small comment with a question. So you use ChatGPT. I've done similar classification exercises to what you're describing, using the text and job adverts here to classify whether the job is suitable for work from home or the job on offer allows some work from home.
We used Bert and we tried ChatGPT. We got good performance with ChatGPT. We got better performance with Bert.
>> Andrew Hall: Okay.
>> Steven Davis: The reason is Bert's a much smaller LLM than is Chat GPT. But you can train it and tailor it in a way to your own data set in a way that you can't do with CHAT GPT.
So I'm just wondering, yeah, maybe Chat GPT is perfectly adequate for your purposes, but I suspect you could do better with more work by taking a BERT model and training it, tailoring it to your specific setting.
>> Andrew Hall: That's great.
>> Steven Davis: Which is, it's a little different than the mounds of data that CHAT GPT has been just trained on huge amounts of data, but it's not tailored to your setting the way you could do with BERT.
>> Andrew Hall: And when you say it performed better, that's relative to your.
>> Steven Davis: It's an accuracy metric. I'm trying to remember. I was way talking, I was trying to dig up the key slide, but basically, our accuracy metric was, I think we got something like 99% accuracy with our preferred BERT model and a 98% accuracy with Chat GPT.
So in an absolute sense the gain wasn't large, but in a relative sense it was pretty big.
>> Andrew Hall: And the ground truth, there was human ground.
>> Steven Davis: The ground truth there was human readings against they were benchmarking the outcomes of the LLMs.
>> Andrew Hall: Yeah, okay, that's super helpful, I mean, definitely what the next step for us before this becomes a full paper is exactly this kind of thing.
We've kind of done the first attempt at it. We know roughly what we're finding. I don't think it's gonna change that much, but we have not. We have a lot of work to nail this down. So I had the thought of, yeah, we're going to do some human coatings to compare and then with those, armed with those, maybe the next thing we should do is the BERT version as well.
That's super helpful. All right, well, this is, Steve, please.
>> Steven Davis: Well, I was just also gonna say this might sound a little self serving. The spirit of what you're doing is very much in the same thing we did in the policy uncertainty paper. The settings very different, the techniques are different.
But in that setting what we did was, you know, at the end of the day we have some simple set of keyword searches that we're going to have to implement because that's what the interfaces would allow us to do at scale. But we evaluated alternative combinations of term sets by going and taking the full text of many thousands of newspaper articles randomly selected, read by many readers and then asking, well, okay, we use that.
That became the source of our ground truth.
>> Andrew Hall: Got it, got it.
>> Steven Davis: Wanted to ask, well, okay, we can't do this at scale, we do something much simpler at scale. But how does the simple thing work? You're doing the same, you're essentially doing the same thing here.
>> Andrew Hall: Yeah, it's like isomorphic.
>> Steven Davis: You don't have the full text of all the ads. So you want to ask, okay, that's where I get my ground truth from. But then I want to do something I can scale up readily and so I need another simpler thing. So this kind of, this is a very practical issue that comes up a lot and it's not sexy or anything, you know, but at the end of the day, often what makes these things work at scale is can I find a really simple scalable method that I can apply that does a pretty good job even though it doesn't do as well as I could do if I had everything I wanted?
>> Andrew Hall: Totally, totally, yeah, yeah, we'll take a look at that. Cuz those are the two things you're hitting on the two things that have been on my mind. One is like nailing the ground truth where we have the description data, and then the second is the simplest but clearest way to extrapolate that to the whole.
>> Steven Davis: Exactly, exactly.
>> Andrew Hall: Yeah, yeah.
>> Erin Baggott Carter: Great, so we have a ton of great questions in the Q and A. So let's move to a few of those. So Hoyt Bleakley, economists ask two questions. So first, how do industry associations figure into the analysis? Your results size might be explained by small firms outsourcing this function, something that they certainly do.
Two, have you considered using the education of the worker, having an MPP versus an MBA, predicting their trajectories?
>> Andrew Hall: Yeah, okay, on the second one, the second one's easier, so I'll tackle it first. We have looked at the education. What is funny, and this is consistent with my own experience in tech, is that MPAs or MPPs are not all that well represented in policy teams.
And in fact, it turns out lawyers are kind of everywhere. I'm married to a lawyer, so I'm used to it. But there are clear, there are clear differences, of course, by sort of other functions of the company. We haven't quite figured out what more we want to do with that, but that's on our mind as something to do.
Next, on the industry associations or trade associations, we have not done much to look into that yet. It's definitely on my list of things we need to think about. I think It'll be easiest to tackle that with respect to lobbying, where we can get the data on how they lobby.
And so we can certainly look when people have done that in the past. My impression is that it turns out that the large companies are shouldering most of the burden, consistent with kind of a, I don't know, Olson style, collective action type story from a policy team perspective.
Yeah, I guess it would, I have to think about a little bit how to interpret it, but in some sense it would be consistent with the idea that they don't employ any policy people themselves. But if they're making large contributions to the industry association, then I guess that's where they're getting their.
The bang for their buck. Yeah, I guess probably the best thing we can do is refer back to some of this research on who actually contributes the most to those associations. But yeah, I need to think more about it. It's a good flag.
>> Erin Baggott Carter: Fantastic, so I think a couple of people here have the capability to answer live.
Elizabeth, do you want to ask your question live?
>> Elizabeth: Sure, happy to, so when you were reading the descriptions, it struck me that some of these employees seemed like policy team employees, seemed like extensions of lobbyists. Like these are just Lobbyists that are not being counted in the disclosure data.
They're informing their strategy, but some of them seem like they're doing something else. They're not trying to influence government, they're trying to figure out how to change the company to comply with government or to anticipate government's demands. And those don't really seem like lobbyists. They seem like kind of compliance people or something else like that, and are engaged in kind of a fundamentally different task.
And you know, especially the high level job descriptions. That job includes both, it seems like. So is there a way to take what you have data wise and parse out either how many jobs or what share of jobs are being focused on, you know, changing the government versus changing the company?
>> Andrew Hall: Yeah, I wish there was. I don't know, maybe we can through the job descriptions or something like that. I think there's a couple different things on my mind based on what you said, that are all important. One, which is the positive side of this from a research perspective is sort of making the point that the way policy teams influence politics is not only by communicating directly with policymakers, but by actually influencing the firm and getting the firm to do different things as part of a political strategy.
And that part of it, I think is very, very important in my experience, is a super important part of what policy teams do. And I would love to be able to directly establish that more. I don't know exactly how to do that with the limited information that we have.
There's a second thing which you might also be getting at, which is that some people we're calling policy conceptually don't really fit in the sense that they're not really trying to influence politics as much as they're, as you put it, like simply trying to comply with rules and laws.
And we made the decision that we want to try to exclude those people. You could make an argument for exactly, where you draw the line on what counts as pure compliance versus influenza policy is very challenging. We decided we wanted to be kinda conservative and try to exclude those people to the extent we could.
We did that through the way we defined and read the job descriptions. But I do think it's a major challenge for this project. It's definitely something I anticipate reviewers asking about is like, have we done that well enough? And I'll just say I didn't have time to put this in the slides.
But my biggest concern related to that is in the pharmaceutical industry. So one of the things we have these kind of follow up analyses. I don't know what we're gonna do with them. They may not fit in this paper. Maybe we'll write a whole paper on the pharma.
There are all these small to medium sized research companies. They tend to have maybe a single piece of intellectual property in the biopharma space. So they have 40, 50 employees and they'll have 10 people that we categorize as being policy. And so percentage wise, they're way off the map in terms of like investing in policy.
And when you look at what those people do, man, like manually, we went down a crazy rabbit hole reading up on these people. It's very hard to figure out whether they should count as policy or not by the letter of the words of what they do. They kind of like liaise with the FDA to manage compliance with these clinical trials.
And so that kind of makes me lean towards thinking we shouldn't count them. To the extent they're kind of like following a prescribed set of procedures for getting a drug approved, that's not really the same as what we're talking about. But then you look into it more and it turns out like, you know, like these trials are actually really complicated and there's a lot of judgment calls and a big part of their job is persuading the FDA that this should be approved.
It starts to sound more like. And if you read kind of the job listings, it talks a lot about monitoring the global pharmaceutical policy environment. And so that was one where we're nervous. Our general instinct is to be conservative, exclude anyone we think is doing straightforward compliance work.
But I think it's definitely gonna be one of the enduring challenges of this project is figuring out where to draw the line on those people. And pharma turns out to be one place where that's particularly difficult.
>> Erin Baggott Carter: Fantastic, well, sadly, we are out of time for the formal proceedings.
We still have a few really excellent questions. So Brett and Mefedia, and Tara, I hope you can all stick around after we turn off the recording. Andy, thank you for an absolutely fascinating conversation and a really promising project. Keen to see where this goes, thank you. And thank you all.
>> Steven Davis: Thank you.
>> Erin Baggott Carter: For the Hoover Institution, their text to data analysis workshop in policy, thank you.
ABOUT THE SPEAKERS
Andrew B. Hall is the Davies Family Professor of Political Economy at the Stanford Graduate School of Business and a Senior Fellow at the Hoover Institution. Hall’s research team uses large-scale quantitative data and cutting-edge methods from econometrics, statistics, and computer science to study American democracy and the design of democratic systems of governance in the online and offline worlds. Hall serves as an advisor to Meta Platforms, Inc and the a16z crypto research group.
Steven J. Davis is the Thomas W. and Susan B. Ford Senior Fellow at the Hoover Institution and Senior Fellow at the Stanford Institute for Economic Policy Research. He studies business dynamics, labor markets, and public policy. He advises the U.S. Congressional Budget Office and the Federal Reserve Bank of Atlanta, co-organizes the Asian Monetary Policy Forum and is co-creator of the Economic Policy Uncertainty Indices, the Survey of Business Uncertainty, and the Survey of Working Arrangements and Attitudes. Davis hosts “Economics, Applied,” a podcast series sponsored by the Hoover Institution.
Erin Baggott Carter is a Hoover Fellow at the Hoover Institution at Stanford University. She is also an assistant professor in the Department of Political Science and International Relations at the University of Southern California, a faculty affiliate at the Center on Democracy, Development and the Rule of Law (CDDRL) at Stanford University’s Freeman Spogli Institute, and a nonresident scholar at the 21st Century China Center at UC San Diego. She has previously held fellowships at the CDDRL and Stanford’s Center for International Security and Cooperation. She received a PhD in political science from Harvard University.