Nearly two decades of influential scholarship on how corporations are governed and valued is based on bad data, according to new research co-authored by Cathy Hwang of the University of Virginia School of Law.

The paper, “Cleaning Corporate Governance,” reveals that an index cited thousands of times by scholars to measure corporate governance and shareholder rights is riddled with errors.

Written by Hwang, Columbia Law School postdoctoral fellow Jens Frankenreiter, Wisconsin law professor Yaron Nili and Columbia law professor Eric L. Talley, the new research also offers a dataset with pilot data to rectify the problem, creating a clearer picture about the power dynamics that control corporations and what that might imply in terms of profit potential, valuation and long-term prospects, among other business factors.

With data gathered by dozens of law students and scholars, the paper unveils a collection of three decades’ worth of corporate charters for thousands of public companies, revealing that faith in the accuracy and integrity of a small, specialized collection of corporate governance data has been misplaced.

Hwang discussed how her team’s dataset operates and how she says it will play a role in corporate governance in the future.

What inspired you and your co-authors to write this paper?

For years, lawyers have raised our eyebrows at some of the major findings in corporate finance. Attempts to examine the underlying data, however, have been unsuccessful — empirical corporate governance data are hard to get and parse. The data that does exist, from commercial vendors, is super-expensive and of dubious provenance. To investigate the received corporate governance wisdom — from either legal scholars or business/finance scholars — we knew that we had to start by collecting the data from scratch.

Our data, which is called the “Cleaning Corporate Governance,” or CCG, data, includes the past 25 years’ worth of historical charters from S&P 1500 companies, hand-labeled across numerous governance metrics. It’s supplemented with state-level panel data that tracks 16 statutory governance rules across 50 states and Washington, D.C., tracked over nearly 30 years.

We decided very early on that we wanted to make CCG free and open-access. A lack of access has been a chief culprit in data inaccuracy. Moreover, we’re really cognizant of the fact that newer researchers don’t have the resources to buy or gather data — and we think their brilliance shouldn’t be hampered by cost. We want CCG to become the gold standard data in our field, and we want everyone to be able to use it and improve it. 

What did your research uncover about this disputed data?

With this first paper, we take one small bite: We revisited a famous 2003 paper by Paul Gompers, Joy Ishii and Andrew Metrick. This paper introduced the “Governance Index” or “G-index” — a measure of how much firm governance rules protect shareholders — and has been cited nearly 10,000 times, and numerous other governance indices are based off of the G-index. The paper’s most famous finding is that firms with stronger shareholder rights are correlated with higher firm value, high profits and higher firm growth.

But the data that underlies the G-index study (like almost all other studies in empirical corporate governance) comes from a private data source that is very expensive to obtain. The vendor’s data collection practices also aren’t very transparent. So we put together our own version of the data, and found that the original data is over 80% incorrect. We then replicated the G-index study using our own data, and their most famous finding is much more attenuated. 

Can you describe how the G-index became so canonical?

The G-index study was one of the first attempts to quantify corporate governance data, which usually comes in the form of written text in charters, bylaws, state statutes and other legal documents. Simple questions like “does this firm allow staggered voting of directors?” can only be answered by looking in many different places and thinking about how they interact. Because quantifying corporate governance data is hard — but corporate governance is clearly important — the first data that was able to quantify corporate governance took off like a rocket. The G-index study was pathbreaking and really opened the door to empirical corporate governance research. We are fans of the G-index project — but it’s time to update that data.

Can you give us a sense of the effort in obtaining this data and putting this research together as both a resource in the database and analysis for the article?

We started working on the project about half a year before the pandemic. But in the summer of 2020, we realized that we all knew a lot of students who’d be left at loose ends. We basically hired dozens of research assistants and spent the summer working with them to gather this data. We couldn’t have done it without them.

Our RAs painstakingly combed through the SEC’s online database to find all of the historical charters, and they coded those and the other resources that make up the CCG database.

UVA students who worked on this project include Nicole Banton ’21, Sean “Mike” Blochberger ’22, Matthew Cunningham ’21, Channing Gatewood ’21, Eli Jones ’21, Andrew Kim ’21, Doriane Nguenang Tchenga ’21 and Olivia Roat ’21.

If there isn’t a connection between shareholder rights and investment return, what questions does it raise?

There are so many questions that arise out of this! Just one that’s really interesting to me right now is the role of stakeholders in the firm. Existing datasets have focused on shareholder governance, but using CCG, a researcher could really figure out a way to measure stakeholder involvement and test whether it impacts factors like investment return. 

What kind of impact could this database have on the study of corporate governance?

The CCG data will be free and open-access. So we’re really hoping CCG will become the gold standard for empirical corporate governance research — both in law and business — going forward. It will allow researchers to revisit some of the most important findings of the last few decades.

We also include the underlying textual corpus (i.e., the actual charters we examined), which are ripe for new research techniques, such as machine learning and computational analysis.

Founded in 1819, the University of Virginia School of Law is the second-oldest continuously operating law school in the nation. Consistently ranked among the top law schools, Virginia is a world-renowned training ground for distinguished lawyers and public servants, instilling in them a commitment to leadership, integrity and community service.

Media Contact