Copy, Paste, Legislate

Published — April 4, 2019

How we uncovered 10,000 times lawmakers introduced copycat model bills — and why it matters

(Andrea Brunty/USA TODAY NETWORK, and Getty Images)

This story was published in partnership with USA TODAY and The Arizona Republic. 


How do you find 10,000 needles in 50 haystacks?

That, in effect, is what journalists and developers with USA TODAY and The Arizona Republic set out to do two years ago: Identify among the roughly 100,000 bills introduced in the 50 states each year what’s been copied from drafts pushed by special interests.

Here’s how we did it.

Using data provided by LegiScan, which tracks every proposed law introduced in the U.S., we pulled in digital copies of nearly 1 million pieces of legislation introduced between 2010 and Oct. 15, 2018. The data included a limited number of bills from 2008 and 2009.

We then asked a dozen reporters covering state legislatures for USA TODAY Network newsrooms across the nation to build a list of model bills by searching special-interest groups’ websites, scouring news coverage and interviewing lobbyists and lawmakers. We identified more than 2,100 models, a list that is far from complete because many groups don’t make their models public.

We then used a computer algorithm designed to recognize similar words and phrases and compared each model in our database to the bills that lawmakers had introduced.

These comparisons were powered by the equivalent of more than 150 computers, called virtual machines, that ran nonstop for months.

How did we compare bills with model legislation?

Even with that computing power, we couldn’t compare every model in its entirety against every bill. To cut computing time, we used keywords — guns, abortion, etc. Some bills have 30 to 40 keywords associated with them.

The system only compared a model with a bill if they had at least one keyword in common.

If there was a keyword match, the system compared the documents looking for strings of six or more words that appeared in both. For this search, the system used “stemmed” words, meaning they had been converted to their root. (For example, walk, walks, walked, and walking all become walk.)

If a bill and a model shared at least one keyword and one six-word string, the system assigned a score reflecting how similar the two documents were.

How our scoring system worked

Our scoring system is based on three factors: the longest string of common text between a model and a bill; the number of common strings of five or more words; and the number of common strings of 10 or more words.

Based on those factors, bills received scores on a 100-point scale. The closer to 100, the more likely a bill was copied from model legislation.

For its analysis, USA TODAY/Arizona Republic used only bills that scored 80 or higher. At that level, substantial amounts of text have been duplicated.

Another estimated 10,000 bills below the 80-point threshold were likely copied from model legislation but matched less of the model’s text. Out of caution, USA TODAY/Arizona Republic cited in its investigation only bills with substantial portions copied from a model. In addition, if legislators copied an idea but not the precise language, a bill would not be flagged.

Joe Walsh, a former data scientist at the University of Chicago, used what’s known as the Smith-Waterman algorithm to create the Legislative Influence Detector, which also finds similarities between model legislation and bills. His system has been used by reporters around the country to find model bills.

Walsh reviewed USA TODAY/Arizona Republic’s investigation and findings and applauded its scoring system for showing when a bill has been substantially copied from model legislation.

“It’s really clear, the numbers are nice and round, and it’s easy to show and explain,” Walsh said. “I wish that we were able to do some of this stuff. I am glad someone is.”

Can I examine the results?

USA TODAY/Arizona Republic continues to search legislation and compare it with known model bills from around the country, furthering its investigation of outside influences on state lawmakers.

Initially, the system is being rolled out to USA TODAY Network journalists for use in reporting on state legislatures.

How were bills categorized?

Special-interest groups, both liberal and conservative, have for years crafted and lobbied for model bills. Generally, the organizations that craft the bills have a clear mission or ideological bent. The American Legislative Exchange Council, the best-known and one of the most prolific model-bill factories, supports conservative ideas and efforts. The State Innovation Exchange, once known as ALICE, is in effect ALEC’s liberal counterpart. We classified bills based on the mission or ideological orientation of the organizations that created each model. In some cases, groups with a conservative bent also push bills that benefit industry. We labeled each bill according to the most dominant characteristic.

Read more in State Politics

Share this article

Join the conversation

Show Comments

Leave a Reply

4 Comment threads
0 Thread replies
Most reacted comment
Hottest comment thread
4 Comment authors
Duncan McEwanDan MarayeJon Michael YeagerSherbroek Recent comment authors
newest oldest most voted
Notify of

How do you stop this? A federal law mandating legislation source transparency? Or mandating that the full text of every proposed piece of legislation is published in full for a period of time before it is voted on to give citizens a chance to speak for or against it? Prevent the beneficiaries of any legislation in any one session from making political donations? Require sponsors of any bill to have to describe the bill from the floor to prove they know what’s in it?

Jon Michael Yeager
Jon Michael Yeager

Interesting. I explained,to Public Integrity, in great detail, a specific Washington State Lien Law, ‘Correction’, done in the early 90’s. It is of course, documented. A Federal Bank had financed a dairy processing company. It’s new borrower defaulted in under a year, despite the Processor having an 85 consecutive year presence in the State. The Bank sought protection from losses the new borrower would have laid upon it. The Bank, utilizing its own, “In house counsel” ( one lawyer) , wrote a ‘new and improved’ lien law, so that the bank could intercept checks that would have been paid to… Read more »

Dan Maraye
Dan Maraye

A critical analysis of World Bank and the IMF recommendations to several developing countries smell of the same weaknesses.

Duncan McEwan
Duncan McEwan

If an elected official has been in office for more than six years, they are on the take. Some group or another is paying them. And if you are concerned about the laws you should also be concerned about how the money is spent by government agencies. When a government agency needs outside help to properly conduct the business of the agency they are required to publish a Request For Proposal (RFP) in the Commerce Business Daily (CBD). The agency is also required to have an Acceptance Test Plan (ATP) to determine whether the work has been completed correctly. Frequently… Read more »