Twilight of the Human Hacker

Secretive Pentagon research program looks to replace hackers with AI

by Zachary Fryer-Biggs September 13, 2020January 28, 2022

Reading Time: 17 minutes

This story was published in partnership with Yahoo News.

The Joint Operations Center inside Fort Meade in Maryland is a cathedral to cyber warfare. Part of a 380,000-square-foot, $520 million complex opened in 2018, the office is the nerve center for both the U.S. Cyber Command and the National Security Agency as they do cyber battle. Clusters of civilians and military troops work behind dozens of computer monitors beneath a bank of small chiclet windows dousing the room in light.

Three 20-foot-tall screens are mounted on a wall below the windows. On most days, two of them are spitting out a constant feed from a secretive program known as “Project IKE.”

The room looks no different than a standard government auditorium, but IKE represents a radical leap forward.

If the Joint Operations Center is the physical embodiment of a new era in cyber warfare — the art of using computer code to attack and defend targets ranging from tanks to email servers — IKE is the brains. It tracks every keystroke made by the 200 fighters working on computers below the big screens and churns out predictions about the possibility of success on individual cyber missions. It can automatically run strings of programs and adjusts constantly as it absorbs information.

Personnel from both the NSA and U.S. Cyber Command work in the ICC to achieve national security objectives in cyberwarfare. — Personnel from both the NSA and U.S. Cyber Command working in the ICC. (NSA)

IKE is a far cry from the prior decade of cyber operations, a period of manual combat that involved the most mundane of tools.

The hope for cyber warfare is that it won’t merely take control of an enemy’s planes and ships but will disable military operations by commandeering the computers that run the machinery, obviating the need for bloodshed. The concept has evolved since the infamous American and Israeli strike against Iran’s nuclear program with malware known as Stuxnet, which temporarily paralyzed uranium production starting in 2005.

Before IKE, cyber experts would draw up battle plans on massive whiteboards or human-sized paper sheets taped to walls. They would break up into teams to run individual programs on individual computers and deliver to a central desk slips of paper scrawled with handwritten notes, marking their progress during a campaign.

For an area of combat thought to be futuristic, nearly everything about cyber conflict was decidedly low-tech, with no central planning system and little computerized thinking.

IKE, which started under a different name in 2012 and was rolled out for use in 2018, provides an opportunity to move far faster, replacing humans with artificial intelligence. Computers will be increasingly relied upon to make decisions about how and when the U.S. wages cyber warfare.

This has the potential benefit of radically accelerating attacks and defenses, allowing moves measured in fractions of seconds instead of the comparatively plodding rate of a human hacker. The problem is that systems like IKE, which rely on a form of artificial intelligence called machine learning, are hard to test, making their moves unpredictable. In an arena of combat in which stray computer code could accidentally shut down the power at a hospital or disrupt an air traffic control system for commercial planes, even an exceedingly smart computer waging war carries risks.

Like nearly everything about such warfare, information about IKE is classified. As even hints about computer code can render attacks driven by that code ineffective, minute details are guarded jealously.

But interviews with people knowledgeable about the programs show that the military is rushing ahead with technologies designed to reduce human influence on cyber war, driven by an arms race between nations desperate to make combat faster.

Using the reams of data at its disposal, IKE can look at a potential attack by U.S. forces and determine the odds of success as a specific percentage. If those odds are high, commanders may decide to let the system proceed without further human intervention, a process not yet in use but quite feasible with current technology.

Ed Cardon, a retired lieutenant general who served as the head of the Army’s cyber forces from 2013 to 2016, spent years trying to persuade senior military and White House leaders to use cyber weapons, especially during his tenure running U.S. Cyber Operations against ISIS. He faced stiff opposition because of concerns about the potential of cyberattacks to muddle international relations.

Retired lieutenant general Edward Cardon (Stacy Niles/DVIDS)

His pitches typically included a lot of guesswork. If Cardon was laying out plans, he’d have to include a slew of unknowns, a couple of maybes and a yes or two when mapping the probability of success. All too often, when Cardon tried to get permission for an operation and had to describe the uncertainty associated with it, the answer would be no.

Cardon, who speaks in a way that forces the listener to lean in, told me that fear of political repercussions was why only a handful of offensive cyber operations were approved during the Obama administration.

But what he saw with IKE could change all that.

“That was what was powerful,” Cardon said. “It categorized risk in a way that I could have a pretty good level of confidence.”

The Stuxnet episode explains why the U.S. has been hesitant to use cyber weapons. The initial attempt to disrupt Iranian uranium enrichment had worked, blowing up centrifuges in a highly protected nuclear facility, but the code that made the attack successful somehow escaped from that system and started popping up across the internet, revealing America’s handiwork to security researchers who discovered the bug in 2010. That led to strict rules governing how and when cyber weapons could be used.

Those rules were laid out in 2013, when President Barack Obama signed a classified order, Presidential Policy Directive 20 that outlined a series of steps, including high-level White House meetings, that would have to take place before U.S. Cyber Command could attack. Military officials quietly complained that the order tied their hands because it was almost impossible to get approval for operations, given the uncertainty around their outcomes.

After the order was in place, the number of global cyberattacks, including those against the U.S., surged. Military defenders had a hard time keeping up; the speed of combat escalated to the point that Pentagon officials feared U.S. networks would be overwhelmed.

In September 2018, President Trump signed off on National Security Policy Memorandum 13, which supplanted Obama’s order. The details of the policy remain classified, but sources familiar with it said it gave the secretary of defense the authority to approve certain types of operations without higher approval once the secretary had coordinated with intelligence officials.

The Trump order took effect just before IKE matured from an earlier research program. The order wasn’t issued because of IKE, but both were part of a wave of new technologies and policies meant to allow cyberattacks to happen more quickly.

With IKE, commanders will be able to deliver to decision makers one number predicting the likelihood of success and another calculating the risk of collateral damage, such as destroying civilian computer networks that might be connected to a target.

IKE is the model of what cyber warfare will look like, but it’s just the beginning. Any automation of such warfare will require huge amounts of data — that IKE will collect — to teach artificial intelligence systems. Other programs in development, such as Harnessing Autonomy for Countering Cyberadversary Systems (HACCS), are designed to give computers the ability to unilaterally shut down cyber threats.

Coming soon to the battlefield: Robots that can kill

Round One: Machine beats man in air-combat exercise

The Pentagon tries to win hearts and minds in Silicon Valley

All of these programs are bringing cyber warfare closer to the imagined world of the 1983 film WarGames, which envisioned an artificial intelligence system waging nuclear war after a glitch makes it unable to decipher the difference between a game and reality.

IKE hasn’t been turned into a fully autonomous cyber engine, and there’s no chance nuclear weapons would ever be added to its arsenal of hacking tools, but it’s laying the groundwork for computers to take over more of the decision making for cyber combat. U.S. commanders have had a persistent fear of falling behind rivals like China and Russia, both of which are developing AI cyber weapons.

While the growing autonomous cyber capabilities are largely untested, there are no legal barriers to their deployment.

What worries some experts, however, is that artificial intelligence systems don’t always act predictably, and glitches could put lives at risk. The computer “brains” making decisions also don’t fret about collateral damage: If allowing U.S. troops to be killed would give the system a slight advantage, the computer would let those troops die.

“The machine is perfectly happy to sacrifice hands to win,” Cardon said.

As these increasingly autonomous systems become more capable, top White House officials must decide whether they’re willing to give AI computers control of America’s cyber arsenal even if they don’t understand the computers’ decision making.

“The nice thing about machine learning systems is that they often spit out numbers,” said Ben Buchanan, a professor of cybersecurity and foreign policy at Georgetown University and author of The Hacker and the State. “The dangerous thing is that those numbers aren’t always right. It’s tempting to assume that, just because something came from a computer, it’s rigorous and accurate.”

Plan X: A new type of cyberwarfare

The basics of cyber operations are fairly simple. Experts, whether working on offense or defense, have to figure out which computers or other devices are on a network, whether they have any weaknesses in their defenses. Then hackers exploit those weaknesses to take control of a system, or, if they’re playing defense, fix the vulnerability.

Having gained control of a system, an attacker can pretty much do what he or she wants. For intelligence agencies, that usually means meticulously monitoring the network to learn about the adversary. The rest of the time cyber operators are looking to disrupt the system, destroying or replacing data to undermine an opponent’s ability to work.

Thus far, full-scale cyberwar hasn’t broken out, with combat confined to skirmishes between countries that try to deny responsibility for strikes. One of the benefits of using computers is that countries relay the code that runs their attacks through multiple networks, making it harder to track the source of the attack. But the dozen or so countries with advanced cyber capabilities have been busy hacking everything from power plants to fighter jet manufacturers, thus far focused on stealing information.

For the first era of cyber warfare, which picked up steam at the beginning of the millennium, the processes of attacking or defending meant a manual series of steps. The Pentagon was busy purchasing one-off tools from tech companies offering solutions to track all the computers in a network and find weaknesses in their code. That meant one expert would sit at a computer using a program such as Endgame, while another, at a different computer, might use a piece of software such as Splunk. Everything moved slowly.

“You could be sitting right next to each other, and the person right next to them would not have any idea what the other was doing,” John Bushman, a former U.S. Army Cyber Command official told me.

To create a battle plan, experts would have to step away from their computers, draw up strategies on whiteboards or sheets of paper, return to their stations and engage in a series of sequenced moves to win the battle. By 2012, the military had tired of this old-school approach. It was tedious work, given that so much coordination had to take place away from keyboards. Almost every cyber unit could report at least one instance of a bleary-eyed hacker accidentally leaning against a whiteboard and wiping out a battle plan.

The Pentagon tasked its research arm, the Defense Advanced Research Projects Agency (DARPA), famous for inventing the internet and the computer mouse, with trying to come up with a better way to run cyber wars.

“The rule for the first couple of years was, if we end this program with a keyboard and a mouse as the interface to our data, we have failed.”
Jeff Karrels, Plan X developer from contracting firm Two Six Labs

Early systems helped simplify what the troops were doing, but they were still facing massive hurdles, mainly because there are fewer than 7,000 experts at U.S. Cyber Command trying to defend countless systems.

DARPA’s answer was to contact software companies for a new program officially called Foundational Cyberwarfare, but affectionately nicknamed “Plan X.”

In its announcement of Plan X in 2012, DARPA made clear that cyber warfare had to get beyond the “manual” way of waging war, which, it said, “fails to address a fundamental principle of cyberspace: that it operates at machine speed, not human speed.”

Nearly a decade later, Plan X has morphed into Project IKE. The Pentagon will spend $27 million on it this year, and plans to spend $30.6 million next year.

The original work on Plan X looked nothing like the team-management and predictive engine that IKE would become. It was much closer to the interactive screens and neon lighting of the film Minority Report, focusing on displays of data showing what was happening on computer networks.

“The rule for the first couple of years was, if we end this program with a keyboard and a mouse as the interface to our data, we have failed,” Jeff Karrels, who runs the division of the contracting firm Two Six Labs that built much of Plan X, told me in an interview. Researchers toyed with having hand gestures control the system along with three-dimensional holographic projections.

Instead of the old sand tables with little models of troops and tanks, the new visual system would be fed by a constant stream of data on the work of U.S. cyber troops. Two Six hired game developers to work on the interaction between humans and the complex models they’d be presented with.

Eventually, enthusiasm for the virtual-reality version of battle waned. Those working on the program began engineering a new way to combine different cyber software the Pentagon had already bought so it could all work on the same computers. That meant helping to create automated tools that could run several programs in quick succession, speeding up operations by reeling off a string of steps in a campaign.

The shift resulted in a system that wasn’t focused on the big picture — planning and running wars — as had been one of the original goals of the program. Rather, the aim was to simplify some smaller steps.

That was until 2015 arrived.

Up until that point Cardon, the retired lieutenant general, had been keeping an eye on the program and feared the experts were missing an opportunity.

Frank Pound, who managed Plan X for DARPA, remembered sitting in a meeting that year with Cardon to discuss the progress that had been made. U.S. Army Cyber Command, the group that Cardon commanded, had become closely involved with the program early in its development although it was a DARPA project.

“We were trying to build a system that would allow them to fight back,” Pound recalled in a 2018 interview, describing the pivot to combined software.

Cardon had a different message.

“Oh, it’s much more than that,” he said. Cardon nearly reached across the table and grabbed Pound by the lapels. He wanted Pound to see Plan X’s full potential. It could help coordinate all cyber operations, while constantly chewing through information on Defense Department networks to find new vulnerabilities and ferret out attackers. It could use all of that data to help make decisions on which attacks by U.S. forces might work, and when they should be used.

That vision, closer to the all-consuming platform that DARPA had originally described in 2012, would suddenly seem very different once another DARPA program took center stage in the summer of 2016.

Sure, Plan X might be able to help digest all the data about computer networks, but what if it could feed a system smart enough to wage its own cyber war?

Machine learning

At first glance, the Mayhem Cyber Reasoning System looks like an engorged gaming computer, a black rectangular box about 7 feet tall with neon lights and a glass side revealing row after row of processors. When the National Museum of American History decided to display the machine in 2017, it sat in a hallway near an exhibit showing off some of the nation’s greatest inventions, including a model of the original prototype that would lead to Morse code.

The glowing Mayhem box might not seem worthy of comparison to that earth-shattering invention, but a museum curator and a slew of experts with DARPA thought it might herald a seismic shift in cyber warfare.

Mayhem was the victor in a 2016 DARPA competition, besting a half-dozen competitors in a hacking competition. What made this competition different from previous ones was that Mayhem had no human directing its actions. Once challenged, it had to make its own decisions about when and how to attack competitors and how to defend its own programs, developing strategy for how to win a contained cyber war that played out in five-minute rounds over the course of a day.

Curator Arthur Daemmrich walked a group of DARPA officials through the museum for the exhibit’s grand opening. The officials told Daemmrich that they felt an obligation to develop new cyber systems because of the organization’s ties to the birth of the internet.

“DARPA at some level feels a responsibility to have the internet function in a secure fashion and not be rendered useless by hacking,” he said.

Finding a way to automate cybersecurity is the kind of complex problem DARPA likes to grapple with. The biggest projects launched by the agency tend to come in the form of what it calls “Grand Challenges,” some of which can be bit too grand. A 2004 competition testing autonomous vehicles had 15 entrants vying for a $1 million prize. None managed to complete the 150-mile driving course, and the winner managed only 7.3 miles.

DARPA viewed this not as a failure but as a sign it was helping to advance the technology in the field. A 2005 competition had nearly two dozen entrants, five of whom managed to complete the 132-mile course. A machine developed by a team from Stanford University recorded the fastest time and won its handlers a $2 million prize.

In 2013, DARPA announced the Cyber Grand Challenge, the competition Mayhem would claim. The winning team would get $2 million if it won a capture-the-flag contest modeled on the one held every summer in Las Vegas at the DefCon hacking convention. It’s the gold standard for such events, pitting teams of humans against each other to attack and defend custom-built computer networks while scoring points based on how successfully they can meddle with their opponents’ computers while protecting their own. It’s a microcosm of the kind of combat hackers encounter in the real world.

The very idea for the Cyber Grand Challenge had come out of the DefCon competition. Mike Walker, the DARPA program manager who would run the Cyber Grand Challenge, had spent years competing in the DefCon capture-the-flag competitions and noticed an increased use of automated tools. These were narrow in scope, limited to what hackers call “fuzzing” — a brute-force effort to throw challenges at a piece of software until something breaks. When it does, hackers reverse-engineer the problem to see if they can use it to sneak into a system.

Walker, who declined to be interviewed for this story through his current employer, Microsoft, admired the progress computer systems like IBM’s Watson and Deep Mind’s AlphaGo had made in playing games like Chess and Go, according to former colleague Chris Eagle, a professor at the Naval Postgraduate School in Monterey, Calif.

The key question Walker kept asking was whether a computer could play capture the flag the way Watson played chess.

Machine learning is what makes Watson work. Simply put, it is given a large pool of data to pick through. Different techniques are used for the computer to learn lessons, but in general it finds patterns and uses those patterns to make predictions. This type of learning doesn’t yield the kind of near-human personality present in HAL 9000, the ill-intentioned computer in “2001: A Space Odyssey,” but it does allow a machine to arrive at its own conclusions independent of humans. And because it can analyze far more data than a human can in a short period of time, the predictions can evaluate a lot more detail.

The problem with machine learning is that computers can’t explain how they come up with the answers they do, meaning users have to trust that the conclusions are sound.

“Machine learning is often like a smart but lazy eighth grader taking a math test: It’s great at getting the right answer, but often pretty bad at showing its work,” Buchanan, the Georgetown professor, told me.

Mayhem used automation to allow the machine to make tactical decisions on when to break into an opponent’s system, or when to try to fix weaknesses in its own defenses, to an extent far beyond what the fuzzing hackers had used before.

Once the Cyber Grand Challenge competition got underway in August 2016, Mayhem began to malfunction. The system was supposed to constantly turn out new attacks and fixes to its defenses but went quiet.

“That’s when we realized that something was misbehaving very badly,” Alex Rebert, the leader of the Mayhem team, said. “It was rough, it was very stressful. We had just spent two years of our lives working on this, and then the day of the competition it misbehaves.”

The team members were deflated. They tried to get the other competitors to let them restart Mayhem, hoping that would flush out the bugs, but were turned down. Then something clicked. Mayhem sprang back to life and won.

One problem with systems that rely on machine learning is that it’s difficult to test them, and they can be prone to cheating. In one experiment, a computer taught to play Tetris concluded the best way to achieve its mission — not losing — was simply to pause the game.

Whichever tools military researchers develop using AI, it will be hard to gauge how they might work in combat. It’s still not clear how to test machine-learning systems that are constantly adjusting their conclusions based on new data.

Many experts in the cyber field had made the trip to Las Vegas to see the Cyber Grand Challenge, including Lieutenant General Cardon, and their imaginations were sparked.

Most in the audience didn’t see gremlins at work in Mayhem and didn’t know that even this apparently sophisticated combatant still had major bugs. They simply saw an autonomous system succeeding in cyber combat.

“When I saw that, I’m like, ‘Oh my gosh, if you put that together with Plan X, this is how you would conduct operations,’” Cardon said.

An early version of the Plan X program (DARPA)

A ‘new arms race’

Frank Pound, the head of DARPA’s Plan X effort, gave me a demonstration of the system during a conference in September 2018.

Spurred by Cardon’s insistence that the program could become the platform for all U.S. military cyber warfare, Plan X had morphed into a broader management tool for cyber operations.

Gone were the high-tech digital sand tables. Left were a pair of screens, with charts listing the people on each cyber team and who should report to whom. Modules on the side coughed up a stream of data showing what was happening on the network. You could click through to find out more about the hackers behind each dot, learning about their skill sets and past mission successes. The data on the screen was a mock-up, with the actual details classified.

“Think of it as a full-spectrum cyber operations platform specifically designed for military operations with all that rigor and discipline,” Pound told me.

Pound was a couple of months from leaving his post at DARPA, cycling out after several years, as most program managers do.

The DARPA program itself had evolved, but the broad outlines of its work were still public, and I was standing in an exhibition hall for a celebration of DARPA’s 60th birthday, and a demonstration on how commanders might use the system.

The Trump directive making it easier to launch cyberattacks was announced only a few weeks after Pound and I spoke.

And soon after that, control of Plan X would be snatched by one of the more secretive wings of the Pentagon’s research structure.

Ash Carter, then deputy secretary of defense, had created the Strategic Capabilities Office in 2012 to take on the mission of converting promising technologies to real battlefield tools. While DARPA still had a mission of fiddling with concepts that were largely theoretical, the SCO was supposed to make sure all of this work quickly translated to combat.

When SCO took over Plan X in December 2018, one of the first things it did was change the program’s name. Plan X had too much baggage tied to its first iteration as an elaborate and futuristic battle visualization tool. Instead the office renamed it Project IKE. Those who worked on the program insist that IKE doesn’t stand for anything, but rather was meant as a cheery new moniker, a play on the “I like Ike” slogan that swept Gen. Dwight Eisenhower into the White House.

SCO had an important message for the program’s main contractor, Two Six Labs: Make sure the system was using machine learning to make more predictions, not just keep track of hacking teams. Having seen what Mayhem had been able to pull off at the Cyber Grand Challenge, Pentagon leaders were convinced that more automation and artificial intelligence could be pushed into its new cyberwarfare star. The ability to calculate a single number measuring the likelihood of a mission’s success became key, as did using computer thinking to help figure out how to structure teams of cyber experts.

“Think of it as a full-spectrum cyber operations platform specifically designed for military operations with all that rigor and discipline.”
Frank Pound, head of DARPA’s Plan X

SCO also began to look at how IKE could use machine learning to develop information about targets for potential attacks. The idea was to let the computers pull information from different sources to create a clearer picture about what a target looked like.

That focus on artificial intelligence has driven constant improvements on IKE ever since. Every three weeks an updated version of the system is finished and sent to U.S. Cyber Command.

Once IKE left DARPA, it was quickly hidden behind a thick veil of Pentagon classification. The Defense Department’s annual budget documents sent to Congress name the program and lay out the amount of money sought — $30.6 million for 2021 — but all other details have been withheld.

Several sources, however, told me Project IKE is on the cusp of being able to perform many of its functions without human intervention. The big question is whether Pentagon and White House officials will let it.

Congress, thus far, hasn’t stepped in to establish limits on how the military can use its blossoming cyber arsenal. The U.S. Cyberspace Solarium Commission, chaired by Sen. Angus King, I-Maine, and Rep. Mike Gallagher, R-Wis., studied a range of issues involving cybersecurity and expressed concern about the rise of artificial intelligence. The commission’s final report, released in March, found that AI could lead to a “new arms race” but didn’t suggest any form of regulation.

Without it, the Pentagon has pressed on, developing the most advanced tools it can.

A newer DARPA program, known as Harnessing Autonomy for Countering Cyberadversary Systems, is trying to develop systems that can hunt on their own for certain types of attackers, mainly botnets that flood victims with traffic from numerous computers. It’s the kind of program that could be plugged into IKE to automate more cyber combat.

Karrels’s company, Two Six Labs, is also working on the HACCS program, and says the big question is whether U.S. Cyber Command would unleash it. Even if the technology is capable, without rules about when it could be used, it’s unclear if it would be deployed.

From a technology standpoint, the hard part is done, and the software is already capable of planning and launching its own attacks if cyber experts let it. That could make massive botnet attacks, the type that often disable bank websites and others, a thing of the past. It’s also largely unproven technology that could start shutting down and damaging critical computer networks accidentally.

Either way, the technology is ready.

“We are danger close on all of that becoming a reality,” Karrels said.