After having deployed to as many as six locations in Africa and the Middle East, personnel associated with the Defense Department’s hallmark automation effort say they are learning valuable lessons in how to use algorithms for war.
Known as Project Maven, the DoD initiative aims to accelerate the integration of big data and machine learning, first focusing on processing of full motion video from tactical drones and from medium altitude sensors.
Here’s what officials are saying:
It’s critical to retrain algorithms aimed at analyzing full motion video
“It’s not all that different if you think about it than a young airman coming onto my ops floor. On their first day of work it’s going to take them a little while to figure out what’s going on, to understand the mission, to understand the [area of responsibility] in which we’re operating,” Lt. Col. Garry “Pink” Floyd, deputy chief of the algorithmic warfare cross functional team, also known as Project Maven, at the Modular Open Systems Summit in Washington May 1. “You see a similar version of that with [artificial intelligence], with these algorithms.”
Analysts are currently swimming in troves of data that are nearly impossible for humans to sift through.
To exemplify the problem, military personnel at the annual GEOINT symposium in Tampa, Florida in late April said in fiscal 2017, 127 terabytes of captured enemy data was collected. In addition, the annual video that Central Command collects could cover 325,000 feature films and the annual signals intelligence Central Command collects is equal to roughly 5.5 million songs.
Floyd said in Africa Command where the first algorithm was deployed, staff retrained the machine six times in five days.
“Maybe they’re trained off of data from one region and we deploy it to a new region and maybe makes a few silly mistakes at first, but we’ve developed some tools to help do that quickly,” he said, noting these algorithms have been deployed to garrison processing, exploitation and dissemination sites.
Floyd explained the team built into user interfaces a button that literally says “train AI.”
“If you see the algorithm misidentify a palm tree as a person or something like that, the analyst, the operator can hit that train AI button, it captures the number of frames to the left of that instant, it captures the frames to right. Take that out of the theater and get it into our data labeling pipeline,” he said. “We relabel the data, we rapidly get that to our algorithm developer … they retrain optimize … and then we redeploy it to the field. We were thinking this would take a long time, but again, we were able to do this in like say a day or so or even less.
Data labeling is a necessity
The algorithms must have labeled, or what is often referred to as structured, data to work with as opposed to raw data or unstructured data. Without it, the algorithms struggle to complete their work.
Floyd said one approach to the data labeling issue is grabbing airmen that are waiting on their security clearances to do the labeling.
“Some of our best pockets of data labelers are folks that are waiting orders or waiting to get their clearance and it’s just been tremendous,” he said. “That becomes their service to their country for those first three or four months while they’re waiting to get in the front door at Fort Meade or some other place like that.”
Adaptive interfaces
User interfaces for the operators and analysts leveraging the algorithms must be tailorable and adaptable, Floyd said.
“We need to give the user interface to the operators to where they can choose the algorithm they want to use for the mission that they’re on. An algorithm that’s tailored for this area of the world won’t be perfect for another area of the world,” he explained. “We know that we’ll have tailored algorithms for different geographic regions.”
One of the most important aspects of deploying algorithms in conflict is going to be how operators use it, Floyd said describing a “slide rule” of confidence regarding the algorithm’s performance.
Some operators might want the algorithm to cue things that it’s 80 percent or higher in, he said. Others might say slide the confidence down to 20 percent because that might return hits the human wouldn’t normally see.
“We have seen that a couple of times already in the deployed environment where the algorithm picked up things the human eye missed,” Floyd said.
Future of Project Maven
Floyd said DoD is still in the early stages of Project Maven adding that while the technology has been deployed in six locations, more are on the way. There’s a pathway for additional sites in the U.S. and overseas throughout the rest of the year.
He alluded to additional use cases or applications beyond full motion video for Project Maven.
“It might be a little early to get too specific on those. But there might be another use case or two, another sensor type or two that we might go after soon,” he said.
Mark Pomerleau is a reporter for C4ISRNET, covering information warfare and cyberspace.