The intelligence community’s research arm announced it awarded four organizations research contracts under a language processing software program.
The Intelligence Advanced Research Projects Activity awarded Johns Hopkins University, Raytheon BBN Technologies, Columbia University and the University of Southern California Information Sciences Institute research contracts for the Machine Translation for English Retrieval of Information in Any Language, or MATERIAL program.
Originally announced in January 2017, MATERIAL seeks to allow users to quickly develop and deploy systems to allow English-only speakers to accurately and efficiently identify foreign-language documents of interest across social media, news wires and broadcasts.
“The collection and analysis of information required to accomplish a specific intelligence task has increasingly become a multilingual venture,” said Dr. Carl Rubino, IARPA program manager.
“For most languages, there are very few or no automated tools available for cross-lingual data mining and analysis. The MATERIAL Program aims to investigate how current language processing technologies can most efficiently be developed and integrated to respond to specific information needs against multilingual speech and text data.”
According to a notice announcing a proposer’s day in August, MATERIAL will retrieve relevant data from a large, multilingual repository and display it in English as query-biased summaries. Such queries will consist of two parts: a domain specification and either an English word or phrase that captures information needed of an English-speaker such as “zika virus” in the world of government vs. “zika virus” in the health care world. Then, the English summaries would convey where the retrieved information was relevant, the notice said.
Moreover, the notice said the current methods to produce similar technologies require a substantial investment in training data and/or language-specific development and expertise. The program seeks to drastically shrink the time and data needed to field systems capable of fulfilling an English-in, English-out task.
MIT Lincoln Laboratory, the University of Maryland Center for Advanced Study of Language, National Institute of Standards and Technology and Tarragon Consulting comprise the MATERIAL test and evaluation team that will assess performance of a variety of complex end-to-end solutions developed by the aforementioned contract awardees, IARPA said.
Mark Pomerleau is a reporter for C4ISRNET, covering information warfare and cyberspace.