Jeff Sorenson is president and partner, A.T. Kearney Public Sector & Defense Services, and former Army CIO/G6. (File)
As stated in my last post, Big Data has the potential to change the way the Department of Defense accomplishes its mission. However, to take advantage of this potential, the DoD must overcome a few challenges.

First among these challenges is building a smart data “haystack”— in other words, a complex, yet comprehensive and inter-connected source of enterprise data.

The DoD has multiple systems for collecting reams of data — data that, unfortunately, is often duplicative, incomplete, and not interconnected in any structured manner. As a result, data cannot be mined by decision makers into a single, clear view of day-to-day business operations, or of the enemy. In the field environment, critical data needed to understand enemy actions is collected, but can’t be found. This data is a needle buried in a big haystack. Thus, it is of no value.

To manage all this data, the DoD needs to start by intelligently designing and building a haystack that has each data set readily available when needed to assist the department in fulfilling its mission. This haystack must achieve two specific goals. First, it must address the DoD’s operational requirements for managing and integrating two distinct kinds of data: structured (e.g., data stored in relational databases, spreadsheets) and unstructured (e.g., videos, blogs, text files, social media). This step is critical to derive key insights from detailed analytics. Second, the DoD must create data governance models that enable better management of data growth and the sharing of data across the value chain in a sustainable, coordinated manner.

While it will be challenging, the DoD must improve its data management capabilities to be more effective. Today, leading companies build smarter data haystacks as a result of evolving their ability to gather, store, and mine data through the four stages of data management. They have gone from (1) enhancing the use of structured data, through (2) gathering internal unstructured data, (3) integrating external unstructured data, and (4) amalgamating and sharing data. The DoD, however, seems stuck between stages two and three.

Equally important, leading companies have advanced their Big Data capabilities by following four leading practices that are key to building a smarter data haystack — practices that ensure organizations have the technology and tools necessary to manage and integrate structured and unstructured data:

Optimize structured data processing. Leading companies actively manage and organize traditional sources of structured data with a master data management (MDM) strategy, making certain they consider all sources of transactional data residing in multiple systems within the organization.

Test semi-structured and unstructured data. Leading companies experiment with Hadoop-based appliances offered by leading vendors to integrate data across the structure spectrum. Such experimentation helps them better understand the facets of unstructured data and makes them mindful of how to build the capabilities needed to capture it.

Design architectures for data analytics. When leading companies design an enterprise architecture and infrastructure, they consider upgraded analytics platforms that can scale for new uses. This design helps them plan for the future, including the potential uses and corresponding data needed to support future-state analytics.

Embrace open source platforms. Leading companies take advantage of major innovations in data analytics that occur mostly in the open-source community. They use the platforms’ greater flexibility to better manage costs as they test and design different aspects of the data infrastructure.

With these practices for integrating structured and unstructured data in place, the difficult part of maintaining the integrity and quality of the haystack comes into play. A comprehensive data governance process looks at several key elements, including maintenance policies, data ownership/stewardship, change control processes, management of a data dictionary and a standard toolkit. Attention to these areas enables the adoption of the following four commercial best practices:

Drive standardization and automation. Each company has agreed-upon and transparent standards for the definition of data. It uses clearly understood calculations to derive critical metrics. And it uses active and ongoing monitoring of the validity of the data and the usage of standard reports.

Enforce a single information source. Each company has a single agreed-upon source of business information that is consistent across all views, and it does not accept reports or information from any other source. It uses mandated and centrally maintained tools.

Build analytics capabilities. Each company has a common suite of reports and a publishing schedule that supports harmonized stakeholder requirements. It also has a reporting tool/group that allows users to perform analysis and drill down without compromising the integrity of the source data or requiring change requests.

Manage demand for ad-hoc analysis. Each company has robust demand management processes for prioritizing ad-hoc requests that cannot be addressed as standard, and it monitors ad-hoc reporting to determine if new standardization is needed.

Building a smart data haystack is a difficult, but necessary, action for the DoD to take advantage of Big Data’s potential. Only after the haystack is designed and built with data governance best practices will the department be ready to mine this data into actionable, predictive insights, as leading commercial companies do today.

The next blog post will cover this topic.

