Thoughts on analytics, data management, visualization and collaboration

–White House: “Big Data is indeed a Big Deal”

Posted by Brett Sheppard on April 2, 2012

White House Big Data AnnouncementThe U.S. federal government announced over US$200 million in funding for a wide range of both new and already-in-progress initiatives related to Big Data. Director of the White House Office of Science and Technology Policy Dr. John Holdren (pictured at left) spoke on Thursday March 29, 2012 in a panel presentation and webcast at the American Association for the Advancement of Science (AAAS) auditorium in Washington, D.C. 

“Big data is indeed a big deal” — John Holdren, Assistant to the President and Director, White House Office of Science and Technology Policy, at the Thursday March 29, 2012 panel presentation and webcast

At the forum, each official announced initiative(s) that his or her federal government agency was embarking on to embrace the opportunities and address the challenges afforded by what the event described as the “Big Data Revolution”.

  • The National Science Foundation and National Institutes of Health: NSF announced new awards under its “Cyberinfrastructure for the 21st Century” framework and “Expeditions in Computing” programs, as well as awards that expand statistical approaches to address big data. The agency anticipates opportunities for cross-disciplinary efforts under its Integrative Graduate Education and Research Traineeship program and an Ideas Lab for researchers in using large datasets to enhance the effectiveness of teaching and learning. Among other NSF Big Data solicitations, NSF released “Core Techniques and Technologies for Advancing Big Data Science & Engineering,” or “Big Data,” jointly with the National Institutes of Health.  This program aims to extract and use knowledge from collections of large data sets in order to accelerate progress in science and engineering research. Specifically, it will fund research to develop and evaluate new algorithms, statistical methods, technologies, and tools for improved data collection and management, data analytics and e-science collaboration environments.
  • Department of Defense: DOD is investing $250 million annually (with $60 million available for new research projects) across the military departments in a series of programs led by the Defense Advanced Research Projects Agency (DARPA). Among these, the XDATA program seeks to develop computational techniques and software tools for analyzing large volumes of semi-structured and unstructured data. Central challenges to be addressed include scalable algorithms for processing imperfect data in distributed data stores and effective human-computer interaction tools that are rapidly customizable to facilitate visual reasoning for diverse missions. The program envisions open source software toolkits for flexible software development that enable processing of large volumes of data for use in targeted defense applications.
  • National Security Agency: The U.S. intelligence community has identified a set of coordination, outreach and program activities to collaborate with a wide variety of partners throughout the U.S. government, academia and industry, as well as make its perspective accessible to the unclassified science community. For example, “Vigilant Net: A Competition to Foster and Test Cyber Defense Situational Awareness at Scale” will explore the feasibility of conducting an online contest for developing data visualizations in the defense of massive computer networks, beginning with the identification of best practices in the design and execution of such an event.
  • Department of Homeland Security: Among other Big Data projects, DHS is working with Rutgers University and Purdue University (with three additional partner universities each) on large, heterogeneous data that first responders could use to address issues ranging from manmade or natural disasters to terrorist incidents; law enforcement to border security concerns; and explosives to cyber threats.
  • Department of Energy: As part of the DOE Office of Science, the office of Advanced Scientific Computing Research works with the data management, visualization and data analytics communities on technologies including the Kepler scientific workflow system; a Storage Resource Management standard; a variety of data storage management technologies, such as BeSTman, the Bulk Data Mover and the Adaptable IO System (ADIOS); FastBit data indexing technology (used by Yahoo!); and two scientific visualization tools, ParaView and VisIt.
  • National Aeronautics & Space Administration: Among other NASA projects, the Advanced Information Systems Technology (AIST) awards seek to reduce the risk and cost of evolving NASA information systems to support future Earth observation missions and to transform observations into Earth information as envisioned by NASA’s Climate Centric Architecture. Some AIST programs seek to mature Big Data capabilities to reduce the risk, cost, size and development time of Earth Science Division space-based and ground-based information systems and increase the accessibility and utility of science data.
  • Health and Human Services: Among other HHS projects, CDC’s Special Bacteriology Reference Laboratory (SBRL) identifies and classifies unknown bacterial pathogens for effective, rapid outbreak detection.
  • Department of Veterans Affairs: One of nine Veterans Affairs projects related to Big Data, the Genomic Information System for Integrated Science (GenISIS) is a program to enhance health care for Veterans through personalized medicine. The GenISIS consortium serves as the contact for clinical studies with access to the electronic health records and genetic data to conduct clinical trials, genomic trials and outcome studies.
  • Food and Drug Administration: The FDA “Virtual Laboratory Environment” will combine existing resources and capabilities to enable a virtual laboratory data network, advanced analytical and statistical tools and capabilities, crowd sourcing of analytics to predict and promote public health, document management support, tele-presence capability to enable worldwide collaboration, and basically make any location a virtual laboratory with advanced capabilities in a matter of hours.
  • U.S. Geological Survey: The John Wesley Powell Center for Analysis and Synthesis at USGS announced eight new research projects for transforming big data sets and big ideas about earth science theories into scientific discoveries.

For more on the funding for specific Big Data initiatives, reference a White House fact sheet.

The federal officials were followed by a panel discussion moderated by Steve Lohr of The New York Times with:

  • Dr. Daphne Koller, Stanford University (machine learning and applications in biology and education)
  • James Manyika, McKinsey & Company (co-author of major McKinsey report on Big Data)
  • Dr. Lucila Ohno-Machado, UC San Diego (National Institute of Health’s “Integrating Data for Analysis, Anonymization and Sharing” initiative)
  • Dr. Alex Szalay, Johns Hopkins University (big data for astronomy)

Thefederal funding announcements were particularly newsworthy given an election year with partisan gridlock and concerns about the U.S. federal budget deficit. According to Dr. Holdren, the overall objective is to accelerate creation and adoption of better tools to access, store, search, visualize and analyze Big Data to extract insights, discover new patterns and make new connections across disciplines, and thereby contribute to economic growth and national competitiveness.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: