The Importance of FAIR Data in Earth Science
Data’s valuation as an enterprise asset is most acutely realized over time. When properly managed, the same dataset supports a plurality of use cases, becomes almost instantly available upon request, and is exchangeable between departments or organizations to systematically increase its yield with each deployment.
These boons of leveraging data as an enterprise asset are the foundation of GO FAIR’s Findable Accessible Interoperable Reusable (FAIR) principles profoundly impacting the data management rigors of geological science. Numerous organizations in this space have embraced these tenets to swiftly share information among a diversity of disciplines to safely guide the stewardship of the earth.
According to Dr. Annie Burgess, Lab Director of Earth Science Information Partners (ESIP), the “most pressing global challenges cannot be solved by a single organization. Scientists require data collected across multiple disciplines, which are often managed by many different agencies and institutions.” As numerous members of the earth science community are realizing, the most effectual means of managing those disparate data according to FAIR principles is by utilizing the semantic standards underpinning knowledge graphs.
These uniform approaches to managing metadata, data models, and terminology are the crux of the FAIR data movement, ensuring data’s place as a prized asset of the scientific community.
Communal Science
The semantic standards supporting knowledge graphs are designed for uniquely identifying, immediately accessing, and sharing data in a machine readable format. They’re the same standards responsible for facilitating these advantages in the World Wide Web, and are immensely beneficial for reusing data within the geological science field. This field is one of the more challenging scientific areas because it’s so extensive, encompassing marine life, atmospheric concerns, land masses, and subterranean developments. The ability to rapidly share data in these different specializations is an integral aspect of advancing the field as a whole, as are the other advantages of uniquely identifying data and quickly accessing them via machine readable techniques.
Observed Dr. Lewis McGibbney, data scientist for the California Institute of Technology’s Jet Propulsion Laboratory and co-chair of the NASA ESDSWG Search Relevance Working Group, “We are at an exciting stage for where there is a critical mass of experts and organizations around the globe with similar goals as well as the realization that we need knowledge-intensive applications. The semantic technology stack is a crucial piece for building intelligent apps for knowledge-intensive use cases within the geoscience area.” Moreover, semantic standards enable those organizations to publish data and findings in a reusable format so different organizations directly benefit from each other’s labor.
Linking Humans and Machines
The FAIR approach revolves around linking different pieces of data in a knowledge graph. Those knowledge graphs in turn can be linked between different organizations or ‘published’ on the web for universal access—which is critical for interoperability. This approach not only requires each individual datum to have its own unique identifier, but also a rich description of its metadata based on standardized vocabularies and taxonomies swiftly understood and accessed via machines. Semantic data models (ontologies) standardize inherent differences in schema used by different organizations for different applications, further aiding the interoperability of IT systems embracing FAIR principles.
Monterey Bay Aquarium Research Institute Senior Software Engineer Carlos Rueda commented that “the Marine Metadata Interoperability Project developed the MMI Ontology Registry and Repository (ORR), which leverages AllegroGraph to provide powerful interoperable semantic services that make the content on the web interconnected in a meaningful way for both humans and machines to consume.” By enabling different scientific organizations in the Marine Metadata Interoperability Project to register ontologies of their myriad repositories in this standardized manner, data integration and accessibility are expedited.
Unified Diversity
Perhaps the capital advantage of actuating FAIR principles with knowledge graphs within the earth science community is the ability to standardize on the assortment of divers data relevant to scientists. The sheer number of different specializations in this field requires data of seemingly infinite varieties. Sources include sensor data from water, aerial, and terrestrial sources, in addition to satellite data and those from physical samples. Furthermore, these data are characterized by many different spatial and temporal resolutions, adding to the overall complexity of managing them homogeneously. In this respect, semantic data models are considerably helped uniform vocabularies to describe data. Dr. Burgess alluded to the merit of “the ESIP Community Ontology Repository, a community platform to manage and exchange terms and vocabularies that assist scientists to publish, discover, and reuse data.”
Long Term Propagation
As the abundant use cases within the geological science community reveal, data’s true esteem is based on its enduring reusability and immediate accessibility. These priorities spawned the FAIR movement, which depends on semantic technologies for implementation. This approach delivers the same benefit when applied to contemporary organizations: an increase in data’s value as an enterprise asset.
About the Author
Jans Aasman is a Ph.D. psychologist, expert in Cognitive Science and CEO of Franz Inc., an early innovator in Artificial Intelligence and provider of AllegroGraph, the leading Semantic Graph Database. As both a scientist and CEO, Dr. Aasman continues to break ground in the areas of Artificial Intelligence and Knowledge Graphs as he works hand-in-hand with numerous Fortune 500 organizations as well as U.S. and foreign governments. Dr. Aasman has spent a large part of his professional career specializing in applied Artificial Intelligence projects, intelligent user interfaces and telecommunications research. He has gathered patents in the areas of speech technology, multimodal user interaction, recommendation engines while developing precursor technology for tablets and personal assistants. He was a professor in the Industrial Design department of the Technical University of Delft and a noted conference speaker at events such as Smart Data, NoSQL Now, International Semantic Web Conference, GeoWeb, AAAI, Enterprise Data World, Global Graph Summit, Text Analytics, and TTI Vanguard.