Cutting Ocean Data Processing Time Fivefold
The growing fleet of robotic ocean sensors coupled with the emergence of new and affordable monitoring technology has increased exponentially the amount of data collected from the world’s oceans. This puts decision-makers and researchers who work with these data in a completely fresh situation.
The challenge is: How to benefit from the abundance of the ocean data while keeping data acquisition, its management and processing budgets within reasonable limits?
It was early 2011, when Rainer Sternfeld worked on manufacturing profiling data buoys. Sternfeld, who has experiences on enterprise software, remote sensing and product development, had built with his team in 2009 a prototype data buoy. The buoy collected data properly, but processing and analyzing the data was a long and time-consuming process.
“I realized then that the bottleneck of the ocean data market was not in collecting the data, but in processing the data,” said Sternfeld. Then, he conceived the idea for Marinexplore.
As Sternfeld discovered, most public ocean data is disconnected, often archived, and sometimes never used again. He found that professionals who rely on ocean data are isolated from each other, and spend most of their exploration time on data processing. For example, when he spoke separately to two researchers on the opposite coasts of the small Baltic Sea, both of who were making measurements in the same area, they never had heard of each other.
Sternfeld, who had worked for ABB as the Baltic States Business Development Manager, leading it to win and establish world’s first nation-wide fast-charging infrastructure for electric cars, was surprised that the methods and tools used for the ocean exploration were outdated and failed to take advantage of the latest technologies.
“I asked people, why there are no fast and intuitive software solutions for working with ocean data, and they all just answered this is the way it’s always been,” said Sternfeld.
“After all that insight I got few important things figured out. First, it was clear that intelligent mobile devices are becoming the new standard of the instrumentation industry. Further, cloud computing and big data are disrupting the computing and collaboration paradigm in the world. And end-user expectations to software usability have changed dramatically during last 10 years.” summarized Sternfeld his findings and continued “What is needed is a solution that cuts data processing time at least fivefold and which enables secure data management between partners, subcontractors, and federal agencies. Both for open and proprietary data. We are going to solve this problem.”
With a vision for bringing ocean data exploration and analysis into the 21st century, Sternfeld launched his new business, Marinexplore, in February 2012. By July he had an alpha version of the first product.
By the beginning of October the company got funded by investors from the U.S., Norway, Singapore, and Estonia. In October the company also came out with a significant upgrade in the product, integrating several satellite based data sources and releasing tools for working with data overlays.
Sternfeld has two co-founders. André Karpištšenko, acting as a Technology Lead of Marinexplore, came from Skype, the technology flagship of Estonia, having founded the Data Research Team there. Kalle Kägi, the Product Data Manager of the company, was working previously on the data buoy with Rainer.
He assembled a team of top technology and science experts in the U.S. and in Estonia. The company has already started attracting top talent in the industry, hiring in September Roberto de Almeida, who created widely used open source libraries Pydap and scipy.io.netcdf.
Organizing Ocean Data
Marinexplore addresses the two biggest issues around managing marine-related big data: how to access the vast amount of data that reside in isolated silos, segregated and disconnected from each other; and how to make the time-consuming handling and processing of all this data more efficient.
“To solve these problems, Marinexplore is creating the world’s first ocean big data platform, dramatically cutting processing costs for offshore energy, fisheries and environmental analysis industries. To date there is no universal tool that organizes ocean data, and enables users to share public information securely across the globe.” said Sternfeld. Making data management significantly more efficient means someone will have less work to do. By Sternfeld it’s an opportunity not a problem, saying “Marinexplore will empower the professionals with cutting-edge tools to focus on their real expertise. Now, people can finally focus on making really useful conclusions based on data instead of spending days with Excel and Matlab.”
The company has already aggregated over 1.2 billion in-situ measurements from more than 24,000 ocean-borne devices and two satellite products, organized into an easy-to-use user experience. Oceanographic measurements from a growing list of public sources include NASA GHRSST model data, NASA Aquarius salinity data, NOAA NDBC stations, GTS buoys and drifters, Argo floats, and Liquid Robotics wave gliders.
Focus on design and visualization
The first thing that grabs the attention after diving into Marinexplore tool is the design. While most map-based ocean data tools are similarly designed, Marinexplore’s tools incorporate the latest technology; the simplicity and look & feel of today’s web applications.
The menu is minimalistic and the interface is dark, bringing the map and data thereon to the focus. Sternfeld confirms that was intentional, adding, “We just wanted to eliminate all unnecessary to save space for what users are looking for – ocean data. Design, usability and data quality are key to our approach, because ultimately we are designing a process, not a tool.”
The ocean community has so far relied on traditional data serving solutions, like FTP servers and other data access protocols, which in the usability terms mostly means the data can be accessed as a list of files. Marinexplore’s solution is based on visual search. Each oceanographic measurement comes with coordinates indicating the depth and location, so each data point can be tied to a GIS system. Sternfeld explains, “It is much more intuitive for the humans to search geographic measurements using a map rather than browsing an endless table consisting of numbers.”
The system includes four filters for finding the necessary data on the map – location, parameter, data source and device type. There’s also a dynamic time filter that can be accessed with a slider or calendar. Besides default locations, a polygon tool can be used to choose any custom area according to the users preferences.
After filtering desired dataset, one can switch to the table view, which lists all the selected devices, allowing checking any necessary details related to the measurements. The filtered dataset can be downloaded with one click, CSV and NetCDF formats can be chosen.
In addition, Marinexplore has the ability to display simultaneously satellite product overlays and in-situ devices.
The data tool has graphs. Clicking on a device on the map opens a popup in the screen. This popup contains data about the device, but it also plots graphs for a number of parameters. Data graphs enable easy pre-screening of the data a user is interested of, which helps to avoid unnecessary downloads.
Sternfeld states that the most powerful feature of Marinexplore is streamlined data aggregation. This allows a user to flexibly select a combination of oceanographic data sources to work with, creating unified datasets literally within seconds. Up until now, assembling a custom marine dataset was very time-consuming. For example, an aquafarming operator looking to extend to a new location needs broad spectrum of data. Each fish species has specific ranges for key oceanographic parameters, which support its habitats, like temperature, salinity, and chlorophyll. With Marinexplore, these data can now be analyzed in a fraction of time it used to take.
This also means, if the polygon tool is used for selecting specific area on the map while all data sources are switched on, the system is able to aggregate both in-situ and satellite sources. As one of the main issues for people working with ocean data is the quality,
Sternfeld tells they plan to introduce automatic validation rules and data quality checks that will be collaboratively improved together with the ocean community relying on Marinexplore.
Bringing all ocean data sources together raises the question of access and data rights. Although huge amount of oceanographic measurements are collected with the help of public funding, many researches still resist making the datasets freely available due to ongoing researches or copyright concerns. According to Sternfeld, Marinexplore has a solution for that: “We are working on private data management layer and the data owners will have full control over their rights and interests.”
Ocean Data Community
What makes Marinexplore completely different compared to other ocean data tools, are its community and co-creation features. As an ocean data collaboration platform, Marinexplore wants to bring together the whole ocean community.
This is an important component of the company’s charter, according to Sternfeld. “Today people working on the specific task are collaborating usually with a couple of colleagues from around the world. For the benefit of all of us on this planet—and considering that 90% of the oceans are unexplored – we need to be much more open and collaborative with information and communications related to the oceans.”
Much like LinkedIn, the Marinexplore community creates a virtual community for people working on oceans. Users can add work history, skills, and expertise. But Sternfeld says it’s not that straightforward.
“The public profile is the tip of the iceberg. When I say ‘collaboration’ I mean collaboration. People will be able to explore and discuss data privately or publicly, sharing results in the real time and co-creating new content. It has happened elsewhere in the business software world. There’s no reason why it shouldn’t be so with the ocean data and community.”
The main channel for the Marinexplore ocean community is the co-creation section of its website. Sternfeld believes co-creation will eventually be good means for the ocean community to making discoveries and developing new models. “And it will also help to create context around the data the users are looking for. Current registered users include members of the oceanographic and marine technology community, environment, offshore industry, and more,” said Sternfeld.
A Look into Future
“Marinexplore is a system integrator that creates tools to manage different types of data on top of existing modern analytical data computing platforms. This is how we can increase the productivity and quality of end applications. So we are not actually creating something completely new but just putting pieces the right pieces to together” said Sternfeld.
The company is currently choosing pilot project partners for the API it is developing. “We are looking for ways to improve decision-making applications and to utilize other benefits that will be possible thanks to a well organized spatial data repository,” said Sternfeld.
While many businesses are planning to shift their focus on ocean resources, tough competition and raising compliance requirements do not make conquering the new frontier any easier.
Sternfeld doesn’t want to go into the details when speaking about other future developments of the Marinexplore but he’s convinced in one thing: “We’ve entered into an era of big data. The tools used for ocean data have not yet entered this era. Marinexplore is going to introduce this era to the ocean community.”
(As published in the November/December 2012 edition of Marine Technology Reporter - www.marinetechnologiesnews.com)