Marine Geospatial Ecology Tools: An integrated framework for ecological geoprocessing with ArcGIS, Python, R, MATLAB, and C++

https://doi.org/10.1016/j.envsoft.2010.03.029Get rights and content

Abstract

With the arrival of GPS, satellite remote sensing, and personal computers, the last two decades have witnessed rapid advances in the field of spatially-explicit marine ecological modeling. But with this innovation has come complexity. To keep up, ecologists must master multiple specialized software packages, such as ArcGIS for display and manipulation of geospatial data, R for statistical analysis, and MATLAB for matrix processing. This requires a costly investment of time and energy learning computer programming, a high hurdle for many ecologists. To provide easier access to advanced analytic methods, we developed Marine Geospatial Ecology Tools (MGET), an extensible collection of powerful, easy-to-use, open-source geoprocessing tools that ecologists can invoke from ArcGIS without resorting to computer programming. Internally, MGET integrates Python, R, MATLAB, and C++, bringing the power of these specialized platforms to tool developers without requiring developers to orchestrate the interoperability between them.

In this paper, we describe MGET’s software architecture and the tools in the collection. Next, we present an example application: a habitat model for Atlantic spotted dolphin (Stenella frontalis) that predicts dolphin presence using a statistical model fitted with oceanographic predictor variables. We conclude by discussing the lessons we learned engineering a highly integrated tool framework.

Introduction

The Duke University Marine Geospatial Ecology Laboratory specializes in the development of spatially-explicit ecological modeling techniques and the application of those methods in marine ecology studies and conservation projects. By publishing reusable and interoperable software tools in addition to journal articles, we hope to allow other researchers, managers, and conservation practitioners to repeat our analyses without reengineering them from scratch, to integrate them into larger scientific and management workflows, and ultimately to leverage them in an operational context. To maximize our pool of potential users, we target ecologists and analysts with moderate expertise in geographic information systems (GIS) and little experience in computer programming.

Developing tools for this community is hard. The community requires that tools be easy to install and operate, with graphical user interfaces (GUIs), ideally integrated with a GIS. For simple tools, such as a tool for building polygons for the regions traversed by drifting longline fishing gear (Dunn et al., 2008), we found we could satisfy these requirements by writing geoprocessing tools for ArcGIS using the Python programming language (ESRI, 2008, Python Software Foundation, 2008). ArcGIS is well known: an ongoing survey of nearly 40,000 GIS professionals found that ArcGIS is the dominant GIS platform, with 78% of respondents reporting that they used ArcGIS or related ESRI products and only 27% reporting that they used the next most popular GIS product (GISJobs.com, 2008). Python is a modern, open-source language that has been integrated into ArcGIS. Developing Python-based geoprocessing tools for ArcGIS is easy: all tools share a common graphical user interface provided by ArcGIS, and developers must implement only the geospatial analysis tasks performed by the tool.

For more complicated tools, we found that ArcGIS and Python did not provide all of the analytic functionality we needed. For predictive modeling tools, such as a system for predicting marine animal habitats from oceanographic conditions (Best et al., 2007), or a tool for predicting hard bottom substrate from coarse-resolution bathymetry (Dunn and Halpin, 2009), we needed multivariate statistical modeling functions. The platform we preferred for this was R, a popular statistics programming language (R Development Core Team, 2008). For math-intensive modeling tools, such as a hydrodynamic simulation of the dispersal of larvae between coral reefs (Treml et al., 2008), we preferred MATLAB, a popular programming platform for numeric processing (MathWorks, 2008). Unfortunately, R and MATLAB proved unsuitable as user interfaces for our target community because they both require programming expertise to operate effectively.

To provide our target community with an acceptable user interface and our developers with sufficient analytic functionality, we concluded that we must integrate ArcGIS and Python with R and MATLAB. To avoid reengineering this integration on a tool-by-tool basis, we decided to rewrite them all under a common software framework and release them as a unified collection called Marine Geospatial Ecology Tools (MGET).

Besides providing engineering benefits, this approach benefitted our tool development process as well. Our development process essentially follows Argent’s (2004) four-level process for developing environmental models, in which a model (or tool) is first developed by a researcher for a specific problem, then generalized and tested for a range of similar problems, then reengineered, repackaged, and documented for widespread operational use, and finally adopted by planners and policy makers as a reliable “black box”. Most of our tools start out as rough prototypes developed for specific research projects, but when they have high potential utility, our goal is to take them to at least the third level of Argent’s process. To successfully transition to the third level, in which a tool is of suitable quality to be used operationally, we have found it is almost always necessary to throw away the prototype and rewrite the tool from scratch. By developing a unified framework and release vehicle for all of our tools, we reduced both the temptation to release low quality prototypes and the effort required to rewrite them as high quality members of a consistent collection.

Section snippets

Software architecture of MGET

Although the ArcGIS/Python/R/MATLAB integration was a key requirement for MGET, it was not the only one. Here, we enumerate the other important requirements, present the architecture of MGET tools, and describe MGET’s code-generation functionality, a key component of the architecture that facilitates the integration.

Tools in the MGET collection

At the time of this writing, MGET included over 180 tools, grouped into seven categories: Conversion, Data Management, Spatial Analysis, Oceanographic Analysis, Connectivity Analysis, Statistics, and Data Products. In this section, we briefly summarize each category and highlight some of the tools they contain.

The Conversion tools convert geospatial data from one format to another, allowing ArcGIS users to convert oceanographic data from popular formats such as HDF, NetCDF, and binary flat

Example application: predictive habitat modeling using ArcGIS

The introduction of advanced GIS software and statistical modeling techniques has spawned a burgeoning assortment of spatially-explicit statistical approaches to ecology. One such approach is predictive habitat modeling, in which the investigator attempts to relate spatiotemporal observations of a species to environmental conditions using statistics or other quantitative techniques and then predicts the distribution of the species across a region, timeframe, and range of environmental

Lessons learned

The principal challenge we faced in creating MGET was an engineering problem, not a science problem: how do we build a suite of geospatial ecology tools that are accessible to non-programmers, that are modular and highly interoperable, and that are cheap to build and deploy? Our solution was to integrate a number of different programming platforms and software packages, both commercial and free, into a unified framework, and develop our tools on top of the framework. Here, we discuss some of

Conclusion

In this paper, we presented Marine Geospatial Ecology Tools (MGET), a collection of modular software tools designed for ecologists who are familiar with GIS but have little experience with computer programming. By integrating ArcGIS, Python, R, MATLAB, and C++ through interoperability modules and code generation, MGET allows tool developers to easily access the power of several popular scientific programming platforms without having to write tedious integration code. At the time of this

Acknowledgements

We thank Dave Ullman and Stephanie Hansen for providing example code for detecting SST fronts and geostrophic eddies. Several MGET tools are based on those examples. We thank Michelle Sims for providing example code and suggestions for improving several statistical tools. We thank four anonymous reviewers for providing suggestions that improved this manuscript. The development of MGET was made possible by a grant from the David and Lucile Packard Foundation, with additional support from NASA.

References (40)

  • K.S. Casey et al.

    Global AVHRR 4 km SST for 1985–2005, Pathfinder v5.0, NODC/RSMAS

    (2008)
  • J.-F. Cayula et al.

    Edge detection algorithm for SST images

    Journal of Atmospheric and Oceanic Technology

    (1992)
  • D.C. Dunn et al.

    Filling a marine spatial planning data gap: rugosity as a mesoscale proxy for hard-bottom habitat

    Marine Ecology Progress Series

    (2009)
  • ESRI

    ArcGIS – A Complete Integrated System

    (2008)
  • G.C. Feldman et al.

    Ocean Color Web, SeaWiFS Reprocessing 5.2

  • GISJobs.com

    Salary Survey

    (2008)
  • P.N. Halpin et al.

    OBIS–SEAMAP: developing a biogeographic research data commons for the ecological studies of marine mammals, seabirds, and sea turtles

    Marine Ecology Progress Series

    (2006)
  • M. Hammond et al.

    Python for Windows Extensions

    (2008)
  • T.J. Hastie et al.

    Generalized Additive Models

    (1990)
  • P. Hollemans

    CoastWatch Software Library and Utilities v3.2.2

    (2008)
  • Cited by (286)

    • Spatial and seasonal patterns of cetacean species richness: A Bayesian approach

      2023, Deep-Sea Research Part I: Oceanographic Research Papers
    View all citing articles on Scopus
    View full text