|Customize your entry pages|
The goal of this document is to provide help for managers and researchers at Great Smoky Mountains National Park (GRSM) in modeling species distributions using maximum entropy (maxent) methods. It provides a reference for the maxent software (Phillips and Dudik 2008): the standard for modeling species distributions.
In the sections following we provide help for:
A brief "Motivation and Background" section discusses the rationale for using maxent in GRSM.
Getting the Software
There are many different software packages that can optimize data using maximum entropy methods. In this document, however, we focus on the most common software package for biologists (Maxent).
Software Download The software is available for download at http://www.cs.princeton.edu/~schapire/maxent/.
To see if Java is installed: Image:Fig1.tiff
Installing / Running Maxent Java Application (Graphical User Interface) The main file to consider once the maxent files are downloaded from the website are: #maxent.jar and #maxent.bat.
Preparing the Data
Maxent requires precise formatting of the species occurrence data and the environmental data. Further, the spatial attributes of all data must be identical. This section is meant to guide users through the preliminary decision about species and environments that must be made, and then help users convert their data into formats appropriate for analysis in maxent.
There are some decision made up front which will alter how every other part of the analysis proceeds. Species and environmental layers must be selected which conform to certain geographic requirements, and the spatial attributes of all these layers must be defined.
Maxent can build models for multiple species at one time. The species to be modelled must have geolocated occurrences. It is advantageous if the precision of these geolocations are also known. Environmental maps can be adjusted to match the precision of the geolocations. If any temporally sensitive environmental data are included (e.g. temperature for a particular year, or fire history), then the species observation dates must coincide with dates for which the environmental data are valid.
Choose Environmental Variables
The predictions of any model will be improved if the selected environmental layers reflect the ecology of the organism. These associations may not be known for many species beforehand, however. Including every remotely sensed variable available is another option, and maxent provides estimates of the importance for each environmental variable included in the model (5.2). Maxent also provides a tuning parameter that adjusts the degree over-fitting (4). So, the kitchen-sink approach to variable inclusion works better in maxent than other approaches. At a bare minimum, species respond broadly to gradients of temperature and moisture. Three variables that approximate these gradients in GRSM are elevation, topographic convergence index, and hillshade (Jobe 2006).
Choose a Projection
You must choose a projection that matches precisely among all data types. This includes having the same datum among all data types. Data layers for GRSM are typically projected as Universal Transverse Mercator (UTM) zone 17, and either have the NAD27 or WGS84 datum. WGS84 is preferred, but the choice of datum and projection does not matter as long as both the occurrence data and all the environmental are exactly the same. Projecting digital elevation models (DEMs) is not recommended if any other environmental layer is derived from them (e.g. slope, hillshade, hydrological models). The resampling required for projection introduces striations in the derived layers. It is best practice to project all other layers to match the projection of the DEM. Alternatively, derive layers from the DEM in the original projection, reproject all the grids. In ArcGIS you can use ArcToolbox to project both rasters and features. To project all layers to a common projection use the batch project option:
At the end you should have new set of environmental layers, all sharing the same projection.
Prepare a Workspace
It is simpler to create one folder for a given analysis. Here, we term this the workspace. The files maxent.bat and maxent.jar should be copied into this workspace. Also, two sub-folders should be created in the workspace: grid, which will hold the prepared ArcGrid binary environmental layers, and ascii, which will hold the prepared ESRI ASCII environmental layers.
Prepare the Environmental Layers
The environmental layers set the geographic extent of the analysis window in the maxent software. So, it is best to prepare these layers before the species occurrence data, because some of the occurrences may lie outside this window and will have to be pared accordingly (3.4)
Maxent expects environmental data to be in ESRI ASCII grid format (AAIGrid). These grids can contain either continuous, or categorical data. If the grid is categorical, each category must be coded as an integer value. Environmental layers must share the same extent, the same grain, and the same mask (i.e. NODATA cells). In short, each layer must be identical except for the values contained in the data cells.
The names of each environmental layer should be less than 13 characters. Optionally, categorical layers should begin with prefix (e.g. c_). If maxent is ever run from the command line, these layers can be switched from continuous (the default) to categorical based on their prefix using the command option togglelayertype.
There are many ways to ensure that the environmental layers have matching spatial attributes, but here I present a method that uses the Spatial Analyst toolbar in ArcMap. I assume that the environmental layers are already in the standard ArcInfo binary grid format and that they have the same projection (3.1.3). If some environmental layers are stored as polygon shapefiles, then they must be converted to ArcInfo binary grids from: Spatial AnalystConvertFeatures to Raster... (details for starting Spatial Analyst are given below). The cell size for the output grid may be determined beforehand, or should be taken to be the largest cell size of the environmental layers already stored as grids.
Input raster 1 grid1 2 grid2 ... Output ASCII raster file Path to workspace\ascii\grid1 Path to workspace\ascii\grid2 After following these steps, the ascii folder in the analysis workspace will have all of the grids necessary for analysis in maxent. ArcMap should not be closed at this point, however, because the binary grids will still need to be used.
Prepare the Species Occurrence Data
Here, I assume that all species occurrence data have been projected to match the environmental layers (3.1.3), that the data exist as a point shapefile, and that one field of the shapefile contains the species name.
Spatial AnalystConvertRaster to Features
ArcToolboxData Management ToolsAdd XY
The end result of creating the species occurrence data should be a comma-separated values (csv) file, pntOccurrences.csv, with three fields (no header row): species, x, & y. This is the file that will be input to maxent.
An output folder must be created in the workspace to hold the results from the Maxent model (an easy folder name is output.
Optionally, you may also generate an samples with data (SWD) file for the species and the environment. Details of this format are given in the maxent tutorial, but basically it saves model run time if the environmental data at the sample points is added to the species occurrence file. Maxent optimizes the relationship between occurrences and environment using a random sample of 10,000 random points. You can skip this step in the Maxent model run by doing it yourself in ArcGIS. The procedure for generating SWD files for the observations and the environmental data is this:
Add XY coordinates as in 3.4
The end result of these steps will be two files, species.csv and environ.csv. These can be loaded as the species and environmental files, respectively, in the Maxent GUI or specified at the command line (4). Maxent will still need to use the contents of the ASCII folder for generating prediction layers if that option is selected.
Running the Model