# Generalized Additive Models: An Introduction Wi... Fixed

Spatially explicit ecosystem models of all types require an initial allocation of biomass, often in areas where fisheries independent abundance estimates do not exist. A generalized additive modelling (GAM) approach is used to describe the abundance of 40 species groups (i.e. functional groups) across the Gulf of Mexico (GoM) using a large fisheries independent data set (SEAMAP) and climate scale oceanographic conditions. Predictor variables included in the model are chlorophyll a, sediment type, dissolved oxygen, temperature, and depth. Despite the presence of a large number of zeros in the data, a single GAM using a negative binomial distribution was suitable to make predictions of abundance for multiple functional groups. We present an example case study using pink shrimp (Farfantepenaeus duroarum) and compare the results to known distributions. The model successfully predicts the known areas of high abundance in the GoM, including those areas where no data was inputted into the model fitting. Overall, the model reliably captures areas of high and low abundance for the large majority of functional groups observed in SEAMAP. The result of this method allows for the objective setting of spatial distributions for numerous functional groups across a modeling domain, even where abundance data may not exist.

## Generalized Additive Models: An Introduction wi...

Fisheries independent sampling efforts typically result in many zero observations for any given species, particularly in surveys that cover a broad range of habitats or depths. To deal with this problem a number of approaches have been developed to fit these types of data including lognormal delta distributions [12], [13], delta method approximation of variance [14], and zero inflated distributions [15], [16]. All of these methods can be applied to either generalized linear models, or generalized additive models. The latter of the two allows for greater flexibility in the model fitting. Despite these advanced methods dealing with zero inflation, Warton (2005) [17] found that in most cases a negative binomial was sufficient to model data with many zeros. In this paper we utilized a negative binomial GAM to predict the relative abundance of functional groups across shelf areas of the entire Gulf of Mexico (GoM) including Mexican and Cuban waters and areas where fisheries independent surveys do not exist, based on environmental and habitat predictors.

The exponential family probability function upon which GLMs are based can be expressed as$$\labeleq1f(y_i;\theta_i,\phi)= \exp\(y_i\theta_i-b(\theta_i))/\alpha_i(\phi)-c(y_i;\phi)\\tag1$$where the distribution is a function of the unknown data, $y,$ for given parameters $\theta$ and $\phi.$ For generalized linear models, the probability distribution is re-parameterized such that the distribution is a function of unknown parameters based on known data. In this form the distribution is termed a likelihood function, the goal of which is to determine the parameters making the data most likely. Statisticians log-transform the likelihood function in order to convert it to an additive rather than the multiplicative scale. Doing so greatly facilitates estimation based on the function. The log-likelihood function is central to all maximum likelihood estimation algorithms. It is also the basis of the deviance function, which was traditionally employed in GLM algorithms as both the basis of convergence and as a goodness-of-fit statistic. The log-likelihood is defined as$$L(\theta_i; y_i,\phi)=\sum_i=1^n\(y_i\theta_i-b(\theta_i))/\alpha_i(\phi)-c(y_i;\phi)\big\\tag2$$where $\theta$ is the link function, $b(\theta)$ the cumulant, $\alpha_i(\phi)$ the scale, and $c(y;\phi)$ thenormalization term, guaranteeing that the distribution sums to one. The first derivative of the cumulant withrespect to $\theta,$ $b'(\theta),$ is the mean of the function, $\mu;$ the secondderivative, $b''(\theta),$ is the variance, $V(\mu).$ The deviance function is given as$$2\sum_i=1^n\big\L(y_i;y_i)-L(y_i,\mu_i)\big\\tag3$$

Adam C Smith, Brandon P M Edwards, North American Breeding Bird Survey status and trend estimates to inform a wide range of conservation needs, using a flexible Bayesian hierarchical generalized additive model, Ornithological Applications, Volume 123, Issue 1, 1 February 2021, duaa065, 041b061a72