In the Bayesian paradigm all unknown quantities in the model are treated as random variables and the aim is to compute (or estimate) the joint posterior distribution. This is, the distribution of the parameters, θ, conditional on the observed data y. The way that posterior distribution is obtained relies on Bayes’ theorem:
\[\begin{equation} \pi(\theta|\textbf{y}) = \frac{ \pi(\textbf{y}|\theta)\pi(\theta) }{\pi(\textbf{y}) } \end{equation}\]Where \(\pi(\textbf{y}|\theta)\) is the likelihood of the data \(\textbf{y}\) given parameters \(\theta\), \(\pi(\theta)\) is the prior distribution of the parameters and \(\pi(\textbf{y})\) is the marginal likelihood, which acts as a normalizing constant (Gómez-Rubio, 2021).
Laplace Integrated Nested Approach or INLA is a recent method of fitting Bayesian models. The INLA approach aims to solve the computational difficulty of MCMC in data-intensive problems or complex models. In many applications, the posterior distribution sampling process using MCMC can take too long and is often not even feasible with existing computational resources.
The slides of the “SPATIAL PREDICTION MODELS IN R” lecture at UCSD-GPS Fall 2021 can be found here
First, we install and load all the needed packages for this workshop. Here a reference for the installation of INLA
package
# install.packages("kableExtra")
# install.packages("tidyverse")
# install.packages("yardstick")
# install.packages("gt")
# install.packages("spdep")
# install.packages("viridis")
# install.packages("INLA",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE)
library(kableExtra)
library(tidyverse)
library(yardstick)
library(gt)
library(spdep)
library(viridis)
library(INLA)
For this workshop we will use monthly mortality data from Lima, Peru (2018-2019). We’re downloading the data directly from the github repository. You can check the dictionary at the bottom of the table.
<- readRDS(url("https://github.com/healthinnovation/Inla_intro/raw/main/db_excess_proc_dis_1819_m.rds")) db
reg | prov | distr | year | month | n | week | date | temperature | precipitation | pp.insured | pp.edu.under25 | pp.pover | pp.no.elec | pp.no.water |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LIMA | LIMA | SANTA ROSA | 2019 | 11 | 14 | 44 | 2019-11-04 | 29.05179 | 0.0628037 | 0.7250182 | 0.1420677 | 0.1330481 | 0.0308246 | 0.103765 |
LIMA | LIMA | SURQUILLO | 2019 | 04 | 82 | 13 | 2019-04-01 | 29.43726 | 0.4175346 | 0.7250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
LIMA | LIMA | PUNTA NEGRA | 2018 | 01 | 0 | 1 | 2018-01-08 | 29.27019 | 0.5562810 | 0.7250182 | 0.1420677 | 0.1330481 | 0.0308246 | 0.103765 |
LIMA | LIMA | CHACLACAYO | 2019 | 11 | 30 | 44 | 2019-11-04 | 29.05179 | 0.0628037 | 0.7250182 | 0.1420677 | 0.4330481 | 0.0308246 | 0.103765 |
LIMA | LIMA | CHACLACAYO | 2019 | 01 | 42 | 1 | 2019-01-07 | 29.26792 | 0.2194335 | 0.7250182 | 0.1420677 | 0.4330481 | 0.0308246 | 0.103765 |
LIMA | LIMA | CARABAYLLO | 2019 | 07 | 194 | 26 | 2019-07-01 | 29.06252 | 0.0398653 | 0.3250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
LIMA | LIMA | SAN BORJA | 2018 | 12 | 104 | 48 | 2018-12-03 | 29.15187 | 0.1387612 | 0.7250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
LIMA | LIMA | LIMA | 2018 | 09 | 316 | 35 | 2018-09-03 | 29.00803 | 0.0264610 | 0.7250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
LIMA | LIMA | LIMA | 2019 | 12 | 454 | 48 | 2019-12-02 | 29.15607 | 0.1205020 | 0.7250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
LIMA | LIMA | SURQUILLO | 2018 | 10 | 116 | 39 | 2018-10-01 | 29.03968 | 0.0989436 | 0.7250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
LIMA | LIMA | BARRANCO | 2018 | 05 | 22 | 18 | 2018-05-07 | 29.31195 | 0.2102735 | 0.7250182 | 0.1420677 | 0.4330481 | 0.0308246 | 0.103765 |
LIMA | LIMA | RIMAC | 2018 | 12 | 180 | 48 | 2018-12-03 | 29.15187 | 0.1387612 | 0.7250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
LIMA | LIMA | PUNTA HERMOSA | 2018 | 07 | 0 | 26 | 2018-07-02 | 29.01577 | 0.0666329 | 0.7250182 | 0.1420677 | 0.1330481 | 0.0308246 | 0.103765 |
LIMA | LIMA | SAN BORJA | 2019 | 08 | 80 | 31 | 2019-08-05 | 28.98726 | 0.0000000 | 0.7250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
LIMA | LIMA | LA MOLINA | 2019 | 07 | 116 | 26 | 2019-07-01 | 29.06252 | 0.0398653 | 0.7250182 | 0.1420677 | 0.4330481 | 0.7308246 | 0.803765 |
Variable name | Description |
---|---|
reg | region |
prov | province |
distr | district |
year | year of register |
month | month of register |
week | week of register |
n | number of deaths |
temperature | monthly temperature |
precipitation | monthly precipitation |
pp.pover | poverty indicator |
pp.edu.under25 | proportion of people under 25 with a low level of education |
pp.insured | proportion of insured population |
pp.no.elec | proportion of people without access to basic electricity service |
pp.no.water | proportion of people without access to basic water service |
We proceed to do a temporal descriptive analysis using the ggplot
package.
%>%
db group_by(date) %>%
summarise(n=sum(n)) %>%
ggplot(aes(x=date,y=n)) +
geom_line(color="red") +
geom_point(color="red",shape=21) +
labs(y = "Deaths count") +
theme_bw(base_size = 15)