In this vignette, the prioriactions
package is introduced in a real context, demonstrating part of its capabilities in order to familiarize the reader with it. The vignette is divided into three parts: the first shows a base case; which consists of prioritizing management actions while minimizing costs and, in turn, achieves certain recovery targets; the second part incorporates other curves in the calculation of benefits, while the third adds spatial requirements using the blm and blm_actions parameters.
The Mitchell River is a river located in Northern Queensland, Australia. The headwaters of the Mitchell River are in the Atherton Tableland about 50 kilometres (31 mi) northwest of Cairns, and flows about 750 kilometres (470 mi) northwest across Cape York Peninsula from Mareeba to the Gulf of Carpentaria. We will use this case study to present some functionalities of the prioriactions
package.
We started loading libraries:
# load packages
library(prioriactions)
library(raster) #To plot of shapefiles
library(tmap) #To create cool maps
library(scales) #To standardize the value of amount
library(reshape2) #To use the melt function
library(sp) #To use the spplot function
library(viridis) #To use viridis pallete
We divided the whole catchment (71,630 \(km^2\)) into 2316 sites (i.e., sub-catchments), each one included the portion of river length between two consecutive river connections and the surrounding land draining into this river stretch. We sourced the distribution of 45 fish species in the Mitchell river catchment as our conservation features [@cattarino2015]. Also, we considered four major threats to freshwater fish species in the catchment: water buffalo (Bubalis bubalis), cane toad (Bufo marinus), river flow alteration (caused by infrastructure for water extractions and levee banks) and grazing land use . All input files will be loaded directly into the prioriactions
package as follows:
path <- system.file("extdata/mitchell_vignette_data/",
package = "prioriactions")
pu_data <- data.table::fread(file = paste0(path,
"/pu_mitchell.csv"),
data.table = FALSE)
features_data <- data.table::fread(file = paste0(path,
"/features_mitchell.csv"),
data.table = FALSE)
dist_features_data <- data.table::fread(file = paste0(path,
"/dist_features_mitchell.csv"),
data.table = FALSE)
threats_data <- data.table::fread(file = paste0(path,
"/threats_mitchell.csv"),
data.table = FALSE)
dist_threats_data <- data.table::fread(file = paste0(path,
"/dist_threats_mitchell.csv"),
data.table = FALSE)
bound_data <- data.table::fread(file = paste0(path,
"/boundary_mitchell.csv"),
data.table = FALSE)
sensitivity_data <- data.table::fread(file = paste0(path,
"/sensibility_mitchell.csv"),
data.table = FALSE)
We load the shapefile of the case study also included as part of the package installation:
# read shapefile
mit_pu = raster::shapefile("data/Mitchell.shp")
# plot sapefile
plot(mit_pu)
After loading the instance’s data, we can plot different distributions of features or threats on the shapefile loaded. To do it, we can assign the values from tabular input to any of the fields in the shapefile containing the distribution of native species and threats:
# load amount of dist_features data
dist_features <- reshape2::dcast(dist_features_data,
pu~feature,
value.var = "amount",
fill = 0)
# assign the distribution of first feature to feature_distribution field
# in the shapefile
mit_pu$feature_distribution <- dist_features[, 2]
# plot distribution with tmap library
tmap::tm_shape(mit_pu) +
tmap::tm_fill("feature_distribution",
pal = c("white", "seagreen"),
labels = c("0", "1"),
breaks = c(0,1,2)) +
tmap::tm_borders(col="black",
lwd = 0.5)
Likewise, we can plot the distributions of any threats by applying the following instructions:
# load amount of dist_threats data
dist_threats <- reshape2::dcast(dist_threats_data,
pu~threat,
value.var = "amount",
fill = 0)
# assign the distribution of third threat to feature_distribution field
# in the shapefile
mit_pu$threat_distribution <- dist_threats[, 4]
# plot distribution with tmap library
tmap::tm_shape(mit_pu) +
tmap::tm_fill("threat_distribution",
pal = c("white", "red4"),
labels = c("0", "1"),
breaks = c(0,1,2)) +
tmap::tm_borders(col="black",
lwd = 0.5)
To solve all the models, the solver will be configured with a stop criterion of 5% of gap (gap_limit = 0.05
), which means that the solver will end its execution when it finds a solution whose gap is at least 5%, indicating 0% that this solution is optimal. Similarly, there are other stopping criteria such as the time (time_limit
), all available in the solve()
reference. There is also the option of not using any stopping criteria that will make the solver run until it finds an optimal solution; however, this could take a long time period.
In Section 2, we present the results obtained when solving the base model for the case study. Likewise, in Section 3 we report the results obtained when incorporating the different curves in the benefit calculation. And finally, in Section 4 we present the results obtained when incorporating the connectivity requirements to the model.
First, our base model considers the prioritization of management actions to abate a particular threat using previously loaded data. Note that for both features and threats we have presence/absence values (binary values). Some of the characteristics of the base model are the following:
minimiCosts
type of model to minimize costs for reaching 15% of the maximum recovery benefit per feature as target.To proceed we follow the three-step scheme described in the prioriactions
package: 1) data validation, 2) model creation and 3) model optimization.
# step 1: data validation
input_data <- inputData(pu = pu_data,
features = features_data,
dist_features = dist_features_data,
threats = threats_data,
dist_threats = dist_threats_data,
sensitivity = sensitivity_data,
bound = bound_data)
input_data
## Data
## planning units: data.frame (2316 units)
## monitoring costs: min: 1, max: 1
## features: scl_ja, nem_er, thr_sc, ... (45 features)
## threats: threat1, threat2, threat3, threat4 (4 threats)
## action costs: min: 1, max: 1
Now, in order to set the recovery targets at 20%, we must know what is the maximum recovery benefit possible through the getPotentialBenefit()
function.
# view the maximum benefit to achieve
maximum_benefits <- getPotentialBenefit(input_data)
head(maximum_benefits)
## feature dist dist_threatened maximum.conservation.benefit maximum.recovery.benefit maximum.benefit
## 1 1 692 692 0 692 692
## 2 2 1052 1052 0 1052 1052
## 3 3 416 416 0 416 416
## 4 4 653 653 0 653 653
## 5 5 238 238 0 238 238
## 6 6 276 276 0 276 276
We assign the new target (20% of maximum) to the corresponding column of the feature data input and create the object Data-class
again.
features_data$target_recovery <- maximum_benefits$maximum.recovery.benefit * 0.2
# step 1: validate modified data
input_data <- inputData(pu = pu_data,
features = features_data,
dist_features = dist_features_data,
threats = threats_data,
dist_threats = dist_threats_data,
sensitivity = sensitivity_data,
bound = bound_data)
input_data
## Data
## planning units: data.frame (2316 units)
## monitoring costs: min: 1, max: 1
## features: scl_ja, nem_er, thr_sc, ... (45 features)
## threats: threat1, threat2, threat3, threat4 (4 threats)
## action costs: min: 1, max: 1
Once the input data is validated, we proceed to create the mathematical model using the problem()
function with curve = 1
.
# step 2:
model.base <- problem(input_data,
model_type = "minimizeCosts",
blm = 0)
## Warning: The blm argument was set to 0, so the boundary data has no effect
## Warning: Some blm_actions argument were set to 0, so the boundary data has no effect for these cases
model.base
## Optimization Problem
## model sense: minimization
## dimensions: 37371, 43180, 1813.016 kB (nrow, ncol, size)
## variables: 43180
Note that the dimension of the model are 37371 mathematical constraints and 43180 variables. By using the getModelInfo()
function, we can display additional information of the model:
getModelInfo(model.base)
## model_sense n_constraints n_variables size
## 1 minimization 37371 43180 1813.016 kB
To solve the corresponding model, the solver is set considering a 5% optimality gap as stopping criterion (note that a 0% gap means that the problem is solved to optimality) and using gurobi solver. In turn, the verbose = TRUE
is used to display the execution of the solver and the output_file = FALSE
to avoid generating output files with the results.
# step 3:
solution.base <- solve(model.base,
gap_limit = 0.05,
verbose = TRUE,
output_file = FALSE,
cores = 2)
## Gurobi Optimizer version 9.1.2 build v9.1.2rc0 (linux64)
## Thread count: 2 physical cores, 4 logical processors, using up to 2 threads
## Optimize a model with 37371 rows, 43180 columns and 136566 nonzeros
## Model fingerprint: 0xc61c114a
## Variable types: 34965 continuous, 8215 integer (8215 binary)
## Coefficient statistics:
## Matrix range [3e-01, 4e+00]
## Objective range [1e+00, 1e+00]
## Bounds range [1e+00, 1e+00]
## RHS range [1e+00, 4e+02]
## Found heuristic solution: objective 7947.0000000
## Found heuristic solution: objective 6638.0000000
## Presolve removed 35017 rows and 35259 columns
## Presolve time: 0.25s
## Presolved: 2354 rows, 7921 columns, 66284 nonzeros
## Variable types: 0 continuous, 7921 integer (7904 binary)
##
## Root relaxation: objective 1.145583e+03, 4158 iterations, 0.09 seconds
##
## Nodes | Current Node | Objective Bounds | Work
## Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
##
## 0 0 1145.58333 0 547 6638.00000 1145.58333 82.7% - 0s
## H 0 0 1396.0000000 1145.58333 17.9% - 0s
## H 0 0 1393.0000000 1145.58333 17.8% - 0s
## 0 0 1173.77778 0 712 1393.00000 1173.77778 15.7% - 0s
## H 0 0 1369.0000000 1173.77778 14.3% - 0s
## 0 0 1175.01235 0 696 1369.00000 1175.01235 14.2% - 1s
## 0 0 1175.12963 0 691 1369.00000 1175.12963 14.2% - 1s
## 0 0 1222.36111 0 479 1369.00000 1222.36111 10.7% - 1s
## H 0 0 1333.0000000 1222.36111 8.30% - 1s
## 0 0 1225.44444 0 469 1333.00000 1225.44444 8.07% - 1s
## 0 0 1225.44444 0 472 1333.00000 1225.44444 8.07% - 1s
## 0 0 1241.19444 0 305 1333.00000 1241.19444 6.89% - 1s
## 0 0 1242.75000 0 292 1333.00000 1242.75000 6.77% - 1s
## 0 0 1242.75000 0 291 1333.00000 1242.75000 6.77% - 1s
## 0 0 1271.33333 0 68 1333.00000 1271.33333 4.63% - 1s
##
## Cutting planes:
## Gomory: 1
## Cover: 1214
## Clique: 11
## MIR: 482
## StrongCG: 1
## RLT: 2
##
## Explored 1 nodes (14581 simplex iterations) in 1.58 seconds
## Thread count was 2 (of 4 available processors)
##
## Solution count 6: 1333 1369 1393 ... 7947
##
## Optimal solution found (tolerance 5.00e-02)
## Best objective 1.333000000000e+03, best bound 1.272000000000e+03, gap 4.5761%
We have achieved a gap of 4.57% and a objective value of 1333 in a a time of 2.88 seconds. This and other relevant information can be obtained from the getPerformance()
function:
getPerformance(solution.base)
## solution_name objective_value gap solving_time status
## 1 sol 1333 4.576 1.58 Optimal solution (according to gap tolerance: 0.05)
Since our objective function does not contain the connectivity component (because both blm
and blm_actions
were set to zero), the objective value corresponds to the sum of all actions and monitoring costs. We can check these using the getCost()
function:
getCost(solution.base)
## solution_name monitoring threat_1 threat_2 threat_3 threat_4
## 1 sol 486 0 395 0 452
This shows that the actions with the highest total cost correspond to those that go against the threat 4, and then those corresponding to monitoring.
We use the getActions()
function to get the distribution of conservation actions. Note that because we only set recovery targets we use a recovery target planning propose, there are only planning units selected for prescribing management actions against threats, while there are no planning units selected for conservation. There are no planning units selected for connectivity as both blm and blm_actions were set to zero.
# get actions distribution
solution_actions.base <- getActions(solution.base)
head(solution_actions.base)
## solution_name pu 1 2 3 4 conservation connectivity
## 1 sol 1 0 0 0 0 0 0
## 2 sol 2 0 0 0 0 0 0
## 3 sol 3 0 1 0 1 0 0
## 4 sol 4 0 0 0 0 0 0
## 5 sol 5 0 0 0 0 0 0
## 6 sol 6 0 1 0 1 0 0
In the same way that we plot the distributions of species and threats, we can also explore the spatial distribution of management actions included in the optimal solution:
# assign solution to shapefile field to plot it
mit_pu$action_1.base <- solution_actions.base$`1`
mit_pu$action_2.base <- solution_actions.base$`2`
mit_pu$action_3.base <- solution_actions.base$`3`
mit_pu$action_4.base <- solution_actions.base$`4`
# actions plots
plot_action1.base <- tmap::tm_shape(mit_pu) +
tmap::tm_fill("action_1.base",
pal = c("white", "dodgerblue4"),
labels = c("0", "1"),
breaks = c(0,1,2)) +
tmap::tm_borders(col="black",
lwd = 0.5)
plot_action2.base <- tmap::tm_shape(mit_pu) +
tmap::tm_fill("action_2.base",
pal = c("white", "dodgerblue4"),
labels = c("0", "1"),
breaks = c(0,1,2)) +
tmap::tm_borders(col="black",
lwd = 0.5)
plot_action3.base <- tmap::tm_shape(mit_pu) +
tmap::tm_fill("action_3.base",
pal = c("white", "dodgerblue4"),
labels = c("0", "1"),
breaks = c(0,1,2)) +
tmap::tm_borders(col="black",
lwd = 0.5)
plot_action4.base <- tmap::tm_shape(mit_pu) +
tmap::tm_fill("action_4.base",
pal = c("white", "dodgerblue4"),
labels = c("0", "1"),
breaks = c(0,1,2)) +
tmap::tm_borders(col="black",
lwd = 0.5)
tmap::tmap_arrange(plot_action1.base,
plot_action2.base,
plot_action3.base,
plot_action4.base)
Also, we can show the distribution of the sum of the actions (higher density of actions):
mit_pu$sum_actions.base <- solution_actions.base$`1` +
solution_actions.base$`2` +
solution_actions.base$`3` +
solution_actions.base$`4` +
solution_actions.base$connectivity*5
# plot sum of actions with tmap library
plot.base <- tmap::tm_shape(mit_pu) +
tmap::tm_fill("sum_actions.base",
palette="viridis",
labels = c("do nothing",
"one action",
"two actions",
"three actions",
"four actions",
"connectivity"),
breaks = c(0,1,2,3,4,5,6)) +
tmap::tm_borders(col="black",
lwd = 0.5)
plot.base
We will use the results from this base model for comparisons with the following planning exercises.
This model differs from the previous one in that it tries to group conservation actions within the selected sites as part of the management plan (through a non-linear relationship in the calculation of benefits). This would be desirable when all or most of threats impacting a given species needed to be abated to ensure it long-term persistence (e.g., a species is highly sensitive to all the threats in a planning unit). The latter is done by adding a value other than 1 to the curve parameter. Where: (1) indicates that there is a linear relationship between the ratio between the actions carried out with respect to the possible actions to be carried out with respect to the benefit obtained by a characteristic so all actions are considered equally important, not being the species specially sensitive to any of them. (2) indicates a quadratic relationship between these values, and (3) a cubic relationship. The number of segments refers to the number of smaller portions in which the curve is divided to linealize it. Larger values of segments result in more complex models.
model.curve <- prioriactions::problem(input_data,
model_type = "minimizeCosts",
blm = 0,
curve = 3,
segments = 5)
## Warning: The blm argument was set to 0, so the boundary data has no effect
## Warning: Some blm_actions argument were set to 0, so the boundary data has no effect for these cases
model.curve
## Optimization Problem
## model sense: minimization
## dimensions: 37371, 78145, 1952.872 kB (nrow, ncol, size)
## variables: 78145
Note that the new model associates a larger instance (from 43180 to 78145 variables) when compared to the previously presented model. As a consequence, the resulting problem is computationally more difficult. The solution of the resulting model has a objective function value equal to 1336 (greater than the previous one), although the corresponding optimality gap is 4.79% (similar than the one attained for the simple model). The complete output of the solver is shown below:
solution.curve <- prioriactions::solve(model.curve,
gap_limit = 0.05,
verbose = TRUE,
output_file = FALSE,
cores = 2)
## Gurobi Optimizer version 9.1.2 build v9.1.2rc0 (linux64)
## Thread count: 2 physical cores, 4 logical processors, using up to 2 threads
## Optimize a model with 37371 rows, 78145 columns and 136566 nonzeros
## Model fingerprint: 0x727af3a5
## Model has 34965 general constraints
## Variable types: 69930 continuous, 8215 integer (8215 binary)
## Coefficient statistics:
## Matrix range [3e-01, 4e+00]
## Objective range [1e+00, 1e+00]
## Bounds range [1e+00, 1e+00]
## RHS range [1e+00, 4e+02]
## Found heuristic solution: objective 8215.0000000
## Presolve added 69869 rows and 174524 columns
## Presolve time: 0.78s
## Presolved: 107240 rows, 252669 columns, 695592 nonzeros
## Presolved model has 34965 SOS constraint(s)
## Found heuristic solution: objective 7947.0000000
## Variable types: 244742 continuous, 7927 integer (7911 binary)
##
## Deterministic concurrent LP optimizer: primal and dual simplex
## Showing first log only...
##
##
## Root simplex log...
##
## Iteration Objective Primal Inf. Dual Inf. Time
## 52139 2.7269611e+03 0.000000e+00 1.876868e+04 5s
## 77100 2.2476995e+03 0.000000e+00 1.868763e+04 10s
## 93792 2.0070600e+03 0.000000e+00 1.542965e+04 15s
## Concurrent spin time: 0.01s
##
## Solved with dual simplex
##
## Root relaxation: objective 1.144050e+03, 54270 iterations, 14.95 seconds
## Total elapsed time = 21.11s
##
## Nodes | Current Node | Objective Bounds | Work
## Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
##
## 0 0 1144.05000 0 6203 7947.00000 1144.05000 85.6% - 22s
## H 0 0 7924.0000000 1144.05000 85.6% - 23s
## 0 0 1148.84167 0 8970 7924.00000 1148.84167 85.5% - 28s
## H 0 0 2094.0000000 1148.84167 45.1% - 32s
## 0 0 1162.52272 0 11530 2094.00000 1162.52272 44.5% - 37s
## 0 0 1187.87778 0 6309 2094.00000 1187.87778 43.3% - 43s
## 0 0 1201.60000 0 5121 2094.00000 1201.60000 42.6% - 46s
## 0 0 1201.60000 0 5072 2094.00000 1201.60000 42.6% - 46s
## 0 0 1205.21667 0 5214 2094.00000 1205.21667 42.4% - 51s
## 0 0 1221.05556 0 5864 2094.00000 1221.05556 41.7% - 54s
## 0 0 1245.76667 0 3135 2094.00000 1245.76667 40.5% - 59s
## 0 0 1247.26667 0 2809 2094.00000 1247.26667 40.4% - 61s
## 0 0 1247.26667 0 3125 2094.00000 1247.26667 40.4% - 63s
## 0 0 1269.26667 0 3240 2094.00000 1269.26667 39.4% - 69s
## H 0 0 1472.0000000 1269.26667 13.8% - 79s
## 0 0 1269.26667 0 2501 1472.00000 1269.26667 13.8% - 81s
## 0 0 1269.92292 0 2699 1472.00000 1269.92292 13.7% - 86s
## 0 0 1270.83333 0 1876 1472.00000 1270.83333 13.7% - 88s
## 0 0 1270.83333 0 1741 1472.00000 1270.83333 13.7% - 88s
## 0 0 1272.00000 0 2068 1472.00000 1272.00000 13.6% - 92s
## H 0 0 1371.0000000 1272.00000 7.22% - 103s
## 0 0 1272.00000 0 1726 1371.00000 1272.00000 7.22% - 105s
## 0 0 1272.00000 0 2194 1371.00000 1272.00000 7.22% - 109s
## 0 0 1272.00000 0 1548 1371.00000 1272.00000 7.22% - 123s
## H 0 0 1353.0000000 1272.00000 5.99% - 141s
## H 0 0 1348.0000000 1272.00000 5.64% - 151s
## 0 2 1272.00000 0 1537 1348.00000 1272.00000 5.64% - 154s
## 1 4 1272.00000 1 1672 1348.00000 1272.00000 5.64% 4286 156s
## 5 8 1272.00000 2 1772 1348.00000 1272.00000 5.64% 2175 163s
## 7 10 1272.00000 3 1883 1348.00000 1272.00000 5.64% 2229 165s
## 17 26 1272.00000 5 1793 1348.00000 1272.00000 5.64% 1052 171s
## 25 42 1272.00000 9 1689 1348.00000 1272.00000 5.64% 769 183s
## H 27 42 1344.0000000 1272.00000 5.36% 725 183s
## 41 86 1272.00000 16 1691 1344.00000 1272.00000 5.36% 494 207s
## H 62 86 1341.0000000 1272.00000 5.15% 371 207s
## H 84 86 1340.0000000 1272.00000 5.07% 298 207s
## 85 178 1272.00000 34 1689 1340.00000 1272.00000 5.07% 295 233s
## H 108 178 1339.0000000 1272.00000 5.00% 245 233s
## H 111 178 1338.0000000 1272.00000 4.93% 239 233s
##
## Cutting planes:
## Cover: 1439
## Implied bound: 29488
## Clique: 144
## MIR: 426
## Flow cover: 3219
## Network: 4
## RLT: 2
## Relax-and-lift: 12413
##
## Explored 177 nodes (219985 simplex iterations) in 233.32 seconds
## Thread count was 2 (of 4 available processors)
##
## Solution count 10: 1338 1339 1340 ... 2094
##
## Optimal solution found (tolerance 5.00e-02)
## Warning: max constraint violation (1.3502e-02) exceeds tolerance
## Warning: max general constraint violation (1.3502e-02) exceeds tolerance
## (model may be infeasible or unbounded - try turning presolve off)
## Best objective 1.338000000000e+03, best bound 1.272000000000e+03, gap 4.9327%
Note that the last solution was reached in 177326 seconds. This specifically demonstrates how complexity increases when using the different parameters of curves.
# get action distribution
solution_actions.curve <- prioriactions::getActions(solution.curve)
# assign solution to shapefile field to plot it
mit_pu$action_1.curve <- solution_actions.curve$`1`
mit_pu$action_2.curve <- solution_actions.curve$`2`
mit_pu$action_3.curve <- solution_actions.curve$`3`
mit_pu$action_4.curve <- solution_actions.curve$`4`
mit_pu$sum_actions.curve <- solution_actions.curve$`1` +
solution_actions.curve$`2` +
solution_actions.curve$`3` +
solution_actions.curve$`4` +
solution_actions.curve$connectivity*5
# plot sum of actions with tmap library
plot.curve <- tmap::tm_shape(mit_pu) +
tmap::tm_fill("sum_actions.curve",
palette="viridis",
labels = c("do nothing",
"one action",
"two actions",
"three actions",
"four actions",
"connectivity"),
breaks = c(0,1,2,3,4,5,6)) +
tmap::tm_borders(col="black",
lwd = 0.5)
# comparative with base model solution
tmap::tmap_arrange(plot.base, plot.curve)