library(r2ogs6)
library(ggplot2)
library(dplyr)
Hi there! This is a practical guide on the OGS6_Ensemble
class from the r2ogs6
package. I will show you how you can use this class to set up ensemble runs for OpenGeoSys 6, extract the results and plot them.
If you want to follow this tutorial, you’ll need to have r2ogs6
installed and loaded. The prerequisites for r2ogs6
are described in detail in the r2ogs6 User Guide
vignette.
Additionally, you’ll need the following benchmark files:
Theis’ Problem as described here
Theis solution for well pumping as described here
General instructions on how to download OpenGeoSys 6 benchmark files can be found here.
We will consider the following parameters for our sensitivity analysis:
permeability
porosity
storage
First, we create a simulation object to base our ensemble on and read in the .prj
file.
# Change this to fit your system
<- tempdir()
testdir_path <- paste0(testdir_path, "/axisym_theis_sim")
sim_path
<- OGS6$new(sim_name = "axisym_theis",
ogs6_obj sim_path = sim_path)
# Change this to fit your system
<- system.file("extdata/benchmarks/AxiSymTheis/",
prj_path "axisym_theis.prj", package = "r2ogs6")
read_in_prj(ogs6_obj, prj_path, read_in_gml = T)
Let’s create a small ensemble where we only alter the value of storage
. Say we don’t want to hardcode the values, but instead examine the effects of changing storage
by 1%, 10% and 50%. We can use the percentages_mode
parameter of OGS6_Ensemble
for this. It already defaults to TRUE
, below I’m merely being explicit for demonstration purposes.
# Assign percentages
<- c(-50, -10, -1, 0, 1, 10, 50)
percentages
# Define an ensemble object
<-
ogs6_ens $new(
OGS6_Ensembleogs6_obj = ogs6_obj,
parameters = list(list(ogs6_obj$media[[1]]$properties[[4]]$value,
percentages)),percentages_mode = TRUE)
Now you can start the simulation.
$run_simulation() ogs6_ens
lapply(ogs6_ens$ensemble, ogs6_read_output_files)
Our simulations (hopefully) ran successfully - great! Now it’d be nice to see some results. Say we’re interested in the pressure
data.
# This will get a combined dataframe
<-
storage_tbl $get_point_data(point_ids = c(0, 1, 2),
ogs6_enskeys = c("pressure"))
You may leave out the point_ids
parameter. It defaults to all points in the dataset - I specify it here because there’s about 500 which would slow down building this vignette.
# Let's look at the first 10 rows of the dataset
head(storage_tbl, 10)
#> # A tibble: 10 x 8
#> id x y z pressure timestep sim_id perc
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
#> 1 0 0.305 0 0 0 0 1 -50
#> 2 0 0.305 0 0 6.99 8.64 1 -50
#> 3 0 0.305 0 0 9.77 86.4 1 -50
#> 4 0 0.305 0 0 13.3 1728 1 -50
#> 5 0 0.305 0 0 16.2 24192 1 -50
#> 6 0 0.305 0 0 16.6 172800 1 -50
#> 7 0 0.305 0 0 16.6 604800 1 -50
#> 8 0 0.305 0 0 16.6 864000 1 -50
#> 9 1 305. 0 0 0 0 1 -50
#> 10 1 305. 0 0 0 8.64 1 -50
You can see there’s one row per timestep for each point ID. Say we want to plot the average pressure for each timestep
, that is, the mean of the pressure
values where the rows have the same timestep
. We’ll consider only the first simulation for now.
# Get average pressure from first simulation
<- storage_tbl[storage_tbl$sim_id == 1,] %>%
avg_pr_first_sim group_by(timestep) %>%
summarize(avg_pressure = mean(pressure))
# Plot pressure over time for first simulation
ggplot(data = avg_pr_first_sim) +
geom_point(mapping = aes(x = as.numeric(row.names(avg_pr_first_sim)),
y = avg_pressure)) + xlab("Timestep")
# Get average pressure for all simulations
<- storage_tbl %>%
avg_pr_df group_by(sim_id, timestep) %>%
summarise(avg_pressure = mean(pressure))
# Plot pressure over time for all simulations
ggplot(avg_pr_df,
aes(x = as.numeric(as.factor(timestep)),
y = avg_pressure)) +
geom_point(aes(color = as.factor(sim_id))) +
geom_line(aes(color = as.factor(sim_id))) +
xlab("Timestep") +
labs(color = "sim id") +
facet_grid(rows = vars(sim_id))
Now we have the average pressure
over time for each of the simulations (rows). Since they’re pretty similar, let’s put them in the same plot for a better comparison.
ggplot(avg_pr_df, aes(x = as.numeric(as.factor(timestep)),
y = avg_pressure,
group = sim_id)) +
geom_point(aes(color = as.factor(sim_id))) +
geom_line(aes(color = as.factor(sim_id))) +
labs(color = "sim id") +
xlab("Timestep")
So far we’ve only considered the storage
parameter. Now we want to have a look at how the other two parameters influence the simulation, so let’s put them into an ensemble together. For permeability
we can reuse the percentages
we already have. For porosity
, this doesn’t work - the original value is 1 and its values can only range between 0 and 1, so we’ll supply a shorter vector.
This time it’s important we set the OGS6_Ensemble
parameter sequential_mode
to TRUE
as this will change the supplied parameters sequentially which means in the end we have 18 (7 + 7 + 4) simulations which is equal to the sum of elements in the value vectors you supply (I’ve named them values
below for clarity, naming them is optional though).
The default FALSE
would give an error message because our value vectors do not have the same length and even if they had, it wouldn’t do what we want - the number of simulations would equal the length of one value vector (thus requiring them to be of the same length). Generally, set sequential_mode
to TRUE
if you want to examine the influence of parameters on a simulation independently. If you want to examine how the parameters influence each other as in wanting to test parameter combinations, the default mode is the way to go.
# Change this to fit your system
<- paste0(testdir_path, "/axisym_theis_sim_big")
sim_path
$sim_path <- sim_path
ogs6_obj
<-
ogs6_ens_big $new(
OGS6_Ensembleogs6_obj = ogs6_obj,
parameters = list(per = list(ogs6_obj$media[[1]]$properties[[1]]$value,
values = percentages),
por = list(ogs6_obj$media[[1]]$properties[[3]]$value,
c(-50, -10, -1, 0)),
sto = list(ogs6_obj$media[[1]]$properties[[4]]$value,
values = percentages)),
sequential_mode = TRUE)
Now you can start the simulation.
$run_simulation() ogs6_ens_big
lapply(ogs6_ens_big$ensemble, ogs6_read_output_files)
This will take a short time. As soon as the simulations are done, we can extract the point data much like we did before. This time we want to plot the point x coodinates on the x axis so we’re leaving out point_ids
to get all points. Also we just want the data from the last timestep.
# Get combined dataframe
<-
per_por_sto_df $get_point_data(
ogs6_ens_bigkeys = c("pressure"),
start_at_timestep = ogs6_ens_big$ensemble[[1]]$pvds[[1]]$last_timestep)
Plotting time! Since we set sequential_mode
to TRUE
, the dataframe we just created contains a name
column which allows us to group by parameters. Because we’ve also set percentages_mode
to TRUE
, it also has a column perc
which allows us to group by percentages. Now we can simply use a facet grid to plot.
# Make plot
ggplot(per_por_sto_df,
aes(x = x,
y = -pressure, # Flip pressure because source term was positive
group = perc)) +
geom_point(aes(color = as.factor(perc))) +
xlab("Radius (m)") +
ylab("Head (m)") +
labs(color = "%") +
facet_grid(cols = vars(name),
labeller = as_labeller(c(per = "permeability",
por = "porosity",
sto = "storage")))
Ta-Daa! We can see permeability
has the most influence on the pressure. Though they may seem suspicious, porosity
and storage
are being plotted correctly - the points are just being placed right on top of each other. Since porosity
can’t go over the value 1
which was the original value, our value vector only went from -50% to 0% which is why the line colors of porosity
and storage
differ. Maybe we want to try and use a logarithmic approach for storage
. This won’t work with the built-in functionality of OGS6_Ensemble
so we’ll set up our Ensemble a little differently.
# Calculate log value
<- log(as.numeric(
log_val $media[[1]]$properties[[4]]$value),
ogs6_objbase = 10)
# Apply changes to log value
<- vapply(percentages, function(x){
log_vals + (log_val * (x / 100))
log_val FUN.VALUE = numeric(1))
},
# Transfer back to non-logarithmic scale
<- 10^log_vals
back_transf_vals
# Change sim_path to fit your system
$sim_path <- paste0(testdir_path, "/axisym_theis_sim_log_storage")
ogs6_obj
# Set up new ensemble
<-
ogs6_ens_sto $new(
OGS6_Ensembleogs6_obj = ogs6_obj,
parameters =
list(
sto = list(
$media[[1]]$properties[[4]]$value,
ogs6_objvalues = back_transf_vals)
),percentages_mode = FALSE,
sequential_mode = TRUE
)
As before, we can run the simulation right away.
$run_simulation() ogs6_ens_sto
lapply(ogs6_ens_sto$ensemble, ogs6_read_output_files)
Let’s check if we can observe any influence of storage
on pressure
now.
# Get combined dataframe
<-
sto_df $get_point_data(
ogs6_ens_stokeys = c("pressure"),
start_at_timestep = ogs6_ens_sto$ensemble[[1]]$pvds[[1]]$last_timestep)
# Supply percentages manually since we couldn't use `percentages_mode`
<- vapply(sto_df$sim_id,
percs function(x){percentages[[x]]},
FUN.VALUE = numeric(1))
ggplot(sto_df,
aes(x = x,
y = -pressure)) +
geom_point(aes(color = as.factor(percs))) +
xlab("Radius (m)") +
ylab("Head (m)") +
labs(color = "%")
We will consider the following parameters for our sensitivity analysis:
permeability
porosity
slope
First, we create a simulation object to base our ensemble on and read in the .prj
file. This time we want to specify that an output file only gets written at the last timestep.
# Change this to fit your system
<- paste0(testdir_path, "/theis_sim")
sim_path
<- OGS6$new(sim_name = "theis",
ogs6_obj sim_path = sim_path)
# Change this to fit your system
<- system.file("extdata/benchmarks/theis_well_pumping/",
prj_path "theis.prj", package = "r2ogs6")
read_in_prj(ogs6_obj, prj_path, read_in_gml = T)
# Increase each_steps
$time_loop$output$timesteps$pair$each_steps <- 200 ogs6_obj
# Assign percentages
<- c(-50, -10, -1, 0, 1, 10, 50)
percentages
<-
ogs6_ens_theis_2 $new(
OGS6_Ensembleogs6_obj = ogs6_obj,
parameters =
list(
per = list(ogs6_obj$parameters[[3]]$values,
values = percentages),
por = list(ogs6_obj$parameters[[2]]$value,
values = percentages),
slo = list(
$media[[1]]$phases[[1]]$properties[[1]]$independent_variable[[2]]$slope,
ogs6_objvalues = percentages)
),sequential_mode = TRUE
)
Now you can start the simulation.
$run_simulation() ogs6_ens_theis_2
lapply(ogs6_ens_theis_2$ensemble, ogs6_read_output_files)
When the simulations have run, we can extract and plot the results like before. To avoid cluttering the plot, we only extract the pressure
values for a single line. For this, we get the IDs of all points on the x axis.
# Extract point ids
<- function(points){
get_point_ids_x <- numeric()
x_axis_ids
for(i in seq_len(dim(points)[[1]])) {
if (points[i, ][[2]] == 0 && points[i, ][[3]] == 0) {
<- c(x_axis_ids, (i - 1))
x_axis_ids
}
}
return(x_axis_ids)
}
<- get_point_ids_x(
point_ids_x $ensemble[[1]]$pvds[[1]]$OGS6_vtus[[1]]$points)
ogs6_ens_theis_2
# Get combined dataframe
<-
per_por_slo_df $get_point_data(
ogs6_ens_theis_2point_ids = point_ids_x,
keys = c("pressure"),
start_at_timestep = ogs6_ens_theis_2$ensemble[[1]]$pvds[[1]]$last_timestep)
# Make plot
ggplot(per_por_slo_df,
aes(x = x,
y = pressure / 9806.65, # 1mH2O = 9806.65 kPa
group = perc)) +
geom_point(aes(color = as.factor(perc))) +
xlab("Radius (m)") +
ylab("Absenkung (m)") +
labs(color = "%") +
facet_grid(cols = vars(name),
labeller = as_labeller(c(per = "permeability",
por = "porosity",
slo = "slope"
)))
Let’s take a closer look at permeability
.
<- subset(per_por_slo_df, name == "per")
per_df
# Make plot
ggplot(per_df,
aes(x = x,
y = pressure)) +
geom_point(aes(color = as.factor(perc))) +
xlab("Radius (m)") +
ylab("Head (m)") +
labs(color = "%")
Maybe we want to try and use a logarithmic approach for slope
. This won’t work with the built-in functionality of OGS6_Ensemble
so we’ll set up our Ensemble a little differently.
# Calculate log value
<- log(as.numeric(
log_val $media[[1]]$phases[[1]]$properties[[1]]$independent_variable[[2]]$slope),
ogs6_objbase = 10)
# Apply changes to log value
<- vapply(percentages, function(x){
log_vals + (log_val * (x / 100))
log_val FUN.VALUE = numeric(1))
},
# Transfer back to non-logarithmic scale
<- 10^log_vals
back_transf_vals
# Change sim_path to fit your system
$sim_path <- paste0(testdir_path, "/theis_sim_log_slope")
ogs6_obj
# Set up new ensemble
<-
ogs6_ens_slo $new(
OGS6_Ensembleogs6_obj = ogs6_obj,
parameters =
list(
slo = list(
$media[[1]]$phases[[1]]$properties[[1]]$independent_variable[[2]]$slope,
ogs6_objvalues = back_transf_vals)
),percentages_mode = FALSE,
sequential_mode = TRUE
)
As before, we can run the simulation right away.
$run_simulation() ogs6_ens_slo
lapply(ogs6_ens_slo$ensemble, ogs6_read_output_files)
Let’s check if we can observe any influence of slope
on pressure
now.
# Filter point ids
<- get_point_ids_x(
point_ids_x $ensemble[[1]]$pvds[[1]]$OGS6_vtus[[1]]$points)
ogs6_ens_slo
# Get combined dataframe
<-
slo_df $get_point_data(
ogs6_ens_slopoint_ids = point_ids_x,
keys = c("pressure"),
start_at_timestep = ogs6_ens_slo$ensemble[[1]]$pvds[[1]]$last_timestep)
# Supply percentages manually since we couldn't use `percentages_mode`
<- vapply(slo_df$sim_id,
percs function(x){percentages[[x]]},
FUN.VALUE = numeric(1))
ggplot(slo_df,
aes(x = x,
y = pressure / 9806.65)) + # 1mH2O = 9806.65 kPa
geom_point(aes(color = as.factor(percs))) +
xlab("Radius (m)") +
ylab("Head (m)") +
labs(color = "%")
The OGS6_Ensemble
class is a useful tool to set up ensemble runs for sensitivity analyses. In this vignette, we learned how to create OGS6_Ensemble
objects. We looked at how the parameters sequential_mode
and percentages_mode
influence how our ensemble object is initialised. We started simulations via OGS6_Ensemble$run_simulation()
and extracted information from the output files to plot them.