-
Ruben Heinrich authoredRuben Heinrich authored
title: "r2ogs6 Developer Guide"
author: "Ruben Heinrich"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{r2ogs6 Developer Guide}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
devtools::load_all(".")
library(r2ogs6)
Hi there!
Welcome to my dev guide on r2ogs6
. This is a collection of tips, useful info (and admittedly a few warnings) which will hopefully make your life a bit easier when developing this package.
The basics
Before we dive into any implementation details, we will take a look at how exactly this package is structured first. r2ogs6
was developed using the workflow described here. I strongly recommend keeping it that way as it will save you time and headaches.
...
In the main folder R/
you will find a lot of scripts, most of which can be grouped into the following categories:
-
export_*.R
export functions -
generate_*.R
code generation -
read_in_*.R
import functions -
ogs6_*.R
simulation class definitions -
prj_*.R
class definitions for XML tags found in a.prj
file -
*_utils.R
utility functions used in multiple scripts
The classes
r2ogs6
is largely built on top of S3 classes at the moment. For reasons I will elaborate on later, it is very viable to switch to R6 classes. But let's look at what we have first.
....
Generating new classes
If you've familiarized yourself with OpenGeoSys 6, you know that there are a lot, and by a lot I mean a LOT of parameters and special cases regarding the .prj
XML tags. For a nice new class based on such a tag, you will have to consider all of them.
To save me (and you) a bit of typing, I've written a few useful functions for this.
analyse_xml()
The first and arguably most important one is analyse_xml()
. It matches files in a folder, reads them in as XML and searches for XML elements of a given name. It then analyses those elements and returns useful information about them, namely the names of their attributes and child elements. It prints a summary of its findings and also returns a list which we will look at in a moment.
I used this function for two things: Analysing ... . Secondly, as soon as I had decided which tags should be represented by a class, I used the function output for class generation.
generate_*()
So say we have some .prj
files stored in a folder. I will show the workflow on a small dataset (that is, on a folder with only two .prj
files) here, the path I usually passed to analyse_xml()
was the directory containing all of the benchmark files for OpenGeoSys 6 which can be downloaded from here.
test_folder <- system.file("extdata/vignettes_data/analyse_xml_demo",
package = "r2ogs6")
Now say we have decided we are going to make a class based on the element with tag name nonlinear_solver
. For readability reasons, I will store the results of analyse_xml()
in a variable and pass it to our generator function. If you want, you can skip this step and call analyse_xml()
in the generator function directly.
analysis_results <- analyse_xml(path = test_folder,
pattern = "\\.prj$",
xpath = "//nonlinear_solver",
print_findings = TRUE)
First, I define my path and specify that only files with the ending .prj
will be parsed. I'm looking for elements named nonlinear_solver
, and I'm looking for them in the whole document. This often isn't the best option since sometimes nodes may have the same name but contain different things depending on their exact position in the document, which is also the case here. To narrow it down further, change xpath
accordingly.
analysis_results <- analyse_xml(path = test_folder,
pattern = "\\.prj$",
xpath = "/OpenGeoSysProject/nonlinear_solvers/nonlinear_solver",
print_findings = TRUE)
Now we can be sure our future class will be generated from the correct parameters.
analyse_xml()
returns a named list invisibly, let's have a short look at it.
analysis_results
You can see the list contains the xpath
parameter passed to analyse_xml()
, along with three named logical vectors called children
, attributes
and both_sorted
respectively. They can be read like this: If an attribute or a child of the element specified by xpath
always occurred, it is a required parameter for the new class. Else, it is an optional parameter. The logical vectors are sorted by occurrency, so the rarest children and attributes will go to the very end of their logical vector. Now, let's generate some code!
For S3 classes, we generate a constructor like this:
generate_constructor(params = analysis_results,
print_result = TRUE)
For S3 classes, we generate a helper like this:
generate_helper(params = analysis_results,
print_result = TRUE)
For R6 classes, we generate a constructor like this:
generate_R6(params = analysis_results,
print_result = TRUE)
Ta-daa, you now have some nice stubs. Copy them into a script in the R
folder of this package, add some documentation and validation to it and you're almost done.
Integrating new classes
Now that we have a class, we need to tell the package it exists. This is so when we're reading in or exporting a .prj
file, it knows to automatically turn the content of our nonlinear_solver
tag into an object of our new class and the other way around. To achieve this, execute the code in data_raw/xpaths_for_classes.R
. What this will do is update the xpaths_for_classes
parameter, adding an entry for your class. Afterwards, run xpaths_for_classes[["your_class_name"]]
. It should return the xpath
parameter of your class like so:
xpaths_for_classes[["prj_process"]]
# A class can have multiple xpaths if the represented node occurs at different positions.
xpaths_for_classes[["prj_convergence_criterion"]]
If the class you've created is a .prj
top level class or a child of a top level wrapper node like processes
, add a corresponding OGS6
private parameter and an active field. For example, the processes
node is represented as a list, so I added the private parameter .processes = list()
and the active field processes
.
A lot of things in the r2ogs6
package work in a way that is a bit "meta". Often times, functions are called via eval(parse(text = call_string))
where call_string
has for example been concatenated out of info about the parameter names of a certain class. This saves a lot of code regarding import, export and script generation but requires that you've made the respective info available as shown here.
So we've analysed some files, generated some code, created a new class and registered it with the package... what now? That's it actually, that's the workflow. Well, at least it's supposed to be.
Recursive function guide
If that wasn't it, I'm afraid you might have to take a look at the functions handling import, export and benchmark script generation. These are a bit tricky because they use recursion which so far has proven to be efficient structure-wise but not exactly fun to think about.
read_in
to_node
generate_benchmark_script
Conclusion
I hope you've taken away some helpful information from this short guide. If you make changes to improve the workflow, please update this vignette for the next dev!