dev-check-model-reproducibility-highres-same-pkg-version.Rmd
This notebook gather the analysis performed in order to compare the
output of the original FishMap main.R
script between Client
and ThinkR. We will compare the results obtained with identical seeds
following the execution of the script
dev/run_main_and_save_output.R
.
Here, we use high resolution parameters :
k = 0.75
month_start <- 10
month_end <- 12
We use similar package versions for Client and ThinkR outputs :
R version 4.2.3
; TMB_1.9.2
;
INLA_22.12.16
R version 4.2.0
;
TMB_1.9.0
; INLA_22.12.16
main.R
We generate the outputs on ThinkR machine. Make sure your .Renviron
variables FISHMAP_UPDATE_OUTPUTS
and
FISHMAP_OUTPUT_DIR
are correctly set.
# Run main.R a second time (model files are already compiled from first run)
source(here::here("dev", "run_main_and_save_output.R"))
## Error in file(filename, "r", encoding = encoding): cannot open the connection
We list the resulting output files.
thinkr1_output_dir <- file.path("~","shared","outputs_fishmap_highres_rerun")
thinkr1_output <- paste0(
list.files(
path = thinkr1_output_dir,
full.names = TRUE
),
collapse = "\n"
)
glue::glue("The paths to ThinkR's second run output files are :\n {thinkr1_output}")
## The paths to ThinkR's second run output files are :
We now load the outputs generated from Clients (BA) in a temporary folder.
# Create tmp folder to store Client output
tmp_folder <- tempfile(pattern = "fishmap_highres")
dir.create(tmp_folder)
# Download and unzip BA highres outputs from Git repo
ba_zip_file_url <- "https://github.com/balglave/FishMap/files/11011699/outputs_fishmap_highres_updatepkg.zip"
download.file(
url = ba_zip_file_url,
destfile = file.path(tmp_folder, "ba_output.zip")
)
unzip(
zipfile = file.path(tmp_folder, "ba_output.zip"),
exdir = file.path(tmp_folder, "ba_output")
)
ba_output_dir <- file.path(tmp_folder, "ba_output", "outputs_fishmap_highres_updatepkg")
ba_output <- paste0(
list.files(
path = ba_output_dir,
full.names = TRUE
),
collapse = "\n"
)
glue::glue("The paths to Baptiste's output files are :\n {ba_output}")
## The paths to Baptiste's output files are :
## /tmp/RtmpMWf0bp/fishmap_highres5ded4c03ec4d/ba_output/outputs_fishmap_highres_updatepkg/converge_output.rds
## /tmp/RtmpMWf0bp/fishmap_highres5ded4c03ec4d/ba_output/outputs_fishmap_highres_updatepkg/obj_input.rds
## /tmp/RtmpMWf0bp/fishmap_highres5ded4c03ec4d/ba_output/outputs_fishmap_highres_updatepkg/opt_output.rds
## /tmp/RtmpMWf0bp/fishmap_highres5ded4c03ec4d/ba_output/outputs_fishmap_highres_updatepkg/report_output.rds
Results between ThinkR and Client’s are not perfectly identical.
Important note : To compare numerical results at two precision levels, we will consecutively set a tolerance in numerical differences to 1e-4 (as used in the other comparison vignettes) and 1e-6.
Conclusion :
fn
element of
the obj_input.rds
file for both tolerance thresholdsWe will use again waldo within a function to display
the differences for each file. We will adapt the tolerance
parameter to adjust numerical comparison threshold.
# Create a function to explore waldo's output file by file between ThinkR and Baptiste outputs
compare_output_file <- function(file_name, tolerance) {
# define client output dir
client_output_dir <- ba_output_dir
# running waldo on one file (thinkR ~ client)
message(glue::glue("contrasting output of {file_name} between thinkr and baptiste with a numerical tolerance of {tolerance}"))
compare_author <- waldo::compare(
x = readRDS(
file.path(client_output_dir, file_name)
),
y = readRDS(
file.path(thinkr1_output_dir, file_name)
),
x_arg = "baptiste",
y_arg = "thinkr",
max_diffs = 100,
tolerance = tolerance
)
return(compare_author)
}
1e-4
numerical
tolerance
We first make the comparison for each file with a tolerance of
1e-4
.
compare_output_file(file_name = "converge_output.rds", tolerance = 1e-4)
## contrasting output of converge_output.rds between thinkr and baptiste with a numerical tolerance of 1e-04
## Warning in gzfile(file, "rb"): cannot open compressed file '/home/rstudio/shared/outputs_fishmap_highres_rerun/converge_output.rds', probable reason 'No
## such file or directory'
## Error in gzfile(file, "rb"): cannot open the connection
compare_output_file(file_name = "opt_output.rds", tolerance = 1e-4)
## contrasting output of opt_output.rds between thinkr and baptiste with a numerical tolerance of 1e-04
## Warning in gzfile(file, "rb"): cannot open compressed file '/home/rstudio/shared/outputs_fishmap_highres_rerun/opt_output.rds', probable reason 'No such
## file or directory'
## Error in gzfile(file, "rb"): cannot open the connection
compare_output_file(file_name = "report_output.rds", tolerance = 1e-4)
## contrasting output of report_output.rds between thinkr and baptiste with a numerical tolerance of 1e-04
## Warning in gzfile(file, "rb"): cannot open compressed file '/home/rstudio/shared/outputs_fishmap_highres_rerun/report_output.rds', probable reason 'No
## such file or directory'
## Error in gzfile(file, "rb"): cannot open the connection
compare_output_file(file_name = "obj_input.rds", tolerance = 1e-4)
## contrasting output of obj_input.rds between thinkr and baptiste with a numerical tolerance of 1e-04
## Warning in gzfile(file, "rb"): cannot open compressed file '/home/rstudio/shared/outputs_fishmap_highres_rerun/obj_input.rds', probable reason 'No such
## file or directory'
## Error in gzfile(file, "rb"): cannot open the connection
With a numerical tolerance of 1-e4
, the outputs between
ThinkR and Baptiste are identical, with the exception of the
fn
object in obj_inputs.rds
, which was also
reported when contrasting Juliette’s outputs. All numerical outputs are
identical at 1e-4 numerical tolerance.
converge_output.rds
output at
1e-6
numerical tolerance
We will now run the same comparison for each file, with a more
stringent numerical tolerance of 1e-6
.
compare_output_file(file_name = "converge_output.rds", tolerance = 1e-6)
## contrasting output of converge_output.rds between thinkr and baptiste with a numerical tolerance of 1e-06
## Warning in gzfile(file, "rb"): cannot open compressed file '/home/rstudio/shared/outputs_fishmap_highres_rerun/converge_output.rds', probable reason 'No
## such file or directory'
## Error in gzfile(file, "rb"): cannot open the connection
opt_output.rds
output at 1e-6
numerical tolerance
compare_output_file(file_name = "opt_output.rds", tolerance = 1e-6)
## contrasting output of opt_output.rds between thinkr and baptiste with a numerical tolerance of 1e-06
## Warning in gzfile(file, "rb"): cannot open compressed file '/home/rstudio/shared/outputs_fishmap_highres_rerun/opt_output.rds', probable reason 'No such
## file or directory'
## Error in gzfile(file, "rb"): cannot open the connection
report_output.rds
output at
1e-6
numerical tolerance
compare_output_file(file_name = "report_output.rds", tolerance = 1e-6)
## contrasting output of report_output.rds between thinkr and baptiste with a numerical tolerance of 1e-06
## Warning in gzfile(file, "rb"): cannot open compressed file '/home/rstudio/shared/outputs_fishmap_highres_rerun/report_output.rds', probable reason 'No
## such file or directory'
## Error in gzfile(file, "rb"): cannot open the connection
obj_input.rds
output at 1e-6
numerical tolerance
compare_output_file(file_name = "obj_input.rds", tolerance = 1e-6)
## contrasting output of obj_input.rds between thinkr and baptiste with a numerical tolerance of 1e-06
## Warning in gzfile(file, "rb"): cannot open compressed file '/home/rstudio/shared/outputs_fishmap_highres_rerun/obj_input.rds', probable reason 'No such
## file or directory'
## Error in gzfile(file, "rb"): cannot open the connection
# delete temporary folder
unlink(tmp_folder)