Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate 'guessing' functions for horizon-level column names #309

Merged
merged 6 commits into from
Jun 25, 2024

Conversation

brownag
Copy link
Member

@brownag brownag commented Feb 25, 2024

RE: #306 (comment)

This PR deprecates guessHzDesgnName() guessHzTexClName() and removes guessHzAttrName() from usage in aqp functions. I think the latter function is useful for certain applications if users "opt in" by using it in their own code but we can certainly avoid "internal" use of it.

The mechanism to require metadata be set, and generic methods for getting/setting site or horizon level metadata have existed for some time (

# this method throws an error if required metadata aren't present
.require.metadata.aqp <- function(object, attr, attrlabel, message, required = FALSE) {
xx <- metadata(object)[[attr]]
if (length(xx) == 0 || nchar(xx) == 0) {
if (required) {
stop(attrlabel, " (Metadata element: '", attr, "') is not specified for this SoilProfileCollection. It is required.", message,
call. = FALSE)
} else {
xx <- ""
}
}
xx
}
# this method handles setting of user metadata
# key features:
# - NA, "", and NULL `value` are treated the same
# - If `value` does not exist in horizonNames()
# - allowed names is either "site" "horizon" or NULL to allow options:
# siteNames(), horizonNames() and no limit
#
.set.metadata.aqp <- function(object, value, required = FALSE,
attr, attrlabel, message = "",
allowednames = NULL) {
# test: does it exist?
if (length(value) == 0) {
value <- ""
}
if(length(value) > 0) {
# several ways to "reset" the metadata to the safe but invalid value
if((value == "") | is.na(value) | is.null(value)) {
value <- ""
} else if (!is.null(allowednames)) {
namegrp <- switch(tolower(allowednames),
"site" = siteNames(object),
"horizon" = horizonNames(object), NULL)
if (!is.null(namegrp) && !(value %in% namegrp)) {
stop(paste0(attrlabel, " (", value, ") not in ", allowednames, " names."), call. = FALSE)
}
}
}
# replace
metadata(object)[[attr]] <- value
# check if required
.require.metadata.aqp(object,
attr = attr,
attrlabel = attrlabel,
message = message,
required = required)
# done
return(object)
}
). These are now used in all cases that used to use the guessXX functions. There are still several cases that could make use of required metadata routines.

To ease the handling of generic metadata column names, hzmetaname() and hzmetaname<- have been added. It allows the specification/storage of standard, but custom, metadata that relates a name for horizon data of a particular type. This replaces the use of guessHzAttrName(p, attr = 'clay', c("total", "_r")) in argillic/PSCS function definitions.

For example:

library(aqp)
#> This is aqp 2.0.3

data(sp1)

# promote to SPC
depths(sp1) <- id ~ top + bottom

# set important metadata columns
hzdesgnname(sp1) <- "name"
hztexclname(sp1) <- "texture"

# set custom horizon property (clay content) column
hzmetaname(sp1, "clay") <- "prop"

# inspect metadata list
metadata(sp1)
#> $aqp_df_class
#> [1] "data.frame"
#> 
#> $aqp_group_by
#> [1] ""
#> 
#> $aqp_hzdesgn
#> [1] "name"
#> 
#> $aqp_hztexcl
#> [1] "texture"
#> 
#> $depth_units
#> [1] "cm"
#> 
#> $stringsAsFactors
#> [1] FALSE
#> 
#> $original.order
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#> [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
#> [51] 51 52 53 54 55 56 57 58 59 60
#> 
#> $aqp_hzclay
#> [1] "prop"

# get horizon clay content column
hzmetaname(sp1, "clay")
#> [1] "prop"

# uses hzdesgname(), hztexclname(), hzmetaname(attr="clay") in function definition
estimatePSCS(sp1)
#>     id pscs_top pscs_bottom
#> 1 P001       49          89
#> 2 P002       30          59
#> 3 P003        2          52
#> 4 P004       32          62
#> 5 P005        5          55
#> 6 P006       31         106
#> 7 P007       25         100
#> 8 P008       27         102
#> 9 P009       28         103

One area where guessing is still used is guessGenHzLevels(). I will not address that in this PR, but out of all of these functions this is actually the only one I am aware of having caused confusion. I have had folks ask me about why the levels returned by generalize.hz() were ordered the way they were, and perhaps we should move this logic outside of the generalization function. Ordered factors make sense when the SPC is carefully curated, but can wind up being a bit illogical when there is, for example, a wide range of depth classes with different types of vertical subdivision present.

 - `guessHzAttrName()` is still available for users, but never called in aqp functions
 - Preferred method is to specify in function call or have users set metadata.
   - Require metadata via `required` argument to metadata get methods
 - update examples and tests accordingly
@brownag brownag mentioned this pull request Feb 25, 2024
3 tasks
@dylanbeaudette
Copy link
Member

Thanks. I did not mean to trigger an all at once change, but I do think that magical function action is best avoided. I'll think about how to best address guessGenHzLevels()--thanks for the reminder.

@brownag
Copy link
Member Author

brownag commented Feb 25, 2024

magical function action is best avoided.

IMO it is not that magical if we have exported, documented functions as default arguments to do the operation vs. having it buried inside a function, or having custom handling within the function like plotSPC() did. I agree there were a few variants of approaches, which was confusing and inconsistent. We still have some variation for functions that can take data.frame or SPC as input, but I think this PR is a step in the right direction for the future.

It is better overall to guide the users towards using the metadata functions / setting the metadata, and ensuring that soilDB functions also set relevant metadata so most users don't have to worry about it.

@brownag brownag marked this pull request as ready for review March 22, 2024 15:53
@brownag
Copy link
Member Author

brownag commented Apr 1, 2024

I would suggest merging this after CRAN release of 2.0.3

@brownag
Copy link
Member Author

brownag commented Jun 24, 2024

I think this is ready to merge

@brownag
Copy link
Member Author

brownag commented Jun 25, 2024

Thanks for review, just synced with master branch, assuming everything passes (aside pending fixes from #317) will proceed with merge

@brownag brownag merged commit a98a409 into master Jun 25, 2024
4 of 5 checks passed
@brownag brownag deleted the deprecateGuess branch June 25, 2024 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants