The Kew Reconciliation Service (KRS) allows a user submit a taxon for matching against IPNI.

The reconciliation service is an Open Refine style API that allows matching of a single name against IPNI. The matching is done by a series of transformations configured to botanical names in IPNI. These transformations are detailed here, I think.

It appears that KRS is the service that sits behind KNMS. KNMS allows matching of batches of names in one request but does not allow matching to different parts of a name. If you have a set of names to match and just want to do simple matching, I’d use KNMS. But if you want to specify which parts of the names to match on, I’d use KRS.

library(kewr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)

Matching names

To use KRS, you can just submit a single name for matching.

match <- match_krs("Solanum sanchez-vegae S.Knapp")
match
#> <KRS match: 1 names matched to 'Solanum sanchez-vegae S.Knapp'>
#> List of 1
#>  $ :List of 5
#>   ..$ id   : chr "77103635-1"
#>   ..$ name : chr "Solanaceae Solanum sanchez-vegae S.Knapp"
#>   ..$ type :List of 1
#>   .. ..$ :List of 2
#>   .. .. ..$ id  : chr "/biology/organism_classification/scientific_name"
#>   .. .. ..$ name: chr "Scientific name"
#>   ..$ score: num 100
#>   ..$ match: logi TRUE

This also works without the author string included:

match <- match_krs("Solanum sanchez-vegae")
match
#> <KRS match: 1 names matched to 'Solanum sanchez-vegae'>
#> List of 1
#>  $ :List of 5
#>   ..$ id   : chr "77103635-1"
#>   ..$ name : chr "Solanaceae Solanum sanchez-vegae S.Knapp"
#>   ..$ type :List of 1
#>   .. ..$ :List of 2
#>   .. .. ..$ id  : chr "/biology/organism_classification/scientific_name"
#>   .. .. ..$ name: chr "Scientific name"
#>   ..$ score: num 100
#>   ..$ match: logi TRUE

The match results can be converted to a table for easier inspection.

tidy(match)
#> # A tibble: 1 × 5
#>   id         name                                     type           score match
#>   <chr>      <chr>                                    <list>         <dbl> <lgl>
#> 1 77103635-1 Solanaceae Solanum sanchez-vegae S.Knapp <tibble [1 × …   100 TRUE

Matching parts of a name

The reconciliation service provides a specification for matching to different parts of a botanical name. This is described in detail here.

For example, if we want to match to the genus name Myrcia, we could submit a simple request like before.

match <- match_krs("Myrcia")
match
#> <KRS match: 2 names matched to 'Myrcia'>
#> List of 1
#>  $ :List of 5
#>   ..$ id   : chr "30001220-2"
#>   ..$ name : chr "Myrtaceae Myrcia DC."
#>   ..$ type :List of 1
#>   .. ..$ :List of 2
#>   .. .. ..$ id  : chr "/biology/organism_classification/scientific_name"
#>   .. .. ..$ name: chr "Scientific name"
#>   ..$ score: num 50
#>   ..$ match: logi FALSE

But this has returned more than one result. We can be more specific by matching to the genus and the author.

match <- match_krs(list(genus="Myrcia", author="DC"))
match
#> <KRS match: 1 names matched to genus='Myrcia', author='DC'>
#> List of 1
#>  $ :List of 5
#>   ..$ id   : chr "30001220-2"
#>   ..$ name : chr "Myrtaceae Myrcia DC."
#>   ..$ type :List of 1
#>   .. ..$ :List of 2
#>   .. .. ..$ id  : chr "/biology/organism_classification/scientific_name"
#>   .. .. ..$ name: chr "Scientific name"
#>   ..$ score: num 100
#>   ..$ match: logi TRUE

Which has narrowed it down more.

We can specify a match for every part of a name like this.

match <- match_krs(list(genus="Myrcia", species="magnolifolia", infra="latifolia",
                        author="Berg"))
match
#> <KRS match: 1 names matched to genus='Myrcia', species='magnolifolia', infra='latifolia', author='Berg'>
#> List of 1
#>  $ :List of 5
#>   ..$ id   : chr "165832-2"
#>   ..$ name : chr "Myrtaceae Myrcia magnoliifolia DC. var. latifolia O.Berg"
#>   ..$ type :List of 1
#>   .. ..$ :List of 2
#>   .. .. ..$ id  : chr "/biology/organism_classification/scientific_name"
#>   .. .. ..$ name: chr "Scientific name"
#>   ..$ score: num 100
#>   ..$ match: logi TRUE

This match has worked even though there’s a minor misspelling of the specific epithet and the author string. Matching to the taxon name works by a set of pre-configured string transformations that catch some common mistakes in botanical names. The matching to author strings is also slightly fuzzy.

This matching also handles different taxonomic ranks using ordered epithets, where the highest rank is specified as epithet_1 down to epithet_3.

match <- match_krs(list(epithet_1="Solanaceae"))
match
#> <KRS match: 2 names matched to epithet_1='Solanaceae'>
#> List of 1
#>  $ :List of 5
#>   ..$ id   : chr "60437408-2"
#>   ..$ name : chr "Solanaceae Adans."
#>   ..$ type :List of 1
#>   .. ..$ :List of 2
#>   .. .. ..$ id  : chr "/biology/organism_classification/scientific_name"
#>   .. .. ..$ name: chr "Scientific name"
#>   ..$ score: num 50
#>   ..$ match: logi FALSE

This also works for infrageneric names.

match <- match_krs(list(epithet_1="Acacia", epithet_2="Aculeiferum", author="Vassal"))
match
#> <KRS match: 3 names matched to epithet_1='Acacia', epithet_2='Aculeiferum', author='Vassal'>
#> List of 1
#>  $ :List of 5
#>   ..$ id   : chr "53905-3"
#>   ..$ name : chr "Mimosaceae Acacia sect. Aculeiferum (Vassal) Vassal"
#>   ..$ type :List of 1
#>   .. ..$ :List of 2
#>   .. .. ..$ id  : chr "/biology/organism_classification/scientific_name"
#>   .. .. ..$ name: chr "Scientific name"
#>   ..$ score: num 33
#>   ..$ match: logi FALSE

It should be noted that these last two examples give a score lower than 100, because they return more than one match.

Matching more than one name

If you want to do simple matching to more than one name, it might be easier to use KNMS.

If you want to match the individual parts of multiple names, you can apply the matching function to the rows of a data frame, using dplyr::rowwise.

names <- tibble(
  genus=c("Poa", "Myrcia", "Solanum"),
  species=c("annua", "almasensis", "sanchez-vegae"),
  author=c("L.", "NicLugh.", "S.Knapp")
)


matches <-
  names %>%
  rowwise() %>%
  mutate(match=list(match_krs(list(genus=genus, species=species, author=author)))) %>%
  mutate(match=list(tidy(match))) %>%
  unnest(cols=c(match))

matches
#> # A tibble: 3 × 8
#>   genus   species       author   id         name            type     score match
#>   <chr>   <chr>         <chr>    <chr>      <chr>           <list>   <dbl> <lgl>
#> 1 Poa     annua         L.       320035-2   Poaceae Poa an… <tibble…   100 TRUE 
#> 2 Myrcia  almasensis    NicLugh. 304073-2   Myrtaceae Myrc… <tibble…   100 TRUE 
#> 3 Solanum sanchez-vegae S.Knapp  77103635-1 Solanaceae Sol… <tibble…   100 TRUE