Skip to contents

Calculate genome 'novelty' from a list of selection sets returned by pick_derep_sets

Usage

calculate_novelty(selection_set_results)

Arguments

selection_set_results

a tibble of selection sets

Value

a dataframe with the novelty scores of all genomes in the list 'Novelty' score for each genome is (number of selections) / median rank of selection

Examples

sets <- pick_derep_sets(example_pangenome_matrix)
genome_novelty <- calculate_novelty(sets)
genome_novelty
#> # A tibble: 26 × 8
#>    asm_acc   median_rank number_selections best_rank worst_rank novelty_score
#>    <chr>           <dbl>             <int>     <int>      <int>         <dbl>
#>  1 genome_55         2                  15         2          4         7.5  
#>  2 genome_19         3                  10         3          4         3.33 
#>  3 genome_31         2.5                 8         2          4         3.2  
#>  4 genome_72         3.5                 6         2          5         1.71 
#>  5 genome_11         3                   5         2          5         1.67 
#>  6 genome_8          2                   3         2          4         1.5  
#>  7 genome_85         4                   5         2          5         1.25 
#>  8 genome_20         3                   3         2          5         1    
#>  9 genome_28         3                   3         3          3         1    
#> 10 genome_60         4.5                 4         3          5         0.889
#> # ℹ 16 more rows
#> # ℹ 2 more variables: log_novelty <dbl>, RANK <int>