Skip to contents

Get a subset of a pangenome that tries to represent the gene presence absence diversity Selects a random genome, then selects the most distant genome from the selected genome then

Usage

get_pangenome_representatives_jaccard(
  pan_mat = NULL,
  pan_dist = NULL,
  SEED = 3,
  verbose = FALSE,
  CUTOFF = 0.5,
  max_genomes = 1000
)

Arguments

pan_mat

gene presence absence matrix

pan_dist

distance matrix if not providing the PA matrix

SEED

random seed

verbose

include print statemetns?

CUTOFF

stop choosing new genomes when distances are below this level

max_genomes

stop choosing new genomes when this many genomes are chosen

Value

returns a tibble, 1) asm_acc, 2) min_jacc, 3) order

Examples

# get_pangenome_representatives_jaccard(pan_dist)