Generate the KNN data with distances
generate_knn_sc.RdThis function will generate the kNNs based on a given embedding. Three
different algorithms are implemented with different speed and accuracy to
approximate the nearest neighbours. "annoy" is more
rapid and based on the Approximate Nearest Neigbours Oh Yeah algorithm;
"hnsw" implements a Hierarchical Navigatable Small Worlds vector
search that is slower, but more precise. Lastly, there is the option of
"nndescent", a Rust-based implementation of the PyNNDescent algorithm. This
version skips the index generation and can be faster on smaller data sets.
This version of the function returns an sc_knn object that can be
used in other functions.
Usage
generate_knn_sc(
object,
embd_to_use = "pca",
cells_to_use = NULL,
no_embd_to_use = NULL,
neighbours_params = params_sc_neighbours(),
seed = 42L,
.verbose = TRUE
)Arguments
- object
SingleCellsclass.- embd_to_use
String. The embedding to use. Whichever you chose, it needs to be part of the object.
- cells_to_use
String. Optional cell names to include in the generation of the kNN graph. If
NULLall (filtered) cells in the object will be used.- no_embd_to_use
Optional integer. Number of embedding dimensions to use. If
NULLall will be used.- neighbours_params
List. Output of
params_sc_neighbours(). A list with the following items:full_snn - Boolean. Shall the full shared nearest neighbour graph be generated that generates edges between all cells instead of between only neighbours.
pruning - Numeric. Weights below this threshold will be set to 0 in the generation of the sNN graph.
snn_similarity - String. One of
c("rank", "jaccard"). Defines how the weight from the SNN graph is calculated. For details, please seeparams_sc_neighbours().knn - List of kNN parameters. See
params_knn_defaults()for available parameters and their defaults.
- seed
Integer. For reproducibility.
- .verbose
Boolean. Controls verbosity and returns run times.