Assign score and labels from raw data

# S4 method for Moanin
splines_kmeans_predict(
  object,
  kmeans_clusters,
  data = NULL,
  method = c("distance", "goodnessOfFit"),
  ...
)

# S4 method for Moanin
splines_kmeans_score_and_label(
  object,
  kmeans_clusters,
  data = NULL,
  proportion_genes_to_label = 0.5,
  max_score = NULL,
  previous_scores = NULL,
  rescale_separately = FALSE
)

Arguments

object

the Moanin object that contains the basis functions used in creating the clusters

kmeans_clusters

List returned by splines_kmeans

data

the data to predict. If not given, will use assay(object). If given, the number of columns of data must match that of object

method

If "distance", predicts based on distance of data to kmeans centroids. If "goodnessOfFit", is a wrapper to splines_kmeans_score_and_label, assigning labels based on goodness of fit, including any filtering.

...

arguments passed to splines_kmeans_score_and_label

proportion_genes_to_label

float, optional, default: 0.5 Percentage of genes to label. If max_score is provided, will label genes that are either in the top `proportion_genes_to_label` or with a score below `max_score`.

max_score

optional, default: Null When provided, will only label genes below that score. If NULL, ignore this option.

previous_scores

matrix of scores, optional. Allows user to give the matrix scores results from a previous run of splines_kmeans_score_and_label, and only redo the filtering (i.e. if want to change proportion_genes_to_label without rerunning the calculation of scores)

rescale_separately

logical, whether to score separately within grouping variable

Value

splines_kmeans_predict returns a vector giving the labels for the given data.

A list consisting of

  • labelsthe label or cluster assigned to each gene based on the cluster with the best (i.e. lowest) score, with no label given to genes that do not have a score lower than a specified quantity

  • scoresthe matrix of size n_cluster x n_genes, containing for each gene and each cluster, the goodness of fit score

  • score_cutoffThe required cutoff for a gene receiving an assignment

Examples

data(exampleData) moanin <- create_moanin_model(data=testData, meta=testMeta) # Cluster on a subset of genes kmClusters=splines_kmeans(moanin[1:50,],n_clusters=3) # get scores on all genes scores_and_labels <- splines_kmeans_score_and_label(object=moanin, kmClusters) head(scores_and_labels$scores)
#> [,1] [,2] [,3] #> NM_009912 1.0000000 0.4310798 1.0000000 #> NM_008725 0.9005695 1.0000000 0.9954105 #> NM_007473 1.0000000 0.9951235 0.9660120 #> ENSMUST00000094955 0.9792864 1.0000000 0.6498697 #> NM_001042489 0.9260287 1.0000000 0.9287973 #> NM_008159 1.0000000 0.6188498 0.9630068
head(scores_and_labels$labels)
#> NM_009912 NM_008725 NM_007473 ENSMUST00000094955 #> 2 NA NA 3 #> NM_001042489 NM_008159 #> NA 2
# should be same as above, only just the assignments predictLabels1 <- splines_kmeans_predict(object=moanin, kmClusters, method="goodnessOfFit") # Instead use distance to centroid: predictLabels2 <- splines_kmeans_predict(object=moanin, kmClusters, method="distance")