Assign score and labels from raw data

# S4 method for Moanin
splines_kmeans_predict(
  object,
  kmeans_clusters,
  data = NULL,
  method = c("distance", "goodnessOfFit"),
  ...
)

# S4 method for Moanin
splines_kmeans_score_and_label(
  object,
  kmeans_clusters,
  data = NULL,
  proportion_genes_to_label = 0.5,
  max_score = NULL,
  previous_scores = NULL,
  rescale_separately = FALSE
)

Arguments

object	the Moanin object that contains the basis functions used in creating the clusters
kmeans_clusters	List returned by `splines_kmeans`
data	the data to predict. If not given, will use `assay(object)`. If given, the number of columns of `data` must match that of `object`
method	If "distance", predicts based on distance of data to kmeans centroids. If "goodnessOfFit", is a wrapper to `splines_kmeans_score_and_label`, assigning labels based on goodness of fit, including any filtering.
...	arguments passed to `splines_kmeans_score_and_label`
proportion_genes_to_label	float, optional, default: 0.5 Percentage of genes to label. If max_score is provided, will label genes that are either in the top `proportion_genes_to_label` or with a score below `max_score`.
max_score	optional, default: Null When provided, will only label genes below that score. If NULL, ignore this option.
previous_scores	matrix of scores, optional. Allows user to give the matrix scores results from a previous run of `splines_kmeans_score_and_label`, and only redo the filtering (i.e. if want to change `proportion_genes_to_label` without rerunning the calculation of scores)
rescale_separately	logical, whether to score separately within grouping variable

Value

splines_kmeans_predict returns a vector giving the labels for the given data.

A list consisting of

labelsthe label or cluster assigned to each gene based on the cluster with the best (i.e. lowest) score, with no label given to genes that do not have a score lower than a specified quantity
scoresthe matrix of size n_cluster x n_genes, containing for each gene and each cluster, the goodness of fit score
score_cutoffThe required cutoff for a gene receiving an assignment

Examples

data(exampleData)
moanin <- create_moanin_model(data=testData, meta=testMeta)
# Cluster on a subset of genes
kmClusters=splines_kmeans(moanin[1:50,],n_clusters=3)
# get scores on all genes
scores_and_labels <- splines_kmeans_score_and_label(object=moanin, kmClusters)
head(scores_and_labels$scores)
#>                         [,1]      [,2]      [,3]
#> NM_009912          1.0000000 0.4310798 1.0000000
#> NM_008725          0.9005695 1.0000000 0.9954105
#> NM_007473          1.0000000 0.9951235 0.9660120
#> ENSMUST00000094955 0.9792864 1.0000000 0.6498697
#> NM_001042489       0.9260287 1.0000000 0.9287973
#> NM_008159          1.0000000 0.6188498 0.9630068
head(scores_and_labels$labels)
#>          NM_009912          NM_008725          NM_007473 ENSMUST00000094955 
#>                  2                 NA                 NA                  3 
#>       NM_001042489          NM_008159 
#>                 NA                  2 
# should be same as above, only just the assignments
predictLabels1 <- splines_kmeans_predict(object=moanin, kmClusters,
     method="goodnessOfFit")
# Instead use distance to centroid:
predictLabels2 <- splines_kmeans_predict(object=moanin, kmClusters,
     method="distance")