Farthest point sampling returns a reordering of the metric space P = p_1, ..., p_k, such that each p_i is the farthest point from the first i-1 points.

farthest_point_sampling(mat, metric = "precomputed", k = nrow(mat),
  initial_point_index = 1L, return_clusters = FALSE)

Arguments

mat

Original distance matrix

metric

Distance metric to use (either "precomputed" or a metric from rdist)

k

Number of points to sample

initial_point_index

Index of p_1

return_clusters

Should the indices of the closest farthest points be returned?

Examples

# generate data df <- matrix(runif(200), ncol = 2) dist_mat <- pdist(df) # farthest point sampling fps <- farthest_point_sampling(dist_mat) fps2 <- farthest_point_sampling(df, metric = "euclidean") all.equal(fps, fps2)
#> [1] TRUE
# have a look at the fps distance matrix rdist(df[fps[1:5], ])
#> 1 2 3 4 #> 2 1.0924360 #> 3 0.9280582 0.8499785 #> 4 0.8029144 0.8977409 1.3558406 #> 5 0.5409166 0.5515509 0.7071422 0.6487099
dist_mat[fps, fps][1:5, 1:5]
#> [,1] [,2] [,3] [,4] [,5] #> [1,] 0.0000000 1.0924360 0.9280582 0.8029144 0.5409166 #> [2,] 1.0924360 0.0000000 0.8499785 0.8977409 0.5515509 #> [3,] 0.9280582 0.8499785 0.0000000 1.3558406 0.7071422 #> [4,] 0.8029144 0.8977409 1.3558406 0.0000000 0.6487099 #> [5,] 0.5409166 0.5515509 0.7071422 0.6487099 0.0000000