11a: non-Metric Multidimensional Scaling (nMDS)

Published

2026/06/15

TipMaterial Required for This Chapter
Type Name Link
Theory Numerical Ecology with R See pages 145-151
Slides nMDS lecture slides 💾 BCB743_11_nMDS.pdf
Data The Doubs River data 💾 Doubs.RData
ImportantTasks to Complete in This Chapter

Non-Metric Multidimensional Scaling (nMDS) is a rank-based indirect gradient analysis that uses a distance or dissimilarity matrix as its input (either pre-calculated using vegdist() or constructed internal to the metaMDS() function via the dist argument). Should one supply a ‘raw’ species × site table, the default dissimilarity matrix is Bray-Curtis dissimilarity, but any other dissimilarity index in vegdist() can be specified. Unlike other ordination methods such as Principal Component Analysis (PCA) and Correspondence Analysis (CA), which aim to maximise variance or correspondence between sites, nMDS focuses on representing the pairwise dissimilarities between sites in an ordination space. It does not use the raw distances or dissimilarities directly; instead, these values are replaced with their ranks, which is why the method is termed “non-metric.”

nMDS is the non-metric equivalent of Principal Coordinates Analysis (PCoA); in fact, PCoA is sometimes referred to as metric multidimensional scaling. PCoA and nMDS can both produce ordinations of objects from any distance or dissimilarity matrix. However, nMDS does not preserve the exact distances among objects in an ordination plot. Instead, it tries to represent the ordering (rank) relationships among objects as accurately as possible on a specified number of axes. This non-linear mapping of dissimilarities onto a low-dimensional ordination space means that the Euclidean distances of points in the ordination space are rank-order similar to the original community dissimilarities.

The two methods sit one step apart, and the table fixes the distinction:

PCoA nMDS
preserves distances preserves ranks
metric non-metric
eigenvalue decomposition iterative optimisation
keeps the distance magnitudes keeps only the ordering
diagnostic: eigenvalues diagnostic: stress

“Preserving the ranks” is worth seeing directly. Suppose four pairs of sites have Bray-Curtis dissimilarities of 0.10, 0.30, 0.55, and 0.80. nMDS does not try to reproduce those values. It tries only to keep the pairs in the same order of similarity, so that the most similar pair stays closest together in the plot and the least similar pair stays furthest apart (Figure 1). The spacing is free to change but the order is not.

Code
rank_demo <- tibble(
  pair = c("A-B", "A-C", "A-D", "A-E"),
  `Bray-Curtis\ndissimilarity` = c(0.10, 0.30, 0.55, 0.80),
  `distance in the\nnMDS plot` = c(0.30, 0.95, 1.25, 2.35)
) |>
  pivot_longer(-pair, names_to = "stage", values_to = "val") |>
  mutate(
    stage = factor(
      stage,
      levels = c("Bray-Curtis\ndissimilarity", "distance in the\nnMDS plot")
    )
  )

ggplot(rank_demo, aes(stage, val, group = pair, colour = pair)) +
  geom_line(linewidth = 0.7) +
  geom_point(size = 2.6) +
  geom_text(
    data = filter(rank_demo, stage == "Bray-Curtis\ndissimilarity"),
    aes(label = pair),
    hjust = 1.3,
    size = 3
  ) +
  scale_x_discrete(expand = expansion(mult = c(0.3, 0.15))) +
  labs(x = NULL, y = "value") +
  theme(legend.position = "none")
Figure 1: Rank preservation. Four site pairs are shown by their Bray-Curtis dissimilarity (left) and by the distance between the same pairs in the nMDS plot (right). The lines do not cross, so the order of the pairs is preserved, from most similar (A-B) to least similar (A-E). Their spacing changes, because nMDS keeps only the ordering and is free to stretch or shrink the gaps.

The ordination space in nMDS is metric, but the regression used to fit the dissimilarities to this space is non-metric. Because it uses only the rank order of the dissimilarities, nMDS makes fewer assumptions about the structure of the distances than the eigen-methods do, which helps when ecological data depart from the distributional or geometric forms those methods expect. This is a difference in assumptions, not a verdict that nMDS is the better method: PCA remains the right tool for linear, Euclidean structure, and CA for \(\chi^2\) structure. Like PCoA, nMDS can in principle take a dissimilarity built from any kind of variable, but in community ecology it is most often applied to species composition through measures such as Bray-Curtis, Jaccard, or Sørensen, and that is the use I develop here.

A new concept, not seen in the eigen-approaches, is the idea of ‘stress.’ Stress quantifies the discrepancy between the observed dissimilarities and the distances in the ordination space. Stress is visually presented as the scatter of observed dissimilarities against the expected monotone regression. Lower stress values indicate a better fit of the data to the ordination space. Because rank orders of dissimilarities cannot be exactly preserved by rank-orders of ordination distances in low-dimensional space, some stress cannot be avoided.

Stress is the nMDS counterpart of the goodness measures met in the earlier chapters, and it is read in the opposite direction, namely lower is better rather than higher. Each method reports how well its low-dimensional picture represents the data, under a different name:

Method Goodness measure
PCA variance explained
CA inertia explained
PCoA eigenvalues
nMDS stress (lower is better)

nMDS does have some limitations. The rank-based approach means that information about the magnitude of differences between site pairs is lost and this can be a disadvantage when the actual distances are important for interpretation. Also, nMDS can be computationally intensive with large datasets or when trying to minimise stress through numerous iterations.

After performing nMDS, I can interpret environmental structure with vegan’s envfit() and ordisurf() functions, as I have already seen in PCA, CA, and PCoA. As before, these functions fit environmental variables onto the ordination. They help show how environmental variables are associated with the species ordination, but they do not turn nMDS into a formal constrained ordination. If I require a statistical framework to assess categorical effects on observed dissimilarities, I can use PERMANOVA (Permutational Multivariate Analysis of Variance). These ideas are shown below.

Set-Up the Analysis Environment

library(tidyverse)
library(vegan)
library(viridis)
library(ggrepel) # for tidy biplot labels

# Files will be referenced using here::here() for absolute paths

The Doubs River Data

I continue to use the species data:

load(here::here(
  "data",
  "BCB743",
  "NEwR-2ed_code_data",
  "NEwR2-Data",
  "Doubs.RData"
))
spe <- dplyr::slice(spe, -8)

Do the nMDS

1spe_nmds <- metaMDS(spe, distance = "bray", trace = 0)
spe_nmds
1
I use trace = 0 to suppress the output of the iterations.

Call:
metaMDS(comm = spe, distance = "bray", trace = 0)

global Multidimensional Scaling using monoMDS

Data:     spe
Distance: bray

Dimensions: 2
Stress:     0.07376229
Stress type 1, weak ties
Best solution was repeated 1 time in 20 tries
The best solution was from try 8 (random start)
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'spe' 

By default, metaMDS() also has autotransform = TRUE, so it may transform raw community data before calculating dissimilarities. That is convenient for first-pass ordinations, but when you have already transformed the data or need exact agreement with another analysis, supply a precomputed distance matrix or set autotransform = FALSE.

NoteWhy “Best Solution from Try …”?

Unlike the eigen-methods, metaMDS() does not solve for its axes in a single step. It searches, starting from many random configurations and keeping the one with the lowest stress, because the optimisation can settle into a local minimum that is not the best available. I suppressed the running log with trace = 0; without it, metaMDS() prints lines such as Best solution was from try 7. The same reason explains why running it again can return a plot that is rotated or mirrored: the orientation of the axes is arbitrary, only the configuration of points relative to one another carries meaning.

As always, reading the help file (accessible as ?vegan::metaMDS) is invaluable, as it is for all other ordination techniques.

There is a summary method available, but it is not particularly useful and I do not display the output here:

summary(spe_nmds)

Although summary(spe_nmds) does not return anything interesting, the species and site scores are nevertheless available directly through the scores() command, and they can be plotted as a layer in ggplot2 if needed:

scores(spe_nmds)
$sites
         NMDS1       NMDS2
1  -1.78637175  0.81869913
2  -1.14365966 -0.15631784
3  -1.00357338 -0.07494361
4  -0.62138650 -0.07870613
5   0.07173345  0.45922084
6  -0.42772875 -0.15287513
7  -0.87059326 -0.23029890
8  -0.01407051 -0.86265587
9  -0.52650217 -0.40399735
10 -1.00849345 -0.37668417
11 -0.97239211 -0.16390821
12 -1.16277723  0.09662109
13 -0.80299016  0.12329202
14 -0.49321206  0.17907383
15 -0.18906758  0.27544156
16  0.08269164  0.12231269
17  0.29908969  0.11249489
18  0.44578111  0.14960037
19  0.76705354  0.28465318
20  0.86196372  0.37782328
21  0.95488428  0.44496834
22  0.75415306 -1.44430805
23  1.13005094 -0.63109804
24  0.85950436 -0.87214750
25  0.93308696  0.12369992
26  0.97486057  0.36375426
27  1.02339896  0.37391321
28  0.79599004  0.55881834
29  1.06857630  0.58355387

$species
          NMDS1       NMDS2
Cogo -0.9122165 -0.10543745
Satr -1.1122862 -0.22568856
Phph -0.7890604 -0.32127999
Babl -0.5341412 -0.28980694
Thth -0.9380770 -0.09798833
Teso -0.5249368  0.16290435
Chna  0.9013670  0.36642015
Pato  0.5170017  0.38504288
Lele  0.3300299  0.28546774
Sqce  0.3584408 -0.15432493
Baba  0.7035228  0.47487862
Albi  0.7278511  0.47023586
Gogo  0.6849497  0.30439877
Eslu  0.6092081  0.42158340
Pefl  0.6168840  0.50349584
Rham  0.9662932  0.58291461
Legi  0.9558530  0.50941947
Scer  0.9654409  0.55208266
Cyca  0.9581991  0.62828043
Titi  0.7238399  0.41810692
Abbr  1.0820133  0.67990080
Icme  1.1276516  0.78239723
Gyce  1.0739651  0.40773244
Ruru  0.7591053  0.15189845
Blbj  1.0946850  0.58067169
Alal  0.9932691  0.02815640
Anan  1.0087027  0.61236511

See Numerical Ecology with R (pp. 145 to 149) for information about the interpretation of a nMDS and the ordination diagrams shown below.

Ordination Diagrams

I create the ordination diagrams as before, but new concepts introduced here are stress, Shepard plots, and goodness of fit (Figure 2). The stress indicates the scatter of observed dissimilarities against an expected monotone regression, while a Shepard diagram plots ordination distances against original dissimilarities and adds a monotone or linear fit line to highlight this relationship. The stressplot() function also produces two fit statistics. The goodness-of-fit of the ordination is measured as the \(R^{2}\) of either a linear or a non-linear regression of the nMDS distances on the original ones.

opar <- par(no.readonly = TRUE)
par(mfrow = c(2, 2))
stressplot(spe_nmds, main = "Shepard plot")
ordiplot(
  spe_nmds,
  type = "t",
  cex = 1.2,
  main = paste0("nMDS stress = ", round(spe_nmds$stress, 2))
)
gof <- goodness(spe_nmds)
plot(spe_nmds, type = "t", cex = 1.2, main = "Goodness of fit")
points(spe_nmds, display = "sites", cex = gof * 200)
# ...bigger bubbles indicate a worse fit
par(opar)
Figure 2: nMDS ordination plots of the Doubs River species data showing a Shepard plot (top, left), the ordination diagram (top, right), and goodness of fit (bottom, right).

In the goodness-of-fit panel each site is drawn as a bubble whose size reflects how poorly its dissimilarities to the other sites are reproduced in the ordination. Large bubbles flag sites whose placement is a compromise, so their position should be read with more caution than that of the small-bubble sites.

A good rule of thumb (after Clarke 1993): stress <0.05 is an excellent representation in reduced dimensions, <0.1 is good, <0.2 is usable but only the broad pattern should be trusted, 0.2–0.3 is poor, and stress >0.3 is close to arbitrary.

NoteWhat Stress Means

Interpret stress as a mismatch between the fitted ordination distances and the monotone relationship to the original dissimilarities, not as a percentage of variance explained. Lower is better, and the thresholds above are rough guides rather than pass marks. A stress of 0.12 is not a failure; it says the community is a little too complex to flatten perfectly into two dimensions. Treat the number as a measure of how far to trust the fine detail of the plot, not as a score to be minimised for its own sake.

How to Read This nMDS

The diagnostics tell me the ordination is trustworthy; the next question is what it means ecologically. Figure 3 redraws the ordination with the sites coloured from source to mouth and the distinctive species labelled. The axis is oriented so that the source lies on the left.

Code
nmds_sites <- as.data.frame(scores(spe_nmds, display = "sites"))
nmds_sites$site <- as.integer(rownames(nmds_sites))
nmds_spp <- as.data.frame(scores(spe_nmds, display = "species"))
nmds_spp$lab <- rownames(nmds_spp)

# the orientation of nMDS axes is arbitrary; flip so the source (site 1) is on the left
flip <- if (nmds_sites$NMDS1[nmds_sites$site == 1] > 0) -1 else 1
nmds_sites$NMDS1 <- nmds_sites$NMDS1 * flip
nmds_spp$NMDS1 <- nmds_spp$NMDS1 * flip

nmds_spp$d <- sqrt(nmds_spp$NMDS1^2 + nmds_spp$NMDS2^2)
nmds_lab <- nmds_spp[nmds_spp$d > quantile(nmds_spp$d, 0.45), ]

ggplot(nmds_sites, aes(NMDS1, NMDS2)) +
  geom_hline(yintercept = 0, colour = "grey85") +
  geom_vline(xintercept = 0, colour = "grey85") +
  geom_point(aes(colour = site), size = 2) +
  scale_colour_viridis_c(name = "Site\n(1 = source)") +
  geom_point(
    data = nmds_spp,
    aes(NMDS1, NMDS2),
    colour = "seagreen4",
    shape = 3,
    size = 0.8
  ) +
  geom_text_repel(
    data = nmds_lab,
    aes(NMDS1, NMDS2, label = lab),
    colour = "seagreen4",
    size = 2.5,
    max.overlaps = Inf,
    segment.colour = "grey80"
  ) +
  annotate(
    "segment",
    x = min(nmds_sites$NMDS1) * 0.95,
    y = min(nmds_sites$NMDS2) * 1.08,
    xend = max(nmds_sites$NMDS1) * 0.95,
    yend = min(nmds_sites$NMDS2) * 1.08,
    arrow = arrow(length = unit(2.5, "mm")),
    colour = "firebrick"
  ) +
  annotate(
    "text",
    x = 0,
    y = min(nmds_sites$NMDS2) * 1.2,
    label = "upstream to downstream (NMDS1)",
    colour = "firebrick",
    size = 2.7
  ) +
  labs(x = "NMDS1", y = "NMDS2") +
  coord_equal()
Figure 3: An annotated nMDS (Bray-Curtis) of the Doubs fish data. Points are sites, coloured from the source (dark) to the mouth (yellow); green crosses are species, labelled where they sit away from the crowded centre. The first axis separates the upstream and downstream faunas.

The reading is the gradient I have met in every ordination of these data:

  • The sites fall along the first axis in river order. metaMDS() rotates its solution so that the widest spread lies on NMDS1, and here that spread is the source-to-mouth sequence, recovered from species composition alone.
  • The two faunas separate along NMDS1. The cool upper river holds trout-zone species, brown trout (Satr), grayling (Thth), bullhead (Cogo), and minnow (Phph); the lower river holds lowland species such as bream (Abbr), silver bream (Blbj), ruffe (Gyce), and eel (Anan).
  • NMDS1 is therefore the upstream-to-downstream river gradient, the same gradient recovered by the PCA, CA, and PCoA. The environmental vectors fitted below confirm it: dissolved oxygen and elevation point towards the trout end, and distance from source, nitrate, and organic load towards the lowland end.
  • The low stress (about 0.07) means the two-dimensional picture is faithful, so the gradient can be read from it with confidence.

This ordination is descriptive. When nMDS is used beside PERMANOVA, also check group dispersions with betadisper()/PERMDISP, as shown in the diatom case study, because apparent group separation can reflect differences in spread rather than differences in centroid location.

I can also build ordination plots from scratch to suit specific needs:

pl <- ordiplot(spe_nmds, type = "none", main = "nMDS fish abundances ")
points(pl, "sites", pch = 21, cex = 1.75, col = "grey80", bg = "grey80")
points(pl, "species", pch = 21, col = "turquoise", arrows = TRUE)
text(pl, "species", col = "blue4", cex = 0.9)
text(pl, "sites", col = "red4", cex = 0.9)

nMDS ordination plot of the Doubs River species data assembled from scratch.

nMDS ordination plot of the Doubs River species data assembled from scratch.

Or I can fit response surfaces using ordisurf() and project environmental drivers (Figure 4):

palette(viridis(8))
opar <- par(no.readonly = TRUE)
par(mar = c(4, 4, 0.9, 0.5) + .1, mfrow = c(2, 2))

invisible(ordisurf(
  spe_nmds ~ Satr,
  data = spe,
  bubble = 3,
  family = quasipoisson,
  knots = 2,
  col = 6,
  display = "sites",
  main = "Salmo trutta fario"
))
abline(h = 0, v = 0, lty = 3)

invisible(ordisurf(
  spe_nmds ~ Scer,
  data = spe,
  bubble = 3,
  family = quasipoisson,
  knots = 2,
  col = 6,
  display = "sites",
  main = "Scardinius erythrophthalmus"
))
abline(h = 0, v = 0, lty = 3)

invisible(ordisurf(
  spe_nmds ~ Teso,
  data = spe,
  bubble = 3,
  family = quasipoisson,
  knots = 2,
  col = 6,
  display = "sites",
  main = "Telestes souffia"
))
abline(h = 0, v = 0, lty = 3)

invisible(ordisurf(
  spe_nmds ~ Cogo,
  data = spe,
  bubble = 3,
  family = quasipoisson,
  knots = 2,
  col = 6,
  display = "sites",
  main = "Cottus gobio"
))
abline(h = 0, v = 0, lty = 3)

env <- env[-8, ] # because we removed the eighth site in the spp data

# A posteriori projection of environmental variables onto the nMDS
# The last nMDS plot produced must be active
spe_nmds_env <- envfit(spe_nmds, env) # fit environmental vectors onto the ordination
plot(spe_nmds_env)

# Plot significant variables with a different colour
plot(spe_nmds_env, p.max = 0.05, col = "red")
par(opar)
Figure 4: nMDS ordination plots with species response surfaces of the Doubs River species data emphasising four species of fish: A) Satr, B) Scer, C) Teso, and D) Cogo. D) additionally has the environmental vectors projected on the plot, with the significant vectors shown in red.

Four Methods, One Goal

PCA, CA, PCoA, and nMDS all seek a low-dimensional picture of multivariate ecological data, and they differ in what they hold fixed. PCA and CA begin with the data matrix and preserve a particular geometry, Euclidean and \(\chi^2\) respectively. PCoA begins with a distance matrix and preserves the distances themselves. nMDS also begins with a distance matrix, but preserves only the rank ordering of the distances, trading exact magnitudes for freedom from distributional assumptions. That trade is why nMDS has become one of the most widely used unconstrained ordination methods in community ecology, particularly with Bray-Curtis dissimilarities of species composition.

All four are unconstrained: they recover whatever gradients the species or environmental data happen to contain, without testing them against an external hypothesis. The next step asks a sharper question, namely not what gradients exist, but how much of the community pattern a measured set of environmental variables can account for. That is the task of constrained ordination, Redundancy Analysis (RDA) and Canonical Correspondence Analysis (CCA), which follow next.

References

Reuse

Citation

BibTeX citation:
@online{smit2026,
  author = {Smit, A. J.},
  title = {11a: {non-Metric} {Multidimensional} {Scaling} {(nMDS)}},
  date = {2026-06-15},
  url = {https://tangledbank.netlify.app/BCB743/nMDS.html},
  langid = {en}
}
For attribution, please cite this work as:
Smit AJ (2026) 11a: non-Metric Multidimensional Scaling (nMDS). https://tangledbank.netlify.app/BCB743/nMDS.html.