library(vegan)
library(geodist)
library(tidyverse)
env <- read.csv(here::here("data", "BarentsFish_env.csv"))
# a. List the three pairs of sites that are furthest apart in terms of the
# geographical distance between them. For each pair, also provide the
# temperature and depth associated with each member of the pair
# extract the lon and lat
geo_dat <- env[, c("Longitude", "Latitude")]BDC334 Practical Assessment
You have been provided with three files:
-
BarentsFish.xls(a Microsoft Excel file with the full dataset and a description of the content on the first sheet) -
BarentsFish_env.csv(a CSV file with the environmental data) -
BarentsFish_spp.csv(a CSV file with the species data)
Import the CSV files into R and answer the questions below.
The assessment is out of a total of 50 marks and you have 2 hours to complete it.
You are welcome to use any online resources to help you complete the test, but you may not communicate with anyone else during the assessment.
Question 1
List the three pairs of sites that are furthest apart in terms of the geographical distance (i.e., lat/lon needs to account for earth’s curvature) between them. For each pair, also provide the temperature and depth associated with each member of the pair.
List three pairs of sites that are closest together in terms of the geographical distance between them. For each pair, also provide the temperature and depth associated with each member of the pair.
Communicate the above output in a clear and concise manner, for example, using a table. The same applies to the rest of the questions.
Answer
Calculate the geographical distance using the Haversine function (assign half the marks if they used Euclidian distance) this step or the next one is possibly as far as you’ll get with the code I gave you. You can proceed manually from here by examining the matrices and cross referencing with the data files for the environmental data.
You’ll have to scan the matrix for the three largest distances (or devise some other method) and find the three pairs of sites that are furthest apart it is a pain to do by eye, but it is possible.
For my own convenience, I’ll calculate it more efficiently: set the diagonal and upper triangle to NA since we only need the lower triangle:
Find the indices of the three largest distances get the order of the matrix values in decreasing order, and select the first three:
Now, I combine the row and column indices into pairs. I then add the site temperature and depth values automagically, but you can manually accomplish the same:
largest_dist_pairs <- data.frame(
Site1 = row_indices,
Site2 = col_indices,
Distance = geo_dist_mat[largest_dist_indices],
Site1_temp = env$Temperature[row_indices],
Site2_temp = env$Temperature[col_indices],
Site1_depth = env$Depth[row_indices],
Site2_depth = env$Depth[col_indices]
)
largest_dist_pairs # this is what you get marked on Site1 Site2 Distance Site1_temp Site2_temp Site1_depth Site2_depth
1 88 54 685108.5 4.45 0.35 167 272
2 88 55 652755.0 4.45 0.55 167 306
3 88 53 644203.2 4.45 1.15 167 311
Now we repeat for the sites that are closest together:
# b. List three pairs of sites that are closest together in terms of the
# **geographical distance** between them. For each pair, also provide the
# temperature and depth associated with each member of the pair.
# to do this, I'll adapt the code above to find the three smallest distances
shortest_dist_indices <- order(geo_dist_mat,
decreasing = FALSE, na.last = NA)[1:3]
# retrieve the row and column indices of these largest distances
row_indices <- row(geo_dist_mat)[shortest_dist_indices]
col_indices <- col(geo_dist_mat)[shortest_dist_indices]
# combine the row and column indices into pairs
# I add the site temperature and depth values automagically
# but you can manually accomplish the same
shortest_dist_pairs <- data.frame(
Site1 = row_indices,
Site2 = col_indices,
Distance = geo_dist_mat[shortest_dist_indices],
Site1_temp = env$Temperature[row_indices],
Site2_temp = env$Temperature[col_indices],
Site1_depth = env$Depth[row_indices],
Site2_depth = env$Depth[col_indices]
)
shortest_dist_pairs # this is what you get marked on Site1 Site2 Distance Site1_temp Site2_temp Site1_depth Site2_depth
1 55 53 12441.09 0.55 1.15 306 311
2 14 13 17410.30 1.95 2.85 358 332
3 34 32 28042.83 0.55 0.95 305 294
(/10)
Question 2
List the three pairs of sites that are furthest apart in terms of the environmental distance between them. For each pair, also state the environmental distance between them.
List three pairs of sites that are closest together in terms of the environmental distance between them. For each pair, also state the environmental distance between them.
Answer
The same approach applies here, but now we use Euclidian distances:
# a. List the three pairs of sites that are furthest apart in terms of the
# **environmental distance** between them. For each pair, also state the
# environmental distance between them.
# Again, I adapt pre-existing code but you'll do this manually as far
# as possible
# extract the temperature and depth and standardise
env_dat <- decostand(env[, c("Depth", "Temperature")], MARGIN = 2,
method = "standardize")
# calculate the Euclidean distance
env_dist <- round(vegdist(env_dat, method = "euclidean", upper = FALSE), 2)Your code will bring you to the above step, and from there you can accomplish the rest manually to assemble the table by hand. I’ll continue with more efficient code…
1 2 3 4 5 6 7
1 NA NA NA NA NA NA NA
2 0.53 NA NA NA NA NA NA
3 0.96 1.36 NA NA NA NA NA
4 0.74 1.18 0.24 NA NA NA NA
5 0.78 0.38 1.37 1.24 NA NA NA
6 0.30 0.58 0.78 0.61 0.67 NA NA
7 0.38 0.56 0.81 0.66 0.59 0.11 NA
largest_dist_indices <- order(env_dist_mat,
decreasing = TRUE, na.last = NA)[1:3]
row_indices <- row(env_dist_mat)[largest_dist_indices]
col_indices <- col(env_dist_mat)[largest_dist_indices]
largest_dist_pairs <- data.frame(
Site1 = row_indices,
Site2 = col_indices,
Distance = env_dist_mat[largest_dist_indices],
Site1_temp = env$Temperature[row_indices],
Site2_temp = env$Temperature[col_indices],
Site1_depth = env$Depth[row_indices],
Site2_depth = env$Depth[col_indices]
)
largest_dist_pairs # this is what you get marked on Site1 Site2 Distance Site1_temp Site2_temp Site1_depth Site2_depth
1 88 81 5.52 4.45 1.65 167 486
2 88 80 5.41 4.45 1.55 167 474
3 88 78 5.12 4.45 1.65 167 455
# b. List three pairs of sites that are closest together in terms of the
# **environmental distance** between them. For each pair, also state the
# environmental distance between them.
shortest_dist_indices <- order(env_dist_mat,
decreasing = FALSE, na.last = NA)[1:3]
row_indices <- row(env_dist_mat)[shortest_dist_indices]
col_indices <- col(env_dist_mat)[shortest_dist_indices]
shortest_dist_pairs <- data.frame(
Site1 = row_indices,
Site2 = col_indices,
Distance = env_dist_mat[shortest_dist_indices],
Site1_temp = env$Temperature[row_indices],
Site2_temp = env$Temperature[col_indices],
Site1_depth = env$Depth[row_indices],
Site2_depth = env$Depth[col_indices]
)
shortest_dist_pairs # this is what you get marked on Site1 Site2 Distance Site1_temp Site2_temp Site1_depth Site2_depth
1 55 34 0.02 0.55 0.55 306 305
2 74 43 0.02 1.85 1.85 439 438
3 49 20 0.03 0.35 0.35 225 227
(/10)
Question 3
Is there a relationship between the environmental variables? Produce the code for this analysis and the evidence (both graphical and statistical) for the nature of this relationship. If a relationship is present, describe it.
Answer
Depth Temperature
Depth 1.00000000 -0.01820205
Temperature -0.01820205 1.00000000
No, there is no relationship between the two variables. The correlation coefficient shows a value of -0.02, which is very close to zero. This is confirmed by the flat line in the correlation plot.
(/10)
Question 4
List the three pairs of sites that are furthest apart in terms of species composition between them. For each pair, also state the species dissimilarity between them.
List three pairs of sites that are closest together in terms of species composition between them. For each pair, also state the species dissimilarity between them.
Answer
# a. List the three pairs of sites that are furthest apart in terms of
# **species composition** between them. For each pair, also state the
# species dissimilarity between them.
spp_dat <- read.csv(here::here("data", "BarentsFish_spp.csv"))
# using Bray-Curtis for abundance data (could use something else)
spp_diss <- round(vegdist(spp_dat, method = "bray", upper = FALSE), 2)
spp_diss_matrix <- as.matrix(spp_diss)
spp_diss_matrix[upper.tri(spp_diss_matrix, diag = TRUE)] <- NA
largest_diss_indices <- order(spp_diss_matrix,
decreasing = TRUE, na.last = NA)[1:3]
row_indices <- row(spp_diss_matrix)[largest_diss_indices]
col_indices <- col(spp_diss_matrix)[largest_diss_indices]
largest_diss_pairs <- data.frame(
Site1 = row_indices,
Site2 = col_indices,
Dissimilarity = spp_diss_matrix[largest_diss_indices]
)
largest_diss_pairs # this is what you get marked on Site1 Site2 Dissimilarity
1 57 2 0.90
2 57 3 0.89
3 57 5 0.88
# b. List three pairs of sites that are closest together in terms of
# **species composition** between them. For each pair, also state the
# species dissimilarity between them.
smallest_diss_indices <- order(spp_diss_matrix,
decreasing = FALSE, na.last = NA)[1:3]
row_indices <- row(spp_diss_matrix)[smallest_diss_indices]
col_indices <- col(spp_diss_matrix)[smallest_diss_indices]
smallest_diss_pairs <- data.frame(
Site1 = row_indices,
Site2 = col_indices,
Dissimilarity = spp_diss_matrix[smallest_diss_indices]
)
smallest_diss_pairs # this is what you get marked on Site1 Site2 Dissimilarity
1 76 74 0.03
2 87 86 0.04
3 77 43 0.05
(/10)
Question 5
Using all the answers given above to support your reasoning, discuss the implications of these findings in the light of the theory covered in the BDC334 module.
Answer
Anything that is not wrong, provide explanations for the patterns observed, relates the environmental similarities and differences to the species similarities and differences, and discusses the implications of these findings in the light of the theory covered in the BDC334 module.
(/10)
TOTAL /50
Instructions
Submit a R script onto iKamva at the end of the test period. Label the script as follows:
BDC334_<Surname>_<Student_no.>_Practical_Assessment.docx.
Within the Quarto document, ensure that all code:
- necessary to accomplish an answer is neatly and clearly associated with the question heading,
- works as intended, and that each line of code is properly accompanied by a comment explaining the purpose of the code,
- is well-structured and easy to follow, and
- is free of errors and warnings.
You are also welcome (encouraged, in fact) to add comments to your script to explain your reasoning or thought process.
Reuse
Citation
@online{smit,_a._j.2025,
author = {Smit, A. J., and Smit, AJ},
title = {BDC334 {Practical} {Assessment}},
date = {2025-08-25},
url = {http://tangledbank.netlify.app/BDC334/assessments/Prac_assessment_2025.html},
langid = {en}
}