Quantitative Ecology

Published

August 8, 2022

“We have become, by the power of a glorious evolutionary accident called intelligence, the stewards of life’s continuity on earth. We did not ask for this role, but we cannot abjure it. We may not be suited to it, but here we are.”

— Stephen J. Gould

“Reports that say that something hasn’t happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns- the ones we don’t know we don’t know.”

— Donald Rumsfeld

Welcome to the pages for BCB743 Quantitative Ecology. This page provides the syllabus and teaching policies for the module, and it serves is a starting point accessing all the theory, instruction, and data.

Honours Coordinator

Prof. Bryan Maritz—Room 4.105, Department of Biodiversity & Conservation Biology

Module Coordinator

Prof. AJ Smit—Room 4.103, Department of Biodiversity & Conservation Biology, ajsmit@uwc.ac.za

Instructors

Isma-eel Jattiem—Department of Biodiversity & Conservation Biology, 4035085@myuwc.ac.za

Zoë-Angelique Petersen—Department of Biodiversity & Conservation Biology, 4042512@myuwc.ac.za

Module Description

Quantitative ecology employs statistical and computational techniques to comprehend ecosystems. It aims to describe and quantify ecological processes, analyse and model complex multivariate ecological data, and make predictions about the structure and dynamics of ecosystems across various spatial and temporal scales. The multivariate statistical approaches taught in this module will equip you with the ability to interpret multidimensional data in a comprehensible two- or three-dimensional space.

In this course, you will cover the following topics:

  • Ecological Structure: You will explore the fundamental principles underlying the environmental structuring of ecosystems (ecosystem structure).

  • Ecological Data Analysis: In this section, you will examine approaches to analyse ecological data, including hypothesis testing, regression analysis, and multivariate analysis.

  • Multivariate Analyses: Here, you will learn how to utilize multivariate statistics to make sense of complex systems, predict ecological outcomes, and understand the underlying mechanisms that drive ecological processes.

  • Spatial Ecology: You will acquire knowledge on how to analyse and model spatial patterns in ecological data, including the distribution of species and habitats across landscapes.

  • Community Ecology: The theory covered in this section will prepare you to analyse and model the interactions between species within ecological communities, such as competition, predation, and mutualism.

  • Ecosystem Ecology: You will learn how to model and analyse the flow of energy and nutrients through ecosystems, including the roles of producers, consumers, and decomposers in ecological processes.

This module will provide you with the skills and tools necessary to analyse and model ecological data, community structure and the processes operating within them, and make predictions about the structure and dynamics of ecosystems. You will also learn how to communicate your findings effectively to a range of audiences, including scientists, policymakers, and the general public.

In BCB743, I will primarily focus on multivariate statistics. Multivariate methods play a crucial role in ecology as they enable us to analyse and interpret complex datasets involving multiple variables and ecosystems teeming with species. Ecosystems are characterised by their complexity and interconnectedness, with numerous factors typically influencing the distribution and abundance of species within an ecosystem. Multivariate statistical methods allow you to identify the underlying patterns and relationships among these variables and to explore how they interact to shape ecosystems.

Some of the most commonly employed multivariate statistical techniques in quantitative ecology include ordination methods, such as principal component analysis (PCA) and correspondence analysis (CA), which are utilised to reduce complexity and permit the visualisation of patterns of species distribution in ecosystems. We will also learn how to incorporate multiple regression into these multivariate analyses through the use of redundancy analysis (RDA) and canonical correspondence analysis (CCA), and examine non-metric multidimensional scaling (NMDS). All these methods extend our ability to explore species-environment relationships, assess community composition (structure), and identify influential environmental variables driving species distributions. We will also explore the application of multivariate statistics in addressing critical ecological issues, such as biodiversity monitoring, ecosystem functioning, and the impacts of anthropogenic disturbances on ecosystems.

Module Content and Framework

These links point to online resources such as datasets and R scripts in support of the video and PDF lecture material. It is essential that you work through these examples and workflows. Here is the Syllabus:

Week Class date Lecture Topic Class Slides Reading Tasks Tasks due
W1 11 June L0 Overview & Introduction ♦︎
Lecture Set 1 Ecological and Earth Data Task A1 1 July
RECAP OF BIODIVERSITY
L1: Revision Ecological Data Lab 1 Review
DATA, MATRICES
L2: Revision Environmental Distance ♦︎ Lab 2 Review
L3: Revision Quantifying Biodiversity ♦︎ Lab 3 Review
L4: Revision Describing Biodiversity Patterns ♦︎ Lab 4 Review
Lecture Set 2 Ecological Theories Task A2 11 July
13 June L5 Correlations & Associations ♦︎ Task B 17 June
L6: Self Distance Metrics
UNCONSTRAINED ORDINATION
W2 17 June L7 Intro to Ordination ♦︎
L8 PCA ♦︎ Task C 24 June
L8: Self PCA: Additional Examples
L8: Self PCA: WHO SDG Example Task D 11 July
20 June L9a CA ♦︎ Task E 24 June
L9b DCA
W3 24 June L10 PCoA ♦︎ Integrative Assignment 16 July
L11 nMDS ♦︎ Task F 1 July
L11: Self nMDS: PERMANOVA (Diatoms) Example
L12: Self Unconstrained Ordi. Summary
W4 1 July Lecture Set 1 Ecological & Earth Data Present Lecture Set 1 1 July
REGRESSION ANALYSIS
4 July L13 Model Building
L14 Multiple Regression Task G (Final Assessment) 16 July
L14: Self Gradients Example
Lx Generalised Linear Models TBA (2025) ΤΒΑ (2025) TBA (2025) ΤΒΑ (2025)
W5 8 July Lx Generalised Additive Models TBA (2025) ΤΒΑ (2025) TBA (2025) ΤΒΑ (2025)
CONSTRAINED ORDINATION
L15 Distance-Based Redundancy Analysis ♦︎
L15: Self db-RDA: Seaweeds Example
CLUSTER ANALYSIS
L16 Cluster Analysis Task D (continue) 11 July
11 July Lecture Set 2 Ecological Theories Present Lecture Set 2 11 July
L17 Review

Core theoretical framework Ecological hypotheses underlying the processes of species assembly in space and time, including neutral and niche-based mechanisms, and historical events; overview of the currently known and understood distributional patterns of major groups of organisms at global, regional and local scales; consideration of sampling designs aimed at capturing these patterns and drivers so as to arrive at a processed based understanding of species assembly.

Competence Data collection aimed at a quantitative test of the relevant hypotheses, above. The management and analysis of ecological data; reproducible and collaborative research; the use of R as a tool for the analysis multivariate ecological data; multivariate techniques such as nMDS, PCA, RDA and cluster analysis; graphical data summaries and visualisations.

Outcomes of BCB743

By the end of this module, students will be able to:

  • Understand the concepts of \alpha-, \beta- and \gamma-diversity
  • Know and understand the current hypotheses that explain species assembly processes in space and time (e.g. neutral and niche mechanisms)
  • Collect ecological data at the appropriate scale, which would lend themselves to a quantitative analysis of points 1 and 2, above
  • Use the R software and associated packages to undertake the analyses required in point 3, above
  • Interpret the outcomes of the above analyses and use it to quantitatively characterise points 1 and 2, above
  • Communicate the findings by written and oral means

Graduate Attributes

The graduate attributes resulting from completion of this modules alignment with the expectations of the workspace across diverse organisations and institutions where graduates typically find employment.

Data and Reading in Support of the Syllabus

In the table above there are links to several key papers to read in preparation of each week’s theory. It is essential that you read these papers.

Many other references are cited in each Chapter. These serve several functions in that they:

  • Add additional theory relevant to some ecological concepts
  • Provide background to some of the datasets used in my examples
  • Discuss derivations of some equations used to calculate diversity concepts
  • Provide example walkthroughs of some of the computational aspects of the methods covered in the Labs
  • Collectively supplement the discussion about these concepts covered in the lectures

Actively engaging with these reading materials will make to difference between a 60% average mark for the module, and a mark in excess of 80%.

Reading

You are expected to read additional material in support of the content covered in class and on this website.

A compulsory reference is ‘Numerical Ecology with R’ by Daniel Borcard, François Gillet and Pierre Legendre (Borcard et al. 2011). Much of the class’ content and many of the examples (and code) that I use have been adapted from this source. There is also the excellent book by Legendre and Legendre (2012) called ‘Numerical Ecology’ which provides everything the former book has, but in greater detail and with less focus on R. Both should be considered a ‘gold standard’ reference for Quantitative Ecology.

A third highly recommended text is the book Tree Diversity Analysis by Roeland Kindt and Richard Coe.

I can also recommend a these amazing websites with excellent content:

Note that the URLs with links to additional reading that appear with the worked-through example code should not be seen as optional. They are there for a reason and should be consulted even though I might not necessarily refer to each of them in class. Use these materials liberally.

Should you want to download the source code for the BCB743 (and BCB744 website), you may find it on GitHub.

Datasets Used in This Module

Note that the links provided might not necessarily lead to the vegan help page.

Dataset Source
1 Vegetation and Environment in Dutch Dune Meadows vegan
2 Oribatid Mite Data with Explanatory Variables vegan
3 The Doubs River Data Numerical Ecology with R
4 The Barro Colorado Island Tree Counts vegan
5 John Bolton, Rob Anderson, and Herre Stegenga’s Seaweed Data Smit et al., 2017
6 Serge Mayombo’s Diatoms Data Mayombo et al., 2019
7 World Health Organization Sustainable Development Goals Data WHO

Prerequisites

You should have a moderate numerical literacy and have prior programming experience. Such experience will have been obtained in the BCB744 module, which is a module about doing statistics in R. If you have a reasonable experience in coding and statistical analysis you should find yourself well prepared. You should also thoroughly revise BDC334 by the end of the first week of this module.

Method of Instruction

You are provided with reading material (lecture slides, code, website content) that you are expected to consume prior to the class. Classes will involve brief introductions to new concepts, and will be followed by working on exercises in class that cover those concepts. The workshop is designed to be as interactive as possible, so while you are working on exercises the tutor and I will circulate among you and engage with you to help you understand any material and the associated code you are uncomfortable with. Often this will result in discussions of novel applications and alternative approaches to the data analysis challenges you are required to solve. More challenging concepts might emerge during the assignments (typically these will be submitted the following day), and any such challenges will be dealt with in class prior to learning new concepts.

Although the module is theory-heavy, a large part of it is also about coding. It is up to you to take your coding skills to the next level and move beyond what I teach in class. Coding is a bit like learning a language, and as such programming is a skill that is best learned by doing.

Learning Colaboratively

Also read: How to learn

Please refer to my advice about how to learn.

Discuss the BCB743 workshop activities with your peers as you work on them. Use also the WhatsApp group set up for the module for discussion purposes (I might assist via this medium if neccesary if your questions/comments have relevance to the whole class). A better option is to use GitHub Issues. You will learn more in this module if you work with your friends than if you do not. Ask questions, answer questions, and share ideas liberally. Please identify your work partners by name on all assignments (if you decide to work in pairs).

Cooperative learning is not a licence for plagiarism. Plagiarism is a serious offence and will be dealt with concisely. Consequences of cheating are severe—they range from a 0% for the assignment or exam up to dismissal from the course for a second offense.

Reusing Code Found Elsewhere

A huge volume of code is available on the web and it can be adapted to solve your own problems. You may make use of any online resources (e.g. form StackOverflow, a thoroughly-used source of discussion about R code)—but you MUST clearly indicate (cite) that your solution relies on found code, regardless to what extent you have modified it to your own needs. Reused code that is discovered via a web search and which is not explicitly cited is plagiarism and it will be treated as such. On assignments you may not directly share code with your peers in this workshop.

Software

In this course we will rely entirely on R running within the RStudio IDE. The use of R was covered extensively in the BCB744 module where the installation process was discussed. We will primarily use the vegan package, but some useful functions are also provided by the package BiodiversityR (and here and here). Various other R packages offer overlapping and additional methods, but vegan should accommodate >90% of your Quantitative Ecology needs.

Computers

You are encouraged to provide your own laptops and to install the necessary software before the module starts. Limited support can be provided if required. There are also computers with R and RStudio (and the neccesary add-on libraries) available in the 5th floor lab in the BCB Department.

Attendance

This worskhop-based, hands on course can only deliver acceptible outcomes if you attend all classes. The schedule is set and cannot be changed. Sometimes an occasional absence cannot be avoided. Please be curtious and notify myself or the tutor in advance of any absence. If you work with a partner in class, notify them too. Keep up with the reading assignments while you are away and we will all work with you to get you back up to speed on what you miss. If you do miss a class, however, the assignments must still be submitted on time (also see Late submission of CA).

Since you may decide to work in collaboration with a peer on tasks and assignments, please keep this person informed at all times in case some emergency makes you unavailable for a period of time. Someone might depend on your input and contributions—do not leave someone in the lurch so that they cannot complete a task in your absence.

Sure, here is the improved and completed sentence and meaning at the end:


Assessment Policy

Continuous Assessments (CA) and a Final Assessment will provide a Final Mark for the module. They contribute equally to the final mark. These modes of assessment meet our needs as far as formative and summative assessments are concerned. All assessments are open book, so consult your code and reading material if and when you need to.

Continuous Assessment

The Continuous Assessment is comprised of:

  • Tasks A1 is weighted 0.2 towards the CA. A2 is assessed via the lecture presentation only.
  • Lecture presentation, weighted 0.5 towards the CA.
  • Tasks B to F, the average of which contributes 0.3 towards the CA.

For Tasks A1 and A2, please refer to the marking schedule as agreed on by the class.

When assessing Tasks B to F, we will pay attention to the following criteria:

  • Presentation and formatting (10%), including:
    • Questions answered in order
    • Sectioning
    • General appearance
    • References (if required)
  • Code formatting (10%), e.g.:
    • Application of R code conventions, e.g. spaces around <-, after #, after ,, etc.
    • New line for each dplyr function (lines end in %>%)
    • New line for each ggplot layer (lines end in +)
    • Tidiness of code presentation in the HTML
  • Code correctness in the context of the specified analysis (25%):
    • The code must faithfully execute the intended analysis as required by the research questions and/or hypotheses tests
  • Figures (10%):
    • Sensible use of themes and colors
    • Publication quality (complete axis titles and labels, etc.)
    • Informative and complete titles, axis labels, legends, etc.
  • Discussion (45%):
    • Here you will be assessed for integrating the results of your analyses within the correct theoretical framework

Task G: Final Assessment (Exam)

The Final Assessment starts after the Multiple Regression lecture on 4 July and you can do in the comfort of your home. It will involve the analysis of real world data and assess some more theoretical and philosophical aspects of Quantitative Ecology. A full mark breakdown is provided with Task G.

Submission of Assignments and Exams

A statement such as the one below accompanies every assignment—pay attention, as failing to observe this instruction may result in a loss of marks (i.e. if an assignment remains ungraded because the owner of the material cannot be identified):

Submit a R script wherein you provide answers to Questions 1–9 by no later than 8:00 tomorrow. Label the script as follows (e.g.): BCB743_AJ_Smit_Assignment_2.R.

Late Submission of CA

Late assignments will be penalised 10% per day and will not be accepted more than 48 hours late, unless evidence such as a doctor’s note, a death certificate, or another documented emergency can be provided. If you know in advance that a submission will be late, please discuss this and seek prior approval. This policy is based on the idea that in order to learn how to translate your human thoughts into computer language (coding) you should be working with them at multiple times each week—ideally daily. Time has been allocated in class for working on assignments and students are expected to continue to work on the assignments outside of class. Successfully completing (and passing) this module requires that you finish assignments based on what we have covered in class by the following class period. Work diligently from the onset so that even if something unexpected happens at the last minute you should already be close to done. This approach also allows rapid feedback to be provided to you, which can only be accomplished by returning assignments quickly and punctually.

Support

It’s expected that some tricky aspects of the module will take time to master, and the best way to master problematic material is to practice, practice some more, and then to ask questions. Trying for 10 minutes and then giving up is not good enough. I’ll be more sympathetic to your cause if you can demonstrate having tried for a full day before giving up and asking me. When you ask questions about some challenge, this is the way to do it—explain to me your numerous attempts at trying to solve the problem, and explain how these various attempts have failed. I will not help you if you have not tried to help yourself first (maybe with advice from friends). There will be time in class to do this, typically before we embark on a new topic. You are also encouraged to bring up related questions that arise in your own B.Sc. (Hons.) research project.

Should you require more time with me, find out when I am ‘free’ and set an appointment by sending me a calendar invitation. I am happy to have a personal meeting with you via Zoom, but I prefer face-to-face in my office.

Help Via BCB744 and BCB743 Issues on GitHub

All discussion for the BCB744 and BCB743 workshops will be held in the Issues of this repository. Please post all content-related questions there, and use email only for personal matters. Note that this is a public repository, so be professional in your writing here (grammar, etc.).

To start a new thread, create a New issue. Tag your peers using their handle—@ajsmit, for example—to get their attention.

Once a question has been answered, the issue will be closed, so lots of good answers might end up in closed issues. Don’t forget to look there when looking for answers—you can use the Search feature on this repository to find answers that might have been offered by the same or similar problem experienced by someone else in the past.

Guidelines for Posting Questions:

  • First search existing issues (open or closed) for answers. If the question has already been answered, you’re done! If there is an open issue, feel free to contribute to it. Or feel free to open a closed issue if you believe the answer is not satisfactory.
  • Give your issue an informative title.
    • Good: “Error: could not find function”ggplot””
    • Bad: “My code does not work!” Note that you can edit an issue’s title after it’s been posted.
  • Format your questions nicely using markdown and code formatting. Preview your issue prior to posting.
  • As I explained above, your peers and I will more sympathetic to your cause if you can show all the things you have tried as you, yourself, tried to fix the issue first.
  • Include code and example data so the person trying to help you have something to work with (and which results in the error, perhaps)
  • Where appropriate, provide links to specific files, or even lines within them, in the body of your issue. This will help your peers understand your question. Note that only the teaching team will have access to private repos.
  • (Optional) Tag someone or some group of people. Start by typing their GitHub username prefixed with the @ symbol. Of course this supposes that each of you have a GitHub account and username.
  • Hit Submit new issue when you’re ready to post.

References

Borcard D, Gillet F, Legendre P, others (2011) Numerical ecology with R. Springer
Legendre P, Legendre L (2012) Numerical ecology. Elsevier

Reuse

Citation

BibTeX citation:
@online{j._smit2022,
  author = {J. Smit, Albertus},
  title = {Quantitative {Ecology}},
  date = {2022-08-08},
  url = {http://tangledbank.netlify.app/BCB743/BCB743_index.html},
  langid = {en}
}
For attribution, please cite this work as:
J. Smit A (2022) Quantitative Ecology. http://tangledbank.netlify.app/BCB743/BCB743_index.html.