BCB743: Quantitative Ecology

Author

Affiliation

Published

August 8, 2022

“We have become, by the power of a glorious evolutionary accident called intelligence, the stewards of life’s continuity on earth. We did not ask for this role, but we cannot abjure it. We may not be suited to it, but here we are.”

— Stephen J. Gould

“Reports that say that something hasn’t happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns- the ones we don’t know we don’t know.”

— Donald Rumsfeld

Welcome to the pages for BCB743 Quantitative Ecology. This page provides the syllabus and teaching policies for the module, and it serves is a starting point accessing all the theory, instruction, and data.

Honours Coordinator

Prof. Bryan Maritz—Room 4.105, Department of Biodiversity & Conservation Biology

Module Coordinator

Prof. AJ Smit—Room 4.103, Department of Biodiversity & Conservation Biology, ajsmit@uwc.ac.za

Instructors

None.

Module Description

Quantitative ecology employs statistical and computational techniques to comprehend ecosystems. It aims to describe and quantify ecological processes, analyse and model complex multivariate ecological data, and make predictions about the structure and dynamics of ecosystems across various spatial and temporal scales. The multivariate statistical approaches taught in this module will equip you with the ability to interpret multidimensional data in a comprehensible two- or three-dimensional space.

In this course, you will cover the following topics:

Ecological Structure: You will explore the fundamental principles underlying the environmental structuring of ecosystems (ecosystem structure).
Ecological Data Analysis: In this section, you will examine approaches to analyse ecological data, including hypothesis testing, regression analysis, and multivariate analysis.
Multivariate Analyses: Here, you will learn how to utilize multivariate statistics to make sense of complex systems, predict ecological outcomes, and understand the underlying mechanisms that drive ecological processes.
Spatial Ecology: You will acquire knowledge on how to analyse and model spatial patterns in ecological data, including the distribution of species and habitats across landscapes.
Community Ecology: The theory covered in this section will prepare you to analyse and model the interactions between species within ecological communities, such as competition, predation, and mutualism.
Ecosystem Ecology: You will learn how to model and analyse the flow of energy and nutrients through ecosystems, including the roles of producers, consumers, and decomposers in ecological processes.

This module will provide you with the skills and tools necessary to analyse and model ecological data, community structure and the processes operating within them, and make predictions about the structure and dynamics of ecosystems. You will also learn how to communicate your findings effectively to a range of audiences, including scientists, policymakers, and the general public.

In BCB743, I will primarily focus on multivariate statistics. Multivariate methods play a crucial role in ecology as they enable us to analyse and interpret complex datasets involving multiple variables and ecosystems teeming with species. Ecosystems are characterised by their complexity and interconnectedness, with numerous factors typically influencing the distribution and abundance of species within an ecosystem. Multivariate statistical methods allow you to identify the underlying patterns and relationships among these variables and to explore how they interact to shape ecosystems.

Some of the most commonly employed multivariate statistical techniques in quantitative ecology include ordination methods, such as principal component analysis (PCA) and correspondence analysis (CA), which are utilised to reduce complexity and permit the visualisation of patterns of species distribution in ecosystems. We will also learn how to incorporate multiple regression into these multivariate analyses through the use of redundancy analysis (RDA) and canonical correspondence analysis (CCA), and examine non-metric multidimensional scaling (NMDS). All these methods extend our ability to explore species-environment relationships, assess community composition (structure), and identify influential environmental variables driving species distributions. We will also explore the application of multivariate statistics in addressing critical ecological issues, such as biodiversity monitoring, ecosystem functioning, and the impacts of anthropogenic disturbances on ecosystems.

Module Content and Framework

These links point to online resources such as datasets and R scripts in support of the video and PDF lecture material. It is essential that you work through these examples and workflows. Here is the Syllabus:

Week	Class date	Lecture	Topic	Class	Slides	Reading	Tasks	Tasks due
W1	11 June	L0	Overview & Introduction		♦︎	▤ ▤
		Lecture Set 1	Ecological and Earth Data				◒ Task A1	17 June
			RECAP OF BIODIVERSITY
		L1: Revision	Ecological Data	★			Lab 1 Review
			DATA, MATRICES
		L2: Revision	Environmental Distance	★	♦︎		Lab 2 Review
		L3: Revision	Quantifying Biodiversity	★	♦︎	▤ ▤ ▤	Lab 3 Review
		L4: Revision	Describing Biodiversity Patterns	★	♦︎	▤	Lab 4 Review
		Lecture Set 2	Ecological Theories				◒ Task A2	20 June
	17 June	L5	Correlations & Associations	★	♦︎		◒ Task B	19 June
		L6: Self	Distance Metrics	★
			UNCONSTRAINED ORDINATION
W2	18 June	L7	Intro to Ordination		♦︎
		L8	PCA	★	♦︎		◒ Task C	23 June
		L8: Self	PCA: Additional Examples	★
		L8: Self	PCA: WHO SDG Example	★			◒ Task D	23 June
W3	23 June	L9a	CA	★	♦︎		◒ Task E	25 June
		L9b	DCA	★
		L10	PCoA	★	♦︎		◒ Integrative Assignment	25 June
	25 June	L11	nMDS	★	♦︎		◒ Task F	27 June
		L11: Self	nMDS: PERMANOVA (Diatoms) Example	★		▤
		L12: Self	Unconstrained Ordi. Summary	★
	27 June	Lecture Set 1	Ecological & Earth Data				Present Lecture Set 1	TBA
			REGRESSION ANALYSIS
W4	1 July	L13	Model Building	★
		L14	Multiple Regression	★		▤	◒ Task G (Final Assessment)	TBA
		L14: Self	Gradients Example	★		▤ ▤ ▤
		Lx	Generalised Linear Models	TBA (2025)	ΤΒΑ (2025)	TBA (2025)	ΤΒΑ (2025)
	4 July	Lx	Generalised Additive Models	TBA (2025)	ΤΒΑ (2025)	TBA (2025)	ΤΒΑ (2025)
W5			CONSTRAINED ORDINATION
		L15	Distance-Based Redundancy Analysis	★	♦︎	▤ ▤
		L15: Self	db-RDA: Seaweeds Example	★		▤ ▤
			CLUSTER ANALYSIS
		L16	Cluster Analysis	★			◒ Task D (continue)	TBA
	8 July	Lecture Set 2	Ecological Theories				Present Lecture Set 2	TBA
		L17	Review

Core theoretical framework Ecological hypotheses underlying the processes of species assembly in space and time, including neutral and niche-based mechanisms, and historical events; overview of the currently known and understood distributional patterns of major groups of organisms at global, regional and local scales; consideration of sampling designs aimed at capturing these patterns and drivers so as to arrive at a processed based understanding of species assembly.

Competence Data collection aimed at a quantitative test of the relevant hypotheses, above. The management and analysis of ecological data; reproducible and collaborative research; the use of R as a tool for the analysis multivariate ecological data; multivariate techniques such as nMDS, PCA, RDA and cluster analysis; graphical data summaries and visualisations.

Outcomes of BCB743

By the end of this module, students will be able to:

Understand the concepts of $α$ -, $β$ - and $γ$ -diversity
Know and understand the current hypotheses that explain species assembly processes in space and time (e.g. neutral and niche mechanisms)
Collect ecological data at the appropriate scale, which would lend themselves to a quantitative analysis of points 1 and 2, above
Use the R software and associated packages to undertake the analyses required in point 3, above
Interpret the outcomes of the above analyses and use it to quantitatively characterise points 1 and 2, above
Communicate the findings by written and oral means

Graduate Attributes

The graduate attributes resulting from completion of this modules alignment with the expectations of the workspace across diverse organisations and institutions where graduates typically find employment.

Data and Reading in Support of the Syllabus

In the table above there are links to several key papers to read in preparation of each week’s theory. It is essential that you read these papers.

Many other references are cited in each Chapter. These serve several functions in that they:

Add additional theory relevant to some ecological concepts
Provide background to some of the datasets used in my examples
Discuss derivations of some equations used to calculate diversity concepts
Provide example walkthroughs of some of the computational aspects of the methods covered in the Labs
Collectively supplement the discussion about these concepts covered in the lectures

Actively engaging with these reading materials will make to difference between a 60% average mark for the module, and a mark in excess of 80%.

Reading

You are expected to read additional material in support of the content covered in class and on this website.

A compulsory reference is ‘Numerical Ecology with R’ by Daniel Borcard, François Gillet and Pierre Legendre (Borcard et al. 2011). Much of the class’ content and many of the examples (and code) that I use have been adapted from this source. There is also the excellent book by Legendre and Legendre (2012) called ‘Numerical Ecology’ which provides everything the former book has, but in greater detail and with less focus on R. Both should be considered a ‘gold standard’ reference for Quantitative Ecology.

A third highly recommended text is the book Tree Diversity Analysis by Roeland Kindt and Richard Coe.

I can also recommend a these amazing websites with excellent content:

David Zelený’s Analysis of Community Ecology Data in R
Mike Palmer’s Ordination Methods for Ecologists
GUide to STatistical Analysis in Microbial Ecology (GUSTA ME)

Note that the URLs with links to additional reading that appear with the worked-through example code should not be seen as optional. They are there for a reason and should be consulted even though I might not necessarily refer to each of them in class. Use these materials liberally.

Should you want to download the source code for the BCB743 (and BCB744 website), you may find it on GitHub.

Datasets Used in This Module

Note that the links provided might not necessarily lead to the vegan help page.

	Dataset	Source
1	Vegetation and Environment in Dutch Dune Meadows	vegan
2	Oribatid Mite Data with Explanatory Variables	vegan
3	The Doubs River Data	Numerical Ecology with R
4	The Barro Colorado Island Tree Counts	vegan
5	John Bolton, Rob Anderson, and Herre Stegenga’s Seaweed Data	Smit et al., 2017
6	Serge Mayombo’s Diatoms Data	Mayombo et al., 2019
7	World Health Organization Sustainable Development Goals Data	WHO

Prerequisites

You should have a moderate numerical literacy and have prior programming experience. Such experience will have been obtained in the BCB744 module, which is a module about doing statistics in R. If you have a reasonable experience in coding and statistical analysis you should find yourself well prepared. You should also thoroughly revise BDC334 by the end of the first week of this module.

Method of Instruction

You are provided with reading material (lecture slides, code, website content) that you are expected to consume prior to the class. Classes will involve brief introductions to new concepts, and will be followed by working on exercises in class that cover those concepts. The workshop is designed to be as interactive as possible, so while you are working on exercises the tutor and I will circulate among you and engage with you to help you understand any material and the associated code you are uncomfortable with. Often this will result in discussions of novel applications and alternative approaches to the data analysis challenges you are required to solve. More challenging concepts might emerge during the assignments (typically these will be submitted the following day), and any such challenges will be dealt with in class prior to learning new concepts.

Although the module is theory-heavy, a large part of it is also about coding. It is up to you to take your coding skills to the next level and move beyond what I teach in class. Coding is a bit like learning a language, and as such programming is a skill that is best learned by doing.

Learning Colaboratively

Reusing Code Found Elsewhere

A huge volume of code is available on the web and it can be adapted to solve your own problems. You may make use of any online resources (e.g. form StackOverflow, a thoroughly-used source of discussion about R code)—but you MUST clearly indicate (cite) that your solution relies on found code, regardless to what extent you have modified it to your own needs. Reused code that is discovered via a web search and which is not explicitly cited is plagiarism and it will be treated as such. On assignments you may not directly share code with your peers in this workshop.

Software

In this course we will rely entirely on R running within the RStudio IDE. The use of R was covered extensively in the BCB744 module where the installation process was discussed. We will primarily use the vegan package, but some useful functions are also provided by the package BiodiversityR (and here and here). Various other R packages offer overlapping and additional methods, but vegan should accommodate >90% of your Quantitative Ecology needs.

Computers

You are encouraged to provide your own laptops and to install the necessary software before the module starts. Limited support can be provided if required. There are also computers with R and RStudio (and the neccesary add-on libraries) available in the 5th floor lab in the BCB Department.

Attendance

This worskhop-based, hands on course can only deliver acceptible outcomes if you attend all classes. The schedule is set and cannot be changed. Sometimes an occasional absence cannot be avoided. Please be curtious and notify myself or the tutor in advance of any absence. If you work with a partner in class, notify them too. Keep up with the reading assignments while you are away and we will all work with you to get you back up to speed on what you miss. If you do miss a class, however, the assignments must still be submitted on time (also see Late submission of CA).

Since you may decide to work in collaboration with a peer on tasks and assignments, please keep this person informed at all times in case some emergency makes you unavailable for a period of time. Someone might depend on your input and contributions—do not leave someone in the lurch so that they cannot complete a task in your absence.

Sure, here is the improved and completed sentence and meaning at the end:

Assessment Policy

Continuous Assessments (CA) and a Final Assessment will provide a Final Mark for the module. They contribute equally to the final mark. These modes of assessment meet our needs as far as formative and summative assessments are concerned. All assessments are open book, so consult your code and reading material if and when you need to.

Continuous Assessment

The Continuous Assessment is comprised of:

Tasks A1 is weighted 0.2 towards the CA. A2 is assessed via the lecture presentation only.
Lecture presentation, weighted 0.5 towards the CA.
Tasks B to F, the average of which contributes 0.3 towards the CA.

For Tasks A1 and A2, please refer to the marking schedule as agreed on by the class.

When assessing Tasks B to F, we will pay attention to the following criteria:

Presentation and formatting (10%), including:
- Questions answered in order
- Sectioning
- General appearance
- References (if required)
Code formatting (10%), e.g.:
- Application of R code conventions, e.g. spaces around <-, after #, after ,, etc.
- New line for each dplyr function (lines end in %>%)
- New line for each ggplot layer (lines end in +)
- Tidiness of code presentation in the HTML
Code correctness in the context of the specified analysis (25%):
- The code must faithfully execute the intended analysis as required by the research questions and/or hypotheses tests
Figures (10%):
- Sensible use of themes and colors
- Publication quality (complete axis titles and labels, etc.)
- Informative and complete titles, axis labels, legends, etc.
Discussion (45%):
- Here you will be assessed for integrating the results of your analyses within the correct theoretical framework

Task G: Final Assessment (Exam)

The Final Assessment starts after the Multiple Regression lecture on 4 July and you can do in the comfort of your home. It will involve the analysis of real world data and assess some more theoretical and philosophical aspects of Quantitative Ecology. A full mark breakdown is provided with Task G.

Submission of Assignments and Exams

A statement such as the one below accompanies every assignment—pay attention, as failing to observe this instruction may result in a loss of marks (i.e. if an assignment remains ungraded because the owner of the material cannot be identified):

Submit a R script wherein you provide answers to Questions 1–9 by no later than 8:00 tomorrow. Label the script as follows (e.g.): BCB743_AJ_Smit_Assignment_2.R.

Late Submission of CA

Late assignments will be penalised 10% per day and will not be accepted more than 48 hours late, unless evidence such as a doctor’s note, a death certificate, or another documented emergency can be provided. If you know in advance that a submission will be late, please discuss this and seek prior approval. This policy is based on the idea that in order to learn how to translate your human thoughts into computer language (coding) you should be working with them at multiple times each week—ideally daily. Time has been allocated in class for working on assignments and students are expected to continue to work on the assignments outside of class. Successfully completing (and passing) this module requires that you finish assignments based on what we have covered in class by the following class period. Work diligently from the onset so that even if something unexpected happens at the last minute you should already be close to done. This approach also allows rapid feedback to be provided to you, which can only be accomplished by returning assignments quickly and punctually.

Support

It’s expected that some tricky aspects of the module will take time to master, and the best way to master problematic material is to practice, practice some more, and then to ask questions. Trying for 10 minutes and then giving up is not good enough. I’ll be more sympathetic to your cause if you can demonstrate having tried for a full day before giving up and asking me. When you ask questions about some challenge, this is the way to do it—explain to me your numerous attempts at trying to solve the problem, and explain how these various attempts have failed. I will not help you if you have not tried to help yourself first (maybe with advice from friends). There will be time in class to do this, typically before we embark on a new topic. You are also encouraged to bring up related questions that arise in your own B.Sc. (Hons.) research project.

Should you require more time with me, find out when I am ‘free’ and set an appointment by sending me a calendar invitation. I am happy to have a personal meeting with you via Zoom, but I prefer face-to-face in my office.

Help Via BCB744 and BCB743 Issues on GitHub

All discussion for the BCB744 and BCB743 workshops will be held in the Issues of this repository. Please post all content-related questions there, and use email only for personal matters. Note that this is a public repository, so be professional in your writing here (grammar, etc.).

To start a new thread, create a New issue. Tag your peers using their handle—@ajsmit, for example—to get their attention.

Once a question has been answered, the issue will be closed, so lots of good answers might end up in closed issues. Don’t forget to look there when looking for answers—you can use the Search feature on this repository to find answers that might have been offered by the same or similar problem experienced by someone else in the past.

Guidelines for Posting Questions:

First search existing issues (open or closed) for answers. If the question has already been answered, you’re done! If there is an open issue, feel free to contribute to it. Or feel free to open a closed issue if you believe the answer is not satisfactory.
Give your issue an informative title.
- Good: “Error: could not find function”ggplot””
- Bad: “My code does not work!” Note that you can edit an issue’s title after it’s been posted.
Format your questions nicely using markdown and code formatting. Preview your issue prior to posting.
As I explained above, your peers and I will more sympathetic to your cause if you can show all the things you have tried as you, yourself, tried to fix the issue first.
Include code and example data so the person trying to help you have something to work with (and which results in the error, perhaps)
Where appropriate, provide links to specific files, or even lines within them, in the body of your issue. This will help your peers understand your question. Note that only the teaching team will have access to private repos.
(Optional) Tag someone or some group of people. Start by typing their GitHub username prefixed with the @ symbol. Of course this supposes that each of you have a GitHub account and username.
Hit Submit new issue when you’re ready to post.

References

Borcard D, Gillet F, Legendre P, others (2011) Numerical ecology with R. Springer

Legendre P, Legendre L (2012) Numerical ecology. Elsevier

Reuse

CC BY-NC-SA 4.0

Citation

BibTeX citation:

@online{smit,_a._j.2022,
  author = {Smit, A. J.,},
  title = {BCB743: {Quantitative} {Ecology}},
  date = {2022-08-08},
  url = {http://tangledbank.netlify.app/BCB743/BCB743_index.html},
  langid = {en}
}

For attribution, please cite this work as:

Smit, A. J. (2022) BCB743: Quantitative Ecology. http://tangledbank.netlify.app/BCB743/BCB743_index.html.

--- date: "2022-08-08" title: "BCB743: Quantitative Ecology" --- > *"We have become, by the power of a glorious evolutionary accident called intelligence, the stewards of life's continuity on earth. We did not ask for this role, but we cannot abjure it. We may not be suited to it, but here we are."* > > --- Stephen J. Gould > *"Reports that say that something hasn't happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns- the ones we don't know we don't know."* > > --- Donald Rumsfeld ![](../images/PhD_comics3.JPG){fig-align="center"} **Welcome to the pages for BCB743 Quantitative Ecology. This page provides the syllabus and teaching policies for the module, and it serves is a starting point accessing all the theory, instruction, and data.** ## Honours Coordinator Prof. Bryan Maritz---Room 4.105, Department of Biodiversity & Conservation Biology ## Module Coordinator Prof. AJ Smit---Room 4.103, Department of Biodiversity & Conservation Biology, ajsmit@uwc.ac.za ## Instructors None. ## Module Description Quantitative ecology employs statistical and computational techniques to comprehend ecosystems. It aims to describe and quantify ecological processes, analyse and model complex multivariate ecological data, and make predictions about the structure and dynamics of ecosystems across various spatial and temporal scales. The multivariate statistical approaches taught in this module will equip you with the ability to interpret multidimensional data in a comprehensible two- or three-dimensional space. In this course, you will cover the following topics: - **Ecological Structure:** You will explore the fundamental principles underlying the environmental structuring of ecosystems (ecosystem structure). - **Ecological Data Analysis:** In this section, you will examine approaches to analyse ecological data, including hypothesis testing, regression analysis, and multivariate analysis. - **Multivariate Analyses:** Here, you will learn how to utilize multivariate statistics to make sense of complex systems, predict ecological outcomes, and understand the underlying mechanisms that drive ecological processes. - **Spatial Ecology:** You will acquire knowledge on how to analyse and model spatial patterns in ecological data, including the distribution of species and habitats across landscapes. - **Community Ecology:** The theory covered in this section will prepare you to analyse and model the interactions between species within ecological communities, such as competition, predation, and mutualism. - **Ecosystem Ecology:** You will learn how to model and analyse the flow of energy and nutrients through ecosystems, including the roles of producers, consumers, and decomposers in ecological processes. This module will provide you with the skills and tools necessary to analyse and model ecological data, community structure and the processes operating within them, and make predictions about the structure and dynamics of ecosystems. You will also learn how to communicate your findings effectively to a range of audiences, including scientists, policymakers, and the general public. In BCB743, I will primarily focus on multivariate statistics. Multivariate methods play a crucial role in ecology as they enable us to analyse and interpret complex datasets involving multiple variables and ecosystems teeming with species. Ecosystems are characterised by their complexity and interconnectedness, with numerous factors typically influencing the distribution and abundance of species within an ecosystem. Multivariate statistical methods allow you to identify the underlying patterns and relationships among these variables and to explore how they interact to shape ecosystems. Some of the most commonly employed multivariate statistical techniques in quantitative ecology include ordination methods, such as principal component analysis (PCA) and correspondence analysis (CA), which are utilised to reduce complexity and permit the visualisation of patterns of species distribution in ecosystems. We will also learn how to incorporate multiple regression into these multivariate analyses through the use of redundancy analysis (RDA) and canonical correspondence analysis (CCA), and examine non-metric multidimensional scaling (NMDS). All these methods extend our ability to explore species-environment relationships, assess community composition (structure), and identify influential environmental variables driving species distributions. We will also explore the application of multivariate statistics in addressing critical ecological issues, such as biodiversity monitoring, ecosystem functioning, and the impacts of anthropogenic disturbances on ecosystems. ## Module Content and Framework These links point to online resources such as datasets and R scripts in support of the video and PDF lecture material. It is essential that you work through these examples and workflows. Here is the Syllabus: :::{.column-page-inset-right} | Week | Class date | Lecture | Topic | Class | Slides | Reading | Tasks | Tasks due | |------|------------|---------------|------------------------------------|-----------------------------------------------------------|-------------------------------------------------------|------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|------------| | W1 | 11 June | L0 | Overview & Introduction | | [♦︎](../slides/BCB743_01_intro.pdf) | [▤](../docs/BES-Reproducible-Code-2017.pdf) [▤](../docs/BES-Data-Guide-2017_web.pdf) | | | | | | Lecture Set 1 | Ecological and Earth Data | | | | [◒](assessments/Task_A1.qmd) Task A1 | 17 June | | | | | **RECAP OF BIODIVERSITY** | | | | | | | | | L1: Revision | Ecological Data | [★](../BDC334/01-introduction.qmd) | | | Lab 1 Review | | | | | | **DATA, MATRICES** | | | | | | | | | L2: Revision | Environmental Distance | [★](../BDC334/02b-env_dist.qmd) | [♦︎](../slides/BCB743_04_environmental_distance.pdf) | | Lab 2 Review | | | | | L3: Revision | Quantifying Biodiversity | [★](../BDC334/03-biodiversity1.qmd) | [♦︎](../slides/BCB743_02_biodiversity_1.pdf) | [▤](../docs/Smit_et_al_2013.pdf) [▤](../docs/Smit_et_al_2017.pdf) [▤](../docs/Smit_the_seaweed_data.pdf) | Lab 3 Review | | | | | L4: Revision | Describing Biodiversity Patterns | [★](../BDC334/04-biodiversity2.qmd) | [♦︎](../slides/BCB743_02_biodiversity_2.pdf) | [▤](../docs/Shade_et_al_2018.pdf) | Lab 4 Review | | | | | Lecture Set 2 | Ecological Theories | | | | [◒](assessments/Task_A2.qmd) Task A2 | 20 June | | | 17 June | L5 | Correlations & Associations | [★](correlations.qmd) | [♦︎](../slides/BCB743_06_correlations.pdf) | | [◒](assessments/Task_B.qmd) Task B | 19 June | | | | L6: Self | Distance Metrics | [★](dis-metrics.qmd) | | | | | | | | | **UNCONSTRAINED ORDINATION** | | | | | | | W2 | 18 June | L7 | Intro to Ordination | | [♦︎](../slides/BCB743_07_ordination.pdf) | | | | | | | L8 | PCA | [★](PCA.qmd) | [♦︎](../slides/BCB743_08_PCA.pdf) | | [◒](assessments/Task_C.qmd) Task C | 23 June | | | | L8: Self | PCA: Additional Examples | [★](PCA_examples.qmd) | | | | | | | | L8: Self | PCA: WHO SDG Example | [★](PCA_SDG_example.qmd) | | | [◒](assessments/Task_D.qmd) Task D | 23 June | | W3 | 23 June | L9a | CA | [★](CA.qmd) | [♦︎](../slides/BCB743_09_CA.pdf) | | [◒](assessments/Task_E.qmd) Task E | 25 June | | | | L9b | DCA | [★](DCA.qmd) | | | | | | | | L10 | PCoA | [★](PCoA.qmd) | [♦︎](../slides/BCB743_10_PCoA.pdf) | | [◒](assessments/BCB743_intgrative_assignment.qmd) Integrative Assignment | 25 June | | | 25 June | L11 | nMDS | [★](nMDS.qmd) | [♦︎](../slides/BCB743_11_nMDS.pdf) | | [◒](assessments/Task_F.qmd) Task F | 27 June | | | | L11: Self | nMDS: PERMANOVA (Diatoms) Example | [★](nMDS_diatoms.qmd) | | [▤](../docs/Mayombo_et_al_2019.pdf) | | | | | | L12: Self | Unconstrained Ordi. Summary | [★](unconstrained-summary.qmd) | | | | | | | 27 June | Lecture Set 1 | Ecological & Earth Data | | | | Present Lecture Set 1 | TBA | | | | | **REGRESSION ANALYSIS** | | | | | | | W4 | 1 July | L13 | Model Building | [★](model_building.qmd) | | | | | | | | L14 | Multiple Regression | [★](multiple_regression.qmd) | | [▤](../docs/The-Biostatistics-Book-Ch-3-5.pdf) | [◒](assessments/Task_G.qmd) Task G (Final Assessment) | **TBA** | | | | L14: Self | Gradients Example | [★](deep_dive.qmd) | | [▤](../docs/Smit_et_al_2013.pdf) [▤](../docs/Smit_et_al_2017.pdf) [▤](../docs/Smit_the_seaweed_data.pdf) | | | | | | Lx | Generalised Linear Models | TBA (2025) | ΤΒΑ (2025) | TBA (2025) | ΤΒΑ (2025) | | | | 4 July | Lx | Generalised Additive Models | TBA (2025) | ΤΒΑ (2025) | TBA (2025) | ΤΒΑ (2025) | | | W5 | | | **CONSTRAINED ORDINATION** | | | | | | | | | L15 | Distance-Based Redundancy Analysis | [★](constrained_ordination.qmd) | [♦︎](../slides/BCB743_12_constrained_ordination.pdf) | [▤](../docs/Smit_et_al_2017.pdf) [▤](../docs/Smit_the_seaweed_data.pdf) | | | | | | L15: Self | db-RDA: Seaweeds Example | [★](Seaweed_in_Two_Oceans_v2/two_oceans_appendices.qmd) | | [▤](../docs/Smit_et_al_2017.pdf) [▤](../docs/Smit_the_seaweed_data.pdf) | | | | | | | **CLUSTER ANALYSIS** | | | | | | | | | L16 | Cluster Analysis | [★](cluster_analysis.qmd) | | | [◒](assessments/Task_D.qmd) Task D (continue) | TBA | | | 8 July | Lecture Set 2 | Ecological Theories | | | | Present Lecture Set 2 | TBA | | | | L17 | Review | | | | | | ::: **Core theoretical framework** Ecological hypotheses underlying the processes of species assembly in space and time, including neutral and niche-based mechanisms, and historical events; overview of the currently known and understood distributional patterns of major groups of organisms at global, regional and local scales; consideration of sampling designs aimed at capturing these patterns and drivers so as to arrive at a processed based understanding of species assembly. **Competence** Data collection aimed at a quantitative test of the relevant hypotheses, above. The management and analysis of ecological data; reproducible and collaborative research; the use of R as a tool for the analysis multivariate ecological data; multivariate techniques such as nMDS, PCA, RDA and cluster analysis; graphical data summaries and visualisations. ## Outcomes of BCB743 By the end of this module, students will be able to: - Understand the concepts of $\alpha$-, $\beta$- and $\gamma$-diversity - Know and understand the current hypotheses that explain species assembly processes in space and time (e.g. neutral and niche mechanisms) - Collect ecological data at the appropriate scale, which would lend themselves to a quantitative analysis of points 1 and 2, above - Use the R software and associated packages to undertake the analyses required in point 3, above - Interpret the outcomes of the above analyses and use it to quantitatively characterise points 1 and 2, above - Communicate the findings by written and oral means ## Graduate Attributes The [**graduate attributes**](../pages/graduate_attributes.qmd) resulting from completion of this modules alignment with the expectations of the workspace across diverse organisations and institutions where graduates typically find employment. ## Data and Reading in Support of the Syllabus In the table above there are links to several key papers to read in preparation of each week's theory. It is essential that you read these papers. Many other references are cited in each Chapter. These serve several functions in that they: - Add additional theory relevant to some ecological concepts - Provide background to some of the datasets used in my examples - Discuss derivations of some equations used to calculate diversity concepts - Provide example walkthroughs of some of the computational aspects of the methods covered in the Labs - Collectively supplement the discussion about these concepts covered in the lectures Actively engaging with these reading materials will make to difference between a 60% average mark for the module, and a mark in excess of 80%. ### Reading You are **expected to read additional material** in support of the content covered in class and on this website. A **compulsory** reference is ['Numerical Ecology with R'](http://adn.biol.umontreal.ca/~numericalecology/numecolR/) by Daniel Borcard, François Gillet and Pierre Legendre [@borcard2011numerical]. Much of the class' content and many of the examples (and code) that I use have been adapted from this source. There is also the excellent book by @legendre2012numerical called 'Numerical Ecology' which provides everything the former book has, but in greater detail and with less focus on R. Both should be considered a 'gold standard' reference for Quantitative Ecology. A third **highly recommended text** is the book [Tree Diversity Analysis](http://apps.worldagroforestry.org/downloads/Publications/PDFS/b13695.pdf) by Roeland Kindt and Richard Coe. I can also recommend a these amazing websites with excellent content: - David Zelený's [Analysis of Community Ecology Data in R](https://www.davidzeleny.net/anadat-r/doku.php) - Mike Palmer's [Ordination Methods for Ecologists](http://ordination.okstate.edu/) - [GUide to STatistical Analysis in Microbial Ecology (GUSTA ME)](https://sites.google.com/site/mb3gustame/) Note that the URLs with links to additional reading that appear with the worked-through example code should **not be seen as optional**. They are there for a reason and should be consulted even though I might not necessarily refer to each of them in class. Use these materials liberally. Should you want to download the source code for the BCB743 (and BCB744 website), you may find it on [<i class="fab fa-github"></i> GitHub](https://github.com/ajsmit/R_courses). ### Datasets Used in This Module Note that the links provided might not necessarily lead to the **vegan** help page. | | Dataset | Source | |:--|--------------------------------------------------------------|-----------------------------------------------------------------------------------------------| | 1 | Vegetation and Environment in Dutch Dune Meadows | [**vegan**](https://www.davidzeleny.net/anadat-r/doku.php/en:data:dune) | | 2 | Oribatid Mite Data with Explanatory Variables | [**vegan**](http://adn.biol.umontreal.ca/~numericalecology/data/oribates.html) | | 3 | The Doubs River Data | [**Numerical Ecology with R**](https://www.davidzeleny.net/anadat-r/doku.php/en:data:doubs) | | 4 | The Barro Colorado Island Tree Counts | [**vegan**](https://www.davidzeleny.net/anadat-r/doku.php/en:data:bci) | | 5 | John Bolton, Rob Anderson, and Herre Stegenga's Seaweed Data | [**Smit et al., 2017**](https://www.frontiersin.org/articles/10.3389/fmars.2017.00404/) | | 6 | Serge Mayombo's Diatoms Data | [**Mayombo et al., 2019**](https://www.tandfonline.com/doi/abs/10.2989/1814232X.2019.1592778) | | 7 | World Health Organization Sustainable Development Goals Data | [**WHO**](https://www.who.int/data) | ## Prerequisites You should have a moderate numerical literacy and have prior programming experience. Such experience will have been obtained in the [BCB744](/workshops/) module, which is a module about doing statistics in R. If you have a reasonable experience in coding and statistical analysis you should find yourself well prepared. You should also thoroughly revise [BDC334](../BDC334/BDC334_syllabus.qmd) by the end of the first week of this module. ## Method of Instruction You are provided with reading material (lecture slides, code, website content) that you are expected to consume **prior to the class**. Classes will involve brief introductions to new concepts, and will be followed by working on exercises in class that cover those concepts. The workshop is designed to be as interactive as possible, so while you are working on exercises the tutor and I will circulate among you and engage with you to help you understand any material and the associated code you are uncomfortable with. Often this will result in discussions of novel applications and alternative approaches to the data analysis challenges you are required to solve. More challenging concepts might emerge during the assignments (typically these will be submitted the following day), and any such challenges will be dealt with in class prior to learning new concepts. Although the module is theory-heavy, a large part of it is also about coding. It is up to you to take your coding skills to the next level and move beyond what I teach in class. Coding is a bit like learning a language, and as such programming is a skill that is best learned by doing. ## Learning Colaboratively ::: {.callout-note appearance="simple"} ## Also read: How to learn Please refer to my [advice about how to learn](../pages/How_to_learn.qmd). ::: Discuss the BCB743 workshop activities with your peers as you work on them. Use also the WhatsApp group set up for the module for discussion purposes (I might assist via this medium if neccesary if your questions/comments have relevance to the whole class). A better option is to use [GitHub Issues](/quantecol/#help-via-bcb744-and-bcb743-issues-on-github). You will learn more in this module if you work with your friends than if you do not. Ask questions, answer questions, and share ideas liberally. Please identify your work partners by name on all assignments (if you decide to work in pairs). **Cooperative learning is not a licence for plagiarism. Plagiarism is a serious offence and will be dealt with concisely. Consequences of cheating are severe---they range from a 0% for the assignment or exam up to dismissal from the course for a second offense.** ## Reusing Code Found Elsewhere A huge volume of code is available on the web and it can be adapted to solve your own problems. You may make use of any online resources (e.g. form [StackOverflow](https://stackoverflow.com/), a thoroughly-used source of discussion about [R code](https://stackoverflow.com/questions/tagged/r))---but you **MUST** clearly indicate (cite) that your solution relies on found code, regardless to what extent you have modified it to your own needs. Reused code that is discovered via a web search and which is not explicitly cited is plagiarism and it will be treated as such. On assignments you may not directly share code with your peers in this workshop. ## Software In this course we will rely entirely on [R](https://cran.r-project.org/) running within the [RStudio](https://www.rstudio.com/) IDE. The use of R was covered extensively in the [BCB744](http://localhost:4321/workshops/) module where the [installation process](http://localhost:4321/workshops/intro_r/chapters/02-rstudio/) was discussed. We will primarily use the [**vegan**](https://cran.r-project.org/web/packages/vegan/index.html) package, but some useful functions are also provided by the package [**BiodiversityR**](https://github.com/cran/BiodiversityR) (and [here](http://apps.worldagroforestry.org/downloads/Publications/PDFS/b13695.pdf) and [here](https://rpubs.com/Roeland-KINDT)). Various other R packages offer overlapping and additional methods, but **vegan** should accommodate \>90% of your Quantitative Ecology needs. ## Computers You are encouraged to provide your own laptops and to install the necessary software before the module starts. Limited support can be provided if required. There are also computers with R and RStudio (and the neccesary add-on libraries) available in the 5th floor lab in the BCB Department. ## Attendance This worskhop-based, hands on course can only deliver acceptible outcomes if you attend all classes. The schedule is set and cannot be changed. Sometimes an occasional absence cannot be avoided. Please be curtious and notify myself or the tutor in advance of any absence. If you work with a partner in class, notify them too. Keep up with the reading assignments while you are away and we will all work with you to get you back up to speed on what you miss. If you do miss a class, however, the assignments must still be submitted on time (also see [**Late submission of CA**](/quantecol/#late-submission-of-ca)). Since you may decide to work in collaboration with a peer on tasks and assignments, please keep this person informed at all times in case some emergency makes you unavailable for a period of time. Someone might depend on your input and contributions---do not leave someone in the lurch so that they cannot complete a task in your absence. Sure, here is the improved and completed sentence and meaning at the end: --- ## Assessment Policy {#sec-policy} **Continuous Assessments** (CA) and a **Final Assessment** will provide a **Final Mark** for the module. They contribute equally to the final mark. These modes of assessment meet our needs as far as [formative and summative assessments](../pages/assessment_theory.qmd) are concerned. All assessments are open book, so consult your code and reading material if and when you need to. ### Continuous Assessment The Continuous Assessment is comprised of: - Tasks A1 is weighted 0.2 towards the CA. A2 is assessed via the lecture presentation only. - Lecture presentation, weighted 0.5 towards the CA. - Tasks B to F, the average of which contributes 0.3 towards the CA. For Tasks A1 and A2, please refer to the marking schedule as agreed on by the class. When assessing Tasks B to F, we will pay attention to the following criteria: - Presentation and formatting (10%), including: - Questions answered in order - Sectioning - General appearance - References (if required) - Code formatting (10%), e.g.: - Application of [R code conventions](http://adv-r.had.co.nz/Style.html), e.g. spaces around `<-`, after `#`, after `,`, etc. - New line for each `dplyr` function (lines end in `%>%`) - New line for each `ggplot` layer (lines end in `+`) - Tidiness of code presentation in the HTML - Code correctness in the context of the specified analysis (25%): - The code must faithfully execute the intended analysis as required by the research questions and/or hypotheses tests - Figures (10%): - Sensible use of themes and colors - Publication quality (complete axis titles and labels, etc.) - Informative and complete titles, axis labels, legends, etc. - Discussion (45%): - Here you will be assessed for integrating the results of your analyses within the correct theoretical framework ### Task G: Final Assessment (Exam) The Final Assessment starts after the Multiple Regression lecture on 4 July and you can do in the comfort of your home. It will involve the analysis of real world data and assess some more theoretical and philosophical aspects of Quantitative Ecology. A full mark breakdown is provided with [Task G](assessments/Task_G.qmd). ### Submission of Assignments and Exams A statement such as the one below accompanies every assignment---pay attention, as failing to observe this instruction may result in a loss of marks (i.e. if an assignment remains ungraded because the owner of the material cannot be identified): Submit a R script wherein you provide answers to Questions 1--9 by no later than 8:00 tomorrow. Label the script as follows (e.g.): **BCB743_AJ_Smit_Assignment_2.R**. ### Late Submission of CA Late assignments will be penalised 10% per day and will not be accepted more than 48 hours late, unless evidence such as a doctor's note, a death certificate, or another documented emergency can be provided. If you know in advance that a submission will be late, please discuss this and seek prior approval. This policy is based on the idea that in order to learn how to translate your human thoughts into computer language (coding) you should be working with them at multiple times each week---ideally daily. Time has been allocated in class for working on assignments and students are expected to continue to work on the assignments outside of class. Successfully completing (and passing) this module requires that you finish assignments based on what we have covered in class by the following class period. Work diligently from the onset so that even if something unexpected happens at the last minute you should already be close to done. This approach also allows rapid feedback to be provided to you, which can only be accomplished by returning assignments quickly and punctually. ## Support It's expected that some tricky aspects of the module will take time to master, and the best way to master problematic material is to practice, practice some more, and then to ask questions. Trying for 10 minutes and then giving up is not good enough. I'll be more sympathetic to your cause if you can demonstrate having tried for a full day before giving up and asking me. When you ask questions about some challenge, this is the way to do it---explain to me your numerous attempts at trying to solve the problem, and explain how these various attempts have failed. *I will not help you if you have not tried to help yourself first* (maybe with advice from friends). There will be time in class to do this, typically before we embark on a new topic. You are also encouraged to bring up related questions that arise in your own B.Sc. (Hons.) research project. Should you require more time with me, find out when I am 'free' and set an appointment by sending me a calendar invitation. I am happy to have a personal meeting with you via Zoom, but I prefer face-to-face in my office. ### Help Via BCB744 and BCB743 Issues on GitHub All discussion for the BCB744 and BCB743 workshops will be held in the [Issues](https://github.com/ajsmit/R_courses/issues) of [this repository](https://github.com/ajsmit/R_courses). Please post all content-related questions there, and use email only for personal matters. Note that this is a public repository, so be professional in your writing here (grammar, etc.). To start a new thread, create a **New issue**. Tag your peers using their handle---`@ajsmit`, for example---to get their attention. Once a question has been answered, the issue will be closed, so lots of good answers might end up in closed issues. Don't forget to look there when looking for answers---you can use the **Search** feature on this repository to find answers that might have been offered by the same or similar problem experienced by someone else in the past. **Guidelines for Posting Questions:** - First search existing issues (open or closed) for answers. If the question has already been answered, you're done! If there is an open issue, feel free to contribute to it. Or feel free to open a closed issue if you believe the answer is not satisfactory. - Give your issue an informative title. - Good: "Error: could not find function"ggplot"" - Bad: "My code does not work!" Note that you can edit an issue's title after it's been posted. - Format your questions nicely using markdown and code formatting. Preview your issue prior to posting. - As I explained above, your peers and I will more sympathetic to your cause if you can show *all the things you have tried as you, yourself, tried to fix the issue first*. - Include code and example data so the person trying to help you have something to work with (and which results in the error, perhaps) - Where appropriate, provide links to specific files, or even lines within them, in the body of your issue. This will help your peers understand your question. Note that only the teaching team will have access to private repos. - (Optional) Tag someone or some group of people. Start by typing their GitHub username prefixed with the \@ symbol. Of course this supposes that each of you have a GitHub account and username. - Hit **Submit new issue** when you're ready to post. ```{=html}  ```