*Workshop material has been added in the abstract section below

Le réseau étudiant du CSBQ organise pour la première fois en 2017 un colloque/retraite de deux jours sur l'utilisation du logiciel R en science de la biodiversité. L'objectif de ce colloque est de présenter plusieurs méthodes avancées non couvertes par la série régulière d'ateliers en R du CSBQ. Le colloque se déroulera à la réserve naturelle Gault de l'Université McGill, à Mont-Saint-Hilaire en Montérégie. Ce colloque permettra à la communauté d'utilisateurs de R de tous les géopôles du CSBQ d'interagir dans un contexte convivial tout en se familiarisant avec plusieurs méthodes quantitatives ou informatiques de pointe en écologie. L'hébergement en chalet (dortoir), le service de traiteur et le transport depuis Montréal seront fournis aux participants du colloque. Le nombre de participants est limité à 32, soit la capacité d'accueil des chalets. Les participants seront choisis au hasard lorsque la période d'inscription sera terminée. Bien que le colloque soit gratuit, nous demandons un dépôt de 40$remboursable sur place ou en cas d'annulation au moins deux semaines avant la tenue de l'évènement. Le programme du colloque de l'année en cours est détaillé ci-bas. Veuillez noter que les ateliers se dérouleront principalement en anglais. For the first time, the QCBS student network will hold in 2017 a 2-day symposium on the use of the R computing language in biodiversity science. The objective of this symposium is to present several advanced methods that are not covered during the regular series of QCBS R workshops. The symposium will take place at McGill's Gault Nature Reserve in Mont-Saint-Hilaire, Montérégie. This event will allow the community of R users from all QCBS geopoles to exchange in a casual setting while also learning several current quantitative and computational methods in ecology. Housing (dorms), food, and transport from Montreal will be provided for all participants. The number of participants is limited to 32 based on the housing capacity of the cottages at the reserve. Participants will be chosen randomly once the registration period is over. Although the symposium is free, we ask for a 40$ deposit refundable at the event or if cancellation occurs at least two weeks prior to the event. The program for the current edition of the symposium is detailed below.

Questions? Vincent Fugère - Dalal Hanna - Krista Oke

April 24th

• 10h00: Departure from McGill
• 11h00: Arrival at Gault
• 12h00: Lunch
• 13h00: The Bayesian Biologist: You are probably more Bayesian than you think (Marc-Olivier Beausoleil & Max Farrell)
• 15h00: Coffee break
• 15h30: Intro to gene expression analysis in R (Sébastien Renaut)
• 18h00: Dinner

April 25th

• 8h00: Breakfast
• 9h00: Predicting species geographical distributions using R (Julia Nordlund & Pedro Henrique Pereira Braga)
• 10h30: Break
• 10h45: Joint modelling (Guillaume Blanchet)
• 12h45: Lunch
• 13h30: Open Science and Reproducibility in R (Monica Granados)
• 15h45: Departure from Gault

The Bayesian Biologist: You are probably more Bayesian than you think by Max Farrell & Marc-Olivier Beausoleil. Jump with us into the world of probabilities with a workshop on Bayesian inference. We are going to explore this statistical framework with simple and meaningful examples for biologists. We plan to guide you through some theory, history, applications, and get your hands dirty with some code. At the end of the workshop, you’ll be convinced that Bayesian statistics are a super powerful framework to interpret the world, and get a taste of the ways you might implement them in your own research. You have an idea of things you want us to discuss? Fill this survey: https://goo.gl/forms/By3aMFtNaxLJ2ICB2 or email Max or Marco, we would like to hear your ideas!

Intro to gene expression analysis in R by Sébastien Renaut. Next generation sequencing has promised cheap DNA sequences to the masses. While this may be true, the bottleneck has now shifted from generating data to analyzing it. Here, I will use transcriptome sequencing data (RNAseq) to quantify gene expression. I will introduce data formats commonly used in genomics (e.g.: .fastq,.bam,.sam) and I will use the R programming language to identify differentially expressed genes (e.g. DESeq2, edgeR packages), cluster samples based on gene expression, detects gene ontology categories which are over/under represented (goseq) and present various graphics to illustrate results.

Predicting species geographical distribution using R by Pedro Henrique P. Braga and Julia Nordlund. Species distribution models (SDM) have been widely applied to address many questions in biology, such as in the domains of ecology, evolution, biogeography and conservation. Applications are numerous and may include projecting potential impacts of climate change, predicting species invasions, conservation planning, addressing questions of ecological niche evolution, and estimating potential disease spread. Along with the increase in popularity of species distribution models, many methods and tools have been developed throughout the last decades. Most of these tools are now available within R packages. This course introduces fundamental concepts underpinning species distribution models, describing some of the most prominent methods currently in use, and discussing the strengths and limitations of these models for different applications.

Joint modelling by Guillaume Blanchet. Natural systems are complex and understanding them is a challenging task. In recent years, there has been an explosion in the amount of data that were gathered and made available that can potentially increase our knowledge of why and how species distribute as they do. It is now possible to obtain highly precise environmental and habitat characteristics for large areas of the world, traits are now available for a wealth of species and it is now possible to obtain high quality phylogenies for large groups of species. But how can we link these data together to better understand and predict the distribution of multiple species in a single model? In recent years, joint species distribution models (JSDMs) have emerged as an attractive way to approach such question. In this workshop, I will show you how to construct JSDMs using Bayesian hierarchical models. I will also briefly discuss the concept behind hierarchical models and how they can be used in a community ecology context.

Open Science and Reproducibility in R by Monica Granados. Imagine if every paper you ever publish from now on could be reproduced by anyone around the world. Or a platform that gives you the power to integrate new data seamlessly into a manuscript complete with text and figures. In this workshop, we will be covering how to work in the open using R, R Markdown and GitHub. These three open platforms allow us to host data, analyze, visualize and produce a manuscript in one reproducible workflow. You will learn how to set up a repository in GitHub and manage branches, draw data from GitHub into R, write an R Markdown script for your manuscript and how to upload the R Markdown script into GitHub for reproducibility. The advantages of open, reproducible science are many. When working collaboratively, reproducible workflows allow collaborators to contribute simultaneously to the project with version control to preserve different iterations of the project. Working in the open also allows you share your research more widely, facilitating collaborative opportunities. At the end of the workshop we will also discuss the wider movement of open science, how it is helping breakdown economic barriers in science and education and how you can contribute.