Bootcamp

INFORMATION TOP

Welcome to the PSU Bioinformatics Data Reproducibility Bootcamp!

Important Resources:

Location and General Instructions

The boot camp will be held at the Chemical and Biomedical Engineering Building, Room 001.

If you want to join us a little earlier, we'll have breakfast between 9:00-9:30 AM on the third floor of the Huck Institutes for the Life Sciences, at the Life Sciences Bridge. Lunch will also be provided at the same venue.

Please don't forget to register on the sign-up sheet when you arrive.

If you have any trouble navigating through campus, please e-mail Venitha at vab5299@psu.edu.

Instructors

Organizers

Funding and Support

Computational resources are provided by The Institute for Computational and Data Sciences (ICDS).

The boot camp was first conceived and supported in Jun 2016 by the Administrative Supplement to NIGMS Predoctoral Training Grants (PA-15-136)

SCHEDULE TOP

Friday, August 11, 2023

09:00 AM - 09:30 AM - Breakfast
09:30 AM - 09:45 AM - Welcome and Overview of the Day
09:45 AM - 10:45 AM - Session 1: Introduction to Data Reproducibility
10:45 AM - 11:00 AM - Break
11:00 AM - 12:00 PM - Session 2: Navigating the PSU Compute Cluster and Resources (Part 1)
12:00 PM - 01:00 PM - Lunch
01:00 PM - 01:30 PM - Session 2: Navigating the PSU compute cluster and resources (Part 2)
01:30 PM - 02:30 PM - Session 3: Bioinformatics Data Overview and Metagenomics Snakemake-based Workflow Part 1
02:30 PM - 02:45 PM - Break
02:45 PM - 04:00 PM - Session 3: Bioinformatics Data Overview and Metagenomics Snakemake-based Workflow Part 2

SESSION 1 TOP

Session 1: Introduction to Data Reproducibility

Instructor: Juliana Simas

In this session we will discuss the reproducibility crisis in science and we will go over materials that will help us make our research reproducible.

We will have a look at StackEdit (an online markdown-based tool for lab notebook), we will learn how to use GitHub and Git for version control, as well as Galaxlay for bioinformatics analysis.

Slides

SESSION 2 TOP

Session 2: Navigating the PSU Compute Cluster and Resources

Instructors: Maxwell Konnaris & Venitha Bernard

This session will introduce some basic Linux commands, how to use the computing cluster at PSU, how to set up a conda environment and how to code on Jupyter notebooks.

Contents

SESSION 3 TOP

Session 3: Bioinformatics Resources, Snakemake, Pytest

Instructors: Chunyu Ma & Shaopeng Liu

In this session, we will go through common bioinformatics resources and show examples of some useful tools (e.g. GNU Parallel). We will also utilize an ATAC-seq analysis pipeline as a base to introduce Snakemake, a popular workflow management tool. Finally, we will simply introduce how to use Pytest in Python.

ARCHIVES TOP

Penn State • generated from bootcamp-central via pyblue

2023 PSU Bioinformatics Data Reproducibility Bootcamp