During the boot camp, we will use resources provided by the Institute for Computational and Data Sciences (ICDS). To access the resources provided by the ICDS Cluster Resources, you need to follow these steps:
For incoming graduate students, activate your PSU student account
Enable two-factor authentication for your PSU account. You need to enroll for both the Microsoft Multifactor Authenticator as well as the Duo 2-Factor Authenticator. For a complete list of help articles on Duo 2FA, please see this webpage. Please note that you will not be able to login to the cluster without setting up Duo.
Request an ICDS-ACI account。 Use the research description: "2023 Data Reproducibility Bootcamp" and list Dr. David Koslicki (dmk333) as your sponsoring account. Under “ICDS Linux Clusters,” check “Roar Collab.”
Please do this by the end of Friday (2023/08/07) since it can take a few days for the account request to process.
If you already have access to the cluster, this is not required.
You will need Git installed and an account on GitHub for the section on version control.
Some systems come with Git pre-installed. You can check if you already have Git via the command line interface by typing git --version
. If you need to install Git, refer to install git for instructions.
In Session 3, you will use password authentication to communicate with GitHub. This is required to ensure that the instructions are reproducible by all participants.
If you run into any issues, feel free to contact Maxwell Konnaris: mak6930@psu.edu
You will need to download and install several “packages” to work interactively during the bootcamp. Conda is a commonly used open-source package management and environment management system that allows you to install, run, and update packages while managing their dependencies on Windows, macOS, and Linux operating system.
You can follow the steps below to step up your environment on the ACI server.
ssh <your user account>@submit.hpc.psu.edu ## you need to enter psu password. If you want to login without password, please refer to http://www.linuxproblem.org/art_9.html.
cd /storage/work/<your user account>
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh ## when you run this line, please follow:
# You will see "Please, press ENTER to continue", then click the "ENTER" button to continue and then keep clicking the "Space" button.
# Then you will see a question "Do you accept the license terms? [yes|no]", enter "yes" and then "ENTER" button
# Then click "ENTER" button again and wait for its done.
# Then you will see another question "Do you wish the installer to initialize Miniconda3 by running conda init? [yes|no]", enter "yes" and then "ENTER" button.
mv ~/miniconda3 . # This step will wait a while
cd
ln -s /storage/work/<your user account>/miniconda3
mamba
(mamba
provides conda parallel functionality, which significantly speeds up downloading and installing large bundle of packages):conda install -c conda-forge mamba ## when you run this line, you will see a question
# "Proceed ([y]/n)?", enter "y" to proceed.
If the above command finishes successfully, mamba --version
should return you the installed version.
bootcamp
environment (replace the below parameter of -n
with the name you like if need):wget https://raw.githubusercontent.com/biostars/bootcamp-central/master/web/archives/2023/setup/bootcamp.yaml
mamba env create -f bootcamp.yaml -n bootcamp
snakemake
conda activate bootcamp
snakemake --version