During the bootcamp, we will use resources provided by the Institute for Computational and Data Sciences (ICDS). To access the resources provided by the ICDS-ACI, you need to follow these steps:
For incoming graduate students, activate your PSU student account
Enable two-factor authentication for your PSU account
Request an ICDS-ACI account. Use the research description: "2022 Data Reproducibility Bootcamp" and list PJ (ghp3) as your sponsoring account.
Please do this by the end of Wednesday since it can take a few days for the account request to process.
If you already have access to the cluster, none of this is required.
interactive resources icds main page
You will need Git installed and an account on GitHub for the section on version control.
Some systems come with git already installed, check it in command line interface by git --version
. If not, refer to install git for instructions.
During the worflow section, you will use password authentication to communicate with GitHub. This is required to make the instructions the same for everyone.
You need several packages installed to work on the practices during the bootcamp, to ensure consistent package versions, conda is recommended to be used. Please refer to conda installer for downloading and installing conda.
After installation, open a new terminal and run
$ conda install -c conda-forge mamba
to install mamba
, mamba
provides conda parallel functionality, which significantly speeds up downloading and installing large bundle of packages.
If the above command finishes successfully, mamba --version
should return you the installed version.
Download the environment yaml file, run the following command to create environment based on descriptions in yaml file (replace parameter of -n
with the name you like if need)
$ mamba env create -f bootcamp.yaml -n bootcamp
If succeed, the following commands should return you the version of snakemake
$ conda activate bootcamp
$ snakemake --version
Any issue on above steps, feel free to contact Jianyu Yang: jmy5455@psu.edu