2025-03-01
All slides are available at:
https://umr-1202-biogeco.pages.mia.inra.fr/initiation_git
This presentation was prepared using Quarto, Git and Gitlab
The repository is available here:
https://forgemia.inra.fr/umr-1202-biogeco/initiation_git
The course was prepared by:
Today, we pretend that you finally received some data, and we are going to start the analyses for this part of your project.
Unix note
On windows
On windows, create a folder in “Documents” for example, or in the “data” partition of your HD.
In this folder, you will organize your work in projects or subprojects.
Your first project is this training session.
Rmd, md, Qmd are text formats that allow combining analyses and text, and therefore to:
MD allows generating reports easily
This presentation was made using quarto markdown and the derived html was produced very simply using a Visual studio code add-on.
If your analysis requires a succession of scripts…
How do know which version of each script was run last ?
And if you have a doubt… you start again.
Important
Git will help you keep one version of each script, but records all the changes you made to your code!
Warning
Should already be done!
Where do we find it:
This will be the password you use to connect when you work from the terminal (i.e. when you perform actions sur as “push” and “pull”)
Upload files from your computer
(creates a hidden “.git” folder)
Push your local changes to the remote location
Notes
Tip
Note that password don’t appear when you type them in the terminal. You can copy paste (in the terminal: ‘ctrl + maj + v’ or left click)
An important file in a repository is the “README” file.
We will create it now in our folder.
In the terminal type:
In the README.md file let’s write something of the form:
# Git initiation course in BIOGECO
## Contributors
Where you write who's working on this project.
## Description
This is a test repository to practice using version control.
## Content
Maybe a table of content of this file if it is long.
## Requirements
What is needed to run this analyses in addition to what's in the repository?
Check it out online!
.ssh/
in your home folder (hidden directory, ls -a
to see it)It’s not there? -> You don’t have keys.
/Users/bbrachi/.ssh/id_ed25519_gitinit
enter
twice).ssh
folder:
DO NOT SHARE YOUR PRIVATE SSH KEY!!!
It allows anyone to decrypt your communications with Gitlab!
Copy it to the clipboard:
If it is the first time you connect, you will see:
The authenticity of host 'gitlab.example.com (35.231.145.151)' can't be established.
ECDSA key fingerprint is SHA256:HbW3g8zUjNSksFbqTiUWPWg2Bq1x8xdGUrliXFzSnUw.
Are you sure you want to continue connecting (yes/no)?
Type “yes” and enter.
It should say “Welcome to gitlab, username!”
## Inserting an image:
![An image from the internet](https://vickysteeves.gitlab.io/repro-papers/img/final-doc.jpg)
Often, your project will mix data and analysis scripts. So we need to keep things tidy.
Mainly script files (and other small files - e.g. small images for this presentation) should be synchronised.
Large data files must be ignored and managed another way.
The “.gitignore” file helps you to do that!
# Create a gitignore file in order to untrack all files ending with the following extensions:
printf ".Rproj.user\n*.Rproj\n*.mp4\n*.fastq\n*.xlsx\n*.docx\n" > .gitignore
Note
What file size are we talking about?
Of the importance of a tidy project structure!
It is important to have a tidy project structure in order to make collaborative work easier! Cf. first slides of this presentation.
It’s basically the basic things you need when you work alone on a project
Now let’s see how to collaborate in a (very) small group (adviser/student)
Go to the left bar
code > repository graph
This is a representation of your repository’s history.
Notice how important the messages from your comits are!
We are all on the branch main of you repositories
cd ../ ## to leave your own working repository
git clone git@forgemia.inra.fr:bbrachi/initiation_git_lduva.git
This will copy your supervisors repository to your computer (works only if you have set up your ssh key, what has been done at this stage!).
Files and branches in a collaborative projects are subject to changes by your collaborators, or by yourself on another computer for example.
Importantce of always pulling before doing any modification
Before you start working on a project always pull the latest version to your laptop.
Supervisors: make a change to the Rmd file, then add, commit and push to the main branch (like we did before)
Students: in your local copy of your supervisors repository pull the latest changes:
This command creates a new branch, called “student” and switches to it at the same time:
It should say:
This branch is local!
Connect it to remote branch with :
initiation_git.Rmd
fileBecause the students are smarter, they also indicate in the README.md that the R packages ggplot2 and gally are required for this project to work.
(close the Rmd and README files when you’re done)
Both students and surpervisors add,comit and push changes and check that all is up to date with git status
Now go back to the browser to vizualize the repository graph!
You should see the two branches with their respective commits.
Before the end of the internship, the supervisor needs to merge the students work back into his own.
Supervisors you will merge the student branch to the supervisor branch.
git status
again to make sure all is up to date on your branchAnd it spits out:
Fusion automatique de initiation_git.Rmd
CONFLIT (contenu) : Conflit de fusion dans initiation_git.Rmd
La fusion automatique a échoué ; réglez les conflits et validez le résultat.
This means there are conflicts.
The Rmd file should look something like this:
Now look at the repository graph in the browser!