User:Timothee Flutre/Notebook/Postdoc/2012/08/14: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎About Git: add cheatsheet)
(15 intermediate revisions by the same user not shown)
Line 11: Line 11:




* '''Start using git''': first you need to [http://git-scm.com/downloads download] it. Then you need to read some [http://git-scm.com/doc documentation] about it. And finally you can start using it. Typical cases are when developing a software or writing an article.
* '''Documentation''':
** try it [http://try.github.io/levels/1/challenges/1 online]
** if you liked, [http://git-scm.com/downloads download] it
** official [http://git-scm.com/doc book]
** [http://gitref.org/ quick ref], [http://www.ndpsoftware.com/git-cheatsheet.html cheatsheet]
** tutorial for [http://nyuccl.org/pages/GitTutorial/ scientists] (another by a [http://kbroman.github.io/github_tutorial/ geneticist])
** [http://gitready.com/ resources] depending on your level
** make your repositories freely available online via [https://github.com/ github] (see its [https://help.github.com/ help pages] too) or [https://bitbucket.org/ bitbucket]
** ask questions on [http://stackoverflow.com/ stackoverflow]
** manage your code, papers, talks, courses and even [http://www.wired.com/wiredenterprise/2013/06/cades-witty-headline-here/ books] with it!




* '''Need some help''': the learning curve for git is quite steep at the beginning, so it's always worth browsing [https://help.github.com/ help pages], reading [http://gitref.org/ Git Reference], and searching for questions and answers on [http://stackoverflow.com/ stackoverflow].
* '''Conflicts''': when updating one branch with the content of another one (<code>git checkout branch1; git merge branch2</code>), some conflicts can happen, and it is usually hard to know how to solve them properly (but see a concrete example [http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging here]). In the following, branch1 can be master and branch2 can be origin/master, or branch1 can be master and branch2 can be dev.
** The first solution is to edit each conflicted files by hand, then run <code>git add fileX.txt</code> (staging indicates to git that the conflict is resolved) and finally run <code>git commit -m "merge branch2 and solve conflicts" fileX.txt</code>.
** The second solution is to ignore the conflicts and overwrite the files of branch1 with the content of branch2, one file at a time: <code>git checkout --patch branch2 fileX.txt</code>.
** The third solution, even more radical, is to "overwrite" all of branch1 with the content of the branch2, all files at once: <code>git reset --hard branch2</code>.
 
 
* '''Tips''':
** undo uncommitted changes: <code>git checkout myfile.txt</code>
** split a big commit in several smaller commits: <code>git add -p myfile.txt</code>
** usual config: <code>git config --global user.name 'Timothée Flutre'; git config --global user.email 'timflutre@gmail.com'; git config --global core.editor emacs; git config --global i18n.commitEncoding 'utf8'; git config --global i18n.logOutputEncoding 'utf8'</code>
** remote via ssh tunnel: first open the tunnel <code>ssh gateway.foo.bar -l tflutre -Nf -L 20400:maincluster:22</code>, then add the remote <code>git remote add mcl ssh://tflutre@localhost:20400/home/tflutre/myproject/.git</code>




* '''Writing a paper''': in this example, I am writing a paper with two colleagues. We decide to do it as a [http://git-scm.com/book/en/Distributed-Git-Distributed-Workflows centralized workflow], the shared repository being hosted by [https://github.com/ github].
* '''Writing a paper''': in this example, I am writing a paper with two colleagues. We decide to do it as a [http://git-scm.com/book/en/Distributed-Git-Distributed-Workflows centralized workflow], the shared repository being hosted by [https://github.com/ github].
** Each of us needs to [https://github.com/signup/free create a free account].
** Setting up the infrastructure:
** I need to upgrade my account in order to have the right to manage private repositories ([https://github.com/plans/ $7/month]).
*** Each of us needs to [https://github.com/signup/free create a free account].
** I create a private repository named "paper" and add my colleagues as collaborators to it.
*** I need to upgrade my account in order to have the right to manage private repositories ([https://github.com/plans/ $7/month]).
** I retrieve the repository on my local machine: <code>git clone git://github.com/timflutre/paper.git</code>
*** I create a private repository named "paper" and add my colleagues as collaborators to it.
** I create my first file, for instance "paper_main.tex", and add it to git in my local repository: <code>git add paper_main.tex</code> followed by <code>git commit -m "first commit" paper_main.tex</code>.
*** I retrieve the repository on my local machine: <code>git clone git://github.com/timflutre/paper.git</code>
** I create one branch per collaborator (the default branch being "master"): <code>git branch tim</code>, then <code>git branch colleague1</code> and finally <code>git branch colleague2</code>. I can list the local branches with <code>git branch</code> and I can switch to my branch with <code>git checkout tim</code> for instance.
*** I create my first file, for instance "paper_main.tex", and add it to git in my local repository: <code>git add paper_main.tex</code> followed by <code>git commit -m "first commit" paper_main.tex</code>.
** I push the changes I made from my local repo onto github: <code>git push origin master</code>, this for each branch I created.
*** I create one branch per collaborator (the default branch being "master"): <code>git branch tim</code>, then <code>git branch colleague1</code> and finally <code>git branch colleague2</code>. I can list the local branches with <code>git branch</code> and I can switch to my branch with <code>git checkout tim</code> for instance.
** I send an email to my colleagues telling them that they can retrieve the content of the repository from github into their local machine(s): <code>git clone https://github.com/timflutre/paper.git</code>.
*** I push the changes I made from my local repo onto github: <code>git push origin master</code>, this for each branch I created.
** Each of us can make modifications on its own branch, and push them on github in order to allow the others to access the changes: <code>git push origin colleague1</code> for instance.
*** I send an email to my colleagues telling them that they can retrieve the content of the repository from github into their local machine(s): <code>git clone https://github.com/timflutre/paper.git</code>.
** From time to time, one of us has the responsibility to merge the changes and update the "master" branch with the latest version.
** Typical working cycle:
** Once this is done, the others need to retrieve the new content of "master" in their local repo: <code>git checkout master</code>, <code>git fetch origin</code>, <code>git diff master origin/master</code>, <code>git merge origin/master</code>.
*** Each of us can make modifications on its own branch, and push them on github in order to allow the others to access the changes: <code>git push origin colleague1</code> for instance.
** Then, they need to update their local branch with the new content of "master": <code>git checkout colleague1</code>, <code>git diff --name-status colleague1..master</code>. This will list the files having differences between their local branch and the new content of "master".
*** From time to time, one of us has the responsibility to merge the changes and update the "master" branch with the latest version.
** One can look at the differences file by file: <code>git diff --color-words colleague1:paper_main.tex master:paper_main.tex</code>. The options "--color-words" is especially useful in LaTeX.
*** Once this is done, the others need to retrieve the new content of "master" in their local repo: <code>git checkout master</code>, <code>git fetch origin</code>, <code>git diff master origin/master</code>, <code>git merge origin/master</code>.
** To merge the content of "master" into his own branch, we do: <code>git merge master</code>.
*** Then, they need to update their local branch with the new content of "master": <code>git checkout colleague1</code>, <code>git diff --name-status colleague1..master</code>. This will list the files having differences between their local branch and the new content of "master".
** In case of conflicts, we have to edit our own files by hand. Or we can also choose to ignore the conflicts and overwrite our local files with the content of "master": <code>git checkout --patch master paper_main.tex</code>.
*** One can look at the differences file by file: <code>git diff --color-words colleague1:paper_main.tex master:paper_main.tex</code>. The options "--color-words" is especially useful in LaTeX.
*** To merge the content of the recently-updated local "master" into his own local branch, we do: <code>git merge master</code>.
** Tips: don't version the output pdf in the repository because, as it is binary, git can't merge it properly. But you can add a Makefile (see below) and, by entering <code>make main -i</code> on the command-line, it will compile your pdf document when you need it
 
<nowiki>
all: main supp
 
main:
latex paper_main.tex
bibtex paper_main
latex paper_main.tex
latex paper_main.tex
pdflatex paper_main
 
supp:
latex paper_supplements.tex
bibtex paper_supplements
latex paper_supplements.tex
latex paper_supplements.tex
pdflatex paper_supplements
 
clean:
rm -f *~ *.aux *.dvi *.log *.pdf *.bbl *.blg *.toc
</nowiki>
 
 
* '''Two remotes''': let's imagine that on cluster1 I have 2 branches, "master" and "dev", on github I only have "master", and I want to work with "dev" on cluster2.
** first I log on cluster2 and I clone the repo from github: <code>git clone https://github.com/timflutre/myproject.git</code>
** then I add my repo from cluster1 as a remote: <code>cd myproject/; git remote add cluster1 ssh://tflutre@cluster1:/home/tflutre/myproject/.git</code>
** finally I fetch the remotes and create a "dev" branch which tracks the one on cluster1: <code>git remote update; git checkout -b dev cluster1/dev</code>


<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->

Revision as of 22:53, 8 November 2013

Project name <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page
<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>

About Git

  • Motivation: nowadays, it's pretty common to use a computer for a project in which one wants to: keep history of the changes, access them on different machines with different operating systems, share our work with someone else, etc. In such cases, it's very useful to use a distributed versioning system, such as Git.



  • Conflicts: when updating one branch with the content of another one (git checkout branch1; git merge branch2), some conflicts can happen, and it is usually hard to know how to solve them properly (but see a concrete example here). In the following, branch1 can be master and branch2 can be origin/master, or branch1 can be master and branch2 can be dev.
    • The first solution is to edit each conflicted files by hand, then run git add fileX.txt (staging indicates to git that the conflict is resolved) and finally run git commit -m "merge branch2 and solve conflicts" fileX.txt.
    • The second solution is to ignore the conflicts and overwrite the files of branch1 with the content of branch2, one file at a time: git checkout --patch branch2 fileX.txt.
    • The third solution, even more radical, is to "overwrite" all of branch1 with the content of the branch2, all files at once: git reset --hard branch2.


  • Tips:
    • undo uncommitted changes: git checkout myfile.txt
    • split a big commit in several smaller commits: git add -p myfile.txt
    • usual config: git config --global user.name 'Timothée Flutre'; git config --global user.email 'timflutre@gmail.com'; git config --global core.editor emacs; git config --global i18n.commitEncoding 'utf8'; git config --global i18n.logOutputEncoding 'utf8'
    • remote via ssh tunnel: first open the tunnel ssh gateway.foo.bar -l tflutre -Nf -L 20400:maincluster:22, then add the remote git remote add mcl ssh://tflutre@localhost:20400/home/tflutre/myproject/.git


  • Writing a paper: in this example, I am writing a paper with two colleagues. We decide to do it as a centralized workflow, the shared repository being hosted by github.
    • Setting up the infrastructure:
      • Each of us needs to create a free account.
      • I need to upgrade my account in order to have the right to manage private repositories ($7/month).
      • I create a private repository named "paper" and add my colleagues as collaborators to it.
      • I retrieve the repository on my local machine: git clone git://github.com/timflutre/paper.git
      • I create my first file, for instance "paper_main.tex", and add it to git in my local repository: git add paper_main.tex followed by git commit -m "first commit" paper_main.tex.
      • I create one branch per collaborator (the default branch being "master"): git branch tim, then git branch colleague1 and finally git branch colleague2. I can list the local branches with git branch and I can switch to my branch with git checkout tim for instance.
      • I push the changes I made from my local repo onto github: git push origin master, this for each branch I created.
      • I send an email to my colleagues telling them that they can retrieve the content of the repository from github into their local machine(s): git clone https://github.com/timflutre/paper.git.
    • Typical working cycle:
      • Each of us can make modifications on its own branch, and push them on github in order to allow the others to access the changes: git push origin colleague1 for instance.
      • From time to time, one of us has the responsibility to merge the changes and update the "master" branch with the latest version.
      • Once this is done, the others need to retrieve the new content of "master" in their local repo: git checkout master, git fetch origin, git diff master origin/master, git merge origin/master.
      • Then, they need to update their local branch with the new content of "master": git checkout colleague1, git diff --name-status colleague1..master. This will list the files having differences between their local branch and the new content of "master".
      • One can look at the differences file by file: git diff --color-words colleague1:paper_main.tex master:paper_main.tex. The options "--color-words" is especially useful in LaTeX.
      • To merge the content of the recently-updated local "master" into his own local branch, we do: git merge master.
    • Tips: don't version the output pdf in the repository because, as it is binary, git can't merge it properly. But you can add a Makefile (see below) and, by entering make main -i on the command-line, it will compile your pdf document when you need it
all: main supp

main:
	latex paper_main.tex
	bibtex paper_main
	latex paper_main.tex
	latex paper_main.tex
	pdflatex paper_main

supp:
	latex paper_supplements.tex
	bibtex paper_supplements
	latex paper_supplements.tex
	latex paper_supplements.tex
	pdflatex paper_supplements

clean:
	rm -f *~ *.aux *.dvi *.log *.pdf *.bbl *.blg *.toc


  • Two remotes: let's imagine that on cluster1 I have 2 branches, "master" and "dev", on github I only have "master", and I want to work with "dev" on cluster2.