BazaarWithSvn
Using Bazaar with the svn repository
Intro
This document is written for BibleTime developers to help with bzr and bzr-svn. Outsiders may also find this document useful because it describes the basics of bzr and bzr-svn from a slightly different angle than what's in the bzr documentation. The first sections deal with bzr-svn and local, private workflows. The last section is about the public collaborating workflow which uses the sf.net bzr repository.
Read the following documents first if you have some time:
- Tutorials (not the launchpad one, but the others)
- Bazaar on Subversion Projects tutorial
- Bazaar User Guide - a bit lenghty, not everything is necessary for starters, here are the important pieces:
Some day we may move to bazaar from svn because decentralized version control system is more flexible and powerful than the older centralized system. This won't happen for a while but meanwhile you can learn basic bazaar usage by using it with our svn repository. We have now also a bzr repository for more open and experimental development, though the canonical code is still in the svn. Learning isn't easy, but dvcs is the future of code management. Things will surely feel more complicated when you move from svn to bzr/hg/git, but everything has a price.
Bazaar is very versatile, it supports different kinds of workflows. It's recommended to read http://doc.bazaar-vcs.org/bzr.2.0/en/user-guide/bazaar_workflows.html to get a first grasp. Our workflow is "Centralized" or "Centralized with local commits" or "Decentralized with shared mainline". I recommend the last one because it's the most powerful.
For more info about bzr svn integration, see the Bazaar on Subversion Projects tutorial. If you understand all of it, you already know enough about bzr and don't need this story :) For another slighty different workflow than described here see http://doc.bazaar-vcs.org/bzr.2.0/en/user-guide/svn_plugin.html#a-simple-example.
Install bazaar
Linux distributions may have the bazaar program distributed in several packages. In Ubuntu you have to install at least bzr and bzr-svn. bzr-tools and bzr-rebase may be useful.
There's an installer for Windows in the bazaar Download page.
This document requires bzr version >= 2.0. It's not tested with older versions. Some older versions don't support rich-root format of repository by default but it must be given with init-repo command. Rich root is needed for bzr-svn.
GUIs
Bzr has Qt frontends for several commands. You may have to install the qbzr package first. For example instead of "bzr log" call "bzr qlog". It's simple but good and powerful for basic needs.
Bzr has a new all-in-one GUI called Bazaar Explorer. It's packaged with the Windows installer but it's so new it's not in the Ubuntu yet and must be installed from a PPA repo. See https://launchpad.net/bzr-explorer. I haven't yet tested all the use cases here with the Explorer. It doesn't support rebasing by default but other commands should be fine.
There exists TortoiseBzr for Windows. You're not a real Windows Open Source developer if you don't guess what it is :)
Give your personal information
bzr whoami "John Doe <john.doe@users.sourceforge.net>"
Give your name and the sf.net email or whatever you want to use. These are not used by the snv repo, but bzr clients see them. You should definitely use your sf.net email address if you later want to use remote bzr branches. This is a global setting for the bzr system on your machine. It can be set also for branches individually if needed.
With local commits and local feature branches
One possible workflow (and local repository layout) is to use local branches to which you commit locally and from which you push to the remote server through a local trunk. This is the same as "Decentralized with shared mainline".
- Create repository and download a branch - this local branch will be your trunk
- Create a feature branch (a branch where you develop one feature/fix/experimentation) by branching from your trunk
- Edit the project in a feature branch
- Update your trunk with the remote server
- Merge and commit changes from the feature branch to the trunk
- Push changes to the remote server
Checkout from the svn repository for the first time
bzr init-repo /path/to/your/bibletimebzrrepo/bt-bzr cd /path/to/your/bibletimebzrrepo/bt-bzr bzr branch https://bibletime.svn.sourceforge.net/svnroot/bibletime/trunk/bibletime trunk
where "bt-bzr" (or whatever you want to name it) is the directory where you want the bibletime source code branches to be installed. You don't put anything in bt-bzr manually but you create branches there. After branching the new branch is in its own directory, e.g. bt-bzr/trunk/.
The repository is now fetched from the sf.net server for the first time. This takes a long time because the whole history with all actions is downloaded to the local repository revision by revision. In the future the actions will be much faster.
Bzr terms: branch and repository
Bzr works with branches. A "repository" isn't the same as branch. Repository isn't the same as a file system directory. In bzr a file system directory is a branch container. If a branch is "standalone" it doesn't have a repository other than within the branch itself. However, if you work with multiple branches, as recommended here, it's useful to create a "shared repository", which is actually a directory where branch directories are put. The branches in a shared repository share revision history, so it saves space. Otherwise each branch would have the whole repository within it which could be inefficient.
Because the default way of bzr is to have branches in directories, you select a branch where you want to work by cd'ing to it:
bzr init-repo bzr branch bzr://remote/branch mybranch # creates directory/branch "mybranch" cd mybranch bzr nick # shows "mybranch" bzr branch . ../newbranch # uses the current branch "mybranch" as source cd ../newbranch bzr nick # shows "newbranch"
This is a bit different than in e.g. git, where a directory is a repository which holds branches. There's one working tree shared by branches. In bzr the usual way is to have one working tree per branch because branches are directories.
In svn there's only one central repository for a project and each developer has a local working tree with some metadata. In a dvcs each repository copy is a repository. A central repository is central only by convention or agreement.
When you branch [ a verb ] in bzr, you copy a branch. The copy may be standalone branch or it may be in a shared repository. The sf.net bzr repository has one shared repository per sf.net project. It contains multiple branches.
Use local branches
This may be one of the most important enhancements over svn. You can create private branches very easily, for example for experimentation or for developing many independent features in parallel. It's a bit like having multiple local svn copies, but bzr branches are better. Bzr creates branches as directories, so you will have e.g. bt-bzr/trunk, bt-bzr/feature1/ etc.
cd bt-bzr/ bzr branch trunk featureX cd featureX [edit files...] bzr branch ../trunk ../featureY #branch featureY from trunk cd ../featureY rm -r ../featureX #delete a branch directory (it's still in the shared repo, though)
Commit changes locally
Edit something in a feature branch, then commit:
bzr commit
Your default command line editor is opened with a list of modified files. Write the commit message on top of the file. Save and quit. The changes are now committed, but only locally. You can do as many commits as you want. If you rebase and push to the remote server, each commit will be one svn commit when you finally push them to the server. If you merge to trunk and then push it to the remote server, the commits will be one bunch inside one merge, and svn sees them as one single commit.
It's recommended to keep the trunk branch clean and work only in other branches called feature branches. Then the changes are merged to trunk, committed and pushed to the server.
Make sure you have all the latest svn revisions: rebase
cd ../trunk bzr rebase
This does a checkout from the server. The result is just like if you checked out first and then committed your own changes: your changes will appear on top of the log even if the svn repo was changed after you committed your own changes locally. Rebase every time before you do a push (remote commit) or when you want to do a checkout.
If you don't rebase before pushing bzr may complain that branches have diverged. Rebasing isn't a normal way to work with bzr (it's supported in a plugin), but it may be easier to work with svn this way. A native bzr repo doesn't work like a stack but svn does. (The native way is to "pull" or "merge" from the remote repo.)
If you use rebase and push, all your local commits will become separate commits in the svn repo. There's an alternative for rebasing, namely merging, which might be a better way to do it if you commit locally often.
(This needs verification: if you redirect a feature branch to the server, you can rebase the branch directly and push directly to the server, without the trunk branch - just like with multiple svn local copies. Or you can rebase directly from the server but merge changes to trunk, thereby getting rid of your private commit history before pushing to the server.)
Make sure you have all the latest svn revisions: pull
If you keep trunk clean (no local changes before checking out from the server), you can "pull" the changes from the server. It will fetch all new revisions from the server and add them to the revisions in trunk. After that you can make changes, commit them and push them to the server, provided no revisions were added to the server meanwhile. Both push and pull will fail if there are incompatible changes, so they are safe. You probably can't pull directly to a feature branch because it has changed.
cd ../trunk bzr pull # give the https://... address once if bzr needs it
Merge changes to trunk
Now you have a clean, up-to-date trunk but changes in a feature branch. Next step is merging the changes to trunk. This is abhorred by svn users because merging is problematic there. Modern dcvs's are designed to make merges simple, it's a basic action.
cd ../trunk #(if you're in another branch) bzr merge ../featureX bzr commit
This merges all the new commits from the feature branch in one large bunch to the trunk, so your private commit history from your feature branches won't be directly visible in the result. Actually bzr saves the merge history and it can be seen in the bzr log later, but svn won't know about it, it sees only one commit. If you want to get rid of all traces of your private working history, you have to use patching.
Remember that commits are local all the time. The next step is to commit (actually "push") them to the remote repository.
Dos and Don'ts
Here are some simple guidelines if you want to use the more complicated but also more powerful workflow of local feature branches (contrasted with more simple but less powerful workflows with only one local branch/checkout). They are mostly about merging vs. pushing/pulling.
Dos:
- Keep a local trunk branch which is a clean copy of the remote head.
- Pull from remote to the local trunk to keep it up to date.
- Merge from local trunk to the local feature branches if you want to keep them up to date.
- Merge from local feature branches to the local trunk.
- Push to the remote head only from the local trunk, which has been updated with the remote repo just before the local commit.
Don'ts:
- Don't pull to trunk from feature branches or push to trunk from feature branches - it changes the history of the trunk and it won't be compatible with the remote any more.
- Don't pull or push to feature branches unless you want to override their whole contents (start from scratch).
- Don't try to merge directly to the remote svn repo.
- Don't try to push to the remote svn repo unless you have first pulled from there just before the local commit.
- If you use rebase, don't rebase a publicly released branch or a remote branch. Rebase your local private branches from them, not vice versa.
I understand this may sound complicated. Svn is more simple and you have to know only how to update your working copy, commit a bunch of changes to the repository and resolve conflicts. With bzr you have to understand push, pull, merge, update, rebase, branch... At least if you want to use the more powerful features of a dvcs. I guess most of problems arise from deficient understanding of these basic concepts, especially push/pull vs. merge.
For an explanation why you shouldn't push/pull between trunk and feature branches, see http://doc.bazaar-vcs.org/migration/en/foreign/bzr-on-svn-projects.html.
Bzr terms: revisions, push/pull, merge
It's crucial to understand what push/pull and merge do. Read the documentation with "bzr help push", "bzr help pull" and "bzr help merge". See also the FAQ question about the difference between pull and merge.
A versioned project has revisions and a history. They should be familiar from the svn/cvs world.
Pull pulls revisions from a branch to another branch so that the destination will have identical revision history compared to the source. Push does the same. Usually you push from local to remote repo and pull from remote to local. You can't push/pull if branches have diverged, i.e. if they both have changed (leading to different history) after the branching happened. Or to put it in more technical language (bzr help pull): "Branches are considered diverged if the destination branch's most recent commit is one that has not been merged (directly or indirectly) into the parent." This is quite common pitfall for newcomers, probably because they think of simple update/commit cycle of svn/cvs where only the remote history has changed and in the local copy only the working tree, not history, has changed. You have to merge (or possibly rebase) instead of pushing/pulling in that case. My recommendation is to keep the local trunk identical with the remote by pulling from the remote to the local. Now you can commit your own changes to the local trunk. After that you can push them to the remote branch because the branches have common history and only the local has changed after the latest common revision. After pushing the remote and the local are identical again.
It's possible to push/pull even when the branches have diverged, but then you have to overwrite the history of the destination branch with --overwrite. It's useful if you want to reset your local branch, but you definitely don't want to do it while pushing to the remote svn repo.
Merge adds the changes made in one branch to another branch. However, it keeps the history of the destination. The histories of the two branches may be quite different, but still the final outcome is equal. Merge from branch A to branch B takes revisions from A which are not in B, creates one changeset out of them and applies it to the working tree (or to the latest state) of B. There may be conflicts which must be resolved. The merge can be reverted (cancelled). When there are no conflicts the merge can be committed. After committing there's one new revision, the last one in the history of B. It's marked as a merge in the log and the log item can be expanded to show the revisions of branch A which comprise the final revision. Even if branch A is annihilated from the face of the earth this history and those revisions will stay in branch B. One brilliant feature of merges in dvcs's is that you can merge back and forth between branches, even more than two branches.
Push changes to the server
bzr push https://bibletime.svn.sourceforge.net/svnroot/bibletime/trunk/bibletime
When you push for the first time you have to add the destination. Later bzr remembers it so you can use just:
bzr push
bzr now asks for your remote login. Give your sf.net user name and password.
That's it, the changes are now in the svn repository.
In the case of conflicts
Diverged branches
This is not a conflict in technical sense, but anyways, a developer may feel a great conflict when he sees that "branches have diverged" and he can't push a commit from the local trunk to the remote server. The reason is clear: somebody else has committed to the remote head after this developer has pulled from there. This is the reason why you should pull from the remote to the local trunk just before you merge a feature branch into the trunk. But sometimes diverging can't be avoided.
If the merge was easy, you can roll back the merging commit:
bzr uncommit # takes off the latest commit bzr revert # clears the working tree
... and then pull from the remote, merge and commit again, and try to push again. There are also other options.
Here the "rebase" command may come handy. Rebase your local trunk from the remote repo. The effect is the same if you took away your latest commit, put it in a store waiting, pulled from the remote and then put your commit back.
(TODO: how to leave the merge pending while pulling again?)
The merge conflicts
Conflict means that development have separated into two directions which can't easily be reconciled, for example the same code line in one file has been edited in two branches, and when those two versions are put together, the machine can't know which of the changes is "correct". In svn a conflict may happen when you try to update your local copy after editing your working tree. There's nothing new in bzr conflicts. They just usually happen when you try to merge changes from one branch to another.
- http://doc.bazaar-vcs.org/latest/en/user-guide/resolving_conflicts.html
- bzr help merge
- bzr help conflicts
- bzr help revert
Knowing what you're doing and what have happened
bzr log|less bzr help|less bzr help log|less bzr info #info about the branch bzr status #status of working tree bzr diff #diff between the last commit and working tree
The log is much faster than in svn because everything is local. This alone, with the ability of diffing any revisions instantly, makes using bzr-svn worth it, at least if you want to compare revisions often.
Several branches, one working tree and working directory
Bzr can imitate git-style layout where there are several branches but only one working tree, although it's more clumsy than in git. The working tree will be only under bt-bzr/work/. The branches have their own bt-bzr/*/ directories but there are only .bzr/ directories under them. This way you can save some disc space. It has also the advantage of keeping one working tree for an IDE, so you can just reload the open files in the IDE instead of opening new project for each branch.
The workflow is the same than with several branch directories and several working trees, but you switch between the branches instead of cd'ing.
Create the repo with no working trees:
bzr init-repo --no-trees bt-bzr cd bt-bzr bzr branch https://.../trunk/bibletime trunk
Create the checkout of trunk in which you will work:
bzr checkout --lightweight trunk work cd work/
Now you don't have to move out from the work/ directory but you can switch the working tree to correspond any branch.
bzr branch ../trunk ../featureX bzr switch ../featureX # The checkout points to featureX instead of trunk bzr nick # find out where the working branch is bind to bzr info # more information about the current branch bzr switch ../trunk # copies the working tree changes automatically bzr revert # discard the uncommitted changes bzr switch ../featureX [edit files...] bzr commit bzr switch ../trunk #the working tree is set to latest commit of trunk bzr switch ../featureX #the working tree is set to latest commit of featureX pwd # you'll notice you're still in bt-bzr/work/ even though you switch between branches bzr switch ../trunk bzr pull # trunk will be updated to the latest remote repository changes (revisions) bzr merge ../featureX #merge changes made in featureX to trunk working tree bzr commit #commit the merge locally in trunk creating a new revision bzr push # pushes the committed revision into the remote server rm -r ../featureX # get rid of the needless branch container
Committing directly to the remote server
Some people don't like to start committing locally and pushing to remote after they have learned to commit directly to the repository with a traditional vcs. Bzr supports also svn-style "lightweight checkouts" but it will loose much of the flexibility because everything is remote, including history. Lightweight checkout isn't recommended for our purposes unless you have some very special need.
Then there are normal checkouts, where history is local but commits push directly to the remote server. This is a pretty good compromise if you want to continue with svn habits but want to have local history with fast log and diffs between revisions. However, once you learn to use branches and local commits, you probably don't want to get back to this.
bzr init-repo bt-bzr cd bt-bzr bzr checout [--lightweight] https://.../trunk/bibletime bzr update #update the local checkout from the remote server [edit files...] bzr update bzr log #ready within some seconds - no remote fetching! bzr commit # commit directly to the remote repository
Using the SF.net bzr repository as a public repo
Your private workflow
SourceForge support for bzr is lousy. It doesn't support advanced repository layouts. Documentation isn't up to date. I had to tweak the repo manually to get it work with repos which use bzr-svn. The documentation which told to push to to the repository address seemed to be wrong. The web interface seems not to work, at least not with my experimentations. The developers can't collaborate in one branch because a branch is owned by its creator and the other won't have write permissions for it - each developer has to use his own branches from which others can pull or merge to their branches. However, now the repo can be used for our code.
You first should have your own feature branch originating from our svn repository - see above. Now you can push it to the sf.net server:
bzr push bzr+ssh://USER@bibletime.bzr.sourceforge.net/bzrroot/bibletime/BRANCHNAME
This requires you to have commit access to the repo. Admins can give you one. This command pushes your branch to the remote repo.
Now (or whenever you want to) you can see if it worked:
bzr branches bzr://bibletime.bzr.sourceforge.net/bzrroot/bibletime #VERY SLOW!!!
This shows all the branch names in the repo and you should see your new branch there. Remember that you can't delete a remote branch once it's created, so be careful and conservative when creating and naming remote branches. Maybe you can use only couple of remote branches, e.g. USER-experimental and USER-smallfixes, and override the old data in them when you want to start from scratch. For the sake of clarity we should probably name branches using our user names, e.g. "eelik-feature".
So, you have a remote branch. You can follow any workflow you want to with it. You can use a local lightweight checkout, heavyweight checkout or unbinded local branch. If you commit locally and push to the remote, you can push only seldom or after each commit. If you push continually or use a checkout, the other developers can follow in real time what your'e doing.
Keeping your development branch up to date with trunk
It's good to merge trunk into your branch every now or then, or at least before you publish your branch or introduce large changes in it. There are several reasons for that. The first is familiar from the svn world: it's easier to handle conflicts if they are small. Eventually you have to resolve conflicts between your development code and the main repository head anyways. Here is the recommended way:
- Keep your trunk up to date with the remote head, as described above.
- Develop in a local branch.
- Every now and then merge trunk into the development branch.
- Never push or pull into trunk from the development branch. See http://doc.bazaar.canonical.com/migration/en/foreign/bzr-on-svn-projects.html#merging-trunk-to-your-feature-branch to understand why. (It erroneously states that merging trunk into a feature branch should be avoided, but it's pushing/pulling which should be avoided.)
- Don't rebase a published branch.
This is actually a good way to do it even if you use bzr only privately and make large changes in the development branch over time. Then you don't have to resolve a large amount of conflicts when you merge your branch back into trunk. But there is also another reason if you collaborate with other developers via bzr branches.
In the next section I describe the public collaboration phase. There may be one difficulty if you publish your branch before merging into it from up-to-date trunk. Namely, if the other developers want to keep up with both the development head and your development, they have to merge the head into the copy of your branch themselves. If there are conflicts, they may find them difficult to solve because they don't necessarily know anything about your changes or the changes in trunk. Even the merging alone means some extra work. Multiply it with the amount of the developers and you can see why you should do the merging. This is of course much more important if there are conflicts.
If you like rebasing, be careful when you publish your branch. History of once published branch should never be altered, and rebasing alters history. Rebasing your development branch on the trunk is OK before publishing the branch but not after publishing. If you publish a branch, others merge from it, you change its history and keep it public, problems will occur later. It's of course possible that the others don't merge from your branch at all - then rebasing doesn't matter. But then you all have to have agreement about not merging from your branch; it's public only so that the others can pull and test it but not integrate your changes to their own branches. This prevents a fully distributed workflow, but for our purposes a more centralized one should be enough.
Public collaboration
Whatever you do privately, you should push the final result into the remote branch in the end and announce it. Now other developers can check out your branch:
bzr branch bzr://bibletime.bzr.sourceforge.net/bzrroot/bibletime/BRANCH # or...
bzr pull --overwrite bzr://bibletime.bzr.sourceforge.net/bzrroot/bibletime/BRANCH
# (if they have a local branch already which they can override with yours), or even...
bzr merge bzr://bibletime.bzr.sourceforge.net/bzrroot/bibletime/BRANCH
Now it's time to communicate and edit the code and then update your remote branch, and communicate again etc. When you're happy with with your branch, it's time to get the changes into the svn repository. First you have to have your local feature branch ready. Then you have to merge it to the trunk, which in turn should be up to date with the svn repo. After you have merged and committed to trunk, push it to the svn repo.
This public phase is actually the most important. The quality of the code won't be any better if people announce their branches, nobody checks them and people commit the branches to the svn repo after a moment of silence. Other developers must review the code. "Peer review", "many eyes" etc. are empty words if nobody does it. For some reason many Open Source developers think that time spent on anything else than coding is lost time. This is simply not true. Anyone who has studied software quality in a university/college/institute knows that quality doesn't come by coding. Peer review may be one of the most effective ways to ensure code quality, but it means that we have to review.