[OT] Re: revision control [was: Re: autoconf and erlang]

Thu May 13 10:36:48 CEST 2004

On May 13, 2004, at 0:10, Vlad Dumitrescu wrote:

> From a technical point of view, I guess you are right. I did look at 
> arch and
> darcs, but I have a few problems with them: the most important is that 
> arch
> isn't available for Windows yet and when it will be, I still have a 
> problem with
> command line only tools ;-) Darcs is still beta and last time I 
> checked it
> wasn't fully stable (but still usable).

	GUI tools come.  Some of the developers have what sounds like pretty 
decent integration into emacs.  I've got enough integration into vim 
for myself.  I use perforce at work and can't begin to comprehend the 
GUIs that thing ships with.  The commandline works OK for me, though.  
:)  I do use a web GUI tool, though.

> I wouldn't underestimate the power of habit. Arch and darcs have very 
> different
> approaches to the "mainstream" CVS setup, so the learning curve is 
> steep. I'd
> probably have to be in a particularily good mood to decide to embark 
> on learning
> them - especially as (as you say) my needs are simple and don't need 
> the full
> power of arch.

	tla update
	tla commit

	More stuff is there as you need it.

> As for the distributed repository, that is a big plus for arch/darcs. 
> I'm not
> sure how it works in practice, i.e. how do you find the most recent 
> version
> among many repositories which aren't all online at the same time, but 
> since the
> system is in use, it probably works despite my lack of understanding 
> :-)

	It sounds like what you're thinking it might be is what I like to call 
a ``distributed single point of failure.''  That is to say, multiple 
single points which are all required to be available in order for 
something to work.  It works quite the other way around.

	There is one ``official'' head of line for tla 1.3.  It is developed 
on one of Tom Lord's computers behind his dialup connection to the 
internet.  His changes are replicated to at least one mirror very 
regularly (I'm not sure how he does it, but I've got a ``hook'' that 
mirrors every checkin I do to one machine, and cron jobs that send them 
to various other places).  Lots of other people mirror his mirrors and 
branch the code on their own trees for their own work.  Those people 
make their trees available to others, who perform integration testing, 
etc... and then the integration trees get merged back into the head of 
line tree and eventually make it into a release.

	The changesets that get merged in contain all of the history 
information that made up the change (multiple checkins from various 
branches, etc...) and arch is smart enough not to apply the same 
changeset twice.

	In this model, everyone who wants to gets a full copy of anyone else's 
work which is available when the source is not.

> Having the repository as a plain directory tree is of no use for me - 
> why should
> I ever need to hack it? It could as well be opaque.

	It's not so much the ability to hack it, it's more about confidence.  
I am more confident in my ability to back up and restore little files 
that each represent a checkin, and that I can read with everyday tools 
than a big giant file containing everything that can only be read by 
the application that created it.

	It also means you don't need a server of any kind.  tla out of the box 
supports any regular filesystem, webdav, sftp, ftp, and plain http 
(read-only).

	However, I have hacked some of my repositories in the past.  They're 
immutable, so the tools will never update the data, which is a very 
good thing, but I wanted to change some info in a tree that I imported 
from CVS (a few thousand checkins).  A little python and shell and 
all's well.

>>  It's possible to branch a project, track head of line, make
>> changes and send them back without even having an internet connection.
>
> Erm, how do you send an email without a connection? ;-)

	For a long time, I moved email around via UUCP.  In fact, when I first 
moved to Silicon Valley (not very long ago), the only ``connectivity'' 
I had in my hotel room was my nightly UUCP call.  So, sure, there was a 
gateway somewhere that had IP, but I didn't have IP to the hotel (now, 
between the SGI and the Sun, yeah, but that doesn't count).

> All in all, in my opinion the big problem is that SourceForge has these
> scalability problems and isn't reliable anymore.

	I agree.  It's my opinion that SF is having scalability problems 
because it is a popular centralized service.  Unpopular centralized 
services and popular decentralized services both hold up fairly well 
(although in the unpopular case, people will sometimes turn them off if 
they get too unpopular).

	For example, arch is ``hosted'' on Savannah.  When Savannah broke, 
most of the services went away.  There were weeks where the mailing 
lists weren't even working.  Development continued however.  There was 
no central server a developer needed to contact in order to perform a 
checkin, or see the latest code.  The only requirement for sharing code 
was the ability to get the code to other people (which wasn't a problem 
with the number of people mirroring the archives).

	I had projects hosted on sourceforge during a couple of CVS outtages.  
Work simply stopped.  It's rare that I only use one computer during the 
day, and I'm generally uncomfortable doing work and not being able to 
check it in.  The last time I had any sort of trouble with sourceforge 
I took all of my code out of their CVS and stuck it in arch, made a few 
mirrors, and never looked back.

--
SPY                      My girlfriend asked me which one I like better.
pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@REDACTED>
|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________