[erlang-questions] how do you build your releases with/without rebar?

Sat Apr 12 15:39:46 CEST 2014

On 04/12, Loïc Hoguin wrote:
> When I read this and Fred's posts there's something that strikes me though.
> It sounds to me that the only way this can work in a team is if you misuse
> git. Git is made for non-linear development: that is, someone forks your
> repository, does changes, then sends a patch and you merge it, with one
> repository = one user. If you do non-linear development with the apps/*
> layout where you have to always make sure the whole code is synced, this
> makes it pretty hard to merge things, especially bigger changes. Basically
> the guy merging things would be doing nothing else than make sure it works.
> 

Well, first, not everyone will use git (we do), and the workflow of a
language shouldn't be so intricately tied to its version control tool
unless we desire the language practice to end the next time a new
version control tool supercedes whatever we're using right now.

> So I'm guessing you guys must be having more than one committer per
> repository, which comes with its own set of problems, but sounds more
> workable than my initial assumptions. But it also makes it more difficult to
> push things, because the number of things you must take care of increases,
> and everytime something new gets added before you could push, you have to
> test again or risk breaking things.
> 

We do have multiple committers per repository on Heroku's routing team
-- along with a policy of code review and peer approval of changes so
that more than one person knows what is going on in a code base, both in
order to catch errors, and spread knowledge to reduce our bus factor.

Anyway, that's where communication is important. Smaller change sets can
be done by just merging branches the way you would regularly. Git or hg
or whatever should be clever enough to do 5 merges in 5 repos, or do one
merge of 5 files in different directories that have the same
independence in terms of what they contain.

The challenge here is coordinating changes and development efforts
through bugfixes, feature development, improvements, and so on. One way
to do it is to just merge back into master what is stable and ready to
be deployed. Then you test it again once it's merged to make sure
nothing exploded, and then cut a branch or a tag that is deployable from
there. The rest can keep being done concurrently through branches and
forks as much as you want.

In my opinion, the challenge is people and coordinating their efforts,
not the toolset used, although it can be a hindering or enhancing
factor.

> To take a well known example, look at what OTP does. Basically they pull
> patches, put them in a nightly test build and then depending on the results
> can decide whether to merge things. Now imagine how faster it could be if
> when you send a patch to the ssl app you don't have to worry about anything
> else? It could be tested faster, you would get feedback faster and fix
> things faster and get it merged faster. Smaller iterations = faster
> development speed. Now OTP would still need to do nightly test of
> everything, but they would only need to do so when they bump the version of
> an included application.
> 

Right, for patches and fixes, that model is simpler and usually faster.
Even for big ones within one app, that's usually super flexible.
But when it comes to bigger coordination -- say the task at hand is
"improve httpd to have it support SNI", then you *need* to coordinate
changes across httpd, inets, private_key, ssl, etc. and this is where
the cognitive load can increase with multiple respositories.

So to me, the choice of how much split you need is a function of that
kind of workflow. If you have a tendency to provide libraries that can
be used independently, individual apps work very well. If you have a
tendency to do deeply integrated projects across your stack, with
possibly breaking changes, then moving more apps to a single repo
becomes a lot more interesting, and makes the entire change-tracking
simpler.

To some extent this is still doable when your tool (rebar or erlang.mk)
fetches git repos because you have the possibility to do everything from
the same set of subdirectories, but it's not nearly as nice for
stability.

> 
> Of course in the case of your smaller "big" repositories where you only need
> to support one target OS and Erlang version you don't run into these issues.
> I suppose you are in a sweet spot where doing things the messy way doesn't
> stabs you in the back too much and you can live with it. But add proper
> support for more combinations of OS and Erlang versions and you'll start
> suffering. (Or I suppose you could do it wrong again, use travis-ci and not
> care if your repository gets broken and just fix it afterwards, which also
> means you won't be able to use git bisect etc.)
> 
> tl;dr A rant that probably makes little sense to you.
> 
> But you know, whatever works.
> 

You're right that we tend to support far fewer versions and OSes than a
project set like Cowboy would. For us it's not just a question of the
messy way, but that we can break what we want whenever we want as long
as the behavior stays the same for users of the stack, for example.
Whatever is generic as a lib we tend to extract out and open source, but
for business rules material -- stuff that is soft, squishy, and
frequently changing, that's private code.

The rant makes sense to me, and I totally get the use case for split
repos. I wanted to sell the idea of integrated ones to show that it may
work better in some cases. Even other developers on Heroku's routing
team may disagree with me on that point, and that's entirely fine.

Regards,
Fred.