[erlang-questions] upgrade path from R14B02 to OTP 17.5

Tanguay, Luc (A584369) luc.tanguay@REDACTED
Tue Jun 16 21:27:48 CEST 2015

Thanks Garrett.

With the number of client nodes involved, I will have to test with 2 Erlang installation on one machine (server or client).  I think this is easy to do with Erlang/OTP: the only system-wise settings that I know of is the PATH environment variable. Am I right?

The current system generates 2174 files every night, for a total size of 21.8 GB, and distributes them to 19 client systems (at one point we had 36 clients).  A complete cycle runs for 5 hours.  It will be difficult to produce this load in parallel while testing the new system.  But there are already some hooks in the Erlang application that were put in place when I first developed it (for ex. distribute the files and as soon as the client has received and validated the file, delete it to preserve precious disk space on client, or, send a file that triggers a TCP timeout or TCP transmission error).

Bottom line: plan, test, test and re-test.  And test the rollback procedure...


Luc Tanguay, ing./P. Eng.
Bell Canada
cell: 514-229-7585

-----Original Message-----
From: Garrett Smith [mailto:g@REDACTED] 
Sent: June-16-15 04:56
To: Tanguay, Luc (A584369)
Cc: erlang-questions@REDACTED
Subject: Re: [erlang-questions] upgrade path from R14B02 to OTP 17.5

Hi Luc,

I'd like to point out that today I am top posting, because some topics should never die.

I would strongly suggest that you follow Fred's advice and setup a second production server - not necessarily a separate machine/VM, but a separate version of your app, in production. Then route a controlled percentage of traffic to the new server and gradually increase that over time as your confidence soars!

I honestly can't see this going well for you if you don't, short of great luck or moments of sheer terror as things mysteriously break and you scramble to revert to the old platform. Don't assume that reverting will go smoothly - many famous outages occurred when a backup procedure failed. I personally would just refuse to make such a large change in production without a real mitigation strategy.

This is a fancier version of blue/green where you run two independent services in parallel and gradually shift traffic/load to another server, or back, as you carefully watch error rates.

This could require some application code change, or simply the addition of a simple-as-possible router that lives in the app that directs requests to one of the available back end apps. You should also invest in some metrics - in particular error counts and user-facing latencies (e.g. response times).

Again, you can run these on a single machine. The point here isn't availability, but risk mitigation of large changes. Of course you can distribute these apps on multiple servers if you want, but I wouldn't jump into that before you see the rolling release process working smoothly on your own machine with two Erlang VMs running locally.

You might get some push back from authority figures that this approach is overkill - just "do it and get it over with - sure it's painful, but this will buy us another six years!"

That's an argument, for sure, but consider the ongoing benefits of having a 100% safe outlet for future changes - refactoring, features, version revs, A/B testing etc. This app running on Windows would become a model for others to follow!

This sounds really great to me - I would take the time you've budgeted for reading README and work on this :)


On Mon, Jun 15, 2015 at 6:25 PM, Tanguay, Luc (A584369) <luc.tanguay@REDACTED> wrote:
> Hi.
> We have a Erlang/OTP distributed system running on R14B02 Windows 2008
> (32-bit).  We plan to upgrade to OTP 17.5 (64-bit).   The system involves
> one big server that distribute files to up to 40 smaller clients, 
> three OTP applications, many TCP links between server and clients, a 
> small mnesia database, many calls to erlang:open_port/2 to execute 
> native Windows programs, etc.
> What is the safest upgrade path?  Go straight to OTP 17.5 or something else.
> Do I have to read all READMEs from R14B03 to OTP 17.5?
> Thanks,
> Luc
> ---
> Luc Tanguay, ing./P. Eng.
> Bell Canada
> 514-786-6440
> cell: 514-229-7585
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

More information about the erlang-questions mailing list