[erlang-questions] mnesia -- a naive question

Sun Jul 30 21:33:14 CEST 2017

Hi Jesper,

Your points are reassuring. Thank you.

Wasabi promotes their site as 6x faster and 1/5th the cost of Amazon S3. In the spirit of due diligence my next steps are:

1. Do upload/recovery tests with large files to see minimal likely time for recovery
2. Visit Wasabi to check them out. They're in Boston so easy to do
3. For dev/testing/very early production I'm thinking of hosting two or maybe three Erlang Nitrogen + mnesia servers in house
4. See if I can come up with a script to detect outage and initiate recovery
5. This doesn't address replication across Zones, but one step at a time

I had been considering Riak KV, but this seems easier to implement with less overhead.

I still have many questions. But I'm months from actual beta launch, so this plan at least provides a starting point for critique and refinement. 

Wish me luck. 

All the best,

Lloyd

-----Original Message-----
From: "Jesper Louis Andersen" <jesper.louis.andersen@REDACTED>
Sent: Sunday, July 30, 2017 9:13am
To: lloyd@REDACTED, "Erlang" <erlang-questions@REDACTED>
Subject: Re: [erlang-questions] mnesia -- a naive question

A couple of points:

* Mnesia protects you against the scenario where one of your nodes fail. It
doesn't automatically protect you against the network splitting, and
requires some manual recovery on the flip side of such an event. For rather
small clusters, this is manageable by manual operation. Larger systems will
be far harder to maintain because the risk of netsplits and node loss goes
up whenever you add a new node.

* I don't know about Wasabi, but Amazon's EC2 nodes are ephemeral in the
sense they can go away at a moments notice. And when this happens, the data
on the node is gone. Thus, to achieve persistent storage, you must either
store data off the EC2 node, presumably in S3, RDS, DynamoDB and so on. Or
use an EBS volume, attached to the EC2 node to provide persistent disk
space (on which your mnesia database can reside).

* The game is all about risk mitigation. If you regularly take a mnesia
backup and store it into S3, or something like it, you can get speedy
recovery to that point in time should the accident happen. If you want
better point-in-time-recovery, you can try running two mnesia nodes, but
you need to heed two important caveats:
    - You probably want your nodes to run in different zones so a failure
in one zone doesn't take down everything.
    - Amazons network is brittle and likely to drop connections which are
seen as netsplits.

* Mnesia mitigates risk by assuming the nodes are fairly robust and stable,
as well as the network between them. If you buy good expensive hardware,
this is a likely assumption and the noise of error will be low. So manual
intervention in the case of an error is probably what is needed anyway (to
fix the faulty hardware as well).

* Amazon and other leased environments tend to have brittle network
connections and flaky machines. To mitigate this, your system must make no
assumptions about stability and handle this up front. Mnesia wasn't really
built to work in such an environment.

On Sat, Jul 29, 2017 at 10:23 PM <lloyd@REDACTED> wrote:

> Hello,
>
> Wasabi is a new cloud storage service that promotes lower storage costs
> and greater speed than Amazon S3:
>
> https://wasabi.com/
>
> During the dev phase I'm running mnesia on the back-end of my current web
> project. I much like the seamless way that mnesia integrates into Erlang as
> well as its replication feature. But folks have warned about the hassles of
> mnesia net splits.
>
> Problem is that I have no operations experience to objectively weigh
> options. But I do want to bridge over all points of failure as
> cost-and-time-effectively as possible.
>
> So, my question is if and how I can integrate Wasabi (or Amazon S3 for
> that matter) into my operation to significantly reduce the probability of
> data loss?
>
>
> Many thanks,
>
> LRP
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>