[erlang-questions] Does erlang:now() guarantee that subsequent calls to this BIF returns continuously increasing values even in erlang cluster?

Daniel liudanking@REDACTED
Wed Apr 22 03:34:12 CEST 2015


The background of my problem is as follows:

I use AWS dynamoDB (http://aws.amazon.com/cn/dynamodb/) to store ejabbed chat messages. DynamoDB is a NoSQL storage and needs a *key* to look up a record.  In my application, I construct the *key* as <jid, erlang:now()> where jid is the hash key and  erlang:now() is the range key (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithTables.html). The reason that I prefer to use erlang:now() as the range key is that it will be convenient to query history messages according to timestamp. Unfortunately, <jid, erlang:now()> may be not unique in erlang cluster, so I am looking for a global *erlang:now()* function that generate global unique timestamp in erlang cluster. 

BTW, small probability of <jid, erlang:now()> collision is not a big problem in my application, so I am still using <jid, erlang:now()> as the *key*. 


> On Apr 22, 2015, at 8:59 AM, Michael Turner <michael.eugene.turner@REDACTED> wrote:
> 
> "slapping "lamport clock" on it is reductive."
> 
> -- it was just a suggestion, in case something like that might work. My suggestion was not intended to be all-inclusive or a panacea. Slapping "reductive" on my suggestion is ... well, reductive?
> 
> "Not come with a pet solution to push through."
> 
> It's not my "pet solution". I don't have a "pet solution." I don't even know what this guy's problem is, and you don't either. So how can I have a pet solution if there's no way to know what the solution is in the first place?
> 
> I just happen to know that Erlang/OTP has this feature that's been in "beta" for what seems to be a decade or more, one that exposes a kind of Lamport clock functionality (even though the documentation fails to call it that, which might be why it's still waiting for enough user input to refine the interface -- people who go looking for something like that in Erlang/OTP are not finding it.) seq_trace might be part of /a/ solution to his problem. But since we don't know what his problem really is, I'm just making suggestions.
> 
> Understand?
> 
> 
> 
> Regards,
> Michael Turner
> Executive Director
> Project Persephone
> K-1 bldg 3F
> 7-2-6 Nishishinjuku
> Shinjuku-ku Tokyo 160-0023
> Tel: +81 (3) 6890-1140
> Fax: +81 (3) 6890-1158
> Mobile: +81 (90) 5203-8682
> turner@REDACTED
> http://www.projectpersephone.org/
> 
> "Love does not consist in gazing at each other, but in looking outward together in the same direction." -- Antoine de Saint-Exupéry
> 
> On Tue, Apr 21, 2015 at 9:20 PM, Fred Hebert <mononcqc@REDACTED> wrote:
> On 04/21, Michael Turner wrote:
> "Lamport/vector clocks and other similar ones operate on *causality*, but
> this partial ordering is not the only one available or workable."
> 
> Whether it's "workable" depends on what's desired. Sorting by {Node,
> Timestamp} is not accurate if causality matters and clocks have drifted out
> of synch. As they will. Hence Lamport's work, and the work of others. And
> if causality doesn't matter, well, I wonder: why bother? Unless you just
> want a rough idea of when certain things happened, in which case {Node,
> Timestamp} can give you a /total/ order that's, if anything, more accurate
> than what you need.
> 
> 
> That's not necessarily true. Let's see for different options and when they can be useful.
> 
> - Lamport/vector clocks: causality. I wan to track the logical  dependencies of changes.
> - `{Node, Timestamp}`: I have lots of local events (say HTTP requests  and responses in logs) and want to see *when* they happen and how far  apart. The timestmap might need to be monotonic, but the per-node  value lets me impose a logical order, track some density over time  (assuming I at least have NTP working), and so on.
> - {Shard, Timestamp}: I require a total order, but for events within a  sharded data set.
> - {Cluster, Timestamp}: Each cluster I run might belong to specific  customers or whatever, or run a specific set of hardware, or be a  logical division. In any case, it's possible they have their own time  or id service and I may want a partial or total order based on the  events within that cluster, without worrying I might want to compare  cross-cluster activity.
> - {Region, Timestamp}: Similar to the above, but by geographical area. I  might decide that I need a total order on some form of transactions  and will run a service, but for latency (and if real world allows it),  I won't try to synchronize my time across data-centers or large  geographical areas.
> 
> All of these 'labelled timestamps' *are* a partial order. They only define it on some label. I.e. you can sort all timestamps within a node/shard/cluster/region, but can't do it across boundaries.
> 
> There are other avenues that even combine some of them; One interesting case is inspired by Google's Chubby and CRDTs: You use a timestamp synchronized by NTP, guaranteeing you a maximal drift interval. You then add in a lamport clock whenever two events happen within too close of an interval that we cannot guarantee from the system clocks they truly happened apart.
> 
> The lamport clock is mergeable in a deterministic way that is also commutative and idempotent (that's a CRDT!), and acts as a tie-breaker between events that happen at too close together.
> 
> This way you get reliable timestamps when you can, and when you suddenly can't, you get a form of causality (or global monotonicity) to break things up.
> 
> slapping "lamport clock" on it is reductive. It's a good way to track some levels of causality, but has its limitations. If you only *need* node-local accuracy and you have access to a monotonic clock, it might be far less work to just slap the monotonic clock into things than weave the logical clock through everything, and obtain the same logical result in the end (plus more information). Maybe it's not the best solution either.
> 
> But really, if we want to make good recommendations, we have to ask what the user needs. Not come with a pet solution to push through.
> 




More information about the erlang-questions mailing list