Brain dump #2 - zope and grids and GUID's

Mon Feb 10 12:09:33 CET 2003

I too spent significant time on this GUID problem back in my
Disaster Response days. I eventually settled on this method
of creating GUIDs in Erlang as Good Enough(tm):

new_guid() ->
  {node(), erlang:now()}.

export_guid({N,{Ms,S,Us}}) ->
  lists:concat(
     [atom_to_list(N),
      ".", integer_to_list(Ms),
      ".", integer_to_list(S),
      ".", integer_to_list(Us)]).

Rationale:
(1) The new_guid() function is really cheap
(2) It is also defined such that it is unique (unless
    we start renaming hosts or messing with the system
    clock.)
(3) It can be generated either manually or automatically
    (even though when manually generated, you might want
    to alter it somewhat (which is a Good Thing, since
    we don't want conflicts betw manually and automatically
    generated GUIDs.
(4) The GUID is reasonably readable, and can be re-formatted
    for better readability (converting to date-time format)
(5) The GUID reflects where the entity was created, not
    necessarily where it resides. In Command & Control,
    the basic questions to answer are the 4 'W's:
    Who?, What?, When?, Where? (*)
    (+ sometimes, but not always, Why?)
(6) You may receive multiple indications of the same
    thing, and must be able to tell the indications apart
    (and do source credibility analysis). This basically
    means "tag everything".
(7) I often also create a GUID for 'modified'

(*) As in
    - "Who is reporting?"
    - "What happened?"
    - "Where did it happen?"
    - "When did it happen?"

/Uffe

On Mon, 10 Feb 2003, Vlad Dumitrescu (EAW) wrote:

>I have some comments/question marks, not sure all make sense.
>
>> Central to a lot of ideas about zopes and workflow grids
>> and things is the idea of a GUID (globally unique
>> identifier).
>
>>   Most GUIDs are long random strings of junk - like this
>>   "sadjgfsagfjashfkgasvasjgjsgfjgfjgsafhebutaskhfefjhbdjb"
>>    Of some weirdo Mac address + a time stamp + random
>>   number etc.
>
>>   The main point is *there is no indication HOW to find
>>   the resource referred to*
>
> I am sure everybody already knows this. The "weird"
>looking GUIDs are so because they are meant to be generated
>automatically, by a machine, and we heve to be VERY sure
>that they are really unique. This is because they are used
>inside tha system, more than outside it: compare this with
>the reference type in Erlang, it's conceptually just the
>same!
>
>>   I  think this  is *fundamentally  wrong* -  I think
>> the  GUI should contain a *hint*  as to HOW to find  the
>> GUID - I also  think the GUID should be human  readable
>> and writable -  so I can write it  down on a sheet of
>> paper and remember it.
>
>
>This is true, if the GUID is designed to refer to
>*documents*, not memory objects, ActiveX interfaces or any
>other "weird" stuff.
>
>>   I work on several different machines so my GUIDs are
>> composed of three fields (a hostname, a date, a sequence
>> number). -
>
>This really points to the location where the document was
>_created_. It might not be found there anymore, and there
>might exist duplicates... If the "owning" machine is not up
>or online, then one doesn't have any clue about where to
>look anyway...
>
>> guid://{my.home.machine,my.work.machine,a.permanent.machine}/Date/Seq
>
>>   Could be used - my home machine  is turned off for
>> when I'm at work, my  work  machine  is  (in  principle)
>> always  up  but  might  crash, a.permanentt.machine is
>> also (in principle) always up.
>
>Maybe this will work, but again: what happens if you buy a
>new machine and want to move some or all of the stuff
>there? What if you change jobs and even if the machine may
>be the same, it gets a different name? What if a machine is
>part of two different networks, with two different names?
>You mention peer-2-peer backups, how to find the backed-up
>documents?
>
> What I think I am trying to underline is that even if this
>scheme will help, I am not sure if it is complete. To quote
>you, I think this violates the principle of least surprise
>for me: a document's ID should refer to that document, not
>to the place where it happens to be stored at a particular
>time. These are two different issues and I think they
>should be handled separately. (No, I don't have any
>solution except that already in use: a distributed database
>with efficient search)
>
>I don't say this is bad, I say I need more arguments to be
>convinced! :-)
>
>regards,
>Vlad
>

-- 
Ulf Wiger, Senior Specialist,
   / / /   Architecture & Design of Carrier-Class Software
  / / /    Strategic Product & System Management
 / / /     Ericsson AB, Connectivity and Control Nodes