<!DOCTYPE html><html><head><title></title><style type="text/css">p.MsoNormal,p.MsoNoSpacing{margin:0}</style></head><body><div>Thanks all,<br></div><div><br></div><div>Hugo, I like your third idea. I've been thinking about programming a stop word filtering function anyway. Plus, in my use case all of the books are owned by the author so uniqueness is unlikely to be a problem. <br></div><div><br></div><div>I can't use ISBNs, since the ids are for books under development. Bit I will definitely use them in other parts of my application.<br></div><div><br></div><div>I did program one idea:<br></div><div><br></div><div>make_id(String, First, Second) -><br></div><div> List = string:tokens(String, " "),<br></div><div> F = lists:nth(First, List),<br></div><div> S = lists:nth(Second, List),<br></div><div> F ++ "_" ++ S.<br></div><div><br></div><div> make_id(String, First) -><br></div><div> List = string:tokens(String, " "),<br></div><div> F = lists:nth(First, List),<br></div><div> F.<br></div><div><br></div><div>It nicely fulfills the short and readable criteria and enables focus on two most significant words in the title, but I can't see a way to automate assignment of values to First and Second. So I played with just selecting the first or first two words in the title. But it makes me uncomfortable.<br></div><div><br></div><div>make_id(String) -><br></div><div> List = string:tokens(String, " "),<br></div><div> case length(List) > 1 of<br></div><div> true -> F = lists:nth(1, List),<br></div><div> S = lists:nth(2, List),<br></div><div> F ++ "_" ++ S;<br></div><div> false -> lists:nth(1, List)<br></div><div> end.<br></div><div></div><div><br></div><div>Best wishes,. Much appreciate the help.<br></div><div><br></div><div>LRP<br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div>On Fri, Aug 13, 2021, at 4:19 PM, Hugo Mills wrote:<br></div><blockquote type="cite" id="qt" style=""><div>On Fri, Aug 13, 2021 at 03:44:29PM -0400, Lloyd R. Prentice wrote:<br></div><div>> Hello,<br></div><div>> <br></div><div>> What might be a nifty way to turn a long book title with spaces into a short human-readable ID?<br></div><div><br></div><div> Depends rather on what purpose you want to put this ID to.<br></div><div><br></div><div> One solution would be to hash it (with, say sha256). If the hash is<br></div><div>too long for "short", truncate it. Note that this is not a<br></div><div>globally-unique value, as there are lots of books with identical<br></div><div>titles.<br></div><div><br></div><div> If you want a globally unique identifier for printed books, then<br></div><div>ISBN is a reasonable one to use -- it's not precisely unique (there<br></div><div>have been errors assugning the same ISBN to two different books, for<br></div><div>example), but it's pretty good for most purposes.<br></div><div><br></div><div> If you want an actual globally unique identifier, then some form of<br></div><div>UUID would do the job (UUIDv4 is the easiest). Alternatively, you<br></div><div>could register a DOI prefix and assign numbers inside your own<br></div><div>numberspace within the DOI system.<br></div><div><br></div><div> If you want something vaguely human-readable, try dropping all the<br></div><div>stop-words (the, a, an, in, on, ...), all the vowels and all the<br></div><div>spaces. Truncate at whatever your idea of "short" is. Like the hashing<br></div><div>approach, it's not unique in the slightest.<br></div><div><br></div><div> It all depends on your use-case.<br></div><div><br></div><div> Hugo.<br></div><div><br></div><div>-- <br></div><div>Hugo Mills | Great films about cricket: Interview with the Umpire<br></div><div>hugo@... carfax.org.uk |<br></div><div><a href="http://carfax.org.uk/">http://carfax.org.uk/</a> |<br></div><div>PGP: E2AB1DE4 |<br></div><div><br></div></blockquote><div><br></div></body></html>