[erlang-questions] disk merging

YC yinso.chen@REDACTED
Thu Oct 25 23:39:47 CEST 2007


Yeah you've described distributed version control problem ;)

If your repositories are basically different versions of the same thing -
i.e. you copy a part of the original tree out, and add/delete things but
didn't try to rename files), then Unison might be able to help you.  It's
designed to merge two sets of repository together based on file paths.  For
files with the same name it will attempt to detect which one is later, and
if it can't it will prompt you for reconciliation.

http://www.cis.upenn.edu/~bcpierce/unison/

If your problem is duplicate files with different names, then MD5/SHA1 will
help you find dupes across file names, but it can't solve the file
versioning problem.

W.r.t the folder structure issue, you can actually preserve the folder
structure if you convert the dupe files to symlinks (at least on non-windows
platform).

But if you have changing files with different names, then there probably
will be a manual effort involved if you want to version them as the same
file (either manually checkin to a source control, or ensure the files
following a naming convention and have a script checkin for you).

On 10/25/07, Joe Armstrong <erlang@REDACTED> wrote:
>
> I have an interesting? problem.
>
> Over the last ? years (> 10) I have been upgrading my home system
> this usually involved buying a bigger disk and copying most (or all)
> of the files
> from the old disk to the new disk or disks.
>
> I've also been backing up the family photos etc on USB disks.
>
> Now I have > 1 Tera bytes of files spread over c. 10 computers and
> 3 pluggable USB disks. Having made a "backup" both the original and the
> copy live lives of their own.
>
> Does anybody know of a good algorithm to consolidate/merge all this
> data or do I have
> to write my own? One immediate thought is to compute the MD5 sums of
> all files on all
> disk and thus find all duplicates - then create a master copy of all
> unique files
> but the file names will be wrong and this might result in a big mess.
>
> This cannot be an uncommon problem - any ideas how to solve it?
>
> /Joe
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20071025/a2912c9e/attachment.htm>


More information about the erlang-questions mailing list