Erlang binary longest common substring

Mon Mar 30 15:45:23 CEST 2020

Hi All,

Thank your very much for the answers,

@Valentin Micic <v@REDACTED> , Your lcs function skipping the "World"
word. I need "o" and "World" also count.

@Marc Worrel, Your function also similar with Valentine Micic's function.

@Richard O'Keefe , I suppose substring or subsequence is same thing, but i
think i am wrong, the point is, i need the function to also count
"o" from <<"Hellaao World">>, and "World" from <<"Hello World">>
from example: lcs(<<"Hello World">>, <<"Hellaao World">>)

I think i need to make a NIF .
I will try to make NIF based on this https://github.com/wollmers/c-lcs-bv
wonderful example ANSI C code.

I've try Pyrlang https://pyrlang.github.io/Pyrlang/ in the hope i that i
can use / communicate with https://pypi.org/project/pylcs/ through:
rpc:call('py@REDACTED', 'pylcs', 'lcs', [<<"Hello World">>,<<"Hellaao
World">>])  in a python-process node but Pyrlang not help much in my case.
still very slow
because it run over the network.

Pada tanggal Sen, 30 Mar 2020 pukul 20.08 Ben Murphy <benmmurphy@REDACTED>
menulis:

> Hi,
>
> I think the implementation on rosetta code is slightly broken from a
> performance point of view. Usually, if you are memoising in this
> problem you would use the offsets for where you are in the string. So
> if you were writing in an imperative language you would just have a N
> x M array. Obviously, you can't do this in Erlang so they are storing
> stuff in a dictionary. However, for the keys it looks like they are
> using the string/list for the key instead of the offset. This means
> the dict code will have to iterate over the remaining bit of the
> string for every lookup. So instead of being O(NxM) you end up with an
> algorithm that is O(NxMx(N + M)). So instead of being quadratic it
> gives you cubic performance. This might explain why it is so slow.
> Note: I haven't included the log() term that is introduced by the
> dictionary lookup that can't be avoided in Erlang.
>
> On Mon, Mar 30, 2020 at 11:18 AM I Gusti Ngurah Oka Prinarjaya
> <okaprinarjaya@REDACTED> wrote:
> >
> > Hi,
> >
> > I need longest common substring erlang implementation. Please help me
> >
> > I've tried
> https://rosettacode.org/wiki/Longest_common_subsequence#Erlang but very
> very slow T_T
> >
> > and i've tried binary:longest_common_prefix([<<"i love you
> 2020-03-28">>,<<"i love you 2020-03-01">>]) not effective / not valid
> >
> > Thank you
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20200330/105f5ea9/attachment.htm>