Substring look-up
zxq9
zxq9@REDACTED
Wed Apr 7 12:36:06 CEST 2021
On 2021/04/07 6:29, Olivier Boudeville wrote:
> Hi,
>
> It must be a silly question, but, since the Latin1 -> Unicode switch in
> OTP 20.0, is there a (non-obsolete) way in the string module to look-up
> the index of a string into another one, i.e. to find the location of a
> given substring?
>
> rstr/2 is supposed to be replaced with find/3, yet the former returns an
> index whereas the latter returns a part of the original string. I could
> not find a way to obtain a relevant index with any of the newer string
> functions - whereas I would guess it is a fairly common need?
The regex module's default run/2,3 behavior does what you are asking for.
1> {ok, MP} = re:compile("foo", [unicode]).
{ok,{re_pattern,0,1,0,<<69,82,,...>>}}
2> re:run("barfoobar", MP).
{match,[{3,3}]}
3> re:run("barfoobarfoo", MP).
{match,[{3,3}]}
4> re:run("barfoobarfoo", MP, [global]).
{match,[[{3,3}],[{9,3}]]}
Note here the [global] option makes it continue beyond the first match.
We are in a sort of flux at the moment with strings where we have
finally got good unicode support and on a broader set of representations
than just strings-as-lists but in the process of converting the string
library module itself and revamping it a few rough edges and obsolete
warnings still linger.
When all else fails, writing a custom function works great to cover the
gap -- luckily none of these sort of functions are particularly
difficult to figure out how to implement!
-Craig
More information about the erlang-questions
mailing list