Just to be obnoxious....<div>Why not split on ',' and check last chars? If you find an escape char then build a single term out of it.</div><div><br></div><div>Hello Foo\, it's me\, Mike,Hi Mike\, good to meet you!</div>
<div><br></div><div>lists returns </div><div><meta charset="utf-8">[Hello Foo\, it's me\, Mike,Hi Mike\, good to meet you!]</div><div><br></div><div>You then check for '\' (and optionally, \\) and concat as needed, replacing '\' with ',' when not itself escaped.</div>
<div><br></div><div>You'll have to iterate "N" terms a complete second time which may be a killer for performance but with smaller data sets</div><div>it would work, and would lend itself to parallelization via distributed Map->Reduce on pairs of terms for really big sets.</div>
<div><br></div><div>-mox</div><div><br></div><div><br></div><div><div class="gmail_quote">On Fri, Apr 15, 2011 at 1:44 PM, Jachym Holecek <span dir="ltr"><<a href="mailto:freza@circlewave.net">freza@circlewave.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Hi,<br>
<br>
[I didn't read the rest of the thread, so hopefully I'm not terribly offtopic.<br>
Also, I got rid of top-posting and unlimited context quoting as both are seen<br>
as bad taste where I come from.]<br>
<br>
# Robert Raschke 2011-04-15:<br>
>> On Thu, Apr 14, 2011 at 11:48 AM, Dave Challis <<a href="mailto:dsc@ecs.soton.ac.uk">dsc@ecs.soton.ac.uk</a>> wrote:<br>
>> Yup, I agree completely :)<br>
>><br>
>> My question was mostly prompted by a blog post (<a href="http://ppolv.wordpress.com/2008/02/25/" target="_blank">http://ppolv.wordpress.com/2008/02/25/</a><br>
>> parsing-csv-in-erlang/) on parsing CSV in erlang.<br>
>><br>
>> The do_parse function there has a dozen items which search binaries and check state, which made<br>
>> me wonder whether swapping the argument order round would make any difference at all.<br>
>><br>
>> That code makes me wonder if it could be rewritten using gen_fsm. Never used gen_fsm myself, so I'm<br>
>> not really sure.<br>
<br>
I don't think so; gen_fsm is brilliant for purely event-driven automata but<br>
would be very ugly for a task like this. What the CSV guy needs to do is:<br>
<br>
1) Get rid of #ecsv{} record, that's overkill for such a trivial purpose.<br>
<br>
2) Break the parser into a bunch of tiny functions, pass all necessary<br>
data in arguments.<br>
<br>
IIRC LFE's scanner/parser code is a neat example of how to get it right.<br>
<br>
Regards,<br>
-- Jachym<br>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
</blockquote></div><br></div>