<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Tahoma
}
--></style></head>
<body class='hmmessage'><div dir='ltr'>
Hi,<div><br></div><div>I am about to process flatfiles, may be CSV or may be line based one.</div><div><br></div><div>I came to know about CSV parsing in the below link.</div><div><br></div><div><a href="http://blog.vmoroz.com/2011/01/csv-in-erlang.html">http://blog.vmoroz.com/2011/01/csv-in-erlang.html</a>
</div><div><br></div><div>But I do not want to keep a big file in Erlang VM memory and also wanted to handle that for lines based ones.</div><div><br></div><div>So I changed the parse code as below that it would seperate the lines and give me back where I would read in 1024 or X chunks</div><div><br></div><div><div>parse1([],Lines,CurrentLine)-></div><div> {lists:reverse([lists:reverse(L) ||L <- Lines]),lists:reverse(CurrentLine)};</div><div>parse1([$\r|[]],Lines,CurrentLine) -></div><div> {lists:reverse([lists:reverse(L) ||L <- [CurrentLine|Lines]]),[]};</div><div>parse1([$\n|[]],Lines,CurrentLine) -></div><div> {lists:reverse([lists:reverse(L) ||L <- [CurrentLine|Lines]]),[]};</div><div>parse1([C|T],Lines,CurrentLine) when C == $\r; C ==$\n -></div><div> parse1(T,[CurrentLine|Lines],[]);</div><div>parse1([C|T],Lines,CurrentLine) -></div><div> parse1(T,Lines,[C|CurrentLine]).</div></div><div><br></div><div>but now to convert into CSV record of each line, I am thinking of the below.</div><div>2> {List,Remaining}=<span style="font-size: 10pt; ">parse1(file:read(IO,1024),[],[]). %%Sample data {["1,2,3","3,4,5"],"7,8"}</span></div><div>3> [string:tokens(N,",")|| N <- List]. %% Sample Data [["1","2","3"],["3","4","5]]</div><div><br></div><div>But is using string:tokens performs well for handling huge data? Can I use any other list comprehension or some thing to perform better? Kindly suggest.</div><div><br></div><div>Thanks,</div><div>Marutha</div><div><br></div><div><br></div> </div></body>
</html>