[erlang-questions] Beginner Screen Scaping, and Auto-login/input on 3rd party web app?

Jesse Gumm sigmastar@REDACTED
Wed Dec 29 17:24:57 CET 2010

Hi there,

I'd recommend looking into httpc module, which is an http client.

As for parsing the data, you could check out the 're' module for regular expressions, and it could work mostly, but html is not technically regex parseable.  I don't know of an erlang module for parsing an html dom tree, perhaps mochiweb provides something.


Jesse Gumm
Sigma Star Systems
On Dec 29, 2010 10:19 AM, JETkoten <jetkoten@REDACTED> wrote: 

On 12/24/10 12:05 PM, Alain O'Dea wrote:

> On 2010-12-24, at 11:08, JETkoten<jetkoten@REDACTED>  wrote:


>> [...] Hi Everyone,


>> I'm (very) new to Erlang, and hoping to get some basic experience with it.


>> I really learn best by doing something I'm interested in. I have a "pet project" that I would like to implement now in Erlang.


>> Here it is:


>> I have a large personal library of books and find that I don't need many of them anymore. I'd like to create a program that will help me manage my online sales on a marketplace site, by automatically checking competing sellers' prices at a set time interval an then logging into their website and adjusting my prices according to a formula I'd set based on the other prices.


>> I did a Google search on Erlang "screen scraping" and saw some options:


>> www_tools, Yaws parser, xmerl, mochiweb


>> However, none of the posts that suggest those are less than 2 years old... which is the best/easiest way, and/or are there newer, better options now?


>> Any ideas?


>> Thanks in advance,

>> Jack

> Hi Jack:


> [...] Gradually it probably makes sense to switch to native Erlang utilities if you find them to perform or integrate better.


> [...]


> Eventually it makes sense to use OTP and supervisors to consistently handle agent crashes.  If you find yourself writing a lot of try/catch logic, then stop and refactor to OTP.  Erlang and OTP in Action http://manning.com/logan is the best book for this.


> Cheers and Merry Christmas,

> Alain

Hi Alain,

Thanks very much for your reply.

So, I do want to try and implement this with the native Erlang 

utilities, and am not sure where to begin. I started writing a module 

and then tried to think of what kind of functions I could use to perform 

these tasks, but I don't know how to access a website to screen scrape 

with Erlang or how/where to efficiently store and retreve the 

price/title data that I would scrape.

Would Mnesia or something like Riak be good for the storage part?

I also don't know how to get my program to log in to the site after 

calculating the new price from the scraped data and then changing it on 

the marketplace site...

In the OTP version, would I be looking at gen_server or maybe gen_fsm to 

complete the tasks? I looked through the Erlang and OTP in Action book 

in a bookstore, but it seems too advanced for me at this point to get 

much benefit from.

I'm truly a beginner here, so any concrete steps/tools anyone can offer 

would be a huge help! I've been looking through the online tutorials and 

books, but can't seem to find much about Erlang and WWW related tasks 

like these.

Thanks again,



erlang-questions (at) erlang.org mailing list.

See http://www.erlang.org/faq.html

To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED

More information about the erlang-questions mailing list