[erlang-questions] Beginner Screen Scaping, and Auto-login/input on 3rd party web app?
Wed Dec 29 17:24:57 CET 2010
I'd recommend looking into httpc module, which is an http client.
As for parsing the data, you could check out the 're' module for regular expressions, and it could work mostly, but html is not technically regex parseable. I don't know of an erlang module for parsing an html dom tree, perhaps mochiweb provides something.
Sigma Star Systems
On Dec 29, 2010 10:19 AM, JETkoten <jetkoten@REDACTED> wrote:
On 12/24/10 12:05 PM, Alain O'Dea wrote:
> On 2010-12-24, at 11:08, JETkoten<jetkoten@REDACTED> wrote:
>> [...] Hi Everyone,
>> I'm (very) new to Erlang, and hoping to get some basic experience with it.
>> I really learn best by doing something I'm interested in. I have a "pet project" that I would like to implement now in Erlang.
>> Here it is:
>> I have a large personal library of books and find that I don't need many of them anymore. I'd like to create a program that will help me manage my online sales on a marketplace site, by automatically checking competing sellers' prices at a set time interval an then logging into their website and adjusting my prices according to a formula I'd set based on the other prices.
>> I did a Google search on Erlang "screen scraping" and saw some options:
>> www_tools, Yaws parser, xmerl, mochiweb
>> However, none of the posts that suggest those are less than 2 years old... which is the best/easiest way, and/or are there newer, better options now?
>> Any ideas?
>> Thanks in advance,
> Hi Jack:
> [...] Gradually it probably makes sense to switch to native Erlang utilities if you find them to perform or integrate better.
> Eventually it makes sense to use OTP and supervisors to consistently handle agent crashes. If you find yourself writing a lot of try/catch logic, then stop and refactor to OTP. Erlang and OTP in Action http://manning.com/logan is the best book for this.
> Cheers and Merry Christmas,
Thanks very much for your reply.
So, I do want to try and implement this with the native Erlang
utilities, and am not sure where to begin. I started writing a module
and then tried to think of what kind of functions I could use to perform
these tasks, but I don't know how to access a website to screen scrape
with Erlang or how/where to efficiently store and retreve the
price/title data that I would scrape.
Would Mnesia or something like Riak be good for the storage part?
I also don't know how to get my program to log in to the site after
calculating the new price from the scraped data and then changing it on
the marketplace site...
In the OTP version, would I be looking at gen_server or maybe gen_fsm to
complete the tasks? I looked through the Erlang and OTP in Action book
in a bookstore, but it seems too advanced for me at this point to get
much benefit from.
I'm truly a beginner here, so any concrete steps/tools anyone can offer
would be a huge help! I've been looking through the online tutorials and
books, but can't seem to find much about Erlang and WWW related tasks
erlang-questions (at) erlang.org mailing list.
To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
More information about the erlang-questions