[erlang-questions] Beginner Screen Scaping, and Auto-login/input on 3rd party web app?

JETkoten <>
Tue Jan 4 02:45:47 CET 2011


Thanks Jared, Jeroen and Jesse... I've been looking into the options 
you've sent and trying to wrap my head around the concepts.

Thanks again,
Jack

On 12/29/10 11:29 AM, Jeroen Koops wrote:
> Yes, mochiweb has the mochiweb_html module, which parses HTML.
>
> On Wed, Dec 29, 2010 at 5:24 PM, Jesse Gumm<>  wrote:
>> Hi there,
>>
>> I'd recommend looking into httpc module, which is an http client.
>>
>> As for parsing the data, you could check out the 're' module for regular expressions, and it could work mostly, but html is not technically regex parseable. I don't know of an erlang module for parsing an html dom tree, perhaps mochiweb provides something.
>>
>> -Jesse
>>
>>
>> --
>> Jesse Gumm
>> Sigma Star Systems
>> 414.940.4866
>> On Dec 29, 2010 10:19 AM, JETkoten<> wrote:
>>
>> On 12/24/10 12:05 PM, Alain O'Dea wrote:
>>
>> > On 2010-12-24, at 11:08, JETkoten<>  wrote:
>>
>> >
>>
>> >> [...] Hi Everyone,
>>
>> >>
>>
>> >> I'm (very) new to Erlang, and hoping to get some basic experience with it.
>>
>> >>
>>
>> >> I really learn best by doing something I'm interested in. I have a "pet project" that I would like to implement now in Erlang.
>>
>> >>
>>
>> >> Here it is:
>>
>> >>
>>
>> >> I have a large personal library of books and find that I don't need many of them anymore. I'd like to create a program that will help me manage my online sales on a marketplace site, by automatically checking competing sellers' prices at a set time interval an then logging into their website and adjusting my prices according to a formula I'd set based on the other prices.
>>
>> >>
>>
>> >> I did a Google search on Erlang "screen scraping" and saw some options:
>>
>> >>
>>
>> >> www_tools, Yaws parser, xmerl, mochiweb
>>
>> >>
>>
>> >> However, none of the posts that suggest those are less than 2 years old... which is the best/easiest way, and/or are there newer, better options now?
>>
>> >>
>>
>> >> Any ideas?
>>
>> >>
>>
>> >> Thanks in advance,
>>
>> >> Jack
>>
>> > Hi Jack:
>>
>> >
>>
>> > [...] Gradually it probably makes sense to switch to native Erlang utilities if you find them to perform or integrate better.
>>
>> >
>>
>> > [...]
>>
>> >
>>
>> > Eventually it makes sense to use OTP and supervisors to consistently handle agent crashes.  If you find yourself writing a lot of try/catch logic, then stop and refactor to OTP.  Erlang and OTP in Action http://manning.com/logan is the best book for this.
>>
>> >
>>
>> > Cheers and Merry Christmas,
>>
>> > Alain
>>
>> Hi Alain,
>>
>>
>>
>> Thanks very much for your reply.
>>
>>
>>
>> So, I do want to try and implement this with the native Erlang
>>
>> utilities, and am not sure where to begin. I started writing a module
>>
>> and then tried to think of what kind of functions I could use to perform
>>
>> these tasks, but I don't know how to access a website to screen scrape
>>
>> with Erlang or how/where to efficiently store and retreve the
>>
>> price/title data that I would scrape.
>>
>>
>>
>> Would Mnesia or something like Riak be good for the storage part?
>>
>>
>>
>> I also don't know how to get my program to log in to the site after
>>
>> calculating the new price from the scraped data and then changing it on
>>
>> the marketplace site...
>>
>>
>>
>> In the OTP version, would I be looking at gen_server or maybe gen_fsm to
>>
>> complete the tasks? I looked through the Erlang and OTP in Action book
>>
>> in a bookstore, but it seems too advanced for me at this point to get
>>
>> much benefit from.
>>
>>
>>
>> I'm truly a beginner here, so any concrete steps/tools anyone can offer
>>
>> would be a huge help! I've been looking through the online tutorials and
>>
>> books, but can't seem to find much about Erlang and WWW related tasks
>>
>> like these.
>>
>>
>>
>> Thanks again,
>>
>> Jack
>>
>>
>>
>> ________________________________________________________________
>>
>> erlang-questions (at) erlang.org mailing list.
>>
>> See http://www.erlang.org/faq.html
>>
>> To unsubscribe; mailto:
>>
>>
>>
>>
>>


More information about the erlang-questions mailing list