anybody written a web cache?
Wed Feb 10 21:29:57 CET 2010
Has anybody written a simple web cache?
What I'd like are:
- own policy for time-to-live (ie many (most) web pages come back
cache control directives - to force page refreshes)
- rate limiting (per site) - ie I don't want to annoy sites by
requeting too much data in
a short time
I'm interested in writing some web-aggregation/meshup software - why?
- because I'm fed
up with all the crap that most sites deliver.
<example>I've been looking at ads. for apartments in Stockholm - one
of the largest
sites which has ads. for apartments is rather slow - if you click on
the data for one apartment
134 http GET requests are issued - most are set uncachable. The
generated HTML is crap.
same true for hotels - try booking a hotel somewhere ...</aside>
Seems pretty easy to fetch and cache data on apartments for sale, and hotels,
extract the *information* content and re-mesh the information.
If the individual sites rate-limit queries form my machine we could
set up a distributed
net to request the data from random hosts :-)
Could be a fun project
More information about the erlang-questions