[erlang-questions] configure http methods in yaws

Bogdan Andu bog495@REDACTED
Wed Jul 1 09:09:19 CEST 2015


Thanks for detailed explanations

and

sorry for 'abusing' this list with
yaws stuff.

Bogdan


On Wed, Jul 1, 2015 at 4:48 AM, Steve Vinoski <vinoski@REDACTED> wrote:

>
>
> On Tue, Jun 30, 2015 at 11:09 AM, Bogdan Andu <bog495@REDACTED> wrote:
>
>>
>> It is intended this behavior and only the headers are read
>> and only a part of them because neither query params of a GET
>> are not read fro socket.
>>
>
> For a GET request, query params are part of the requested URL. This means
> that for a dispatchmod, they're available in the req field of the arg,
> specifically as part of the request path in the http_request record
> occupying the arg.req field. Assuming the http_request.path field is an
> {abs_path, RawPath} tuple, your dispatchmod can get the query params by
> calling yaws_api:url_decode_q_split(RawPath). which will give you a {Path,
> QueryData} tuple. If you then set the arg.querydata field to QueryData, you
> should be able to call yaws_api:parse_query(Arg) to get a list of key/value
> query param pairs.
>
> Keep in mind that a dispatchmod is suitable only for systems that already
> are mostly capable of handling web requests, such as the Webmachine-Yaws
> integration I mentioned in my last email. It sounds like dispatchmod is not
> suitable for your application, given that you keep asking for parsing of
> query params, handling POST bodies, and other capabilities that Yaws
> already provides for you in its normal request handling path. The
> dispatchmod is specifically designed to bypass all that because it's
> assumed you have other code to handle it for you in that case.
>
>
>> So, basically the body is not read in a dispatchmod, right?
>> Only a part of the headers. Or all?
>>
>
> For a dispatchmod, the body is not read. All headers are read and stored
> in Arg.
>
>
>> And then in dispatchmod how do I know there are unparsed headers
>> or not to know exactly when to parse the body of the request ?
>>
>
> All headers are parsed. If the request has a body, it's ready to be read
> off the socket by the dispatchmod.
>
> I've copied the erlyaws mailing list on this reply, because I really
> REALLY urge you once again to take this conversation to that list. I don't
> think erlang-questions is the right place for detailed Yaws questions.
>
> --steve
>
>
>
>
>
>>
>> Thanks,
>> Bogdan
>>
>>
>>
>>
>> On Tue, Jun 30, 2015 at 3:36 PM, Steve Vinoski <vinoski@REDACTED> wrote:
>>
>>>
>>>
>>> On Tue, Jun 30, 2015 at 6:38 AM, Bogdan Andu <bog495@REDACTED> wrote:
>>>
>>>> I know intercept module does not have clidata populated.
>>>>
>>>> I was  saying that in dispatch module I want POST data.
>>>>
>>>> In a configuration like this:
>>>>
>>>> ....
>>>> <server localhost>
>>>>         port = 8088
>>>>         listen = 127.0.0.1
>>>>         listen_backlog = 100
>>>>         dispatchmod = dispatch_rewrite
>>>>         docroot = /tmp
>>>>         revproxy = / http://127.0.0.1:8080/ intercept_mod intercept_cgi
>>>> </server>
>>>>
>>>> I thought dispatch_rewrite to give me clidata, but clidata for a POST
>>>> to a cgi script remains undefined.
>>>>
>>>
>>> As its name implies, a dispatch module is useful when you're taking over
>>> dispatching from Yaws. For example, I've used it for a video delivery
>>> application and for integrating webmachine and Yaws, the latter because
>>> webmachine already performs all its own request processing and reply
>>> delivery. For the dispatchmod case Yaws reads just enough of the request to
>>> build a minimal #arg{} and assumes the dispatch module will handle the
>>> rest, including reading additional data from the socket when warranted, so
>>> if your dispatchmod is expecting POSTs, it will have to handle them itself.
>>>
>>> --steve
>>>
>>>
>>>
>>>>
>>>> Thanks,
>>>> Bogdan
>>>>
>>>>
>>>> On Mon, Jun 29, 2015 at 8:48 PM, Steve Vinoski <vinoski@REDACTED>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 29, 2015 at 11:10 AM, Bogdan Andu <bog495@REDACTED>
>>>>> wrote:
>>>>>
>>>>>> Thanks for reply.
>>>>>>
>>>>>> I tested myself yaws in revproxy and
>>>>>> I like it.
>>>>>>
>>>>>> Although I don't know how to capture POST data
>>>>>> which should be present in Arg#arg.clidata field
>>>>>> which is also undefined.
>>>>>>
>>>>>> I searched the web and docs and found nothing.
>>>>>>
>>>>>> I wrongly assumed that field Arg#arg.querydata hold POST data
>>>>>>
>>>>>> I want POST data to apply some regular expression checks
>>>>>> (like mod_rewrite in Apache) on them
>>>>>>
>>>>>
>>>>> Pretty sure an intercept_mod has access only to information about the
>>>>> request, and to the headers. The revproxy code uses an internal state
>>>>> record to track details about POST data and such, but an intercept mod
>>>>> doesn't have access to that state. You might consider posting an issue to
>>>>> the yaws github project (https://github.com/klacke/yaws) to see if
>>>>> this functionality can be added.
>>>>>
>>>>> --steve
>>>>>
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Bogdan
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 29, 2015 at 5:49 PM, Steve Vinoski <vinoski@REDACTED>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Jun 28, 2015 at 11:20 AM, Bogdan Andu <bog495@REDACTED>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I know the docs
>>>>>>>> and I run internally Yaws in reverse proxy mode
>>>>>>>> and I want this in Internet facing setup also
>>>>>>>>
>>>>>>>> I searched the net 'yaws reverse proxy' and I found:
>>>>>>>>
>>>>>>>> 1)
>>>>>>>> http://stackoverflow.com/questions/917546/has-anybody-used-yaws-server-as-an-http-proxy
>>>>>>>>    although 6 years  old, is Yaws in reverse proxy mode comparable
>>>>>>>> or even
>>>>>>>>    better than varnis, haproxy, nginx?
>>>>>>>>
>>>>>>>
>>>>>>> That question/answer is out of date, since revproxy code was largely
>>>>>>> refactored in the 2012-2014 timeframe.
>>>>>>>
>>>>>>>
>>>>>>>> 2)
>>>>>>>> http://osdir.com/ml/web.server.yaws.general/2007-12/msg00000.html
>>>>>>>>     this one is from klacke and he speaks about difficulties he
>>>>>>>>     encountered writing revproxy engine.as well as an OTP
>>>>>>>> limitation
>>>>>>>>     being the main obstacle in overcoming these.
>>>>>>>>     Are these still apply today?
>>>>>>>>
>>>>>>>
>>>>>>> Probably not, given the rewrite.
>>>>>>>
>>>>>>>
>>>>>>>> 3) also found this:
>>>>>>>> http://www.erlang-factory.com/upload/presentations/752/reed-efsf2013-whatsapp.pdf
>>>>>>>>     it seems whatsapp useses or used yaws in revproxy mode with
>>>>>>>> some tweaks
>>>>>>>>
>>>>>>>> The setup I want is simple:
>>>>>>>>
>>>>>>>> Because I have the applications written in other language
>>>>>>>> and for rewriting them in Erlang I dont have the time I must
>>>>>>>> use this setup:
>>>>>>>>
>>>>>>>>  - Yaws in front-end in revproxy mode with interception module with
>>>>>>>> plenty of check
>>>>>>>> on headers, cookies, etc
>>>>>>>>  - Twiggy/Starlet/Starman as back-end or psgi server
>>>>>>>>    more info here:
>>>>>>>> http://www.slideshare.net/kazeburo/yapc2013psgi-plack
>>>>>>>>
>>>>>>>>  I want to keepalive connections between yaws and psgi servers to
>>>>>>>> avoid
>>>>>>>>  3-way handshake overhead. Although with outside world I want
>>>>>>>> disable keepalive
>>>>>>>>   and set 'connection: close' and all of thiese can be done by
>>>>>>>> altering the headers
>>>>>>>>  tru interception module.
>>>>>>>>
>>>>>>>>
>>>>>>>> Is it safe to keepalive connections between Yaws and Twiggy for
>>>>>>>> example which is an Libevent implementation web server?
>>>>>>>> Or Starlet/Starman like web server which has a parallel pre-fork
>>>>>>>> model?
>>>>>>>>
>>>>>>>
>>>>>>> Seems like it should be OK but I've never used the backend servers
>>>>>>> you're talking about, so I can't say for sure whether it's safe or not. You
>>>>>>> might try asking on the erlyaws mailing list (see
>>>>>>> https://lists.sourceforge.net/lists/listinfo/erlyaws-list).
>>>>>>>
>>>>>>> --steve
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Thanks and sorry for long post,
>>>>>>>>
>>>>>>>> Bogdan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Jun 28, 2015 at 4:51 AM, Steve Vinoski <vinoski@REDACTED>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jun 27, 2015 at 4:58 AM, Bogdan Andu <bog495@REDACTED>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> About yaws as reverse proxy..
>>>>>>>>>>
>>>>>>>>>> I want to use yaws as a reverse proxy in a
>>>>>>>>>> http -> http setup. no ssl involved whatsoever.
>>>>>>>>>>
>>>>>>>>>> I am interested in interception module where I want
>>>>>>>>>> to apply various checks on headers, query string, etc
>>>>>>>>>> making this some kind of www firewall .
>>>>>>>>>>
>>>>>>>>>> Is this feature of yaws production ready ?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes. You can find details about it in chapter 13 of
>>>>>>>>> http://yaws.hyber.org/yaws.pdf, or under the revproxy section of
>>>>>>>>> http://yaws.hyber.org/yman.yaws?page=yaws.conf .
>>>>>>>>>
>>>>>>>>> --steve
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Bogdan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 13, 2015 at 6:14 PM, Steve Vinoski <vinoski@REDACTED>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I don't recall seeing Yaws users asking for this config feature
>>>>>>>>>>> in the past, so it's unlikely we'll add it. But what you're asking for -- a
>>>>>>>>>>> configuration point for methods -- would be implemented much as I've shown
>>>>>>>>>>> in my previous emails, much like a dispatchmod. The dispatchmod is as early
>>>>>>>>>>> in the request handling process as you can get after the formation of the
>>>>>>>>>>> #arg{}. The dispatchmod code I provided requires less configuration than
>>>>>>>>>>> what you're showing, even for the default case, plus if having to have a
>>>>>>>>>>> new module concerns you, the dispatch/1 function can be added to some other
>>>>>>>>>>> existing module you already have instead.
>>>>>>>>>>>
>>>>>>>>>>> --steve
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Feb 13, 2015 at 4:30 AM, Bogdan Andu <bog495@REDACTED>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yes but the point is to have a default configuration that can
>>>>>>>>>>>> be overridden by such a mechanism
>>>>>>>>>>>> if one is configured.
>>>>>>>>>>>> The 99 percent of cases only need a default behaviour.
>>>>>>>>>>>>
>>>>>>>>>>>> The way I see this is to have something like that (all in one):
>>>>>>>>>>>>
>>>>>>>>>>>> <LIMIT POST GET>
>>>>>>>>>>>>         mod_405=my_405_handle_module
>>>>>>>>>>>> ....
>>>>>>>>>>>> </LIMIT>
>>>>>>>>>>>>
>>>>>>>>>>>> in this way we can also customize the response if a method
>>>>>>>>>>>> other than GET or POST is sent to the server
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 13, 2015 at 10:17 AM, Imants Cekusins <
>>>>>>>>>>>> imantc@REDACTED> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> > Traffic with methods not allowed should be discarded with 405
>>>>>>>>>>>>>
>>>>>>>>>>>>> you see, someone else might prefer another action depending on
>>>>>>>>>>>>> method
>>>>>>>>>>>>> not allowed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> a dedicated attribute may be convenient but then someone would
>>>>>>>>>>>>> ask:
>>>>>>>>>>>>> "how do I change the response code? how do I redirect?".
>>>>>>>>>>>>> Current
>>>>>>>>>>>>> approach gives you choice.
>>>>>>>>>>>>>
>>>>>>>>>>>>> one of those cases when there is more than one approach, a
>>>>>>>>>>>>> prefers  A,
>>>>>>>>>>>>> b prefers B. Both have a valid point.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150701/bf0f46ef/attachment.htm>


More information about the erlang-questions mailing list