[erlang-questions] configure http methods in yaws

Steve Vinoski vinoski@REDACTED
Wed Jul 1 03:48:32 CEST 2015


On Tue, Jun 30, 2015 at 11:09 AM, Bogdan Andu <bog495@REDACTED> wrote:

>
> It is intended this behavior and only the headers are read
> and only a part of them because neither query params of a GET
> are not read fro socket.
>

For a GET request, query params are part of the requested URL. This means
that for a dispatchmod, they're available in the req field of the arg,
specifically as part of the request path in the http_request record
occupying the arg.req field. Assuming the http_request.path field is an
{abs_path, RawPath} tuple, your dispatchmod can get the query params by
calling yaws_api:url_decode_q_split(RawPath). which will give you a {Path,
QueryData} tuple. If you then set the arg.querydata field to QueryData, you
should be able to call yaws_api:parse_query(Arg) to get a list of key/value
query param pairs.

Keep in mind that a dispatchmod is suitable only for systems that already
are mostly capable of handling web requests, such as the Webmachine-Yaws
integration I mentioned in my last email. It sounds like dispatchmod is not
suitable for your application, given that you keep asking for parsing of
query params, handling POST bodies, and other capabilities that Yaws
already provides for you in its normal request handling path. The
dispatchmod is specifically designed to bypass all that because it's
assumed you have other code to handle it for you in that case.


> So, basically the body is not read in a dispatchmod, right?
> Only a part of the headers. Or all?
>

For a dispatchmod, the body is not read. All headers are read and stored in
Arg.


> And then in dispatchmod how do I know there are unparsed headers
> or not to know exactly when to parse the body of the request ?
>

All headers are parsed. If the request has a body, it's ready to be read
off the socket by the dispatchmod.

I've copied the erlyaws mailing list on this reply, because I really REALLY
urge you once again to take this conversation to that list. I don't think
erlang-questions is the right place for detailed Yaws questions.

--steve





>
> Thanks,
> Bogdan
>
>
>
>
> On Tue, Jun 30, 2015 at 3:36 PM, Steve Vinoski <vinoski@REDACTED> wrote:
>
>>
>>
>> On Tue, Jun 30, 2015 at 6:38 AM, Bogdan Andu <bog495@REDACTED> wrote:
>>
>>> I know intercept module does not have clidata populated.
>>>
>>> I was  saying that in dispatch module I want POST data.
>>>
>>> In a configuration like this:
>>>
>>> ....
>>> <server localhost>
>>>         port = 8088
>>>         listen = 127.0.0.1
>>>         listen_backlog = 100
>>>         dispatchmod = dispatch_rewrite
>>>         docroot = /tmp
>>>         revproxy = / http://127.0.0.1:8080/ intercept_mod intercept_cgi
>>> </server>
>>>
>>> I thought dispatch_rewrite to give me clidata, but clidata for a POST
>>> to a cgi script remains undefined.
>>>
>>
>> As its name implies, a dispatch module is useful when you're taking over
>> dispatching from Yaws. For example, I've used it for a video delivery
>> application and for integrating webmachine and Yaws, the latter because
>> webmachine already performs all its own request processing and reply
>> delivery. For the dispatchmod case Yaws reads just enough of the request to
>> build a minimal #arg{} and assumes the dispatch module will handle the
>> rest, including reading additional data from the socket when warranted, so
>> if your dispatchmod is expecting POSTs, it will have to handle them itself.
>>
>> --steve
>>
>>
>>
>>>
>>> Thanks,
>>> Bogdan
>>>
>>>
>>> On Mon, Jun 29, 2015 at 8:48 PM, Steve Vinoski <vinoski@REDACTED> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Jun 29, 2015 at 11:10 AM, Bogdan Andu <bog495@REDACTED> wrote:
>>>>
>>>>> Thanks for reply.
>>>>>
>>>>> I tested myself yaws in revproxy and
>>>>> I like it.
>>>>>
>>>>> Although I don't know how to capture POST data
>>>>> which should be present in Arg#arg.clidata field
>>>>> which is also undefined.
>>>>>
>>>>> I searched the web and docs and found nothing.
>>>>>
>>>>> I wrongly assumed that field Arg#arg.querydata hold POST data
>>>>>
>>>>> I want POST data to apply some regular expression checks
>>>>> (like mod_rewrite in Apache) on them
>>>>>
>>>>
>>>> Pretty sure an intercept_mod has access only to information about the
>>>> request, and to the headers. The revproxy code uses an internal state
>>>> record to track details about POST data and such, but an intercept mod
>>>> doesn't have access to that state. You might consider posting an issue to
>>>> the yaws github project (https://github.com/klacke/yaws) to see if
>>>> this functionality can be added.
>>>>
>>>> --steve
>>>>
>>>>
>>>>
>>>>> Thanks,
>>>>> Bogdan
>>>>>
>>>>>
>>>>> On Mon, Jun 29, 2015 at 5:49 PM, Steve Vinoski <vinoski@REDACTED>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Jun 28, 2015 at 11:20 AM, Bogdan Andu <bog495@REDACTED>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I know the docs
>>>>>>> and I run internally Yaws in reverse proxy mode
>>>>>>> and I want this in Internet facing setup also
>>>>>>>
>>>>>>> I searched the net 'yaws reverse proxy' and I found:
>>>>>>>
>>>>>>> 1)
>>>>>>> http://stackoverflow.com/questions/917546/has-anybody-used-yaws-server-as-an-http-proxy
>>>>>>>    although 6 years  old, is Yaws in reverse proxy mode comparable
>>>>>>> or even
>>>>>>>    better than varnis, haproxy, nginx?
>>>>>>>
>>>>>>
>>>>>> That question/answer is out of date, since revproxy code was largely
>>>>>> refactored in the 2012-2014 timeframe.
>>>>>>
>>>>>>
>>>>>>> 2) http://osdir.com/ml/web.server.yaws.general/2007-12/msg00000.html
>>>>>>>     this one is from klacke and he speaks about difficulties he
>>>>>>>     encountered writing revproxy engine.as well as an OTP limitation
>>>>>>>     being the main obstacle in overcoming these.
>>>>>>>     Are these still apply today?
>>>>>>>
>>>>>>
>>>>>> Probably not, given the rewrite.
>>>>>>
>>>>>>
>>>>>>> 3) also found this:
>>>>>>> http://www.erlang-factory.com/upload/presentations/752/reed-efsf2013-whatsapp.pdf
>>>>>>>     it seems whatsapp useses or used yaws in revproxy mode with some
>>>>>>> tweaks
>>>>>>>
>>>>>>> The setup I want is simple:
>>>>>>>
>>>>>>> Because I have the applications written in other language
>>>>>>> and for rewriting them in Erlang I dont have the time I must
>>>>>>> use this setup:
>>>>>>>
>>>>>>>  - Yaws in front-end in revproxy mode with interception module with
>>>>>>> plenty of check
>>>>>>> on headers, cookies, etc
>>>>>>>  - Twiggy/Starlet/Starman as back-end or psgi server
>>>>>>>    more info here:
>>>>>>> http://www.slideshare.net/kazeburo/yapc2013psgi-plack
>>>>>>>
>>>>>>>  I want to keepalive connections between yaws and psgi servers to
>>>>>>> avoid
>>>>>>>  3-way handshake overhead. Although with outside world I want
>>>>>>> disable keepalive
>>>>>>>   and set 'connection: close' and all of thiese can be done by
>>>>>>> altering the headers
>>>>>>>  tru interception module.
>>>>>>>
>>>>>>>
>>>>>>> Is it safe to keepalive connections between Yaws and Twiggy for
>>>>>>> example which is an Libevent implementation web server?
>>>>>>> Or Starlet/Starman like web server which has a parallel pre-fork
>>>>>>> model?
>>>>>>>
>>>>>>
>>>>>> Seems like it should be OK but I've never used the backend servers
>>>>>> you're talking about, so I can't say for sure whether it's safe or not. You
>>>>>> might try asking on the erlyaws mailing list (see
>>>>>> https://lists.sourceforge.net/lists/listinfo/erlyaws-list).
>>>>>>
>>>>>> --steve
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks and sorry for long post,
>>>>>>>
>>>>>>> Bogdan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Jun 28, 2015 at 4:51 AM, Steve Vinoski <vinoski@REDACTED>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Jun 27, 2015 at 4:58 AM, Bogdan Andu <bog495@REDACTED>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> About yaws as reverse proxy..
>>>>>>>>>
>>>>>>>>> I want to use yaws as a reverse proxy in a
>>>>>>>>> http -> http setup. no ssl involved whatsoever.
>>>>>>>>>
>>>>>>>>> I am interested in interception module where I want
>>>>>>>>> to apply various checks on headers, query string, etc
>>>>>>>>> making this some kind of www firewall .
>>>>>>>>>
>>>>>>>>> Is this feature of yaws production ready ?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes. You can find details about it in chapter 13 of
>>>>>>>> http://yaws.hyber.org/yaws.pdf, or under the revproxy section of
>>>>>>>> http://yaws.hyber.org/yman.yaws?page=yaws.conf .
>>>>>>>>
>>>>>>>> --steve
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Bogdan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Feb 13, 2015 at 6:14 PM, Steve Vinoski <vinoski@REDACTED>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I don't recall seeing Yaws users asking for this config feature
>>>>>>>>>> in the past, so it's unlikely we'll add it. But what you're asking for -- a
>>>>>>>>>> configuration point for methods -- would be implemented much as I've shown
>>>>>>>>>> in my previous emails, much like a dispatchmod. The dispatchmod is as early
>>>>>>>>>> in the request handling process as you can get after the formation of the
>>>>>>>>>> #arg{}. The dispatchmod code I provided requires less configuration than
>>>>>>>>>> what you're showing, even for the default case, plus if having to have a
>>>>>>>>>> new module concerns you, the dispatch/1 function can be added to some other
>>>>>>>>>> existing module you already have instead.
>>>>>>>>>>
>>>>>>>>>> --steve
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 13, 2015 at 4:30 AM, Bogdan Andu <bog495@REDACTED>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes but the point is to have a default configuration that can be
>>>>>>>>>>> overridden by such a mechanism
>>>>>>>>>>> if one is configured.
>>>>>>>>>>> The 99 percent of cases only need a default behaviour.
>>>>>>>>>>>
>>>>>>>>>>> The way I see this is to have something like that (all in one):
>>>>>>>>>>>
>>>>>>>>>>> <LIMIT POST GET>
>>>>>>>>>>>         mod_405=my_405_handle_module
>>>>>>>>>>> ....
>>>>>>>>>>> </LIMIT>
>>>>>>>>>>>
>>>>>>>>>>> in this way we can also customize the response if a method other
>>>>>>>>>>> than GET or POST is sent to the server
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Feb 13, 2015 at 10:17 AM, Imants Cekusins <
>>>>>>>>>>> imantc@REDACTED> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> > Traffic with methods not allowed should be discarded with 405
>>>>>>>>>>>>
>>>>>>>>>>>> you see, someone else might prefer another action depending on
>>>>>>>>>>>> method
>>>>>>>>>>>> not allowed.
>>>>>>>>>>>>
>>>>>>>>>>>> a dedicated attribute may be convenient but then someone would
>>>>>>>>>>>> ask:
>>>>>>>>>>>> "how do I change the response code? how do I redirect?". Current
>>>>>>>>>>>> approach gives you choice.
>>>>>>>>>>>>
>>>>>>>>>>>> one of those cases when there is more than one approach, a
>>>>>>>>>>>> prefers  A,
>>>>>>>>>>>> b prefers B. Both have a valid point.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150630/61064a35/attachment.htm>


More information about the erlang-questions mailing list