HTTP (Hypertext Transfer Protocol) is an application-level
protocol with the lightness and speed necessary for distributed,
collaborative and hyper-media information systems. The
httpd
module handles HTTP requests as described in
RFC 2616 with a few exceptions such as
gateway and proxy functionality. The
same is true for servers written by NCSA and others.
The server implements numerous features such as
SSL (Secure Sockets Layer),
ESI (Erlang Scripting Interface),
CGI (Common Gateway Interface),
User Authentication(using
Mnesia, dets or plain text database),
Common Logfile Format (with or
without disk_log(3)
support),
URL Aliasing,
Action Mappings,
Directory Listings and
SSI (Server-Side Includes).
The configuration of the server is done using Apache-style configuration directives. The goal is to be plug-in compatible with Apache.
All server functionality has been implemented using an especially crafted server API; EWSAPI (Erlang Web Server API). This API can be used to advantage by all who wants to enhance the server core functionality, for example custom logging and authentication.
All functionality in the server can be configured using
Apache-style configuration directives stored in a
configuration file. Take a look at the example config files in
the conf
directory (UNIX: $INETS_ROOT/examples/server_root/conf/, Windows: %INETS_ROOT%\examples\server_root\conf\
) of the server root for a complete
understanding.
An alphabetical list of all config directives:
All server functionality has been implemented using EWSAPI (Erlang Web Server API) modules. The following modules are available:
disk_log(3)
.
Each module has a man page that further describe it's functionality.
The Modules config directive can be used to alter the server behavior, by alter the EWSAPI Module Sequence. An example module sequence can be found in the example config directory. If this needs to be altered read the EWSAPI Module Interaction section below.
start()
start(ConfigFile) -> ServerRet
start_link()
start_link(ConfigFile) -> ServerRet
ConfigFile = string()
ServerRet = {ok,Pid} | ignore | {error,EReason} | {stop,SReason}
Pid = pid()
EReason = {already_started, Pid} | term()
SReason = string()
start/1
and start_link/1
starts a server
as specified in the given ConfigFile
. The ConfigFile
supports a number of config directives specified below.
start/0
and start/0
starts a server as specified
in a hard-wired config file, that is
start("/var/tmp/server_root/conf/8888.conf")
. Before
utilizing start/0
or start_link/0
, copy the
example server
root (UNIX: $INETS_ROOT/examples/server_root/, Windows: %INETS_ROOT%\examples\server_root\
) to a specific installation directory (UNIX: /var/tmp/, Windows: X:\var\tmp\
)
and you have a server running in no time.
If you copy the example server root to the specific
installation directory it is furthermore easy to start an
SSL enabled server, that is
start("/var/tmp/server_root/conf/ssl.conf")
.
restart()
restart(Port) -> ok | {error,Reason}
restart(ConfigFile) -> ok | {error,Reason}
restart(Address,Port) -> ok | {error,Reason}
Port = integer()
Address = {A,B,C,D} | string() | undefined
ConfigFile = string()
Reason = term()
restart
restarts the server and reloads its config file.
The follwing directives cannot be changed: BindAddress, Port and SocketType. If these should be changed, then a new server should be started instead.
Before the |
stop()
stop(Port) -> ServerRet
stop(ConfigFile) -> ServerRet
stop(Address,Port) -> ServerRet
Port = integer()
Address = {A,B,C,D} | string() | undefined
ConfigFile = string()
ServerRet = ok | not_started
stop/2
stops the server which listens to the specified
Port
on Address
.
stop(integer())
stops a server which listens to a specific
Port
.
stop(string())
extracts BindAddress
and Port
from the config file and stops the server which listens to the
specified Port
on Address
.
stop/0
stops a server which listens to port 8888, that is
stop(8888)
.
block() -> ok | {error,Reason}
block(Port) -> ok | {error,Reason}
block(ConfigFile) -> ok | {error,Reason}
block(Address,Port) -> ok | {error,Reason}
block(Port,Mode) -> ok | {error,Reason}
block(ConfigFile,Mode) -> ok | {error,Reason}
block(Address,Port,Mode) -> ok | {error,Reason}
block(ConfigFile,Mode,Timeout) -> ok | {error,Reason}
block(Address,Port,Mode,Timeout) -> ok | {error,Reason}
Port = integer()
Address = {A,B,C,D} | string() | undefined
ConfigFile = string()
Mode = disturbing | non_disturbing
Timeout = integer()
Reason = term()
This function is used to block a server. The blocking can be done in two ways, disturbing or non-disturbing.
By performing a disturbing block, the server is blocked forcefully and all ongoing requests are terminated. No new connections are accepted. If a timeout time is given then on-going requests are given this much time to complete before the server is forcefully blocked. In this case no new connections is accepted.
A non-disturbing block is more gracefull. No new connections are accepted, but the ongoing requests are allowed to complete. If a timeout time is given, it waits this long before giving up (the block operation is aborted and the server state is once more not-blocked)
Default mode is disturbing.
Default port is 8888
unblock() -> ok | {error,Reason}
unblock(Port) -> ok | {error,Reason}
unblock(ConfigFile) -> ok | {error,Reason}
unblock(Address,Port) -> ok | {error,Reason}
Port = integer()
Address = {A,B,C,D} | string() | undefined
ConfigFile = string()
Reason = term()
Unblocks a server. If the server is already unblocked this is a no-op. If a block is ongoing, then it is aborted (this will have no effect on ongoing requests).
parse_query(QueryString) -> ServerRet
QueryString = string()
ServerRet = [{Key,Value}]
Key = Value = string()
parse_query/1
parses incoming data to erl
and
eval
scripts (See mod_esi(3)) as defined in the standard
URL format, that is '+' becomes 'space' and decoding of
hexadecimal characters (%xx
).
Module:do(Info)-> {proceed, OldData} | {proceed, NewData} | {break, NewData} | done
Info = mod()
OldData = list()
NewData = [{response,{StatusCode,Body}}] | [{response,{response,Head,Body2}}] |
[{response,{already_sent,Statuscode,Size}]
StausCode = integer()
Body = String
Head = [HeaderOption]
HeaderOption = {Key, Value} | {code, StatusCode}
Key = allow | cache_control | content_MD5 | content_encoding | content_encoding |
content_language,Value | content_length | content_location | content_range |
content_type | date | etag | expires | last_modified | location | pragma | retry_after |
server | trailer | transfer_encoding
Value = string()
Body2 = {Fun,Arg} | Body | nobody
Fun = fun( Arg )->sent| close | Body
Arg = [term()]
Info
is a record of type mod, this record is defined in httpd.hrl see
EWSAPI Module programming for more information.
When a valid request reaches httpd it calls do/1
in each module defined by the Modules configuration
directive. The function may generate data for other modules or a response that can be sent back to the client.
The field data
in Info is a list. This list will be the list returned from the from the last call to
do/1
.
Body
is the body of the http-response that will be sent back to the client an appropriate header
will be appended to the message. StatusCode
will be the status code of the response
see RFC2616 for the appropriate values.
Head
is a key value list of HTTP header fields. the server will
construct a HTTP header from this data. See RFC 2616 for the appropriate value for
each header field. If the client is a HTTP/1.0 client then the server will filter the list so that
only HTTP/1.0 header fields will be sent back to the client.
If Body2
is returned and equal to {Fun,Arg}
The Web server will try apply/2
.
on Fun
with Arg
as argument and excpect that the fun either returns a list (Body)
that is a HTTP-repsonse or the atom sent if the HTTP-response is sent back to the client. If close is
returned from the fun something has gone wrong and the server will signal this to the client by closing the
connection.
Line = string()
Context = NewContext = DirectiveList = [Directive]
Directive = {DirectiveKey , DirectiveValue}
DirectiveKey = DirectiveValue = term()
Reason = term()
load/2
takes a row Line
from the configuration file and tries to convert it
to a key value tuple. If a directive is dependent on other directives, the directive may create
a context. If the directive is not dependent on other directives return {ok, [], Directive}
,
otherwise return a new context, that is {ok, NewContext}
or {ok, Context Directive}.
If {error, Reason}
is returned the configuration directive is assumed to be invalid.
DirectiveList = [{DirectiveKey, DirectiveValue}]
DirectiveKey = DirecitveValue = term()
Context = NewContext = DirectiveList = [Directive]
Directive = {Key , Value}
Reason = term()
When all rows in the configuration file is read the function store/2
is called for each
configuration directive. This makes it possible for a directive to alter other configuration
directives. DirectiveList
is a list of all configuration directives read in from load.
If a directive may update other configuration directives then use this function.
Module:remove(ConfigDB)-> ok | {error, Reason}
ConfigDB = ets_table()
Reason = term()
When httpd shutdown it will try to execute remove/1
in each ewsapi module. The ewsapi programmer
may use this to close ets tables, save data, or close down background processes.
The Erlang/OTP programming knowledge required to undertake an EWSAPI module is quite high and is not recommended for the average server user. It is best to only use it to add core functionality, e.g. custom authentication or a RFC 2109 implementation. |
EWSAPI should only be used to add core functionality to the server. In order to generate dynamic content, for example on-the-fly generated HTML, use the standard CGI or ESI facilities instead.
As seen above the major part of the server functionality has
been realized as EWSAPI modules (from now on only called modules).
If you intend to write your own server extension start with examining
the standard
modules (UNIX: $INETS_ROOT/src/, Windows: %INETS_ROOT%\src\
)
mod_*.erl
and note how to they are configured in the example
config
directory (UNIX: $INETS_ROOT/examples/server_root/conf/, Windows: %INETS_ROOT%\examples\server_root\conf\
).
Each module implements do/1
(mandatory), load/2
,
store/2
and remove/1
. The latter functions are needed
only when new config directives are to be introduced, see EWSAPI Module
Configuration.
A module can choose to export functions to be used by other modules in the EWSAPI Module Sequence (See Modules config directive). This should only be done as an exception! The goal is to keep each module self-sustained thus making it easy to alter the EWSAPI Module Sequence without any unneccesary module dependencies.
A module can furthermore use data generated by previous modules in the EWSAPI Module Sequence or generate data to be used by consecutive EWSAPI modules. This is made possible due to an internal list of key-value tuples, see EWSAPI Module Interaction.
The server executes |
-record(mod,{data=[], socket_type=ip_comm, socket, config_db, method, absolute_uri, request_uri, http_version, request_line, parsed_header=[], entity_body, connection}).
The fields of the mod
record has the following meaning:
data
[{InteractionKey,InteractionValue}]
is used to
propagate data between modules
(See EWSAPI
Module Interaction below). Depicted
interaction_data()
in function type declarations.
socket_type
socket_type()
,
Indicates whether it is a ip socket or a ssl socket.
socket
ip_comm
or ssl
format
depending on the socket_type
.
config_db
config_db()
in function type
declarations.
method
"GET" | "POST" | "HEAD" | "TRACE"
, that is the
HTTP method.
absolute_uri
"http://ServerName:Part/cgi-bin/find.pl?person=jocke"
request_uri
Request-URI
as defined in RFC 1945,
for example "/cgi-bin/find.pl?person=jocke"
http_version
HTTP
version of the request, that is "HTTP/0.9", "HTTP/1.0", or "HTTP/1.1".
request_line
Request-Line
as defined in RFC 1945,
for example "GET /cgi-bin/find.pl?person=jocke HTTP/1.0"
.
parsed_header
[{HeaderKey,HeaderValue}]
, parsed_header
contains all HTTP header
fields from the HTTP-request stored in a list as key-value tuples. See RFC 2616
for a listing of all header fields. For example the date field would be stored as:
{"date","Wed, 15 Oct 1997 14:35:17 GMT"}. RFC 2616 defines that HTTP is a case insensitive
protocol and the header fields may be in lowercase or upper case. Httpd will ensure that all header
field names are in lowe case
.
entity_body
Entity-Body
as defined in RFC 2616,
for example data sent from a CGI-script using the POST method.
connection
true | false
If set to true the connection to the client
is a persistent connections and will not be closed when the request
is served.
A do/1
function typically uses a restricted set of the
mod
record's fields to do its stuff and then returns a
term depending on the outcome, The outcome is either {proceed,NewData} |
{break,NewData} | done
. Which has the following meaning:
{proceed,OldData}
OldData
refers to the data
field in the incoming
mod
record.
{proceed,[{response,{StatusCode,Response}}|OldData]}
Response
) should be sent back
to the client including a status code (StatusCode
) as
defined in RFC 2616.
{proceed,[{response,{response,Head,Body}}|OldData]}
code, allow, cache_control, content_MD5, content_encoding, content_encoding, content_language, content_length, content_location, content_range, content_type, date, etag, expires, last_modified location, pragma, retry_after, server, trailer, transfer_encoding,The key code is a special case since the value to this key is a integer and not a string. The value will be used as status code for the response.
Body
is either the tuple {Fun,Arg} a list
or the atom nobody. If Body is {Fun,Arg} Fun is assumed to be a fun that returns either
close, sent or {ok,Body}. If close is returned the connection to the client will be closed.
If sent is returned the connection to the client will be maintained if the connection is
persitent. If {ok,Body} is returned the Body is sent back to the client as the response body.
{proceed,[{response,{already_sent,StatusCode,Size}}|OldData]}
socket
provided by the mod
record (see above), including a valid status code (StatusCode
)
as defined in RFC 1945 and the size (Size
) of the
response in bytes.
{proceed,[{status,{StatusCode,PhraseArgs,Reason}}}|OldData]}
StatusCode
)
as defined in RFC 1945, a term describing how the client
will be informed (PhraseArgs
) and a reason
(Reason
) to why it happened. Read more about
PhraseArgs
in
httpd_util:message/3.
{break,NewData}
proceed
above but with
one important exception; No more modules in the EWSAPI Module
Sequence are executed. Use with care!
done
socket
provided by the mod
record, the client will typically get
a "Document contains no data...".
Each consecutive module in the EWSAPI Module Sequence can choose to ignore data returned from the previous module either by trashing it or by "enhancing" it. |
Keep in mind that there exist numerous utility functions to help you as an EWSAPI module programmer, e.g. nifty lookup of data in ETS-tables/key-value lists and socket utilities. You are well advised to read httpd_util(3) and httpd_socket(3).
An EWSAPI module can define new config directives thus
making it configurable for a server end-user. This is done by
implementing load/2
(mandatory), store/2
and
remove/1
.
The config file is scanned twice (load/2
and
store/2
) and a cleanup is done (remove/1
) during server
shutdown. The reason for this is: "A directive A can be dependent
upon another directive B which occur either before or
after directive A in the config file". If a directive
does not depend upon other directives; store/2
can be left
out. Even remove/1
can be left out if neither
load/2
nor store/2
open files or create ETS-tables
etc.
load/2
takes two arguments. The first being a row from
the config file, that is a config directive in string format
such as "Port 80"
. The second being a list of key-value
tuples (which can be empty!) defining a context. A context is
needed because there are directives which defines inner
contexts, that is directives within directives, such as
<Directory>.
load/2
is expected to return:
eof
ok
{ok,ContextList}
mod_auth:load/2
).
{ok,ContextList,{DirectiveKey,DirectiveValue}}
{port,80}
.
{ok,ContextList,[{DirectiveKey,DirectiveValue}]}
[{port,80},{foo,on}]
.
{error,Reason}
An example of a load function from mod_log.erl
:
load([$T,$r,$a,$n,$s,$f,$e,$r,$L,$o,$g,$ |TransferLog],[]) -> {ok,[],{transfer_log,httpd_conf:clean(TransferLog)}}; load([$E,$r,$r,$o,$r,$L,$o,$g,$ |ErrorLog],[]) -> {ok,[],{error_log,httpd_conf:clean(ErrorLog)}}.
store/2
takes two arguments. The first being a tuple
describing a directive ({DirectiveKey,DirectiveValue}
) and
the second argument a list of tuples describing all directives
([{DirectiveKey,DirectiveValue}]
). This makes it possible
for directive A to be dependent upon the value of directive B.
store/2
is expected to return:
{ok,{DirectiveKey,NewDirectiveValue}}
load/2
.
{ok,[{DirectiveKey,NewDirectiveValue}]}
load/2
.
{error,Reason}
An example of a store function from mod_log.erl
:
store({error_log,ErrorLog},ConfigList) -> case create_log(ErrorLog,ConfigList) of {ok,ErrorLogStream} -> {ok,{error_log,ErrorLogStream}}; {error,Reason} -> {error,Reason} end.
remove/1
takes the ETS-table representation of the
config-file as input. It is up to you to cleanup anything you
opened or created in load/2
or
store/2
. remove/1
is expected to return:
ok
{error,Reason}
A naive example from mod_log.erl
:
remove(ConfigDB) -> lists:foreach(fun([Stream]) -> file:close(Stream) end, ets:match(ConfigDB,{transfer_log,'$1'})), lists:foreach(fun([Stream]) -> file:close(Stream) end, ets:match(ConfigDB,{error_log,'$1'})), ok.
Modules in the EWSAPI
Module Sequence uses the mod
record's data
field to propagate responses and status messages, as seen
above. This data type can be used in a more versatile fashion. A
module can prepare data to be used by subsequent EWSAPI modules,
for example the mod_alias module
appends the tuple {real_name,string()}
to inform
subsequent modules about the actual file system location for the
current URL.
Before altering the EWSAPI Modules Sequence you are well advised to observe what types of data each module uses and propagates. Read the "EWSAPI Interaction" section for each module.
An EWSAPI module can furthermore export functions to be used by other EWSAPI modules but also for other purposes, for example mod_alias:path/3 and mod_auth:add_user/5. These functions should be described in the module documentation.
When designing an EWSAPI module try to make it self-contained, that is avoid being dependent on other modules both concerning exchange of interaction data and the use of exported functions. If you are dependent on other modules do state this clearly in the module documentation! |
You are well advised to read httpd_util(3) and httpd_conf(3).
If a Web browser connect itself to an SSL enabled server
using a URL not starting with https://
the
server will hang due to an ugly bug in the SSLeay package!