httpd

httpd

httpd
HTTP server API

An implementation of an HTTP 1.1 compliant web server, as defined in RFC 2616. Provides web server start options, administrative functions, and an Erlang callback API.

Type definitions that are used more than once in this module:

boolean() = true | false

string() = list of ASCII characters

path() = string() representing a file or a directory path

ip_address() = {N1,N2,N3,N4} % IPv4 | {K1,K2,K3,K4,K5,K6,K7,K8} % IPv6

hostname() = string() representing a host, for example, "foo.bar.com"

property() = atom()

A web server can be configured to start when starting the Inets application, or dynamically in runtime by calling the Inets application API inets:start(httpd, ServiceConfig) or inets:start(httpd, ServiceConfig, How), see inets(3). The configuration options, also called properties, are as follows:

File Properties

When the web server is started at application start time, the properties are to be fetched from a configuration file that can consist of a regular Erlang property list, that is, [{Option, Value}], where Option = property() and Value = term(), followed by a full stop. If the web server is started dynamically at runtime, a file can still be specified but also the complete property list.

If this property is defined, Inets expects to find all other properties defined in this file. The file must include all properties listed under mandatory properties.

Note

Note support for legacy configuration file with Apache syntax is dropped in OTP-23.

Mandatory Properties

The port that the HTTP server listen to. If zero is specified as port, an arbitrary available port is picked and function httpd:info/2 can be used to determine which port was picked.

The name of your server, normally a fully qualified domain name.

Defines the home directory of the server, where log files, and so on, can be stored. Relative paths specified in other properties refer to this directory.

Defines the top directory for the documents that are available on the HTTP server.

Communication Properties

Default is any

Used together with bind_address and port to uniquely identify a HTTP server. This can be useful in a virtualized environment, where there can be more that one server that has the same bind_address and port. If this property is not explicitly set, it is assumed that the bind_address and port uniquely identifies the HTTP server.

For ip_comm configuration options, see gen_tcp:listen/2, some options that are used internally by httpd cannot be set.

For SSL configuration options, see ssl:listen/2.

Default is ip_comm.

Note

OTP-25 deprecates the communication properties {socket_type, ip_comm | {ip_comm, Config::proplist()} | {essl, Config::proplist()}} replacing it by {socket_type, ip_comm | {ip_comm, Config::proplist()} | {ssl, Config::proplist()}}.

Default is inet, legacy option inet6fb4 no longer makes sense and will be translated to inet.

If given, sets a minimum of bytes per second value for connections.

If the value is unreached, the socket closes for that connection.

The option is good for reducing the risk of "slow DoS" attacks.

Erlang Web Server API Modules

Defines which modules the HTTP server uses when handling requests. Default is [mod_alias, mod_auth, mod_esi, mod_actions, mod_cgi, mod_dir, mod_get, mod_head, mod_log, mod_disk_log]. Notice that some mod-modules are dependent on others, so the order cannot be entirely arbitrary. See the Inets Web Server Modules in the User's Guide for details.

Limit properties

A callback module to customize the inets HTTP servers behaviour see httpd_custom_api

Allows you to disable chunked transfer-encoding when sending a response to an HTTP/1.1 client. Default is false.

Instructs the server whether to use persistent connections when the client claims to be HTTP/1.1 compliant. Default is true.

The number of seconds the server waits for a subsequent request from the client before closing the connection. Default is 150.

Limits the size of the message body of an HTTP request. Default is no limit.

Limits the number of simultaneous requests that can be supported. Default is 150.

Limits the size of the message header of an HTTP request. Default is 10240.

Maximum content-length in an incoming request, in bytes. Requests with content larger than this are answered with status 413. Default is 100000000 (100 MB).

Limits the size of the HTTP request URI. Default is no limit.

The number of requests that a client can do on one connection. When the server has responded to the number of requests defined by max_keep_alive_requests, the server closes the connection. The server closes it even if there are queued request. Default is no limit.

Enforces chunking of a HTTP PUT or POST body data to be delivered to the mod_esi callback. Note this is not supported for mod_cgi. Default is no limit e.i the whole body is delivered as one entity, which could be very memory consuming. mod_esi(3).

Administrative Properties

MimeType = string() and Extension = string(). Files delivered to the client are MIME typed according to RFC 1590. File suffixes are mapped to MIME types before file delivery. The mapping between file suffixes and MIME types can be specified in the property list.

Default is [{"html","text/html"},{"htm","text/html"}].

When the server is asked to provide a document type that cannot be determined by the MIME Type Settings, the server uses this default type.

Defines the email-address of the server administrator to be included in any error messages returned by the server.

Defines the look of the value of the server header.

Example: Assuming the version of Inets is 5.8.1, the server header string can look as follows for the different values of server-tokens:

"" % A Server: header will not be generated

"inets"

"inets/5"

"inets/5.8"

"inets/5.8.1"

"inets/5.8.1 (unix)"

"inets/5.8.1 (unix/linux) OTP/R15B"

"foo/bar"

By default, the value is as before, that is, minimal.

Currently only one option is supported:

Produces logger events on logger level error under the hierarchical logger domain: [otp, inets, httpd, ServerID, error] The built in logger formatting function produces log entries from the error reports:

#{server_name => string()
  protocol => internal | 'TCP' | 'TLS' | 'HTTP',
  transport => "TCP "| "TLS", %% Present when protocol = 'HTTP'
  uri => string(), %% Present when protocol = 'HTTP' and URI is valid
  peer => inet:peername(),
  host => inet:hostname(),
  reason => term()
}

An example of a log entry with only default settings of logger

=ERROR REPORT==== 9-Oct-2019::09:33:27.350235 ===
   Server: My Server
 Protocol: HTTP
Transport: TLS
      URI: /not_there
     Host: 127.0.1.1:80
     Peer: 127.0.0.1:45253
   Reason: [{statuscode,404},{description,"Object Not Found"}]

Using this option makes mod_log and mod_disk_log error logs redundant.

Add the filter

{fun logger_filters:domain/2,
	{log,equal,[otp,inets, httpd, ServerID, error]}
to appropriate logger handler to handle the events. For example to write the error log from an httpd server with a ServerID of my_server to a file you can use the following sys.config:
[{kernel,
 [{logger,
  [{handler, http_error_test, logger_std_h,
    #{config => #{ file => "log/http_error.log" },
      filters => [{inets_httpd, {fun logger_filters:domain/2,
                                 {log, equal,
                                  [otp, inets, httpd, my_server, error]
                                 }}}],
      filter_default => stop }}]}]}].

or if you want to add it to the default logger via an API:

logger:add_handler_filter(default,
                          inets_httpd,
                          {fun logger_filters:domain/2,
                           {log, equal,
                            [otp, inets, httpd, my_server, error]}}).

Defines if access logs are to be written according to the common log format or the extended common log format. The common format is one line looking like this: remotehost rfc931 authuser [date] "request" status bytes.

Here:

Remote.
The remote username of the client (RFC 931).
The username used for authentication.
Date and time of the request (RFC 1123).
The request line as it came from the client (RFC 1945).
The HTTP status code returned to the client (RFC 1945).
The content-length of the document transferred.

The combined format is one line looking like this: remotehost rfc931 authuser [date] "request" status bytes "referer" "user_agent"

In addition to the earlier:

The URL the client was on before requesting the URL (if it could not be determined, a minus sign is placed in this field).
The software the client claims to be using (if it could not be determined, a minus sign is placed in this field).

This affects the access logs written by mod_log and mod_disk_log.

Default is pretty. If the error log is meant to be read directly by a human, pretty is the best option.

pretty has a format corresponding to:

io:format("[~s] ~s, reason: ~n ~p ~n~n", [Date, Msg, Reason]).

compact has a format corresponding to:

io:format("[~s] ~s, reason: ~w ~n", [Date, Msg, Reason]).

This affects the error logs written by mod_log and mod_disk_log.

URL Aliasing Properties - Requires mod_alias

Alias = string() and RealName = string(). alias allows documents to be stored in the local file system instead of the document_root location. URLs with a path beginning with url-path is mapped to local files beginning with directory-filename, for example:

{alias, {"/image", "/ftp/pub/image"}}

Access to http://your.server.org/image/foo.gif would refer to the file /ftp/pub/image/foo.gif.

Re = string() and Replacement = string(). re_write allows documents to be stored in the local file system instead of the document_root location. URLs are rewritten by re:replace/3 to produce a path in the local file-system, for example:

{re_write, {"^/[~]([^/]+)(.*)$", "/home/\\1/public\\2"}}

Access to http://your.server.org/~bob/foo.gif would refer to the file /home/bob/public/foo.gif.

directory_index specifies a list of resources to look for if a client requests a directory using a / at the end of the directory name. file depicts the name of a file in the directory. Several files can be given, in which case the server returns the first it finds, for example:

{directory_index, ["index.html", "welcome.html"]}

Access to http://your.server.org/docs/ would return http://your.server.org/docs/index.html or http://your.server.org/docs/welcome.html if index.html does not exist.

CGI Properties - Requires mod_cgi

Alias = string() and RealName = string(). Have the same behavior as property alias, except that they also mark the target directory as containing CGI scripts. URLs with a path beginning with url-path are mapped to scripts beginning with directory-filename, for example:

{script_alias, {"/cgi-bin/", "/web/cgi-bin/"}}

Access to http://your.server.org/cgi-bin/foo would cause the server to run the script /web/cgi-bin/foo.

Re = string() and Replacement = string(). Have the same behavior as property re_write, except that they also mark the target directory as containing CGI scripts. URLs with a path beginning with url-path are mapped to scripts beginning with directory-filename, for example:

{script_re_write, {"^/cgi-bin/(\\d+)/", "/web/\\1/cgi-bin/"}}

Access to http://your.server.org/cgi-bin/17/foo would cause the server to run the script /web/17/cgi-bin/foo.

If script_nocache is set to true, the HTTP server by default adds the header fields necessary to prevent proxies from caching the page. Generally this is preferred. Default to false.

The time in seconds the web server waits between each chunk of data from the script. If the CGI script does not deliver any data before the timeout, the connection to the client is closed. Default is 15.

MimeType = string() and CgiScript = string(). action adds an action activating a CGI script whenever a file of a certain MIME type is requested. It propagates the URL and file path of the requested document using the standard CGI PATH_INFO and PATH_TRANSLATED environment variables.

Example:

{action, {"text/plain", "/cgi-bin/log_and_deliver_text"}}

Method = string() and CgiScript = string(). script adds an action activating a CGI script whenever a file is requested using a certain HTTP method. The method is either GET or POST, as defined in RFC 1945. It propagates the URL and file path of the requested document using the standard CGI PATH_INFO and PATH_TRANSLATED environment variables.

Example:

{script, {"PUT", "/cgi-bin/put"}}

ESI Properties - Requires mod_esi

URLPath = string() and AllowedModule = atom(). erl_script_alias marks all URLs matching url-path as erl scheme scripts. A matching URL is mapped into a specific module and function, for example:

{erl_script_alias, {"/cgi-bin/example", [httpd_example]}}

A request to http://your.server.org/cgi-bin/example/httpd_example:yahoo would refer to httpd_example:yahoo/3 or, if that does not exist, httpd_example:yahoo/2 and http://your.server.org/cgi-bin/example/other:yahoo would not be allowed to execute.

If erl_script_nocache is set to true, the server adds HTTP header fields preventing proxies from caching the page. This is generally a good idea for dynamic content, as the content often varies between each request. Default is false.

If erl_script_timeout sets the time in seconds the server waits between each chunk of data to be delivered through mod_esi:deliver/2. Default is 15. This is only relevant for scripts that use the erl scheme.

Log Properties - Requires mod_log

Defines the filename of the error log file to be used to log server errors. If the filename does not begin with a slash (/), it is assumed to be relative to the server_root.

Defines the filename of the access log file to be used to log security events. If the filename does not begin with a slash (/), it is assumed to be relative to the server_root.

Defines the filename of the access log file to be used to log incoming requests. If the filename does not begin with a slash (/), it is assumed to be relative to the server_root.

Disk Log Properties - Requires mod_disk_log

Defines the file format of the log files. See disk_log for details. If the internal file format is used, the log file is repaired after a crash. When a log file is repaired, data can disappear. When the external file format is used, httpd does not start if the log file is broken. Default is external.

Defines the filename of the (disk_log(3)) error log file to be used to log server errors. If the filename does not begin with a slash (/), it is assumed to be relative to the server_root.

MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the (disk_log(3)) error log file. This file is of type wrap log and max bytes is written to each file and max files is used before the first file is truncated and reused.

Defines the filename of the (disk_log(3)) access log file logging incoming security events, that is, authenticated requests. If the filename does not begin with a slash (/), it is assumed to be relative to the server_root.

MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the disk_log(3) access log file. This file is of type wrap log and max bytes is written to each file and max files is used before the first file is truncated and reused.

Defines the filename of the (disk_log(3)) access log file logging incoming requests. If the filename does not begin with a slash (/), it is assumed to be relative to the server_root.

MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the disk_log(3) access log file. This file is of type wrap log and max bytes is written to each file and max files is used before the first file is truncated and reused.

Authentication Properties - Requires mod_auth

{directory, {path(), [{property(), term()}]}}

The properties for directories are as follows:

Defines a set of hosts to be granted access to a given directory, for example:

{allow_from, ["123.34.56.11", "150.100.23"]}

The host 123.34.56.11 and all machines on the 150.100.23 subnet are allowed access.

Defines a set of hosts to be denied access to a given directory, for example:

{deny_from, ["123.34.56.11", "150.100.23"]}

The host 123.34.56.11 and all machines on the 150.100.23 subnet are not allowed access.

Sets the type of authentication database that is used for the directory. The key difference between the different methods is that dynamic data can be saved when Mnesia and Dets are used.

Sets the name of a file containing the list of users and passwords for user authentication. The filename can be either absolute or relative to the server_root. If using the plain storage method, this file is a plain text file where each line contains a username followed by a colon, followed by the non-encrypted password. If usernames are duplicated, the behavior is undefined.

Example:

 ragnar:s7Xxv7
 edward:wwjau8 

If the Dets storage method is used, the user database is maintained by Dets and must not be edited by hand. Use the API functions in module mod_auth to create/edit the user database. This directive is ignored if the Mnesia storage method is used. For security reasons, ensure that auth_user_file is stored outside the document tree of the web server. If it is placed in the directory that it protects, clients can download it.

Sets the name of a file containing the list of user groups for user authentication. The filename can be either absolute or relative to the server_root. If the plain storage method is used, the group file is a plain text file, where each line contains a group name followed by a colon, followed by the members usernames separated by spaces.

Example:

group1: bob joe ante

If the Dets storage method is used, the group database is maintained by Dets and must not be edited by hand. Use the API for module mod_auth to create/edit the group database. This directive is ignored if the Mnesia storage method is used. For security reasons, ensure that the auth_group_file is stored outside the document tree of the web server. If it is placed in the directory that it protects, clients can download it.

Sets the name of the authorization realm (auth-domain) for a directory. This string informs the client about which username and password to use.

If set to other than "NoPassword", the password is required for all API calls. If the password is set to "DummyPassword", the password must be changed before any other API calls. To secure the authenticating data, the password must be changed after the web server is started. Otherwise it is written in clear text in the configuration file.

Defines users to grant access to a given directory using a secret password.

Defines users to grant access to a given directory using a secret password.

Security Properties - Requires mod_security

{security_directory, {path(), [{property(), term()}]}}

The properties for the security directories are as follows:

Name of the security data file. The filename can either be absolute or relative to the server_root. This file is used to store persistent data for module mod_security.

Specifies the maximum number of attempts to authenticate a user before the user is blocked out. If a user successfully authenticates while blocked, the user receives a 403 (Forbidden) response from the server. If the user makes a failed attempt while blocked, the server returns 401 (Unauthorized), for security reasons. Default is 3. Can be set to infinity.

Specifies the number of minutes a user is blocked. After this time has passed, the user automatically regains access. Default is 60.

Specifies the number of minutes a failed user authentication is remembered. If a user authenticates after this time has passed, the previous failed authentications are forgotten. Default is 30.

Specifies the number of seconds a successful user authentication is remembered. After this time has passed, the authentication is no longer reported. Default is 30.

Types

Pid = pid()
Properties = [atom()]
HttpInformation =
    [CommonOption] |
    [CommunicationOption] |
    [ModOption] |
    [LimitOption] |
    [AdminOption]
CommonOption =
    {port, integer() >= 0} |
    {server_name, string()} |
    {server_root, Path} |
    {document_root, Path}
CommunicationOption =
    {bind_address, inet:ip_address() | inet:hostname() | any} |
    {profile, atom()} |
    {socket_type,
     ip_comm |
     {ip_comm, ssl:tls_option() | gen_tcp:option()} |
     {ssl, ssl:tls_option() | gen_tcp:option()}} |
    {ipfamily, inet | inet6} |
    {minimum_bytes_per_second, integer()}
ModOption = {modules, atom()}
LimitOption =
    {customize, atom()} |
    {disable_chunked_transfer_encoding_send, boolean()} |
    {keep_alive, boolean()} |
    {keep_alive_timeout, integer()} |
    {max_body_size, integer()} |
    {max_clients, integer()} |
    {max_header_size, integer()} |
    {max_content_length, integer()} |
    {max_uri_size, integer()} |
    {max_keep_alive_request, integer()} |
    {max_client_body_chunk, integer()}
AdminOption =
    {mime_types,
     [{MimeType :: string(), Extension :: string()}] | Path} |
    {mime_type, string()} |
    {server_admin, string()} |
    {server_tokens,
     none | prod | major | minor | minimal | os | full |
     {private, string()}} |
    {logger, Options :: list()} |
    {log_format, common | combined} |
    {error_log_format, pretty | compact}

Fetches information about the HTTP server. When called with only the pid, all properties are fetched. When called with a list of specific properties, they are fetched. The available properties are the same as the start options of the server.

Note

Pid is the pid returned from inets:start/[2,3]. Can also be retrieved form inets:services/0 and inets:services_info/0, see inets(3).

Types

Address = inet:ip_address() | any
Port = integer()
Profile = atom()
Properties = [atom()]
HttpInformation =
    [CommonOption] |
    [CommunicationOption] |
    [ModOption] |
    [LimitOption] |
    [AdminOption]
CommonOption =
    {port, integer() >= 0} |
    {server_name, string()} |
    {server_root, Path} |
    {document_root, Path}
CommunicationOption =
    {bind_address, inet:ip_address() | inet:hostname() | any} |
    {profile, atom()} |
    {socket_type,
     ip_comm |
     {ip_comm, ssl:tls_option() | gen_tcp:option()} |
     {ssl, ssl:tls_option() | gen_tcp:option()}} |
    {ipfamily, inet | inet6} |
    {minimum_bytes_per_second, integer()}
ModOption = {modules, atom()}
LimitOption =
    {customize, atom()} |
    {disable_chunked_transfer_encoding_send, boolean()} |
    {keep_alive, boolean()} |
    {keep_alive_timeout, integer()} |
    {max_body_size, integer()} |
    {max_clients, integer()} |
    {max_header_size, integer()} |
    {max_content_length, integer()} |
    {max_uri_size, integer()} |
    {max_keep_alive_request, integer()} |
    {max_client_body_chunk, integer()}
AdminOption =
    {mime_types,
     [{MimeType :: string(), Extension :: string()}] | Path} |
    {mime_type, string()} |
    {server_admin, string()} |
    {server_tokens,
     none | prod | major | minor | minimal | os | full |
     {private, string()}} |
    {logger, Options :: list()} |
    {log_format, common | combined} |
    {error_log_format, pretty | compact}

Fetches information about the HTTP server. When called with only Address and Port, all properties are fetched. When called with a list of specific properties, they are fetched. The available properties are the same as the start options of the server.

Note

The Address must be the IP address and cannot be the hostname.

Types

Config = file:name_all() | [{Option, Value}]
Mode = non_disturbing | disturbing | blocked
Option = atom()
Value = Reason = term()

Reloads the HTTP server configuration without restarting the server. Incoming requests are answered with a temporary down message during the reload time.

Note

Available properties are the same as the start options of the server, but the properties bind_address and port cannot be changed.

If mode is disturbing, the server is blocked forcefully, all ongoing requests terminates, and the reload starts immediately. If mode is non-disturbing, no new connections are accepted, but ongoing requests are allowed to complete before the reload is done.

The Erlang web server API data types are as follows:

      ModData = #mod{}

      -record(mod, {
		data = [],
		socket_type = ip_comm,
		socket, 
		config_db,
		method,
		absolute_uri,
		request_uri,
		http_version,
		request_line,
		parsed_header = [],
		entity_body,
		connection
	}).

To access the record in your callback-module use:

 -include_lib("inets/include/httpd.hrl").

The fields of record mod have the following meaning:

Type [{InteractionKey,InteractionValue}] is used to propagate data between modules. Depicted interaction_data() in function type declarations.

socket_type() indicates whether it is an IP socket or an ssl socket.

The socket, in format ip_comm or ssl, depending on socket_type.

The config file directives stored as key-value tuples in an ETS table. Depicted config_db() in function type declarations.

Type "GET" | "POST" | "HEAD" | "TRACE", that is, the HTTP method.

If the request is an HTTP/1.1 request, the URI can be in the absolute URI format. In that case, httpd saves the absolute URI in this field. An Example of an absolute URI is "http://ServerName:Part/cgi-bin/find.pl?person=jocke"

The Request-URI as defined in RFC 1945, for example, "/cgi-bin/find.pl?person=jocke".

The HTTP version of the request, that is, "HTTP/1.0", or "HTTP/1.1".

The Request-Line as defined inRFC 1945, for example, "GET /cgi-bin/find.pl?person=jocke HTTP/1.0".

Type [{HeaderKey,HeaderValue}]. parsed_header contains all HTTP header fields from the HTTP request stored in a list as key-value tuples. See RFC 2616 for a listing of all header fields. For example, the date field is stored as {"date","Wed, 15 Oct 1997 14:35:17 GMT"}. RFC 2616 defines that HTTP is a case-insensitive protocol and the header fields can be in lower case or upper case. httpd ensures that all header field names are in lower case.

The entity-Body as defined in RFC 2616, for example, data sent from a CGI script using the POST method.

true | false. If set to true, the connection to the client is a persistent connection and is not closed when the request is served.

Types

OldData = list()
NewData = [{response,{StatusCode,Body}}]
| [{response,{response,Head,Body}}]
| [{response,{already_sent,Statuscode,Size}}]
StatusCode = integer()
Body = io_list() | nobody | {Fun, Arg}
Head = [HeaderOption]
HeaderOption = {Option, Value} | {code, StatusCode}
Option = accept_ranges | allow
| cache_control | content_MD5
| content_encoding | content_language
| content_length | content_location
| content_range | content_type | date
| etag | expires | last_modified
| location | pragma | retry_after
| server | trailer | transfer_encoding
Value = string()
Fun = fun( Arg ) -> sent| close | Body
Arg = [term()]

When a valid request reaches httpd, it calls do/1 in each module, defined by the configuration option of Module. The function can generate data for other modules or a response that can be sent back to the client.

The field data in ModData is a list. This list is the list returned from the last call to do/1.

Body is the body of the HTTP response that is sent back to the client. An appropriate header is appended to the message. StatusCode is the status code of the response, see RFC 2616 for the appropriate values.

Head is a key value list of HTTP header fields. The server constructs an HTTP header from this data. See RFC 2616 for the appropriate value for each header field. If the client is an HTTP/1.0 client, the server filters the list so that only HTTP/1.0 header fields are sent back to the client.

If Body is returned and equal to {Fun,Arg}, the web server tries apply/2 on Fun with Arg as argument. The web server expects that the fun either returns a list (Body) that is an HTTP response, or the atom sent if the HTTP response is sent back to the client. If close is returned from the fun, something has gone wrong and the server signals this to the client by closing the connection.

Types

ConfigDB = ets_table()
Reason = term()

When httpd is shut down, it tries to execute remove/1 in each Erlang web server callback module. The programmer can use this function to clean up resources created in the store function.

Types

Line = string()
Option = property()
Config = [{Option, Value}]
Value = term()
Reason = term()

Checks the validity of the configuration options before saving them in the internal database. This function can also have a side effect, that is, setup of necessary extra resources implied by the configuration option. It can also resolve possible dependencies among configuration options by changing the value of the option. This function only needs clauses for the options implemented by this particular callback module.