disk_log

MODULE

disk_log

MODULE SUMMARY

A disk based term logging facility

DESCRIPTION

disk_log is a disk based term logger. It is possible to efficiently log items to files. Each item which is logged is basically appended to a file. Two types of logs are supported, halt logs and wrap logs. For reasons of efficiency, items are always written to the file as binaries.

Two formats of the log files are supported, internal and external. The former uses an internally defined format for the log items. This format makes it possible to perform automatic repair of corrupt log files, and also makes it possible to read the logged terms from the file in a very efficient manner. The latter format leaves the format to the user of the log. If this format is used, the automatic repair and efficient reading of logs cannot be used.

A log file can be opened and closed. Whenever we open a non-existent log file, a new log file will be created. Logs using the internal format must be properly closed. If we try to open a log file which has not been properly closed, the disk_log module will automatically try to repair the log file by searching the file for magic bytes and Erlang terms.

When using the internal format for logs, the functions log/2, log_terms/2, alog/2, and alog_terms/2 should be used. These functions take Erlang terms as argument for logging. When using the external format, the corresponding functions are blog/2, blog_terms/2, balog/2, and balog_terms/2. These functions take a list of bytes, or a binary with a list of bytes, as argument for logging. For example, to log the string "hello" in ASCII format, we can use disk_log:blog(Log, "hello"), or disk_log:blog(Log, list_to_binary("hello"). These two alternatives are equally efficient. The blog/2 functions can be used for internal formatted logs as well, but in this case they must be called with a binary constructed with a call to term_to_binary/1. There is no check to ensure this, it is entirely the responsibility of the caller. If these functions are called with a binary that does not correspond to an Erlang term, the chunk/2,3 and automatic repair functions will fail. The corresponding term (not the binary) will be returned when chunk/2,3 is called.

Logs can be configured to log to files on several nodes at the same time. In this case, the log is globally accessible. It is not guaranteed that all log files contain the same log items. This functionality ensures that as long as at least one of the involved nodes is alive at each time, all log items will be logged.

EXPORTS

alog(Log, Term) -> ok | {error, Reason}
balog(Log, Bytes) -> ok | {error, Reason}

Log = term()

Term = term()

Bytes = binary() | [Byte]

Byte = [Byte] | 0 =< integer() =< 255

Reason = no_such_log

Asynchronously appends an item to the log. Accordingly, this function does not wait for the log process to actually write the object to the file. If the log is opened in read only mode the log attempt naturally fails, but the owner of the log will be notified of the failure only if the log is opened with notify set to true, see open/1.

The alog function is used for logs with format internal, and balog with format external.

alog_terms(Log, TermList) -> ok | {error, Reason}
balog_terms(Log, BytesList) -> ok | {error, Reason}

Log = term()

TermList = [term()]

BytesList = [Bytes]

Bytes = binary() | [Byte]

Byte = [Byte] | 0 =< integer() =< 255

Reason = no_such_log

Asynchronously appends a list of items to the log. If the log is opened in read only mode the log attempt naturally fails, but the owner of the log will be notified of the failure only if the log is opened with notify set to true, see open/1.

The alog_terms function is used for logs with format internal, and balog_terms with format external.

block(Log)
block(Log, QueueLogRecords) -> ok | {error, Reason}

Log = term()

QueueLogRecords = bool()

This function blocks a log. This means that all attempts to use the Log are suspended until the log is unblocked. The process that blocks a log may, however, use the function chunk/2,3. If QueueLogRecords is true, log attempts are suspended until the log is unblocked. If it is false, log records are discarded. Default is true.

chunk(Log, Continuation)
chunk(Log, Continuation, N) -> {Continuation2, Terms} | {Continuation2, Terms, Badbytes} | eof | {error, Reason}

Log = term()

Continuation = start | cont()

N = int() > 0 | infinity

Continuation2 = cont()

Terms= [term()]

Badbytes = integer()

This function makes it possible to efficiently read the terms which have been appended to a log. It minimises disk I/O by reading large 8K chunks from the file.

The first time chunk is called an initial continuation, the atom start, must be provided.

When chunk/3 is called, N controls the maximum number of terms that are read from the log in each chunk. Default is infinity, which means that all the terms contained in the 8K chunk are read. If less than N terms are returned, this does not necessarily mean that end of file is reached.

The chunk function returns a tuple {Continuation2, Terms}, where Terms is a list of terms found in the log. Continuation2 is yet another continuation which must be passed on into any subsequent calls to chunk. With a series of calls to chunk it is then possible to extract all terms from a log.

The chunk function returns a tuple {Continuation2, Terms, Badbytes} if the log is opened in read only mode and the read chunk is corrupt. Badbytes indicates the number of non-Erlang terms found in the chunk. Note also that the log is not repaired.

chunk returns eof when the end of the log is reached, or {error, Reason} if an error occurs.

When chunk/2,3 is used to wrap logs, the returned continuation may or may not be valid in the next call to chunk. This is because the log may wrap and delete the file into which the continuation points. To make sure this does not happen, the log can be blocked during the search.

chunk_step(Log, Continuation, Step) -> {ok, Continuation2} | {error, Reason}

Log = term()

Continuation = start | cont()

Step = int()

Continuation2 = cont()

Reason = end_of_log | term()

This function can be used in conjunction with chunk/2,3 to search through a wrap log. It takes as argument a continuation as returned by chunk/2,3 or chunk_step/3, and steps forward (or backward) Step files in the wrap log. The continuation returned points to the first log item in the new file.

If the wrap log is not full because all files have not been used yet, {error, end_of_log} is returned if trying to step outside the log.

close(Log) -> ok | {error, Reason}

This function closes a log file properly. This must be done before the system is stopped, or a log file with format internal is regarded as unclosed and the automatic repair procedure will be activated the next time the log is opened.

log(Log, Term) -> ok | {error, Reason}
blog(Log, Bytes) -> ok | {error, Reason}

Log = term()

Term = term()

Bytes = binary() | [Byte]

Byte = [Byte] | 0 =< integer() =< 255

Reason = {file_error, FileError} | term()

This function appends the term Term at the end of the log Log. It returns ok or {error, Reason} when the term has been written to disk. Terms are written by means of the normal write() function of the local operating system. Hence, there is no guarantee that the term has actually been written to the disk, it might linger in the operating system kernel for a while. To make sure the log is actually written to disk, the sync/1 function must be called.

The log function is used for logs with format internal, and blog with format external.

If there is an error when writing to file, {error, {file_error, FileError}} is returned, with FileError as returned from file:write.

log_terms(Log, TermList) -> ok | {error, Reason}
blog_terms(Log, BytesList) -> ok | {error, Reason}

Log = term()

TermList = [term()]

BytesList = [Bytes]

Bytes = binary() | [Byte]

Byte = [Byte] | 0 =< integer() =< 255

This function appends a list of items to the log. The difference between this function and the log/2 function is that this function takes every term in TermList and produces a log item from it. This is not the same as logging a list of objects once! This function is more efficient than calling log/2 for each item in the list.

The log_terms function is used for logs with format internal, and blog_terms with format external.

open(ArgL) -> OpenRet

ArgL = [Opt]

Opt = {name, term()} | {file, string()}, {linkto, LinkTo} | {repair, Repair} | {type, Type} | {format, Format} | {size, Size} | {distributed, [Node]} | {notify, bool()} | {head, Head} | {head_func, {M,F,A}} | {mode, Mode}

LinkTo = pid() | none

Repair = true | false | truncate

Type = halt | wrap

Format = internal | external

Size = integer() > 0 | infinity | {MaxBytes, MaxFiles}

MaxBytes = integer() > 0

MaxFiles = integer() > 0

Rec = integer()

Bad = integer()

Node = atom()

Head = none | term()

Mode = read_write | read_only

OpenRet = {ok, Name} | {repaired, Name, {recovered, Rec}, {badbytes, Bad}} | {error, Reason} | DistRet

DistRet = {[{Node, OpenRet}], [{BadNode, {error, Reason}}]}

The ArgL parameter is a list of options which have the following meanings:

{name, Name} specifies the name of the log. This is the name which must be passed as a parameter in all subsequent logging operations. A name must always be supplied.
{file, Filename} specifies the name of the file which will be used to log terms. If this value is omitted and the name is either an atom or a string, the file name will default to lists:concat([Name, ".LOG"]) for halt logs. For wrap logs, this will be the base name of the files. Each file in a wrap log will be called <basename>.N, where N is an integer. Each wrap log will also have a file called <basename>.idx.
{linkto, LinkTo}. The log process can be set up to monitor a pid and then close the file properly if the pid should terminate. This pid is called the owner of the log. If the value none is supplied, the log file will remain open until explicitly closed. By default, the process which calls open owns the log.
{repair, Repair}. If false is given, no automatic repair will be attempted. Instead, the tuple {error, need_repair} is returned if an attempt is made to open log file which was not properly closed. If truncate is given, the log file will be truncated, and thus create an empty log.
{type, Type} is the type of the log. Default is halt. When a halt log reaches its maximum size, all attempts to log more items are rejected.
{format, Format} specifies the format of the log items in the log. Default is internal. With internal format of the items in the log, the log file must be read with the function chunk/2,3. The format of the log file is internally defined, and it is not possible to view the file as ASCII text. With external format, however, all log items are written to the file exactly as they are, and it is the programmer's responsibility to format the items.
{size, Size} specifies the size of the log. Default for halt logs is infinity. When wrap logs are used, the Size parameter is a 2-tuple {MaxBytes, MaxFiles}. The wrap log writes at most MaxBytes bytes on each file, it uses MaxFiles files before it wraps, and it truncates the first file.
{distributed, Nodes}. This option should be used if a log should be globally visible and replicated on several nodes. If the log does not exist on any node, it is created. If the log does exist on some nodes, Nodes are joined to the existing nodes. The recommended way of using this functionality is to open the log with {distributed, [node()]} on each node. The module pg2 is used to address the logs.
{notify, bool()}. If true, the owner of the log is notified when certain events occur in the log. Default if false. The owner is sent one of the following messages when an event occurs:
- {disk_log, Node, Log, {wrap, NoLostItems}} is sent when a wrap log has filled one of its files and a new file is opened.
- {disk_log, Node, Log, {truncated, NoLostItems}} is sent when a log has been truncated, or dumped to a file.
- {disk_log, Node, Log, {read_only, Items}} is sent when an asynchronous log attempt is made to a read only opened log file. Items is the items from the log attempt.
- {disk_log, Node, Log, full} is sent when a halt log is full.
{head, Head} specifies if a header should be written first on the log file. If the log is a wrap log, the Head is written first in each new file.
{head_func, {M,F,A}} specifies that each time a new file is opened, M:F(A) is called. This function is supposed to return {ok, Head}. The Head is written first in each file. The Head should be a term if the format is internal, and a list of bytes (or a binary) otherwise.
{mode, Mode} specifies if the log is opened in read only or read write mode. It defaults to read_write.

The open/1 function returns {ok, Name} if the log file was successfully opened. If the file was successfully repaired, the tuple {repaired, Name, {recovered, Rec}, {badbytes, Bad}} is returned, where Rec is the number of whole Erlang terms found in the file and Bad is the number of bytes in the file which were non-Erlang terms. If the distributed parameter was given to open, the function returns a list of successful replies and a list of erroneous replies. Each reply is tagged with the node name.

The open/1 function ensures that the log server is started. Accordingly, it is not necessary to explicitly start the server first.

The function returns {error, Reason} for all other errors.

reopen(Log, File)
reopen(Log, File, Head)
breopen(Log, File, BHead) -> ok | {error, Reason}

Log = term()

File = string()

Head = term()

BHead = binary() | [Byte]

Byte = [Byte] | 0 =< integer() =< 255

This function first renames the log file to File and then re-creates a new Log file. It is thus very efficient. If the Head or BHead arguments is given, this item is written first in the newly opened log file.

The reopen/3 function is used for logs with format internal, and breopen/3 with format external.

sync(Log) -> ok | {error, Reason}

Log = term()

Ensures that the contents of the log is actually written to the disk. This is usually a pretty expensive operation.

truncate(Log)
truncate(Log, Head)
btruncate(Log, BHead) -> ok | {error, Reason}

Log = term()

Head = term()

BHead = binary() | [Byte]

Byte = [Byte] | 0 =< integer() =< 255

This function truncates a halt log. It cannot be used for wrap logs. If the Head or BHead arguments are given, this item is written first in the newly truncated log.

The truncate/2 function is used for logs with format internal, and btruncate/2 with format external.

unblock(Log) -> ok | {error, Reason}

Log = term()

Reason = not_blocked | no_such_log

This function unblocks a log.

disk_log

MODULE

MODULE SUMMARY

DESCRIPTION

EXPORTS

See Also

AUTHORS