[eeps] EEP 48: Documentation storage and format

Raimo Niskanen raimo+eeps@REDACTED
Wed Jan 10 16:47:23 CET 2018


I have added your proposal to the repository as EEP 48:

    http://erlang.org/eep/eeps/eep-0048.html
    https://github.com/erlang/eep/blob/master/eeps/eep-0048.md

I changed the quoting of atoms and pathnames to use `code` quoting,
changed some strange double and single quote characters into the ASCII
ones, and set José as the main author partly because of him sending in the
EEP gave me an e-mail address to use obfuscated and partly because he
declared the others as co-authors.

Thank you for your contribution!

/ Raimo Niskanen, EEP editor.



On Thu, Jan 04, 2018 at 10:59:58PM +0100, José Valim wrote:
> This EEP proposes an official API documentation storage to be used by
> by BEAM languages.  By standardizing how API documentation is stored,
> it will be possible to write tools that work across languages.
> 
> I want to thank Eric and Radek, who co-authored the proposal, as well as
> Kenneth, Fred, Tristan and Loïc for their feedback.
> 
> See the attached document.
> 
> *José Valimwww.plataformatec.com.br
> <http://www.plataformatec.com.br/>Founder and Director of R&D*

>     Author: Eric Bailey, Radek Szymczyszyn, José Valim
>     Status: Draft
>     Type: Standards Track
>     Created: 04-Jan-2018
>     Post-History:
> ****
> EEP XX: Documentation storage and format
> ----
> 
> 
> 
> Abstract
> ========
> 
> This EEP proposes an official API documentation storage to be used by
> by BEAM languages.  By standardizing how API documentation is stored,
> it will be possible to write tools that work across languages.
> 
> 
> 
> Rationale
> =========
> 
> Currently, different programming languages and libraries running on
> BEAM devise their own schemas for storing and accessing documentation.  
> For example, Elixir and LFE provide a `h` helper in their shell that
> can print the documentation of any module:
> 
>     iex> h String
>     A String in Elixir is a UTF-8 encoded binary.
> 
> However, Elixir is only able to show docs for Elixir modules.  LFE is
> only able to show docs for LFE functions and so on.  If documentation
> is standardized, such features can be easily added to other languages
> in a way that works consistently across all BEAM languages.
> 
> Furthermore, each language ends up building their own tools for
> generating, processing and converting documentation.  We hope a unified
> approach to documentation will improve the compatibility between tools.
> For instance, an Erlang IDE will be able to show inline documentation
> for any module and function, regardless if the function is part of OTP,
> a library or even written in Elixir, LFE or Alpaca.
> 
> **Note**: in this document, the word “documentation” refers exclusively
> to the API documentation of modules and functions.  Guides, tutorials
> and others materials are also essential to projects but not the focus
> of this EEP.
> 
> **Note**: This EEP is not about documentation format.  It is about a
> mechanism for storing documentation to make it easier to produce other
> formats.  For example, a tool can read the documentation and produce man
> pages from it.
> 
> 
> 
> Specification
> =============
> 
> This EEP is divided in three parts.  The first defines the two
> places the documentation can be stored, the second defines the shape of
> the documentation and the third discusses integration with OTP.
> 
> 
> ## Part 1: the "Docs"storage ##
> 
> There are two main mechanisms in which BEAM languages store documentation:
> in the filesystem (usually in the /doc directory) and inside ".beam"
> files. 
> 
> This EEP recognizes both options and aim to support both.  To look for
> documentation for a module name "example", a tool should:
> 
>   1. Look for "example.beam" in the code path, parse the BEAM file and
>      retrieve the "Docs" chunk
> 
>   2. If the chunk is not available, it should look for "example.beam"
>      in the code path and find the "doc/chunks/example.chunk" file in
>      the application that defines the "example" module
> 
>   3. If a ".chunk" file is not available, then documentation is not
>      available
> 
> The choice of using a chunk or the filesystem is completely up to the
> language or library.  In both cases, the documentation can be added or
> removed at any moment by stripping the "Docs" chunk or by removing the
> "doc/chunks" directory.
> 
> For example, languages like Elixir and LFE attach the "Docs" chunk at
> compilation time, which can be controlled via a compiler flag.  On the
> other hand, projects like OTP itself will likely generate the "doc/chunks"
> entries on a separate command, completely unrelated from code compilation.
> 
> 
> ## Part 2: the "Docs" format ##
> 
> 
> In both storages, the documentation is written in the exactly same
> format: an Erlang term serialized to binary via term_to_binary/1.  The
> term may be optionally compressed when serialized and must follow the
> type specification below:
> 
>     {docs_v1,
>      Anno :: erl_anno:anno(),
>      Language :: atom(),
>      Format :: mime_type(),
>      ModuleDoc :: binary() | none | hidden,
>      Metadata :: map(),
>      Docs ::
>        [{{Kind, Name, Arity},
>          Anno :: erl_anno:anno(),
>          Signature :: [binary()],
>          Doc :: binary() | none | hidden,
>          Metadata :: map()
>         }]}
> 
> where in the root tuple we have:
> 
>   * `Anno` - annotation (line, column, file) of the module documentation
>     or of the definition itself (see erl_anno)
> 
>   * `Language` - an atom representing the language, for example:
>     "erlang", "elixir", "lfe", "alpaca", etc
> 
>   * `Format` - the mime type of the documentation, such as "text/markdown"
>     (see the FAQ for a discussion on this field)
> 
>   * `ModuleDoc` - a binary with the documentation or the atom "none"
>     in case there is no documentation or the atom "hidden" if
>     documentation has been explicitly disabled for this entry
> 
>   * `Metadata` - a map of atom keys with any term as value.  This can be
>     used to add annotations like the "authors" of a module, "deprecated",
>     or anything else a language or documentation tool may find relevant
> 
>   * `Docs` - a list of documentation for other entities (such as
>     functions and types) in the module
> 
> For each entry in `Docs`, we have:
> 
>   * `{Kind, Name, Arity}` - the kind, name and arity identifying the
>     function, callback, type, etc.  The official entities are: `function`,
>     `type` and `callback`.  Other languages will add their own. For
>     instance, Elixir and LFE may add `macro`
> 
>   * `Anno` - annotation (line, column, file) of the module documentation
>     or of the definition itself (see erl_anno)
> 
>   * `Signature` - the signature of the entity.  It is is a list of
>     binaries. Each entry represents a binary in the signature that can
>     be joined with a whitespace or a newline.  For example,
>     `["binary_to_atom(Binary, Encoding)", "when is_binary(Binary)"]`
>     may be rendered as as a single line or two lines. It exists
>     exclusively for exhibition purposes
> 
>   * `Doc` - a binary with the documentation or the atom "none"
>     in case there is no documentation or the atom "hidden" if
>     documentation has been explicitly disabled for this entry
> 
>   * `Metadata` - a map of atom keys with any term as value
> 
> This shared format is the heart of the EEP as it is what effectively
> allows cross-language collaboration.
> 
> The `Metadata` field exists to allow languages, tools and libraries to
> add custom information to each entry.  This EEP documents the
> following metadata keys:
> 
>   * `authors := [binary()]` - a list of authors as binaries
> 
>   * `cross_references := [module() | {module(), {Kind, Name, Arity}}]` -
>     a list of modules or module entries that can be used as cross
>     references when generating documentation
> 
>   * `deprecated := binary()` - when present, it means the current entry
>     is deprecated with a binary that represents the reason for
>     deprecation and a recommendation to replace the deprecated code
> 
>   * `since := binary()` - a binary representing the version such entry
>     was added, such as "1.3.0" or "20.0"
> 
> Any key may be added to Metadata at any time.  Keys that are frequently
> used by the community can be standardized in future versions. 
> 
> 
> ## Part 3: Integration with OTP ##
> 
> The last part focuses on integrating the previous parts with OTP docs,
> tools and workflows.  The items below are suggestions and are not
> necessary for the adoption of this EEP, neither by OTP nor by any other
> language or library.
> 
> At this point we should consider changes to OTP such as:
> 
>   * Distributing the `doc/chunks/*.chunk` files as part of OTP and
>     changing the tools that ship with OTP to rely on them. For example,
>     "erl -man lists" could be changed to locate the "lists.chunk" file,
>     parsing the documentation out and then converting it to a man page
>     on the fly.  This task may require multiple changes, as OTP stores
>     documentation on XML files as well as directly in the source code.
>     `edoc` itself should likely be augmented with functions that spit
>     out `.chunk` files from the source code
> 
>   * Adding `h(Module)`, `h(Module, Function, Arity)`, and similar to
>     Erlang’s shell to print the documentation of a module or of a
>     given function and arity. This should be able to print docs any
>     other library or language that implements this proposal
> 
> 
> 
> FAQ
> ===
> 
> *Q: Why do we have a Format entry in the documentation?*
> 
> The main trade-off in the proposal is the documentation format.  We have
> two options:
> 
>   * Allow each language/library/tool to choose their own documentation
>     format
>   * Impose a unified documentation format on all languages
> 
> A unified format for documentation gives no flexibility to languages and
> libraries in choosing how documentation is written.  As the ecosystem
> gets more diverse, it will be unlikely to find a format that suits all.
> For this reason we introduced a Format field that allows each language
> and library to pick their documentation format.  The downside is that,
> if the Elixir docs are written in Markdown and a language does not know
> how to format Markdown, then the language will have to choose to either
> not show the Elixir docs or show them raw (i.e. in Markdown).
> 
> Erlang is in a privileged position.  All languages will be able to
> support whatever format is chosen for Erlang since all languages run on
> Erlang and will have direct access to Erlang's tooling.
> 
> *Q: If I have an Erlang/Elixir/LFE/Alpaca library that uses a custom
> documentation toolkit, will I also be able to leverage this?*
> 
> As long as the documentation ends up up in the "Docs" chunk or inside
> the `doc/chunks` directory, we absolutely do not care how the
> documentation was originally written.  If you use a custom format,
> you may need to teach your language of choice how to render it though.
> See the previous question.
> 
> 
> 
> Copyright
> =========
> 
> This document has been placed in the public domain.
> 
> 
> 
> [EmacsVar]: <> "Local Variables:"
> [EmacsVar]: <> "mode: indented-text"
> [EmacsVar]: <> "indent-tabs-mode: nil"
> [EmacsVar]: <> "sentence-end-double-space: t"
> [EmacsVar]: <> "fill-column: 70"
> [EmacsVar]: <> "coding: utf-8"
> [EmacsVar]: <> "End:"
> [VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "
> _______________________________________________
> eeps mailing list
> eeps@REDACTED
> http://erlang.org/mailman/listinfo/eeps


-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



More information about the eeps mailing list