[erlang-questions] Embedded vs Interactive - Why embedded?

Fri Mar 11 01:02:29 CET 2016

On 03/10/2016 05:53 PM, Michael Truog wrote:
> On 03/10/2016 01:39 PM, Ryan wrote:
>> On 03/10/2016 02:43 PM, Michael Truog wrote:
>>> The embedded/interactive functionality really is focused on module
>>> loading either at startup or lazily, as you have described.  I only
>>> mentioned initialization and configuration source code, due to how this
>>> fail-fast concept can be applied to source code.  While it may seem that
>>> the embedded/interactive choice is not an important one, with execution
>>> generally happening in the same way, it can be important due to some
>>> code paths being infrequent and problems with the dependencies like
>>> modules with the same name (and unfortunately sometimes it then depends
>>> on the search directory order, which can lead to problems during
>>> execution that are counter-intuitive).
>>>
>> I can totally understand the "modules with same name" problem, being
>> from a Java background, where the same problem happens with class
>> names. In this thread, a fellow named Max Lapshin indicated that he's
>> using interactive mode and loading all modules at startup. Do you
>> think that doing that would meet this concern of yours, assuming that
>> during loading, one could detect if there were multiple modules of the
>> same name on the code path? I'm not sure how to do that, but surely
>> there's a way.
>
> I only believe interactive mode should be used when using the Erlang
> shell manually, or when running CT tests or eunit tests.  This is to
> make sure everything is fail-fast.  I understand you are concerned with
> concrete justification of that, which I will have below. Dealing with
> modules that have the same name is something that should be
> automatically caught when a release is built, but there are probably
> situations when that doesn't happen.  At the very least release creation
> should fail when seeing the same application in separate paths, during
> an attempt to build the release.  So it should be easy to see that
> release creation helps avoid errors, instead of haphazard loading of
> modules randomly (due to the function call path) with interactive mode.
>
>>
>>> I meant using interactive mode for manual usage of the Erlang shell, not
>>> real testing of a release.  Only using interactive mode for development
>>> testing of random segments of Erlang source code.  Even that usage of
>>> interactive mode can be problematic due to the undocumented differences
>>> between the Erlang shell execution and normal Erlang module execution.
>>> So, all releases for testing and production should be real releases
>>> running in embedded mode.  The interactive mode just helps you quickly
>>> use the Erlang shell to check stuff.
>>>
>> I think this gets to the heart of what I'm trying to figure out: are
>> there "undocumented differences" between interactive and embedded mode
>> that are going to affect a long-running, production deployment? I'm
>> assuming that what you refer to as "normal Erlang module execution" is
>> embedded mode, and that's what this thread is all about. Why is
>> embedded mode considered "normal"? Why can't we just run interactive
>> mode everywhere? Assuming your concerns about dependencies can be
>> addressed in a fail-fast way, is there really a good reason not to?
>
> The "undocumented differences" I am referring to are specific to using
> the Erlang shell for executing Erlang source code when compared to
> compiled BEAM in Erlang modules.  Some execution differences exist,
> which should discourage any usage of the Erlang shell for any type of
> production deployment, even if it is only for a testing host.  The
> embedded/interactive mode concern is separate and focused on how modules
> are loaded, but interactive is the default for normal Erlang shell usage.
>
>>
>>> That can be weird.  I know there can be problems with reltool including
>>> dependencies that are not dependencies of the main application, due to
>>> xref being used internally by reltool instead of just looking at the
>>> .app dependencies.  That only affects using applications dynamically
>>> though, and doing that is uncommon. Normally all the Erlang applications
>>> are part of a static hierarchy and you only use a single boot file
>>> during the lifetime of the Erlang VM.
>>>
>> Ahh, xref would explain it. I think you're kind of proving my point
>> about dependencies, though. If I can concretely say that my dependency
>> list is [X,Y,Z], then why introduce the complexity overhead of the
>> erlang release tools? They only seem to obfuscate things and introduce
>> more opportunity for mistakes. They certainly have some surprising
>> behavior built in. If I know my dependencies, then I should put the
>> same set of dependencies in all environments, and load the code the
>> same way everywhere. Isn't that the most foolproof way to ensure no
>> surprises?
>
> This is a surprise if you include Erlang application that aren't
> explicitly started by the release.  The release is a necessary thing to
> make sure all the files are in a single place, to be used together, so
> they can be managed as a release.  You should understand that having a
> release is beneficial from other development outside of Erlang.  If you
> don't believe in having releases, then all hope may be lost :-)  This is
> really about managing how software changes to avoid risk.  You may think
> you are fixing source code by changing it, but often you are really
> breaking something else.  You should not need concrete evidence of this,
> since you are human and not perfect like the computer :-) (i.e., doing
> only what we tell it).
>
>>
>> As to the boot file, I haven't tested extensively, but I believe you
>> can use a boot file with either embedded or interactive mode, and it
>> works the same way except for the eager/lazy module loading.
>>
>>> Yes, release building is a build-time concern, but making sure the
>>> release is ran in a dependable way is what relates to the
>>> embedded/interactive mode decision.  Always using the embedded mode when
>>> a release is ran will help make sure the release is executed
>>> dependably.  You may have the initial startup cost of loading all your
>>> modules due to embedded mode but that delay is very small with the
>>> Erlang VM and I have never seen it as a problem (even with an ARM and
>>> slow SSD memory).
>>>
>> I'm not concerned with the initial loading cost of modules. I'm
>> concerned with bugs going to production. I think what you said here is
>> what's the primary thrust of my question: "Always using the embedded
>> mode when a release is ran will help make sure the release is executed
>> dependably."
>>
>> That's the sort of assertion I've found scattered around the net, and
>> that's what I'm questioning here. How do you know that? How can
>> embedded be more dependable than interactive when the only difference
>> is in eager vs. lazy module loading? Keep in mind that we agree that
>> dependency management is a *build-time* concern, so the proper
>> dependencies are theoretically guaranteed to be in place by the time
>> the system is started. How is the runtime mode of the system going to
>> affect its dependability?
>>
>> For follow-up, can you give me some concrete example(s) in which
>> running in embedded mode would (did) prevent some issue that running
>> in interactive wouldn't catch?
>
> If there are any problems with the filesystem that are intermittent, you
> may see them randomly with interactive mode, but at least you would see
> them on startup with embedded mode.  Some modules, like NIFs, can do
> stuff when they load, and that is best done all at once, to see if
> something breaks, rather than waiting an undefined amount of time later
> to find out about your problem in testing or production.  If execution
> changes based on the modules being loaded, that can make the situation
> more complex.  Having a known starting state is important for having
> dependable results in the system, and interactive mode only makes your
> starting state random based on how you change the build and source code.
>
> I am not sure if that is concrete enough, but having the system be
> fail-fast is a pretty fundamental concept.  I have worked on a system
> that took a few days to die (each time requiring a few days, with
> specific input) due to a single bug, and that can quickly teach you that
> fail-fast is beneficial.  Trying to solve all problems as a practice of
> fire-fighting doesn't lead to a dependable service, it just creates
> drama that consumes extra time and money, even if it was meant as a
> method of feeding egos.
>
I hear what you're saying, and I need to think about it for a little 
while. My one thought right now is that the whole "fail fast on load" 
thing can still be achieved in interactive mode if you manually ensure 
all modules are loaded at startup. You don't need the full release and 
embedded mode to do that.