[erlang-bugs] erts_sys_ddll_open_noext() breaks on Opensolaris 2008.11 (and Nevada > svn95)

Paul Fisher pfisher@REDACTED
Wed Dec 31 22:09:39 CET 2008


The erl_ddll module ultimately depends on the following little gem:

int erts_sys_ddll_open_noext(char *dlname, void **handle)
{
      int ret = ERL_DE_NO_ERROR;
      char *str;
      dlerror();
      *handle = dlopen(dlname, RTLD_NOW);
      if ( (str = dlerror())) {
         /* Remove prefix filename to avoid exploading number of 
errorcodes on
extreme usage */
         if (strstr(str,dlname) == str) {
             char *save_str = str;
             str += strlen(dlname);
             while (*str == ':' || *str == ' ') {
                 ++str;
             }
             if (*str == '\0') { /* Better with filename than nothing... */
                 str = save_str;
             }
         }
         ret = ERL_DE_DYNAMIC_ERROR_OFFSET - find_errcode(str);
      }
      return ret;
}

Unfortunately, this breaks on Opensolaris 2008.11 and Nevada > svn 95 
builds because of what seems to be a change in the behavior of dlerror() 
returning a non-null value even when the dlopen() succeeds in returning 
a handle.  What follows is a test program and output from my system 
(which is up-to-date 2008.11) that reproduces the problem.

Given the following test program:

#include <dlfcn.h>
#include <stdio.h>

int main( int argc, const char * const argv[] )
{
      const char *library = (argc > 1 ? argv[1] : "libuuid.so");
      const char *error;
      void *handle;
      int ret = 0;

      /* clear out any lingering dlerror value just in case. */
      dlerror();

      /* open the specified library and if it fails bail with 
diagnostics. */
      handle = dlopen( library, RTLD_NOW );
      if( handle == 0 )
      {
          printf( "dlopen( \"%s\", RTLD_NOW ) failed\n", library );
          printf( "dlerror() reports the error as:\n  %s\n", dlerror() );

          return 1;
      }

      /* open seems to have worked, check what dlerror thinks of our dlopen
         call. */
      printf( "dlopen( \"%s\", RTLD_NOW ) succeeded\n", library );
      if( (error = dlerror()) != 0 )
      {
          printf( "dlerror() still return returned non-null value:\n  %s\n",
                  error );
          ret = 2;
      }

      /* close the handle, and make sure that succeeds. */
      if( dlclose( handle ) == 0 )
          printf( "dlclose() of handle succeeded\n" );
      else
      {
          printf( "dlclose() of handle failed\n" );
          printf( "dlerror() returned: %s\n", dlerror() );
          ret = 3;
      }

      return ret;
}

compiled like so:

pfisher@REDACTED:/tmp$ gcc -m64 -f -fPIC -c test.c
pfisher@REDACTED:/tmp$ g++ -m64 test.o -o /tmp/test -lsocket -lnsl
-pthread -lumem -lrt -ldl

The following results from specifying libuuid.so to the test program:

pfisher@REDACTED:/tmp$ truss -o /tmp/truss-fails.txt ./test libuuid.so
dlopen( "libuuid.so", RTLD_NOW ) succeeded
dlerror() still return returned non-null value:
    ld.so.1: test: fatal: libmapmalloc.so.1: No such file or directory
dlclose() of handle succeeded

This seems to be related to the behavior that was covered in
http://www.virtualbox.org/ticket/1840 regarding /usr/lib/mps/64/libnspr4.so.

but another library that does not pull in libnspr4.so works just fine:

pfisher@REDACTED:/tmp$ truss -o /tmp/truss.txt ./test libcrypto.so
dlopen( "libcrypto.so", RTLD_NOW ) succeeded
dlclose() of handle succeeded

The seemly obvious solution to this problem is to only fail when 
dlopen() returns a non-NULL value, only then checking the value of 
dlerror().  This would seem to be valid on all unix flavor systems, but 
there are always exceptions...


--
paul



More information about the erlang-bugs mailing list