[erlang-bugs] r15b03-1 SEGV in erts_port_task_execute()

Anthony Ramine n.oxyde@REDACTED
Tue Jul 1 01:32:46 CEST 2014


The pop_task() function was removed in 6e01408aba71e26884c5db81b8e4fa89bd803576, which was in R16A.

-- 
Anthony Ramine

Le 18 juin 2014 à 12:38, Mikael Pettersson <mikpelinux@REDACTED> a écrit :

> One of our nodes running OTP r15b03-1 segfaulted yesterday evening.
> Unfortunately it didn't produce a usable core dump (configuration
> problem, sigh), but the kernel logged the address of the instruction
> that faulted and the address it tried to access.  The segfault
> turned out to be in erts_port_task_execute:
> 
> int
> erts_port_task_execute(ErtsRunQueue *runq, Port **curr_port_pp)
> {
>    int port_was_enqueued = 0;
>    Port *pp;
>    ErtsPortTaskQueue *ptqp;
>    ErtsPortTask *ptp;
>    int res = 0;
>    int reds = ERTS_PORT_REDS_EXECUTE;
>    erts_aint_t io_tasks_executed = 0;
>    int fpe_was_unmasked;
>    Uint64 start_time = 0;
> 
>    ERTS_SMP_LC_ASSERT(erts_smp_lc_runq_is_locked(runq));
> 
>    ERTS_PT_CHK_PORTQ(runq);
> 
>    pp = pop_port(runq);
>    if (!pp) {
> 	res = 0;
> 	goto done;
>    }
> 
>    ERTS_PORT_NOT_IN_RUNQ(pp);
> 
>    *curr_port_pp = pp;
> 
>    ASSERT(pp->sched.taskq);
>    ASSERT(pp->sched.taskq->first);
>    ptqp = pp->sched.taskq;
>    pp->sched.taskq = NULL;
> 
>    ASSERT(!pp->sched.exe_taskq);
>    pp->sched.exe_taskq = ptqp;
> 
>    if (erts_smp_port_trylock(pp) == EBUSY) {
> 	erts_smp_runq_unlock(runq);
> 	erts_smp_port_lock(pp);
> 	erts_smp_runq_lock(runq);
>    }
> 
>    if (erts_sched_stat.enabled) {
> 	ErtsSchedulerData *esdp = erts_get_scheduler_data();
> 	Uint old = ERTS_PORT_SCHED_ID(pp, esdp->no);
> 	int migrated = old && old != esdp->no;
> 
> 	erts_smp_spin_lock(&erts_sched_stat.lock);
> 	erts_sched_stat.prio[ERTS_PORT_PRIO_LEVEL].total_executed++;
> 	erts_sched_stat.prio[ERTS_PORT_PRIO_LEVEL].executed++;
> 	if (migrated) {
> 	    erts_sched_stat.prio[ERTS_PORT_PRIO_LEVEL].total_migrated++;
> 	    erts_sched_stat.prio[ERTS_PORT_PRIO_LEVEL].migrated++;
> 	}
> 	erts_smp_spin_unlock(&erts_sched_stat.lock);
>    }
> 
>    /* trace port scheduling, in */
>    if (IS_TRACED_FL(pp, F_TRACE_SCHED_PORTS)) {
> 	trace_sched_ports(pp, am_in);
>    }
> 
>    ERTS_SMP_LC_ASSERT(erts_lc_is_port_locked(pp));
> 
>    ERTS_PT_CHK_PRES_PORTQ(runq, pp);
>    ptp = pop_task(ptqp);
> 
> At this point ptqp is NULL, so the initial load in the pop_task()
> code faults.  This is not a debug build, so the assertions above
> didn't catch this condition.
> 
> I don't know if this is repeatable; we've never seen it before.
> The machine was doing a lot of port I/O at the time (generating
> pdf report files).
> 
> This is mostly an FYI at this point.  If someone thinks they recognize
> the problem and can point to a fix in a later release that'd be great.
> 
> /Mikael
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs




More information about the erlang-bugs mailing list