Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate segfault when calling h.<function>_<suffix> #1283

Open
JCGoran opened this issue May 29, 2024 · 1 comment
Open

Investigate segfault when calling h.<function>_<suffix> #1283

JCGoran opened this issue May 29, 2024 · 1 comment

Comments

@JCGoran
Copy link
Contributor

JCGoran commented May 29, 2024

Below is a preliminary writeup about a segfault caused by calling h.<function>_<suffix> in a Python script.

  • NEURON version: commit 71bf299335f6f2816c23a8002d9a75ff66528682
  • NMODL version: commit bd47049

Take the following mod file:

: minimal.mod
NEURON {
    SUFFIX minimal
}

FUNCTION f() {
    f = 1
}

Compiling it with NMODL works:

$ nrnivmodl -nmodl $(which nmodl) minimal.mod
[NMODL][warning] Code generation with NMODL is pre-alpha, lacks features and is intended only for development use
/Users/jelic/software/nmodl-clean/test/usecases/empty
cfiles =
Mod files: "minimal.mod"

Creating 'arm64' directory for .o files.

MODOBJS= ./minimal.o
 -> Compiling mod_func.cpp
 -> NMODL ../minimal.mod
 -> Compiling /arm64/minimal.cpp
 => LINKING shared library "/arm64/./libnrnmech.dylib"
 => LINKING executable "/arm64/./special" LDFLAGS are:
ld: warning: ignoring duplicate libraries: '-lnrnmech'
Successfully created arm64/special

Unfortunately, running the following Python script:

# sim.py
from neuron import h
s = h.Section()
s.insert("minimal")
h.f_minimal()

causes a segfault when running via nrniv sim.py (one can equivalently run with python sim.py, but then debugging is cumbersome).
Running under the LLDB debugger reveals:

(lldb) run
Process 39954 launched: '/nrn/build-arm64/install/bin/nrniv' (arm64)
NEURON -- VERSION 9.0a-243-g30b42a1b8+ master (30b42a1b8+) 2024-05-14
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2022
See http://neuron.yale.edu/neuron/credits

loading membrane mechanisms from arm64/.libs/libnrnmech.so
Additional mechanisms from files
 "minimal.mod"
Process 39954 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001001dceb8 libnrnmech.so`double* neuron::cache::MechanismRange<1ul, 0ul>::data_array<0, 1>(this=0x000000016fdfdfd8, instance=0) at mechanism_range.hpp:87:26
   84       [[nodiscard]] double* data_array(std::size_t instance) {
   85           static_assert(variable < NumFloatingPointFields);
   86           // assert(array_size == m_data_array_dims[variable]);
-> 87           return std::next(m_data_ptrs[variable], array_size * (m_offset + instance));
   88       }
   89
   90       template <int variable, int array_size>
Target 0: (nrniv) stopped.

The full backtrace being:

  * frame #0: 0x00000001001dceb8 libnrnmech.so`double* neuron::cache::MechanismRange<1ul, 0ul>::data_array<0, 1>(this=0x000000016fdfdfd8, instance=0) at mechanism_range.hpp:87:26
    frame #1: 0x00000001001dce90 libnrnmech.so`double* neuron::cache::MechanismRange<1ul, 0ul>::fpfield_ptr<0>(this=0x000000016fdfdfd8) at mechanism_range.hpp:109:16
    frame #2: 0x00000001001dc99c libnrnmech.so`neuron::make_instance_minimal(_ml=0x000000016fdfdfd8) at minimal.cpp:102:26
    frame #3: 0x00000001001df8dc libnrnmech.so`neuron::_hoc_f() at minimal.cpp:182:21
    frame #4: 0x0000000101c47d30 libnrniv.dylib`hoc_call() at code.cpp:1418:9
    frame #5: 0x0000000101d2dae0 libnrniv.dylib`fcall(vself=0x0000000100af6b70, vargs=0x000000010011c040) at nrnpy_hoc.cpp:728:9
    frame #6: 0x0000000101b8a700 libnrniv.dylib`OcJump::fpycall(f=(libnrniv.dylib`fcall(void*, void*) at nrnpy_hoc.cpp:671), a=0x0000000100af6b70, b=0x000000010011c040) at ocjump.cpp:138:16
    frame #7: 0x0000000101d2cd98 libnrniv.dylib`hocobj_call(self=0x0000000100af6b70, args=0x000000010011c040, kwrds=0x0000000000000000) at nrnpy_hoc.cpp:796:45
    frame #8: 0x00000001003b180c Python`_PyObject_MakeTpCall + 132
    frame #9: 0x000000010048b57c Python`call_function + 268
    frame #10: 0x0000000100486124 Python`_PyEval_EvalFrameDefault + 22388
    frame #11: 0x000000010047fb6c Python`_PyEval_EvalCode + 416
    frame #12: 0x00000001004cc458 Python`run_eval_code_obj + 136
    frame #13: 0x00000001004cc388 Python`run_mod + 112
    frame #14: 0x00000001004cada8 Python`pyrun_file + 168
    frame #15: 0x00000001004ca7e4 Python`pyrun_simple_file + 252
    frame #16: 0x00000001004ca6a8 Python`PyRun_SimpleFileExFlags + 80
    frame #17: 0x0000000101d274c0 libnrniv.dylib`nrnpy_pyrun(fname="sim.py") at nrnpython.cpp:134:26
    frame #18: 0x0000000101c71f0c libnrniv.dylib`hoc_moreinput() at hoc.cpp:1133:14
    frame #19: 0x0000000101c71a08 libnrniv.dylib`hoc_main1(argc=2, argv=0x000000016fdfeed0, envp=0x000000016fdfeee8) at hoc.cpp:917:16
    frame #20: 0x00000001017d9784 libnrniv.dylib`ivocmain_session(argc=2, argv=0x000000016fdfeed0, env=0x000000016fdfeee8, start_session=1) at ivocmain.cpp:744:23
    frame #21: 0x00000001017d909c libnrniv.dylib`ivocmain(argc=2, argv=0x000000016fdfeed0, env=0x000000016fdfeee8) at ivocmain.cpp:349:12
    frame #22: 0x0000000100003b0c nrniv`main(argc=2, argv=0x000000016fdfeed0, env=0x000000016fdfeee8) at nrnmain.cpp:71:12

The issue seems to be that we are trying to dereference a nullptr:

(lldb) p m_data_ptrs
(double *const *) 0x0000000000000000

Going up a couple of frames:

(lldb) up
frame #1: 0x00000001001dce90 libnrnmech.so`double* neuron::cache::MechanismRange<1ul, 0ul>::fpfield_ptr<0>(this=0x000000016fdfdfd8) at mechanism_range.hpp:109:16
   106
   107      template <int variable>
   108      [[nodiscard]] double* fpfield_ptr() {
-> 109          return data_array<variable, 1>(0);
   110      }
   111
   112      /**
(lldb) up
frame #2: 0x00000001001dc99c libnrnmech.so`neuron::make_instance_minimal(_ml=0x000000016fdfdfd8) at minimal.cpp:102:26
   99
   100      static minimal_Instance make_instance_minimal(_nrn_mechanism_cache_range& _ml) {
   101          return minimal_Instance {
-> 102              _ml.template fpfield_ptr<0>()
   103          };
   104      }
   105
(lldb) up
frame #3: 0x00000001001df8dc libnrnmech.so`neuron::_hoc_f() at minimal.cpp:182:21
   179          _ppvar = _local_prop ? _nrn_mechanism_access_dparam(_local_prop) : nullptr;
   180          _thread = _extcall_thread.data();
   181          _nt = nrn_threads;
-> 182          auto inst = make_instance_minimal(_ml_real);
   183          _r = f_minimal(_ml, inst, id, _ppvar, _thread, _nt);
   184          hoc_retpushx(_r);
   185      }

Note that _ml_real doesn't have any data in it:

(lldb) p _ml_real
(_nrn_mechanism_cache_instance) {
  neuron::cache::MechanismRange<1, 0> = {
    m_data_ptrs = 0x0000000000000000
    m_data_array_dims = 0x0000000000000000
    m_pdata_ptrs = 0x0000000000000000
    m_offset = 18446744073709551615
  }
  m_dptr_cache = (__elems_ = "")
  m_dptr_datums = (__elems_ = "")
}

The entire definition of _hoc_f is as follows:

    static void _hoc_f(void) {
        double _r{};
        Datum* _ppvar;
        Datum* _thread;
        NrnThread* _nt;
        Prop* _local_prop = _prop_id ? _extcall_prop : nullptr;
        _nrn_mechanism_cache_instance _ml_real{_local_prop};
        auto* const _ml = &_ml_real;
        size_t const id{};
        _ppvar = _local_prop ? _nrn_mechanism_access_dparam(_local_prop) : nullptr;
        _thread = _extcall_thread.data();
        _nt = nrn_threads;
        auto inst = make_instance_minimal(_ml_real);
        _r = f_minimal(_ml, inst, id, _ppvar, _thread, _nt);
        hoc_retpushx(_r);
    }

Going down the rabbit hole, it seems _local_prop is a nullptr, and the call to:

        _nrn_mechanism_cache_instance _ml_real{_local_prop};

actually calls neuron::cache::MechanismInstance, which has this code snippet:

    MechanismInstance(Prop* prop)
        : base_type{_nrn_mechanism_get_type(prop), mechanism::_get::_current_row(prop)} {
        if (!prop) {
            // grrr...see cagkftab test where setdata is not called(?) and extcall_prop is null(?)
            return;
        }

This seems to originate from this NEURON commit, and is where I sort of lost track of what's going on.

Going back to the drawing board, we can instead call this Python script:

from neuron import h, gui
s = h.Section()
s.insert("minimal")
s(0.5).minimal.f() # <--- instead of `h.f_minimal()`

which doesn't segfault, so the HOC call doesn't work, but its Section equivalent does. Stopping at _npy_f (I guess the NEURON Python equivalent of _hoc_f?), we get:

* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x00000001001dfb7c libnrnmech.so`neuron::_npy_f(_prop=0x0000600002106080) at minimal.cpp:187:16
   184          hoc_retpushx(_r);
   185      }
   186      static double _npy_f(Prop* _prop) {
-> 187          double _r{};
   188          Datum* _ppvar;
   189          Datum* _thread;
   190          NrnThread* _nt;
(lldb) p _prop
(Prop *) 0x0000600002106080

which is not a nullptr, so maybe it has something to do with this?

@nrnhines
Copy link
Collaborator

I don't know if this will help but the same issue was very longstanding with nocmodl and was fixed in
neuronsimulator/nrn#2460
Also see neuronsimulator/nrn#2475

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants