Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SML: Segfault when accessing null pointer without error message #451

Open
moschmdt opened this issue May 2, 2024 · 10 comments · May be fixed by #465
Open

SML: Segfault when accessing null pointer without error message #451

moschmdt opened this issue May 2, 2024 · 10 comments · May be fixed by #465

Comments

@moschmdt
Copy link

moschmdt commented May 2, 2024

This is not a bug, rather more a quality of live improvement suggestion

When reading the output link via SML (FindAttribute) and one attribute does not exist results in a returned null pointer. Reading the null pointer obviously results in a segfault.

One could check for null pointers every time but code readability suffers in my opinion while prototyping and debugging with external code could be associated with high efforts.

I would suggest throwing an error naming the parent I’d that is still valid and the invalid child parameter instead returning a null pointer.

While the Adaption in the cpp code is fairly simple, I don’t know the consequences for the SWIG interfaces.
Someone has an idea about it and can estimate if this is feasible?

@garfieldnate
Copy link
Collaborator

Sorry could you provide an example piece of code that triggers the segfault?

@moschmdt
Copy link
Author

moschmdt commented May 3, 2024

Example Soar rule to generate the output link:

sp {any*propose*output
   (state <s> ^io.output-link <ol>)
   (<ol> -^hello world)
-->
   (<s> ^operator <o> +)
   (<o> ^name output_link)
}

sp {any*apply*output
   (state <s> ^operator <o>
      ^io.output-link <ol>)
   (<o> ^name output_link)
-->
   (<ol> ^hello world)
   (write |Hello|)
}

CPP SML code to read the output link.

#include <iostream>
#include "sml_Client.h"

using namespace std;
using namespace sml;

void MyPrintEventHandler(smlPrintEventId id, void *pUserData, Agent *pAgent, char const *pMessage)
{
    // In this case the user data is a string we're building up
    std::string *pTrace = (std::string *)pUserData;

    (*pTrace) += pMessage;
}

int main(int argc, char const *argv[])
{
    Kernel *pKernel = Kernel::CreateKernelInNewThread();
    sml::Agent *pAgent = pKernel->CreateAgent("test");
    pAgent->LoadProductions("main.soar");

    pAgent->RunSelf(1);

    // Works as expected.
    pAgent->GetOutputLink()->GetParameterValue("hello"); 
    
    // This results in a segfault as expected, but no hint which read operation on the 
    // output link actually caused the segfault. 
    pAgent->GetOutputLink()->GetParameterValue("world"); 

    string dummy;
    cin >> dummy;
    return 0;
}

Potential update to improve the situation in sml_ClientIdentifier.h:

#include <stdexcept>

...


 char const* GetParameterValue(char const* pAttribute) const
{
    WMElement* pWME = FindByAttribute(pAttribute, 0) ;
    if (pWME == NULL){
          std::string err_msg = "[Error]: " + this->m_AttributeName + " has no child attribute: " + std::string(pAttribute);
          throw std::invalid_argument(err_msg);
    }
    return pWME->GetValueAsString();
}

A more subtle change would be to replace the throw with a print command, e.g. cout and do not throw an error to remain API compatible, checking for null pointer returns.

@ShadowJonathan
Copy link
Contributor

Doing throws would be more semantic imo, and writing to cout could be dangerous/bad in some scenarios since the SML bindings can be embedded in other programs

@moschmdt
Copy link
Author

If this is actually desired I would be willing to create a PR. One important remark: this could be considered a breaking API change since checking for returned null pointers no longer works and would require replacing with a try-catch structure.

Maybe someone with experience with SWIG could comment on the effects for the other language interfaces.

@moschmdt
Copy link
Author

There are also null pointer returns in WMElement conversions to identifier, string, int and float elements without error messages, which could also be updated in case the conversion is not implemented in inherited classes.

virtual Identifier* ConvertToIdentifier()
{
return NULL;
}

@garfieldnate
Copy link
Collaborator

Okay, thanks for opening this! It does seem like a serious usability issue, so it would be nice to address it.

Returning null and printing to cerr would be very easy; we already print to cerr throughout Soar, so I don't think that would be a new consideration for clients. Clients could still get a null pointer exception, but perhaps the source of it would be clearer.

I agree with Jonathan, though: I think that throwing an exception would be the clearest way to handle this. We don't use throw anywhere in the entire codebase except for the unit tests, though, which means that we've never seen what SWIG turns C++ exceptions into in the languages we have bindings for: Java, Python, Tcl and C#. Therefore, I think it would be best to test this once to confirm that the actual effect is acceptable (I'm sure it'd be better than a segfault, but what would the message be, and what object would the exception be?).

We could add one try-catch test to each of the SML tests to demonstrate the behavior; we currently run Java, Python and Tcl tests in CI already, but haven't been able to add a C# one yet.

this could be considered a breaking API change since checking for returned null pointers no longer works and would require replacing with a try-catch structure.

I don't follow. Existing code that checks for null would look like this, right?

    WMElement* pWME = FindByAttribute(pAttribute, 0) ;
    if (pWME == NULL){
        // handle here
    }
    const char* val = pWME->GetValueAsString();
    ...

Wouldn't this continue to work fine? It doesn't call GetParameterValue, so it would never hit an exception. Or were you thinking of a different current use-case?

@moschmdt
Copy link
Author

Yes, that could continue to work as long as the FindByAttribute is not changed as well in favour of a throw.

The follow up question would be if the FindByAttribute function should be changed instead of the GetParameterValue to catch all SML calls?

Error Messages

As far as error messages are concerned, I would expect to get something like this for the previous example:

ERROR: output-link has no child attribute world.

This includes information about the last valid WME as well as the child element that could not be found. What other information could be valuable?

SWIG

A quick search on SWIG and CPP exceptions:

@garfieldnate
Copy link
Collaborator

FindByAttribute has been documented since forever as returning NULL when the wme is not found, and I think that would be too big of a breaking change at this point, so I would rather address the issue in the functions that call it.

Regarding your note on ConvertToIdentifier: I can't actually find where that thing is implemented 💢 If you see what's happening there, please share a pointer with me.

I've fixed the main issue described in the thread and will open a PR shortly.

garfieldnate added a commit that referenced this issue May 22, 2024
Previous behavior here was to segfault! An exception is much more user-friendly.

Add the SWIG machinery for catching thrown exceptions in the SML binding
libraries, too. We wrap every SML function with a try/catch. I don't know if
this is actually heavy or not, but at least one implementation out there thought
it was and went an alternative route where each method has to be explicitly
marked to catch exceptions:
https://github.com/KiCad/kicad-source-mirror/blob/47e4ebb32a1366d60649879381eac819f7c7131d/common/swig/ki_exception.i#L41

Add tests for TCL, Python and Java that demonstrate the exceptions being
translated for each host language. We don't have C# tests yet T_T.

Notice we did have to add the `atexit` handler back to prevent a segfault when
the exception is not caught correctly; I don't know exactly why. Filed #464.

Fixes #451.
@garfieldnate garfieldnate linked a pull request May 22, 2024 that will close this issue
@garfieldnate
Copy link
Collaborator

@moschmdt Would you mind taking a look? #465. Thanks!

@moschmdt
Copy link
Author

It looks good to me. Thank you for implementing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants