Skip to content
This repository was archived by the owner on Aug 5, 2022. It is now read-only.
This repository was archived by the owner on Aug 5, 2022. It is now read-only.

Connection error while getting data from ExternalService #97

Open
@chendave

Description

@chendave

Dear developers,

There is a time PODM tells me computer systems are in "InTest" state, and I am unable to do node composition, so I lookup the state from "pod-manager-user-guide-v2-1.pdf", and then execute the below command,

$ sudo /usr/bin/pod-manager-clean-database-on-next-startup

and then restart pod-manager service,

$ sudo systemctl restart pod-manager

But now I cannot get anything computer systems back,
$ curl -k -u admin:admin https://10.3.0.1:8443/redfish/v1/Systems
{
"@odata.context" : "/redfish/v1/$metadata#Systems",
"@odata.id" : "/redfish/v1/Systems",
"@odata.type" : "#ComputerSystemCollection.ComputerSystemCollection",
"Name" : "Computer System Collection",
"Description" : "Computer System Collection",
"[email protected]" : 0,
"Members" : [ ]
}

/var/log/pod-manager/pod-manager-application.log give some hint on such abnormal behavior,
...
WARN c.i.p.d.external.DiscoveryRunner - Connection error while getting data from ExternalService {UUID=4c4c4544-434d-1001-8000-d0946609a764, baseUri=http://10.3.2.248:80/redfish/v1, type=PSME, unreachableSince=2018-10-17T01:27:45.322} service - performing check on this service
2018-10-17 02:22:41,120 [EE-ManagedScheduledExecutorService-TasksExecutor-Thread-5] DEBUG c.i.p.d.e.ExternalServiceAvailabilityCheckerTask - Verifying service with UUID 4c4c4544-434d-1001-8000-d0946609a764
2018-10-17 02:22:41,783 [EE-ManagedScheduledExecutorService-TasksExecutor-Thread-5] DEBUG c.i.p.d.e.ExternalServiceAvailabilityCheckerTask - Service ExternalService {UUID=4c4c4544-434d-1001-8000-d0946609a764, baseUri=http://10.3.2.248:80/redfish/v1, type=PSME, unreachableSince=2018-10-17T01:27:45.322} still exists
...

But the network and PSME service is good, I can connect and get the system back when I call it directly,
$ curl http://10.3.2.248:80/redfish/v1/Systems
{
"@odata.context": "/redfish/v1/$metadata#Systems",
"@odata.id": "/redfish/v1/Systems",
"@odata.type": "#ComputerSystemCollection.ComputerSystemCollection",
"Name": "Computer System Collection",
"[email protected]": 5,
"Members": [
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block2-Sled2-Node1"
},
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block2-Sled4-Node1"
},
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block3-Sled1-Node1"
},
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block3-Sled2-Node1"
},
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block3-Sled3-Node1"
}
]
}

I found there is a similar issue here: #58, and looks like this is related with service UUID, how can purge all those data and poll everything again? Is there any configuration item I need to update to fix the issue? what's the root cause for this issue?

Thanks a lot for any input!

pod-manager-application.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions