Skip to content
This repository has been archived by the owner on Aug 5, 2022. It is now read-only.

Connection error while getting data from ExternalService #97

Open
chendave opened this issue Oct 17, 2018 · 1 comment
Open

Connection error while getting data from ExternalService #97

chendave opened this issue Oct 17, 2018 · 1 comment

Comments

@chendave
Copy link

Dear developers,

There is a time PODM tells me computer systems are in "InTest" state, and I am unable to do node composition, so I lookup the state from "pod-manager-user-guide-v2-1.pdf", and then execute the below command,

$ sudo /usr/bin/pod-manager-clean-database-on-next-startup

and then restart pod-manager service,

$ sudo systemctl restart pod-manager

But now I cannot get anything computer systems back,
$ curl -k -u admin:admin https://10.3.0.1:8443/redfish/v1/Systems
{
"@odata.context" : "/redfish/v1/$metadata#Systems",
"@odata.id" : "/redfish/v1/Systems",
"@odata.type" : "#ComputerSystemCollection.ComputerSystemCollection",
"Name" : "Computer System Collection",
"Description" : "Computer System Collection",
"[email protected]" : 0,
"Members" : [ ]
}

/var/log/pod-manager/pod-manager-application.log give some hint on such abnormal behavior,
...
WARN c.i.p.d.external.DiscoveryRunner - Connection error while getting data from ExternalService {UUID=4c4c4544-434d-1001-8000-d0946609a764, baseUri=http://10.3.2.248:80/redfish/v1, type=PSME, unreachableSince=2018-10-17T01:27:45.322} service - performing check on this service
2018-10-17 02:22:41,120 [EE-ManagedScheduledExecutorService-TasksExecutor-Thread-5] DEBUG c.i.p.d.e.ExternalServiceAvailabilityCheckerTask - Verifying service with UUID 4c4c4544-434d-1001-8000-d0946609a764
2018-10-17 02:22:41,783 [EE-ManagedScheduledExecutorService-TasksExecutor-Thread-5] DEBUG c.i.p.d.e.ExternalServiceAvailabilityCheckerTask - Service ExternalService {UUID=4c4c4544-434d-1001-8000-d0946609a764, baseUri=http://10.3.2.248:80/redfish/v1, type=PSME, unreachableSince=2018-10-17T01:27:45.322} still exists
...

But the network and PSME service is good, I can connect and get the system back when I call it directly,
$ curl http://10.3.2.248:80/redfish/v1/Systems
{
"@odata.context": "/redfish/v1/$metadata#Systems",
"@odata.id": "/redfish/v1/Systems",
"@odata.type": "#ComputerSystemCollection.ComputerSystemCollection",
"Name": "Computer System Collection",
"[email protected]": 5,
"Members": [
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block2-Sled2-Node1"
},
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block2-Sled4-Node1"
},
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block3-Sled1-Node1"
},
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block3-Sled2-Node1"
},
{
"@odata.id": "/redfish/v1/Systems/Rack1-Block3-Sled3-Node1"
}
]
}

I found there is a similar issue here: #58, and looks like this is related with service UUID, how can purge all those data and poll everything again? Is there any configuration item I need to update to fix the issue? what's the root cause for this issue?

Thanks a lot for any input!

pod-manager-application.log

@chendave
Copy link
Author

BTW, the POMD version I am using is 2.1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant