Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfaults of omiagent when running on Kubernetes/CoreOS #78

Open
lagalbra opened this issue Oct 3, 2017 · 12 comments
Open

Segfaults of omiagent when running on Kubernetes/CoreOS #78

lagalbra opened this issue Oct 3, 2017 · 12 comments
Labels

Comments

@lagalbra
Copy link

lagalbra commented Oct 3, 2017

Originally filed at microsoft/OMS-Agent-for-Linux#585 by @edevil

           PID: 33069 (omiagent)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Mon 2017-10-02 15:20:23 UTC (19h ago)
  Command Line: /opt/omi/bin/omiagent 9 12 --destdir / --providerdir /opt/omi/lib --loglevel WARNING
    Executable: /opt/omi/bin/omiagent
 Control Group: /kubepods/besteffort/podca6c5d89-a503-11e7-8e02-000d3a2709aa/27e2a4b7b2cb2a3a32c6cdb928d1be5ed5d74e2ff727d43fd4125c6ed8bf694a
         Slice: -.slice
       Boot ID: 32449a510934451eb8a0cc253dceabe8
    Machine ID: 9a13f6cf2ea8409dbc456655fa12ca04
      Hostname: node-1-vm
       Storage: /var/lib/systemd/coredump/core.omiagent.0.32449a510934451eb8a0cc253dceabe8.33069.1506957623000000.lz4
       Message: Process 33069 (omiagent) of user 0 dumped core.

           PID: 37755 (omiagent)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Mon 2017-10-02 15:21:36 UTC (19h ago)
  Command Line: /opt/omi/bin/omiagent 9 11 --destdir / --providerdir /opt/omi/lib --loglevel WARNING
    Executable: /opt/omi/bin/omiagent
 Control Group: /kubepods/besteffort/podca6c5d89-a503-11e7-8e02-000d3a2709aa/27e2a4b7b2cb2a3a32c6cdb928d1be5ed5d74e2ff727d43fd4125c6ed8bf694a
         Slice: -.slice
       Boot ID: 32449a510934451eb8a0cc253dceabe8
    Machine ID: 9a13f6cf2ea8409dbc456655fa12ca04
      Hostname: node-1-vm
       Storage: /var/lib/systemd/coredump/core.omiagent.0.32449a510934451eb8a0cc253dceabe8.37755.1506957696000000.lz4
       Message: Process 37755 (omiagent) of user 0 dumped core.

Container version: "microsoft/oms:1.4.1-45"
OS: Container Linux by CoreOS stable (1465.8.0)

I can e-mail the core files but I'm not inclined to share them publicly.

@edevil Please email the core files to [email protected]

@edevil
Copy link

edevil commented Oct 3, 2017

The response from the remote server was:
550 5.4.1 [[email protected]]: Recipient address rejected: Access denied [DM3NAM06FT010.Eop-nam06.prod.protection.outlook.com]

@lagalbra
Copy link
Author

lagalbra commented Oct 3, 2017

Oops, my bad. The correct email address is [email protected]

@samisms
Copy link
Contributor

samisms commented Oct 3, 2017

@edevil you can email me ([email protected])

@samisms
Copy link
Contributor

samisms commented Oct 3, 2017

@edevil Can you provide details of your system ?
OS version
docker version
docker images
docker ps -a

@edevil
Copy link

edevil commented Oct 4, 2017

OS version: Container Linux by CoreOS 1465.8.0 (Ladybug)
Docker version:

Server:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.7.6
 Git commit:   a82d35e
 Built:        Wed Sep 20 22:27:13 2017
 OS/Arch:      linux/amd64
master-0-vm brpxuser # docker images
REPOSITORY                             TAG                 IMAGE ID            CREATED             SIZE
quay.io/coreos/hyperkube               v1.8.0_coreos.0     6c7699106132        4 days ago          532.2 MB
microsoft/oms                          1.4.1-45            0e70764dea9e        9 days ago          305.2 MB
edevil/fluentd-kubernetes              20160810-1          5dbb6d76ba11        14 months ago       561.7 MB
gcr.io/google_containers/pause-amd64   3.0                 99e59f495ffa        17 months ago       746.9 kB
master-0-vm brpxuser # docker ps -a
CONTAINER ID        IMAGE                                                                                               COMMAND                  CREATED             STATUS              PORTS               NAMES
d61a41c3cf97        edevil/fluentd-kubernetes@sha256:4b6e60acbd5f090907e3904b8593a778dbfa1b22408de1443aaf2c1ec3bfe24e   "td-agent"               24 hours ago        Up 24 hours                             k8s_fluentd-logz_fluentd-logz-pngv6_kube-system_fdfdc49b-d0f2-11e6-b156-000d3a2709aa_0
5df3b1156666        microsoft/oms@sha256:68103fa5599a133d4900f1b8b5da303e8a9849c786884a25f5e04bc3f3476d3a               "/opt/main.sh"           24 hours ago        Up 24 hours                             k8s_omsagent_omsagent-btkmn_kube-system_ca6c70ed-a503-11e7-8e02-000d3a2709aa_0
4c7b9e0ef461        gcr.io/google_containers/pause-amd64:3.0                                                            "/pause"                 24 hours ago        Up 24 hours                             k8s_POD_fluentd-logz-pngv6_kube-system_fdfdc49b-d0f2-11e6-b156-000d3a2709aa_0
15669fa95b0c        gcr.io/google_containers/pause-amd64:3.0                                                            "/pause"                 24 hours ago        Up 24 hours                             k8s_POD_omsagent-btkmn_kube-system_ca6c70ed-a503-11e7-8e02-000d3a2709aa_0
56f3a66f0994        quay.io/coreos/hyperkube@sha256:8755aefadd070df7b26e49ce2998209547eca7bd4054e5dbb434615407374753    "/hyperkube scheduler"   24 hours ago        Up 24 hours                             k8s_kube-scheduler_kube-scheduler-master-0-vm_kube-system_072608a5c1eeb6414b3a04e5236bca36_0
6041749f03f4        quay.io/coreos/hyperkube@sha256:8755aefadd070df7b26e49ce2998209547eca7bd4054e5dbb434615407374753    "/hyperkube apiserver"   24 hours ago        Up 24 hours                             k8s_kube-apiserver_kube-apiserver-master-0-vm_kube-system_1d78cac468b7188e2cf076db803a3f37_0
e54178fbbf88        quay.io/coreos/hyperkube@sha256:8755aefadd070df7b26e49ce2998209547eca7bd4054e5dbb434615407374753    "/hyperkube proxy --c"   24 hours ago        Up 24 hours                             k8s_kube-proxy_kube-proxy-master-0-vm_kube-system_848b91a5ffa7b345fcb94c26f8e11129_0
1aca6be858eb        quay.io/coreos/hyperkube@sha256:8755aefadd070df7b26e49ce2998209547eca7bd4054e5dbb434615407374753    "/hyperkube controlle"   24 hours ago        Up 24 hours                             k8s_kube-controller-manager_kube-controller-manager-master-0-vm_kube-system_f0f3d04caecdbbd446b6f0844ccbcc69_0
875780451ef8        gcr.io/google_containers/pause-amd64:3.0                                                            "/pause"                 24 hours ago        Up 24 hours                             k8s_POD_kube-apiserver-master-0-vm_kube-system_1d78cac468b7188e2cf076db803a3f37_0
763065ac103c        gcr.io/google_containers/pause-amd64:3.0                                                            "/pause"                 24 hours ago        Up 24 hours                             k8s_POD_kube-scheduler-master-0-vm_kube-system_072608a5c1eeb6414b3a04e5236bca36_0
b000e3f05abb        gcr.io/google_containers/pause-amd64:3.0                                                            "/pause"                 24 hours ago        Up 24 hours                             k8s_POD_kube-proxy-master-0-vm_kube-system_848b91a5ffa7b345fcb94c26f8e11129_0
af5e58a97e7a        gcr.io/google_containers/pause-amd64:3.0                                                            "/pause"                 24 hours ago        Up 24 hours                             k8s_POD_kube-controller-manager-master-0-vm_kube-system_f0f3d04caecdbbd446b6f0844ccbcc69_0

@cyakimov
Copy link

We're seeing lots of omiagent[45431]: segfault at 30 ip 00007fd96460f969 sp 00007ffeb9251820 error 4 in libcontainer.so[7fd9645c5000+87000] errors in AKS nodes as well

@samisms
Copy link
Contributor

samisms commented May 17, 2018

@cyakimov I no longer work on this. Can you email [email protected] . You will get a quicker response there

@cricketfan5
Copy link

@here Can someone help me with the below issue ?
I am installing the omsagent as a daemonset kubernetes hosts. I was able to install successfully with no error at the time of installation but it is not turning in green(green tick mark in the portal stating it is being monitored). And I see below error in the this path

Error:Unsupported Operation system: COREOS 1632.2.1.
I am installing the OMSagent version(1.6.0-42) on COREOS version VERSION=1576.5.0
VERSION_ID=1576.5.0
BUILD_ID=2018-01-05-1121
PRETTY_NAME="Container Linux by CoreOS 1576.5.0 (Ladybug)"
Any help is appreciated. Thanks in advance

@keikhara keikhara added the bug label Jun 27, 2018
@keikhara
Copy link
Contributor

@edevil Which version of the agent is this? Have you tried the latest which is 1.6.0-42?

@edevil
Copy link

edevil commented Jun 27, 2018

@keikhara Sorry, I'm not using Azure at this time.

@vishiy
Copy link
Member

vishiy commented Sep 12, 2018

There are more segfaults in json parsing that we found and fixed that will be included in the upcoming releases.

@vishiy
Copy link
Member

vishiy commented Sep 13, 2018

all - for non-AKS containerized agent fixes - can you guys please try the below images which has seg fault fixes?
1.6.0-163
latest (as of 9/13/2018)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants