Open
Conversation
ozbenh
pushed a commit
that referenced
this pull request
Nov 6, 2015
qpci_msix_pending() writes on pba region, causing qemu to SEGV: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff7fba8c0 (LWP 25882)] 0x0000000000000000 in ?? () (gdb) bt #0 0x0000000000000000 in () #1 0x00005555556556c5 in memory_region_oldmmio_write_accessor (mr=0x5555579f3f80, addr=0, value=0x7fffffffbf68, size=4, shift=0, mask=4294967295, attrs=...) at /home/elmarco/src/qemu/memory.c:434 #2 0x00005555556558e1 in access_with_adjusted_size (addr=0, value=0x7fffffffbf68, size=4, access_size_min=1, access_size_max=4, access=0x55555565563e <memory_region_oldmmio_write_accessor>, mr=0x5555579f3f80, attrs=...) at /home/elmarco/src/qemu/memory.c:506 #3 0x00005555556581eb in memory_region_dispatch_write (mr=0x5555579f3f80, addr=0, data=0, size=4, attrs=...) at /home/elmarco/src/qemu/memory.c:1176 #4 0x000055555560b6f9 in address_space_rw (as=0x555555eff4e0 <address_space_memory>, addr=3759147008, attrs=..., buf=0x7fffffffc1b0 "", len=4, is_write=true) at /home/elmarco/src/qemu/exec.c:2439 #5 0x000055555560baa2 in cpu_physical_memory_rw (addr=3759147008, buf=0x7fffffffc1b0 "", len=4, is_write=1) at /home/elmarco/src/qemu/exec.c:2534 #6 0x000055555564c005 in cpu_physical_memory_write (addr=3759147008, buf=0x7fffffffc1b0, len=4) at /home/elmarco/src/qemu/include/exec/cpu-common.h:80 #7 0x000055555564cd9c in qtest_process_command (chr=0x55555642b890, words=0x5555578de4b0) at /home/elmarco/src/qemu/qtest.c:378 #8 0x000055555564db77 in qtest_process_inbuf (chr=0x55555642b890, inbuf=0x55555641b340) at /home/elmarco/src/qemu/qtest.c:569 #9 0x000055555564dc07 in qtest_read (opaque=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", size=22) at /home/elmarco/src/qemu/qtest.c:581 #10 0x000055555574ce3e in qemu_chr_be_write (s=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", len=22) at qemu-char.c:306 #11 0x0000555555751263 in tcp_chr_read (chan=0x55555642bcf0, cond=G_IO_IN, opaque=0x55555642b890) at qemu-char.c:2876 #12 0x00007ffff64c9a8a in g_main_context_dispatch (context=0x55555641c400) at gmain.c:3122 (without this patch, this can be reproduced with the ivshmem qtest) Implement an empty mmio write to avoid the crash. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 6, 2015
A new gdb commands are added:
qemu handlers
That dumps an AioContext list (by default qemu_aio_context)
possibly including a backtrace for cases it knows about
(with the verbose option). Intended to help find why something
is hanging waiting for IO.
Use 'qemu handlers --verbose iohandler_ctx' to find out why
your incoming migration is stuck.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1445951385-11924-1-git-send-email-dgilbert@redhat.com
V2:
Merge into one command with optional handlers arg, and only do
backtrace in verbose mode
(gdb) qemu handlers
----
{pfd = {fd = 6, events = 25, revents = 0}, io_read = 0x55869656ffd0
<event_notifier_dummy_cb>, io_write = 0x0, deleted = 0, opaque =
0x558698c4ce08, node = {le_next = 0x0, le_prev = 0x558698c4cdc0}}
(gdb) qemu handlers iohandler_ctx
----
{pfd = {fd = 9, events = 25, revents = 0}, io_read = 0x558696581380
<fd_coroutine_enter>, io_write = 0x0, deleted = 0, opaque =
0x558698dc99d0, node = {le_next = 0x558698c4cca0, le_prev =
0x558698c4c1d0}}
----
{pfd = {fd = 4, events = 25, revents = 0}, io_read = 0x55869657b330
<sigfd_handler>, io_write = 0x0, deleted = 0, opaque = 0x4, node =
{le_next = 0x558698c4c260, le_prev = 0x558699f72508}}
----
{pfd = {fd = 5, events = 25, revents = 0}, io_read = 0x55869656ffd0
<event_notifier_dummy_cb>, io_write = 0x0, deleted = 0, opaque =
0x558698c4c218, node = {le_next = 0x0, le_prev = 0x558698c4ccc8}}
----
(gdb) qemu handlers --verbose iohandler_ctx
----
{pfd = {fd = 9, events = 25, revents = 0}, io_read = 0x558696581380
<fd_coroutine_enter>, io_write = 0x0, deleted = 0, opaque =
0x558698dc99d0, node = {le_next = 0x558698c4cca0, le_prev =
0x558698c4c1d0}}
#0 0x0000558696581820 in qemu_coroutine_switch
(from_=from_@entry=0x558698cb3cf0, to_=to_@entry=0x7f421c37eac8,
action=action@entry=COROUTINE_YIELD) at
/home/dgilbert/git/qemu/coroutine-ucontext.c:177
#1 0x0000558696580c00 in qemu_coroutine_yield () at
/home/dgilbert/git/qemu/qemu-coroutine.c:145
#2 0x00005586965814f5 in yield_until_fd_readable (fd=9) at
/home/dgilbert/git/qemu/qemu-coroutine-io.c:90
#3 0x0000558696523937 in socket_get_buffer (opaque=0x55869a3dc620,
buf=0x558698c505a0 "", pos=<optimized out>, size=32768) at
/home/dgilbert/git/qemu/migration/qemu-file-unix.c:101
#4 0x0000558696521fac in qemu_fill_buffer (f=0x558698c50570) at
/home/dgilbert/git/qemu/migration/qemu-file.c:227
#5 0x0000558696522989 in qemu_peek_byte (f=0x558698c50570, offset=0)
at /home/dgilbert/git/qemu/migration/qemu-file.c:507
#6 0x0000558696522bf4 in qemu_get_be32 (f=0x558698c50570) at
/home/dgilbert/git/qemu/migration/qemu-file.c:520
#7 0x0000558696522bf4 in qemu_get_be32 (f=f@entry=0x558698c50570)
at /home/dgilbert/git/qemu/migration/qemu-file.c:604
#8 0x0000558696347e5c in qemu_loadvm_state (f=f@entry=0x558698c50570)
at /home/dgilbert/git/qemu/migration/savevm.c:1821
#9 0x000055869651de8c in process_incoming_migration_co
(opaque=0x558698c50570)
at /home/dgilbert/git/qemu/migration/migration.c:336
#10 0x000055869658188a in coroutine_trampoline (i0=<optimized out>,
i1=<optimized out>)
at /home/dgilbert/git/qemu/coroutine-ucontext.c:80
#11 0x00007f420f05df10 in __start_context () at /lib64/libc.so.6
#12 0x00007ffc40815f50 in ()
#13 0x0000000000000000 in ()
----
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
554592d to
de26bb4
Compare
7643b1c to
7ecd38c
Compare
We don't use the resulting accessors and this gets in the way of the split I/D TLB work. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We rework the way the MMU indices are calculated, providing separate indices for I and D side based on MSR:IR and MSR:DR respectively, and thus no longer need to flush the TLB on context changes. This also adds correct support for HV as a separate address space. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On ppc64 especially, we flush the tlb on any slbie or tlbie instruction.
However, those instructions often come in bursts of 3 or more (context
switch will favor a series of slbie's for example to an slbia if the
SLB has less than a certain number of entries in it, and tlbie's can
happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries
at a time.
Doing a tlb_flush() each time is a waste of time. We end up doing a memset
of the whole TLB, reloading it for the next instruction, memset'ing again,
etc...
Those instructions don't have to take effect immediately. For slbie, they
can wait for the next context synchronizing event. For tlbie, the next
tlbsync.
This implements batching by keeping a flag that indicates that we have a
TLB in need of flushing. We check it on interrupts, rfi's, isync's and
tlbsync and flush the TLB if needed.
This reduces the number of tlb_flush() on a boot to a ubuntu installer
first dialog screen from roughly 360K down to 36K.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: added a 'CPUPPCState *' variable in h_remove() and
h_bulk_remove() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
We don't give them a KVM reg number yet as no current KVM version supports HV mode. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: SPRs AMOR,DAWR,DARWX were already included in commit f401dd3] Signed-off-by: Cédric Le Goater <clg@kaod.org>
This helper is only used by the various instructions that can alter MSR and not interrupts. Add a comment to that effect to the interrupt code as well in case somebody wants to change this Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
This reworks emulation of the various "rfi" variants. I removed some masking bits that I couldn't make sense of, the only bit that I am aware we should mask here is POW, the CPU's MSR mask should take care of the rest. This also fixes some problems when running 32-bit userspace under a 64-bit kernel. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
We use an env. flag which is set to the initial value of MSR_HVB in the msr_mask. We also adjust the POWER8 mask to set SHV. Also use this to adjust ctx.hv so that it is *set* when the processor doesn't have an HV mode (970 with Apple mode for example), thus enabling hypervisor instructions/SPRs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Not that anything remotely recent supports tlbia but ... Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Otherwise it will trip on the forms used in recent architecture. Ideally, we should have different handlers for different architecture levels but our current implementation of TLB flushing is dumb enough that this will do for now. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Otherwise tight loops at smt_low for example, which OPAL does, eat so much CPU that we can't boot a kernel anymore. With that, I can boot 8 CPUs just fine with powernv. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
This will enable decoding of hrfid Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Also use it to clamp the max SMT mode and ensure that the cpu_dt_id are offset by that value in order to preserve consistency with the HW implementations. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
And move the code adjusting the MSR mask and calling kvmppc_set_papr() to it. This allows us to add a few more things such as disabling setting of MSR:HV and appropriate LPCR bits which will be used when fixing the exception model. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Properly implement LPES0/1 handling for HV vs. !HV mode and fix AIL implementation. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We were initializing unused ones and missing some Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Under some circumstances, we need to direct ISI and DSI interrupts at the hypervisor, turning them into HISI/HDSI, and using different SPRs (HDSISR and HDAR) depending on the combination of MSR_DR and the corresponding VPM bits in LPCR. This moves part of the code into helpers that are fixed to select the right exception type and registers. On pre-P7 processors, LPCR is 0 which provides the old behaviour of directing the interrupts at the supervisor. Thanks to Andrei Warkentin for finding a bug when HV=1 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Recent server processors use the Hypervisor Emulation Assistance interrupt for illegal instructions and *some* type of SPR accesses. Also the code was always generating inval instructions even for priv violations due to setting the wrong flags Finally, the checking for PR/HV was open coded everywhere. This reworks it all, using little helper macros for checking, and adding the HV interrupt (which gets converted back to program check in the slow path of excp_helper.c on CPUs that don't want it). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Those instructions are only available in hypervisor real mode and allow cache inhibited garded access to devices in that mode. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
ozbenh
pushed a commit
that referenced
this pull request
Aug 7, 2016
The vnc_server_info_get will allocate the VncServerInfo struct and then call vnc_init_basic_info_from_server_addr to populate the basic fields. If this returns an error though, the qapi_free_VncServerInfo call will then crash because the VncServerInfo struct instance was not properly NULL-initialized and thus contains random stack garbage. #0 0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6 #1 0x00007f1987c902fa in abort () at /lib64/libc.so.6 #2 0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6 #3 0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6 #4 0x00007f1987cdb2ac in free () at /lib64/libc.so.6 #5 0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0 #6 0x0000559193cdcf54 in visit_type_str (v=v@entry= 0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80) at qapi/qapi-visit-core.c:255 #7 0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry= 0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307 #8 0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry= 0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632 #9 0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry= 0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658 #10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970 #11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233 #12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284 #13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039 #14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>) at ui/vnc.c:3877 #15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90) at qmp-marshal.c:105 #16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /home/berrange/src/virt/qemu/monitor.c:3971 #17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105 #18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false) at qobject/json-lexer.c:319 #19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369 #20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:124 #21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /home/berrange/src/virt/qemu/monitor.c:3987 #22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900) at qemu-char.c:2895 #23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213 #25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258 #26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506 #27 0x0000559193964c55 in main () at vl.c:1908 #28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603 This was introduced in commit 98481bf Author: Eric Blake <eblake@redhat.com> Date: Mon Oct 26 16:34:45 2015 -0600 vnc: Hoist allocation of VncBasicInfo to callers which added error reporting for vnc_init_basic_info_from_server_addr but didn't change the g_malloc calls to g_malloc0. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1470134726-15697-3-git-send-email-berrange@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
Spotted by ASAN:
=================================================================
==27907==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 4120 byte(s) in 1 object(s) allocated from:
#0 0x7f913458ce50 in calloc (/lib64/libasan.so.5+0xeee50)
#1 0x7f9133fd641d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
#2 0x5561c6643c95 in qdict_crumple_test_recursive /home/elmarco/src/qq/tests/check-block-qdict.c:438
#3 0x7f9133ff7c49 (/lib64/libglib-2.0.so.0+0x73c49)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180809114417.28718-2-marcandre.lureau@redhat.com>
[Screwed up in commit 2860b2b]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
Spotted by ASAN, during make check...
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f8e27262c48 in malloc (/lib64/libasan.so.5+0xeec48)
#1 0x7f8e26a5f3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
#2 0x555ab67078a8 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:67
#3 0x555ab67071e4 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:24
#4 0x555ab6713fbf in qstring_from_escaped_str /home/elmarco/src/qq/qobject/json-parser.c:144
#5 0x555ab671738c in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:506
#6 0x555ab67179c3 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:569
#7 0x555ab6715123 in parse_pair /home/elmarco/src/qq/qobject/json-parser.c:306
#8 0x555ab6715483 in parse_object /home/elmarco/src/qq/qobject/json-parser.c:357
#9 0x555ab671798b in parse_value /home/elmarco/src/qq/qobject/json-parser.c:561
#10 0x555ab6717a6b in json_parser_parse_err /home/elmarco/src/qq/qobject/json-parser.c:592
#11 0x555ab4fd4dcf in handle_qmp_command /home/elmarco/src/qq/monitor.c:4257
#12 0x555ab6712c4d in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105
#13 0x555ab67e01e2 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323
#14 0x555ab67e0af6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373
#15 0x555ab6713010 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124
#16 0x555ab4fd58ec in monitor_qmp_read /home/elmarco/src/qq/monitor.c:4337
#17 0x555ab6559df2 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175
#18 0x555ab6559e95 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187
#19 0x555ab6560127 in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
#20 0x555ab65d9c73 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
#21 0x7f8e26a598ac in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4c8ac)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180809114417.28718-4-marcandre.lureau@redhat.com>
[Screwed up in commit b273145]
Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu
crash.
The backtrace is:
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080,
io_read=0x8db841 <qio_channel_restart_read>, io_write=0x0, opaque=0x38111e0) at io/channel.c:
#2 0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438
#3 0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47
#4 0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327
at migration/qemu-file-channel.c:83
#5 0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299
#6 0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562
#7 0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575
#8 0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655
#9 0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126
#10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366
#11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1
#12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6
#13 0x00007f96fe858760 in ?? ()
#14 0x0000000000000000 in ?? ()
RDMA QIOChannel not implement io_set_aio_fd_handler. so
qio_channel_set_aio_fd_handler will access NULL pointer.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash.
The backtrace is:
(gdb) bt
#0 0x00007fb20aba91d7 in raise () from /lib64/libc.so.6
#1 0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6
#2 0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460
#5 0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768)
at migration/qemu-file-channel.c:83
#6 0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299
#7 0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562
#8 0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575
#9 0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647
#10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794
#11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504
#12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0
#13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6
This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine().
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
Because RDMA QIOChannel not implement shutdown function,
If the to_dst_file was set error, the return path thread
will wait forever. and the migration thread will wait
return path thread exit.
the backtrace of return path thread is:
(gdb) bt
#0 0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6
#1 0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000)
at qemu-timer.c:325
#2 0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000)
at migration/rdma.c:1501
#3 0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000,
byte_len=0x7ef7091d0640) at migration/rdma.c:1580
#4 0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000,
head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726
#5 0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720,
expecting=3) at migration/rdma.c:1903
#6 0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8,
size=32768) at migration/rdma.c:2714
#7 0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232
#8 0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0)
at migration/qemu-file.c:502
#9 0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515
#10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591
#11 0x00000000006a46d3 in source_return_path_thread (
opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1331
#12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f372a77635d in clone () from /lib64/libc.so.6
the backtrace of migration thread is:
(gdb) bt
#0 0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0
#1 0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 <current_migration.37100+88>)
at util/qemu-thread-posix.c:504
#2 0x00000000006a4bc5 in await_return_path_close_on_source (
ms=0xd826a0 <current_migration.37100>) at migration/migration.c:1460
#3 0x00000000006a53e4 in migration_completion (s=0xd826a0 <current_migration.37100>,
current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980)
at migration/migration.c:1695
#4 0x00000000006a5c54 in migration_thread (opaque=0xd826a0 <current_migration.37100>)
at migration/migration.c:1837
#5 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f372a77635d in clone () from /lib64/libc.so.6
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
tests/cdrom-test -p /x86_64/cdrom/boot/megasas
Produces the following ASAN leak.
==25700==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x7f06f8faac48 in malloc (/lib64/libasan.so.5+0xeec48)
#1 0x7f06f87a73c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
#2 0x55a729f17738 in pci_dma_sglist_init /home/elmarco/src/qq/include/hw/pci/pci.h:818
#3 0x55a729f2a706 in megasas_map_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:698
#4 0x55a729f39421 in megasas_handle_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:1574
#5 0x55a729f3f70d in megasas_handle_frame /home/elmarco/src/qq/hw/scsi/megasas.c:1955
#6 0x55a729f40939 in megasas_mmio_write /home/elmarco/src/qq/hw/scsi/megasas.c:2119
#7 0x55a729f41102 in megasas_port_write /home/elmarco/src/qq/hw/scsi/megasas.c:2170
#8 0x55a729220e60 in memory_region_write_accessor /home/elmarco/src/qq/memory.c:527
#9 0x55a7292212b3 in access_with_adjusted_size /home/elmarco/src/qq/memory.c:594
#10 0x55a72922cf70 in memory_region_dispatch_write /home/elmarco/src/qq/memory.c:1473
#11 0x55a7290f5907 in flatview_write_continue /home/elmarco/src/qq/exec.c:3255
#12 0x55a7290f5ceb in flatview_write /home/elmarco/src/qq/exec.c:3294
#13 0x55a7290f6457 in address_space_write /home/elmarco/src/qq/exec.c:3384
#14 0x55a7290f64a8 in address_space_rw /home/elmarco/src/qq/exec.c:3395
#15 0x55a72929ecb0 in kvm_handle_io /home/elmarco/src/qq/accel/kvm/kvm-all.c:1729
#16 0x55a7292a0db5 in kvm_cpu_exec /home/elmarco/src/qq/accel/kvm/kvm-all.c:1969
#17 0x55a7291c4212 in qemu_kvm_cpu_thread_fn /home/elmarco/src/qq/cpus.c:1215
#18 0x55a72a966a6c in qemu_thread_start /home/elmarco/src/qq/util/qemu-thread-posix.c:504
#19 0x7f06ed486593 in start_thread (/lib64/libpthread.so.0+0x7593)
Move the qemu_sglist_destroy() from megasas_complete_command() to
megasas_unmap_frame(), so map/unmap are balanced.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180814141247.32336-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
Spotted by ASAN:
=================================================================
==5378==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 65536 byte(s) in 1 object(s) allocated from:
#0 0x7f788f83bc48 in malloc (/lib64/libasan.so.5+0xeec48)
#1 0x7f788c9923c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
#2 0x5622a1fe37bc in coroutine_trampoline /home/elmarco/src/qq/util/coroutine-ucontext.c:116
#3 0x7f788a15d75f in __correctly_grouped_prefixwc (/lib64/libc.so.6+0x4c75f)
(Broken in commit 4c8158e.)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180809114417.28718-3-marcandre.lureau@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
We need to set cs->halted to 1 before calling ppc_set_compat. The reason
is that ppc_set_compat kicks up the new thread created to manage the
hotplugged KVM virtual CPU and the code drives directly to KVM_RUN
ioctl. When cs->halted is 1, the code:
int kvm_cpu_exec(CPUState *cpu)
...
if (kvm_arch_process_async_events(cpu)) {
atomic_set(&cpu->exit_request, 0);
return EXCP_HLT;
}
...
returns before it reaches KVM_RUN, giving time to the main thread to
finish its job. Otherwise we can fall in a deadlock because the KVM
thread will issue the KVM_RUN ioctl while the main thread is setting up
KVM registers. Depending on how these jobs are scheduled we'll end up
freezing QEMU.
The following output shows kvm_vcpu_ioctl sleeping because it cannot get
the mutex and never will.
PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr.
STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL
PID: 61564 TASK: c000003e981e0780 CPU: 48 COMMAND: "qemu-system-ppc"
#0 [c000003e982679a0] __schedule at c000000000b10a44
#1 [c000003e98267a60] schedule at c000000000b113a8
#2 [c000003e98267a90] schedule_preempt_disabled at c000000000b11910
#3 [c000003e98267ab0] __mutex_lock at c000000000b132ec
#4 [c000003e98267bc0] kvm_vcpu_ioctl at c00800000ea03140 [kvm]
#5 [c000003e98267d20] do_vfs_ioctl at c000000000407d30
#6 [c000003e98267dc0] ksys_ioctl at c000000000408674
#7 [c000003e98267e10] sys_ioctl at c0000000004086f8
#8 [c000003e98267e30] system_call at c00000000000b488
crash> struct -x kvm.vcpus 0xc000003da0000000
vcpus = {0xc000003db4880000, 0xc000003d52b80000, 0xc0000039e9c80000, 0xc000003d0e200000, 0xc000003d58280000, 0x0, 0x0, ...}
crash> struct -x kvm_vcpu.mutex.owner 0xc000003d58280000
mutex.owner = {
counter = 0xc000003a23a5c881 <- flag 1: waiters
},
crash> bt 0xc000003a23a5c880
PID: 61579 TASK: c000003a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM"
(active)
crash> struct -x kvm_vcpu.mutex.wait_list 0xc000003d58280000
mutex.wait_list = {
next = 0xc000003e98267b10,
prev = 0xc000003e98267b10
},
crash> struct -x mutex_waiter.task 0xc000003e98267b10
task = 0xc000003e981e0780
The following command-line was used to reproduce the problem (note: gdb
and trace can change the results).
$ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \
-enable-kvm -m 4096 \
-smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \
-display none -nographic \
-drive file=disk1.qcow2,format=qcow2
...
(qemu) device_add host-spapr-cpu-core,core-id=4
[no interaction is possible after it, only SIGKILL to take the terminal
back]
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
Spotted by ASAN doing some manual testing:
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7f5fcdc75e50 in calloc (/lib64/libasan.so.5+0xeee50)
#1 0x7f5fcd47241d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
#2 0x55f989be92ce in timer_new /home/elmarco/src/qq/include/qemu/timer.h:561
#3 0x55f989be92ff in timer_new_ms /home/elmarco/src/qq/include/qemu/timer.h:630
#4 0x55f989c0219d in hmp_migrate /home/elmarco/src/qq/hmp.c:2038
#5 0x55f98955927b in handle_hmp_command /home/elmarco/src/qq/monitor.c:3498
#6 0x55f98955fb8c in monitor_command_cb /home/elmarco/src/qq/monitor.c:4371
#7 0x55f98ad40f11 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:393
#8 0x55f98955fa4f in monitor_read /home/elmarco/src/qq/monitor.c:4354
#9 0x55f98aae30d7 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175
#10 0x55f98aae317a in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187
#11 0x55f98aae940c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
#12 0x55f98ab63018 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
#13 0x7f5fcd46c8ac in g_main_dispatch gmain.c:3177
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180901134652.25884-1-marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 #2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 #3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 #4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 #5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 #6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 #10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 #11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 #12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 #13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 #2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 #3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 #4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 #5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 #6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 #10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 #11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 #12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 #13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 #14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 #15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 #16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 #17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 #18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 #19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 #20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 #21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 #22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 #23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 #24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 #25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 #26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 #27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 #28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 #29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 #30 0x00005631fbd81412 in main_loop () at vl.c:1866 #31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
Spotted by ASAN while running:
$ tests/migration-test -p /x86_64/migration/postcopy/recovery
=================================================================
==18034==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 33864 byte(s) in 1 object(s) allocated from:
#0 0x7f3da7f31e50 in calloc (/lib64/libasan.so.5+0xeee50)
#1 0x7f3da644441d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
#2 0x55af9db15440 in qemu_fopen_channel_input /home/elmarco/src/qemu/migration/qemu-file-channel.c:183
#3 0x55af9db15413 in channel_get_output_return_path /home/elmarco/src/qemu/migration/qemu-file-channel.c:159
#4 0x55af9db0d4ac in qemu_file_get_return_path /home/elmarco/src/qemu/migration/qemu-file.c:78
#5 0x55af9dad5e4f in open_return_path_on_source /home/elmarco/src/qemu/migration/migration.c:2295
#6 0x55af9dadb3bf in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3111
#7 0x55af9dae1bf3 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:91
#8 0x55af9daddeca in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108
#9 0x55af9e13d3db in qio_task_complete /home/elmarco/src/qemu/io/task.c:158
#10 0x55af9e13ca03 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:89
#11 0x7f3da643b1ca in g_idle_dispatch gmain.c:5535
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180925092245.29565-1-marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
Recently, the test case has started failing because some job related
functions want to drop the AioContext lock even though it hasn't been
taken:
(gdb) bt
#0 0x00007f51c067c9fb in raise () from /lib64/libc.so.6
#1 0x00007f51c067e77d in abort () from /lib64/libc.so.6
#2 0x0000558c9d5dde7b in error_exit (err=<optimized out>, msg=msg@entry=0x558c9d6fe120 <__func__.18373> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
#3 0x0000558c9d6b5263 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x558c9f3999a0, file=file@entry=0x558c9d6fd36f "util/async.c", line=line@entry=516) at util/qemu-thread-posix.c:96
#4 0x0000558c9d6b0565 in aio_context_release (ctx=ctx@entry=0x558c9f399940) at util/async.c:516
#5 0x0000558c9d5eb3da in job_completed_txn_abort (job=0x558c9f68e640) at job.c:738
#6 0x0000558c9d5eb227 in job_finish_sync (job=0x558c9f68e640, finish=finish@entry=0x558c9d5eb8d0 <job_cancel_err>, errp=errp@entry=0x0) at job.c:986
#7 0x0000558c9d5eb8ee in job_cancel_sync (job=<optimized out>) at job.c:941
#8 0x0000558c9d64d853 in replication_close (bs=<optimized out>) at block/replication.c:148
#9 0x0000558c9d5e5c9f in bdrv_close (bs=0x558c9f41b020) at block.c:3420
#10 bdrv_delete (bs=0x558c9f41b020) at block.c:3629
#11 bdrv_unref (bs=0x558c9f41b020) at block.c:4685
#12 0x0000558c9d62a3f3 in blk_remove_bs (blk=blk@entry=0x558c9f42a7c0) at block/block-backend.c:783
#13 0x0000558c9d62a667 in blk_delete (blk=0x558c9f42a7c0) at block/block-backend.c:402
#14 blk_unref (blk=0x558c9f42a7c0) at block/block-backend.c:457
#15 0x0000558c9d5dfcea in test_secondary_stop () at tests/test-replication.c:478
#16 0x00007f51c1f13178 in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
#17 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
#18 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
#19 0x00007f51c1f13552 in g_test_run_suite () from /lib64/libglib-2.0.so.0
#20 0x00007f51c1f13571 in g_test_run () from /lib64/libglib-2.0.so.0
#21 0x0000558c9d5de31f in main (argc=<optimized out>, argv=<optimized out>) at tests/test-replication.c:581
It is yet unclear whether this should really be considered a bug in the
test case or whether blk_unref() should work for callers that haven't
taken the AioContext lock, but in order to fix the build tests quickly,
just take the AioContext lock around blk_unref().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
A socket chardev may not have associated address (when adding client
fd manually for example). But on disconnect, updating socket filename
expects an address and may lead to this crash:
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555d8c70c in SocketAddress_to_str (prefix=0x555556043062 "disconnected:", addr=0x0, is_listen=false, is_telnet=false) at /home/elmarco/src/qq/chardev/char-socket.c:388
388 switch (addr->type) {
(gdb) bt
#0 0x0000555555d8c70c in SocketAddress_to_str (prefix=0x555556043062 "disconnected:", addr=0x0, is_listen=false, is_telnet=false) at /home/elmarco/src/qq/chardev/char-socket.c:388
#1 0x0000555555d8c8aa in update_disconnected_filename (s=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:419
#2 0x0000555555d8c959 in tcp_chr_disconnect (chr=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:438
#3 0x0000555555d8cba1 in tcp_chr_hup (channel=0x555556b75690, cond=G_IO_HUP, opaque=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:482
#4 0x0000555555da596e in qio_channel_fd_source_dispatch (source=0x555556bb68b0, callback=0x555555d8cb58 <tcp_chr_hup>, user_data=0x555556b1ed00) at /home/elmarco/src/qq/io/channel-watch.c:84
Replace filename with a generic "disconnected:socket" in this case.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
ozbenh
pushed a commit
that referenced
this pull request
Nov 1, 2018
Spotted by ASAN:
=================================================================
==11893==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 1120 byte(s) in 28 object(s) allocated from:
#0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48)
#1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
#2 0x559e708b56a4 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:66
#3 0x559e708b4fe0 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:23
#4 0x559e708bda7d in parse_string /home/elmarco/src/qq/qobject/json-parser.c:143
#5 0x559e708c1009 in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:484
#6 0x559e708c1627 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:547
#7 0x559e708c1c67 in json_parser_parse /home/elmarco/src/qq/qobject/json-parser.c:573
#8 0x559e708bc0ff in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:92
#9 0x559e708d1655 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:292
#10 0x559e708d1fe1 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:339
#11 0x559e708bc856 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:121
#12 0x559e708b8b4b in qobject_from_jsonv /home/elmarco/src/qq/qobject/qjson.c:69
#13 0x559e708b8d02 in qobject_from_json /home/elmarco/src/qq/qobject/qjson.c:83
#14 0x559e708a74ae in from_json_str /home/elmarco/src/qq/tests/check-qjson.c:30
#15 0x559e708a9f83 in utf8_string /home/elmarco/src/qq/tests/check-qjson.c:781
#16 0x7fd05101bc49 in test_case_run gtestutils.c:2255
#17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180901211917.10372-1-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
shenki
pushed a commit
to open-power/qemu
that referenced
this pull request
Dec 11, 2018
The way we determine if we can start the incoming migration was changed to use migration_has_all_channels() in: commit 428d890 Author: Juan Quintela <quintela@redhat.com> Date: Mon Jul 24 13:06:25 2017 +0200 migration: Create migration_has_all_channels This method in turn calls multifd_recv_all_channels_created() which is hardcoded to always return 'true' when multifd is not in use. This is a latent bug... ...activated in a following commit where that return result ends up acting as the flag to indicate whether it is possible to start processing the migration: commit 36c2f8b Author: Juan Quintela <quintela@redhat.com> Date: Wed Mar 7 08:40:52 2018 +0100 migration: Delay start of migration main routines This means that if channel initialization fails with normal migration, it'll never notice and attempt to start the incoming migration regardless and crash on a NULL pointer. This can be seen, for example, if a client connects to a server requiring TLS, but has an invalid x509 certificate: qemu-system-x86_64: The certificate hasn't got a known issuer qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed. #0 0x00007fffebd24f2b in raise () at /lib64/libc.so.6 ozbenh#1 0x00007fffebd0f561 in abort () at /lib64/libc.so.6 ozbenh#2 0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007fffebd1d692 in () at /lib64/libc.so.6 #4 0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386 #5 0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116 #6 0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6 #7 0x0000000000000000 in () To handle the non-multifd case, we check whether mis->from_src_file is non-NULL. With this in place, the migration server drops the rejected client and stays around waiting for another, hopefully valid, client to arrive. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20180619163552.18206-1-berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
shenki
pushed a commit
to open-power/qemu
that referenced
this pull request
Dec 11, 2018
With a Spice port chardev, it is possible to reenter monitor_qapi_event_queue() (when the client disconnects for example). This will dead-lock on monitor_lock. Instead, use some TLS variables to check for recursion and queue the events. Fixes: (gdb) bt #0 0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0 ozbenh#1 0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0 ozbenh#2 0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66 #3 0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645 #4 0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149 #5 0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235 #6 0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316 #7 0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197 #8 0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197 #9 0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414 #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388 #11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347 #12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0 #13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341 #14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 #15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 #16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353 #17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199 #18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112 #19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147 #20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42 #21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425 #22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468 #23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517 #24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538 #25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624 #26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649 #27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58 #28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822 #29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862 #30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644 Note that error report is now moved to the first caller, which may receive an error for a recursed event. This is probably fine (95% of callers use &error_abort, the rest have NULL error and ignore it) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180731150144.14022-1-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [*_no_recurse renamed to *_no_reenter, local variables reordered] Signed-off-by: Markus Armbruster <armbru@redhat.com>
daxtens
pushed a commit
to daxtens/qemu
that referenced
this pull request
Jan 7, 2020
Currently offloads disabled by guest via the VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET command are not preserved on VM migration. Instead all offloads reported by guest features (via VIRTIO_PCI_GUEST_FEATURES) get enabled. What happens is: first the VirtIONet::curr_guest_offloads gets restored and offloads are getting set correctly: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=0, tso6=0, ecn=0, ufo=0) at net/net.c:474 ozbenh#1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 ozbenh#2 virtio_net_post_load_device (opaque=0x555557701ca0, version_id=11) at hw/net/virtio-net.c:2334 #3 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577c80 <vmstate_virtio_net_device>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:168 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2197 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 However later on the features are getting restored, and offloads get reset to everything supported by features: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=1, tso6=1, ecn=0, ufo=0) at net/net.c:474 ozbenh#1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 ozbenh#2 virtio_net_set_features (vdev=0x555557701ca0, features=5104441767) at hw/net/virtio-net.c:773 #3 virtio_set_features_nocheck (vdev=0x555557701ca0, val=5104441767) at hw/virtio/virtio.c:2052 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2220 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 Fix this by preserving the state in saved_guest_offloads field and pushing out offload initialization to the new post load hook. Cc: qemu-stable@nongnu.org Signed-off-by: Mikhail Sennikovsky <mikhail.sennikovskii@cloud.ionos.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
daxtens
pushed a commit
to daxtens/qemu
that referenced
this pull request
Jan 7, 2020
Guests can crash QEMU when writting to PnP registers: $ echo 'writeb 0x800ff042 69' | qemu-system-sparc -M leon3_generic -S -bios /etc/magic -qtest stdio [I 1571938309.932255] OPENED [R +0.063474] writeb 0x800ff042 69 Segmentation fault (core dumped) (gdb) bt #0 0x0000000000000000 in () ozbenh#1 0x0000555f4bcdf0bc in memory_region_write_with_attrs_accessor (mr=0x555f4d7be8c0, addr=66, value=0x7fff07d00f08, size=1, shift=0, mask=255, attrs=...) at memory.c:503 ozbenh#2 0x0000555f4bcdf185 in access_with_adjusted_size (addr=66, value=0x7fff07d00f08, size=1, access_size_min=1, access_size_max=4, access_fn=0x555f4bcdeff4 <memory_region_write_with_attrs_accessor>, mr=0x555f4d7be8c0, attrs=...) at memory.c:539 #3 0x0000555f4bce2243 in memory_region_dispatch_write (mr=0x555f4d7be8c0, addr=66, data=69, op=MO_8, attrs=...) at memory.c:1489 #4 0x0000555f4bc80b20 in flatview_write_continue (fv=0x555f4d92c400, addr=2148528194, attrs=..., buf=0x7fff07d01120 "E", len=1, addr1=66, l=1, mr=0x555f4d7be8c0) at exec.c:3161 #5 0x0000555f4bc80c65 in flatview_write (fv=0x555f4d92c400, addr=2148528194, attrs=..., buf=0x7fff07d01120 "E", len=1) at exec.c:3201 #6 0x0000555f4bc80fb0 in address_space_write (as=0x555f4d7aa460, addr=2148528194, attrs=..., buf=0x7fff07d01120 "E", len=1) at exec.c:3291 #7 0x0000555f4bc8101d in address_space_rw (as=0x555f4d7aa460, addr=2148528194, attrs=..., buf=0x7fff07d01120 "E", len=1, is_write=true) at exec.c:3301 #8 0x0000555f4bcdb388 in qtest_process_command (chr=0x555f4c2ed7e0 <qtest_chr>, words=0x555f4db0c5d0) at qtest.c:432 Instead of crashing, log the access as unimplemented. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: KONRAD Frederic <frederic.konrad@adacore.com> Message-Id: <20191025110114.27091-2-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Apr 13, 2021
We can't know the caller read enough data in the memory pointed
by ext_hdr to cast it as a ip6_ext_hdr_routing.
Declare rt_hdr on the stack and fill it again from the iovec.
Since we already checked there is enough data in the iovec buffer,
simply add an assert() call to consume the bytes_read variable.
This fix a 2 bytes buffer overrun in eth_parse_ipv6_hdr() reported
by QEMU fuzzer:
$ cat << EOF | ./qemu-system-i386 -M pc-q35-5.0 \
-accel qtest -monitor none \
-serial none -nographic -qtest stdio
outl 0xcf8 0x80001010
outl 0xcfc 0xe1020000
outl 0xcf8 0x80001004
outw 0xcfc 0x7
write 0x25 0x1 0x86
write 0x26 0x1 0xdd
write 0x4f 0x1 0x2b
write 0xe1020030 0x4 0x190002e1
write 0xe102003a 0x2 0x0807
write 0xe1020048 0x4 0x12077cdd
write 0xe1020400 0x4 0xba077cdd
write 0xe1020420 0x4 0x190002e1
write 0xe1020428 0x4 0x3509d807
write 0xe1020438 0x1 0xe2
EOF
=================================================================
==2859770==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdef904902 at pc 0x561ceefa78de bp 0x7ffdef904820 sp 0x7ffdef904818
READ of size 1 at 0x7ffdef904902 thread T0
#0 0x561ceefa78dd in _eth_get_rss_ex_dst_addr net/eth.c:410:17
ozbenh#1 0x561ceefa41fb in eth_parse_ipv6_hdr net/eth.c:532:17
ozbenh#2 0x561cef7de639 in net_tx_pkt_parse_headers hw/net/net_tx_pkt.c:228:14
#3 0x561cef7dbef4 in net_tx_pkt_parse hw/net/net_tx_pkt.c:273:9
#4 0x561ceec29f22 in e1000e_process_tx_desc hw/net/e1000e_core.c:730:29
#5 0x561ceec28eac in e1000e_start_xmit hw/net/e1000e_core.c:927:9
#6 0x561ceec1baab in e1000e_set_tdt hw/net/e1000e_core.c:2444:9
#7 0x561ceebf300e in e1000e_core_write hw/net/e1000e_core.c:3256:9
#8 0x561cef3cd4cd in e1000e_mmio_write hw/net/e1000e.c:110:5
Address 0x7ffdef904902 is located in stack of thread T0 at offset 34 in frame
#0 0x561ceefa320f in eth_parse_ipv6_hdr net/eth.c:486
This frame has 1 object(s):
[32, 34) 'ext_hdr' (line 487) <== Memory access at offset 34 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow net/eth.c:410:17 in _eth_get_rss_ex_dst_addr
Shadow bytes around the buggy address:
0x10003df188d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df188e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df188f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df18900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df18910: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
=>0x10003df18920:[02]f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df18930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df18940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df18950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df18960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10003df18970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Stack left redzone: f1
Stack right redzone: f3
==2859770==ABORTING
Add the corresponding qtest case with the fuzzer reproducer.
FWIW GCC 11 similarly reported:
net/eth.c: In function 'eth_parse_ipv6_hdr':
net/eth.c:410:15: error: array subscript 'struct ip6_ext_hdr_routing[0]' is partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds]
410 | if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) {
| ~~~~~^~~~~~~
net/eth.c:485:24: note: while referencing 'ext_hdr'
485 | struct ip6_ext_hdr ext_hdr;
| ^~~~~~~
net/eth.c:410:38: error: array subscript 'struct ip6_ext_hdr_routing[0]' is partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds]
410 | if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) {
| ~~~~~^~~~~~~~~
net/eth.c:485:24: note: while referencing 'ext_hdr'
485 | struct ip6_ext_hdr ext_hdr;
| ^~~~~~~
Cc: qemu-stable@nongnu.org
Buglink: https://bugs.launchpad.net/qemu/+bug/1879531
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Fixes: eb70002 ("net_pkt: Extend packet abstraction as required by e1000e functionality")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Apr 13, 2021
Incoming enabled bitmaps are busy, because we do bdrv_dirty_bitmap_create_successor() for them. But disabled bitmaps being migrated are not marked busy, and user can remove them during the incoming migration. Then we may crash in cancel_incoming_locked() when try to remove the bitmap that was already removed by user, like this: #0 qemu_mutex_lock_impl (mutex=0x5593d88c50d1, file=0x559680554b20 "../block/dirty-bitmap.c", line=64) at ../util/qemu-thread-posix.c:77 ozbenh#1 bdrv_dirty_bitmaps_lock (bs=0x5593d88c0ee9) at ../block/dirty-bitmap.c:64 ozbenh#2 bdrv_release_dirty_bitmap (bitmap=0x5596810e9570) at ../block/dirty-bitmap.c:362 #3 cancel_incoming_locked (s=0x559680be8208 <dbm_state+40>) at ../migration/block-dirty-bitmap.c:918 #4 dirty_bitmap_load (f=0x559681d02b10, opaque=0x559680be81e0 <dbm_state>, version_id=1) at ../migration/block-dirty-bitmap.c:1194 #5 vmstate_load (f=0x559681d02b10, se=0x559680fb5810) at ../migration/savevm.c:908 #6 qemu_loadvm_section_part_end (f=0x559681d02b10, mis=0x559680fb4a30) at ../migration/savevm.c:2473 #7 qemu_loadvm_state_main (f=0x559681d02b10, mis=0x559680fb4a30) at ../migration/savevm.c:2626 #8 postcopy_ram_listen_thread (opaque=0x0) at ../migration/savevm.c:1871 #9 qemu_thread_start (args=0x5596817ccd10) at ../util/qemu-thread-posix.c:521 #10 start_thread () at /lib64/libpthread.so.0 #11 clone () at /lib64/libc.so.6 Note bs pointer taken from bitmap: it's definitely bad aligned. That's because we are in use after free, bitmap is already freed. So, let's make disabled bitmaps (being migrated) busy during incoming migration. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210322094906.5079-2-vsementsov@virtuozzo.com>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Apr 13, 2021
When building with --enable-sanitizers we get:
Direct leak of 32 byte(s) in 2 object(s) allocated from:
#0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf)
ozbenh#1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
ozbenh#2 0x561847f02ca2 in usb_packet_init hw/usb/core.c:531:5
#3 0x561848df4df4 in usb_ehci_init hw/usb/hcd-ehci.c:2575:5
#4 0x561847c119ac in ehci_sysbus_init hw/usb/hcd-ehci-sysbus.c:73:5
#5 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9
#6 0x56184a5bd955 in object_init_with_type qom/object.c:371:9
#7 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5
#8 0x56184a5a24d5 in object_initialize qom/object.c:536:5
#9 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5
#10 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10
#11 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5
#12 0x561849542d18 in npcm7xx_init hw/arm/npcm7xx.c:427:5
Similarly to commit d710e1e ("usb: ehci: fix memory leak in
ehci"), fix by calling usb_ehci_finalize() to free the USBPacket.
Fixes: 7341ea0
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210323183701.281152-1-f4bug@amsat.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Apr 13, 2021
When building with --enable-sanitizers we get:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf)
ozbenh#1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
ozbenh#2 0x561847c2dcc9 in xlnx_dp_init hw/display/xlnx_dp.c:1259:5
#3 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9
#4 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5
#5 0x56184a5a24d5 in object_initialize qom/object.c:536:5
#6 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5
#7 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10
#8 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5
#9 0x5618495aa431 in xlnx_zynqmp_init hw/arm/xlnx-zynqmp.c:273:5
The RX/TX FIFOs are created in xlnx_dp_init(), add xlnx_dp_finalize()
to destroy them.
Fixes: 58ac482 ("introduce xlnx-dp")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210323182958.277654-1-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Apr 13, 2021
g_hash_table_add always retains ownership of the pointer passed in as
the key. Its return status merely indicates whether the added entry was
new, or replaced an existing entry. Thus key must never be freed after
this method returns.
Spotted by ASAN:
==2407186==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020003ac4f0 at pc 0x7ffff766659c bp 0x7fffffffd1d0 sp 0x7fffffffc980
READ of size 1 at 0x6020003ac4f0 thread T0
#0 0x7ffff766659b (/lib64/libasan.so.6+0x8a59b)
ozbenh#1 0x7ffff6bfa843 in g_str_equal ../glib/ghash.c:2303
ozbenh#2 0x7ffff6bf8167 in g_hash_table_lookup_node ../glib/ghash.c:493
#3 0x7ffff6bf9b78 in g_hash_table_insert_internal ../glib/ghash.c:1598
#4 0x7ffff6bf9c32 in g_hash_table_add ../glib/ghash.c:1689
#5 0x5555596caad4 in module_load_one ../util/module.c:233
#6 0x5555596ca949 in module_load_one ../util/module.c:225
#7 0x5555596ca949 in module_load_one ../util/module.c:225
#8 0x5555596cbdf4 in module_load_qom_all ../util/module.c:349
Typical C bug...
Fixes: 9062912 ("module: use g_hash_table_add()")
Cc: qemu-stable@nongnu.org
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210316134456.3243102-1-marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Jul 28, 2021
modinfo runs the preprocessor and therefore needs all generated input files to be there. The "depends" clause does not work in Meson 0.55.3, so for now use "input". Part ozbenh#2: Update the rule for target-specific modules too. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20210723120156.1183920-1-kraxel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Jul 28, 2021
…m' into staging Bugfixes. # gpg: Signature made Sat 24 Jul 2021 07:11:18 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: qom: use correct field name when getting/setting alias properties qapi: introduce forwarding visitor gitlab: only let pages be published from default branch MAINTAINERS: Add memory_mapping.h and memory_mapping.c to "Memory API" MAINTAINERS: Add Peter Xu and myself as co-maintainer of "Memory API" MAINTAINERS: Replace Eduardo as "Host Memory Backends" maintainer i386: do not call cpudef-only models functions for max, host, base target/i386: Added consistency checks for CR3 meson: fix dependencies for modinfo ozbenh#2 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Dec 17, 2021
When trying to use the pc-dimm device on a non-NUMA machine, we get:
$ qemu-system-arm -M none -cpu max -S \
-object memory-backend-file,id=mem1,size=1M,mem-path=/tmp/1m \
-device pc-dimm,id=dimm1,memdev=mem1
Segmentation fault (core dumped)
(gdb) bt
#0 pc_dimm_realize (dev=0x555556da3e90, errp=0x7fffffffcd10) at hw/mem/pc-dimm.c:184
ozbenh#1 0x0000555555fe1f8f in device_set_realized (obj=0x555556da3e90, value=true, errp=0x7fffffffce18) at hw/core/qdev.c:531
ozbenh#2 0x0000555555feb4a9 in property_set_bool (obj=0x555556da3e90, v=0x555556e54420, name=0x5555563c3c41 "realized", opaque=0x555556a704f0, errp=0x7fffffffce18) at qom/object.c:2257
To avoid that crash, restrict the pc-dimm NUMA check to machines
supporting NUMA, and do not allow the use of 'node' property on
non-NUMA machines.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20211106145016.611332-1-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
legoater
pushed a commit
to open-power/qemu
that referenced
this pull request
Dec 17, 2021
Without the previous commit, when running 'make check-qtest-i386'
with QEMU configured with '--enable-sanitizers' we get:
AddressSanitizer:DEADLYSIGNAL
=================================================================
==287878==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000344
==287878==The signal is caused by a WRITE memory access.
==287878==Hint: address points to the zero page.
#0 0x564b2e5bac27 in blk_inc_in_flight block/block-backend.c:1346:5
ozbenh#1 0x564b2e5bb228 in blk_pwritev_part block/block-backend.c:1317:5
ozbenh#2 0x564b2e5bcd57 in blk_pwrite block/block-backend.c:1498:11
#3 0x564b2ca1cdd3 in fdctrl_write_data hw/block/fdc.c:2221:17
#4 0x564b2ca1b2f7 in fdctrl_write hw/block/fdc.c:829:9
#5 0x564b2dc49503 in portio_write softmmu/ioport.c:201:9
Add the reproducer for CVE-2021-20196.
Suggested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20211124161536.631563-4-philmd@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
These can only be generated if the thread is not in HV
mode, even with VPM.
5.7.2 of PowerISA_v2.07: "If the thread is not in hypervisor state,
and either address translation is enabled and VPM1=1, or address
translation is disabled and VPM0=1, conditions that would have caused
a Data Storage or an Instruction Storage interrupt if the affected
memory were not virtu- alized instead cause a Hypervisor Data Storage
or a Hypervisor Instruction Storage interrupt respectively."
6.5.16: "A Hypervisor Data Storage interrupt occurs when no higher
priority exception exists, the thread is not in hypervisor state, and
..."
Signed-off-by: Andrei Warkentin andrey.warkentin@gmail.com