Skip to content

fix dead lock#9

Open
xinyunliu wants to merge 3259 commits intomasterfrom
fix-hdmabuf-deadlock-in-virtio-be
Open

fix dead lock#9
xinyunliu wants to merge 3259 commits intomasterfrom
fix-hdmabuf-deadlock-in-virtio-be

Conversation

@xinyunliu
Copy link
Copy Markdown
Owner

need to release the lock when time out

mengwei-intel and others added 30 commits January 25, 2019 03:55
This patch is to fix the ksys_close call for
camera streaming for SOS case where the dmabuf fd
is close by user space. Added check to only close
the dmabuf fd for virtualized case.

Change-Id: Id0c7fbdc8b81235c36f0340501a8c695ec3aaacf
Tracked-On: PKT-1691
Signed-off-by: Ong Hock Yu <ong.hock.yu@intel.com>
Signed-off-by: Meng Wei <wei.meng@intel.com>
pdata for OV495 multiport.

Change-Id: I6511af1bbb9b5af45fc140b77ebb436f84de8729
Tracked-On: PKT-1691
Signed-off-by: Chen Meng J <meng.j.chen@intel.com>
Signed-off-by: Meng Wei <wei.meng@intel.com>
for TI960 on the AS_1140 board, i2c adapters are 2, 4.
fix 2nd TI960 i2c apdapter, make it work properly.

Change-Id: I14a9b0f6e67baaa5146898d34ecf34fbaa419762
Tracked-On: PKT-1691
Signed-off-by: Chen Meng J <meng.j.chen@intel.com>
Signed-off-by: Meng Wei <wei.meng@intel.com>
enable OV495 multiport.

Change-Id: Iaaf9e2efe1ba3bd97472cb79858b19b28c4850ea
Tracked-On: PKT-1691
Signed-off-by: Chen Meng J <meng.j.chen@intel.com>
Signed-off-by: Meng Wei <wei.meng@intel.com>
without ox03a init sequence, then sensor gone with power on/off.

Change-Id: Ia409b9ac60841c94ef3863a6072744154852bd6e
Tracked-On: PKT-1691
Signed-off-by: Chen Meng J <meng.j.chen@intel.com>
Signed-off-by: Meng Wei <wei.meng@intel.com>
separated init seq for ox03a10 and ov495.

Change-Id: Ic59d9fd41636acd31158a242e2e616bb37c4dfcf
Tracked-On: PKT-1691
Signed-off-by: Chen Meng J <meng.j.chen@intel.com>
Signed-off-by: Meng Wei <wei.meng@intel.com>
…ddress

[ Upstream commit ec90ad3 ]

Similar to c5ee066 ("ipv6: Consider sk_bound_dev_if when binding a
socket to an address"), binding a socket to v4 mapped addresses needs to
consider if the socket is bound to a device.

This problem also exists from the beginning of git history.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8adbe21 ]

Make sure that lag port TX is disabled before mlxsw_sp_port_lag_leave()
is called and prevent from possible EMAD error.

Fixes: 0d65fc1 ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 674bed5 ]

When a VLAN is deleted from a bridge port we should not change the PVID
unless the deleted VLAN is the PVID.

Fixes: fe9ccc7 ("mlxsw: spectrum_switchdev: Don't batch VLAN operations")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ea89098 ]

The 6390 copper ports have an errata which require poking magic values
into undocumented magic registers and then performing a software
reset.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f8c468e ]

Commit dcda9b0 ("mm, tree wide: replace __GFP_REPEAT by
__GFP_RETRY_MAYFAIL with more useful semantic") replaced __GFP_REPEAT in
alloc_skb_with_frags() with __GFP_RETRY_MAYFAIL when the allocation may
directly reclaim.

The previous behavior would require reclaim up to 1 << order pages for
skb aligned header_len of order > PAGE_ALLOC_COSTLY_ORDER before failing,
otherwise the allocations in alloc_skb() would loop in the page allocator
looking for memory.  __GFP_RETRY_MAYFAIL makes both allocations failable
under memory pressure, including for the HEAD allocation.

This can cause, among many other things, write() to fail with ENOTCONN
during RPC when under memory pressure.

These allocations should succeed as they did previous to dcda9b0
even if it requires calling the oom killer and additional looping in the
page allocator to find memory.  There is no way to specify the previous
behavior of __GFP_REPEAT, but it's unlikely to be necessary since the
previous behavior only guaranteed that 1 << order pages would be reclaimed
before failing for order > PAGE_ALLOC_COSTLY_ORDER.  That reclaim is not
guaranteed to be contiguous memory, so repeating for such large orders is
usually not beneficial.

Removing the setting of __GFP_RETRY_MAYFAIL to restore the previous
behavior, specifically not allowing alloc_skb() to fail for small orders
and oom kill if necessary rather than allowing RPCs to fail.

Fixes: dcda9b0 ("mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic")
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f87118d ]

This patch adds MTU default value to qmap network interface in
order to avoid "RTNETLINK answers: No buffer space available"
error when setting an ipv6 address.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3635299 ]

There are two new Realtek Ethernet devices which are re-branded r8168h.
Add the IDs to to support them.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d4a7e9b ]

I realized the last patch calls dev_get_by_index_rcu in a branch not
holding the rcu lock. Add the calls to rcu_read_lock and rcu_read_unlock.

Fixes: ec90ad3 ("ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 41d1c88 ]

Matteo reported forwarding issues inside the linux bridge,
if the enslaved interfaces use the fq qdisc.

Similar to commit 8203e2d ("net: clear skb->tstamp in
forwarding paths"), we need to clear the tstamp field in
the bridge forwarding path.

Fixes: 80b14de ("net: Add a new socket option for a future transmit time.")
Fixes: fb420d5 ("tcp/fq: move back to CLOCK_MONOTONIC")
Reported-and-tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…nd ipmac sets

[ Upstream commit 8cc4ccf ]

There doesn't seem to be any reason to restrict MAC address
matching to source MAC addresses in set types bitmap:ipmac,
hash:ipmac and hash:mac. With this patch, and this setup:

  ip netns add A
  ip link add veth1 type veth peer name veth2 netns A
  ip addr add 192.0.2.1/24 dev veth1
  ip -net A addr add 192.0.2.2/24 dev veth2
  ip link set veth1 up
  ip -net A link set veth2 up

  ip netns exec A ipset create test hash:mac
  dst=$(ip netns exec A cat /sys/class/net/veth2/address)
  ip netns exec A ipset add test ${dst}
  ip netns exec A iptables -P INPUT DROP
  ip netns exec A iptables -I INPUT -m set --match-set test dst -j ACCEPT

ipset will match packets based on destination MAC address:

  # ping -c1 192.0.2.2 >/dev/null
  # echo $?
  0

Reported-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ed8dce4 ]

Keeping the irq_chip definition static will make it shared with multiple
giochips in the system. This practice is considered to be bad and now we
will get the below warning from gpiolib core:

"detected irqchip that is shared with multiple gpiochips: please fix the
driver."

Hence, move the irq_chip definition from static to `struct pl061` for
using a unique irq_chip for each gpiochip.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f41a895 ]

[Why]

The igt@kms_plane@pixel-format-pipe tests can create a sequence where
stream_state is NULL during amdgpu_dm_crtc_set_crc_source which results
in a null pointer dereference.

[How]

Guard against stream_state being NULL before accessing its fields. This
doesn't fix the root cause of the issue so a DRM_ERROR is generated
to still fail the tests.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: David Francis <David.Francis@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2383a76 ]

Vega10 has multiple interrupt rings, so this can be called from multiple
calles at the same time resulting in:

[   71.779334] ================================
[   71.779406] WARNING: inconsistent lock state
[   71.779478] 4.19.0-rc1+ projectacrn#44 Tainted: G        W
[   71.779565] --------------------------------
[   71.779637] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[   71.779740] kworker/6:1/120 [HC0[0]:SC0[0]:HE1:SE1] takes:
[   71.779832] 00000000ad761971 (&(&kfd->interrupt_lock)->rlock){?...},
at: kgd2kfd_interrupt+0x75/0x100 [amdgpu]
[   71.780058] {IN-HARDIRQ-W} state was registered at:
[   71.780115]   _raw_spin_lock+0x2c/0x40
[   71.780180]   kgd2kfd_interrupt+0x75/0x100 [amdgpu]
[   71.780248]   amdgpu_irq_callback+0x6c/0x150 [amdgpu]
[   71.780315]   amdgpu_ih_process+0x88/0x100 [amdgpu]
[   71.780380]   amdgpu_irq_handler+0x20/0x40 [amdgpu]
[   71.780409]   __handle_irq_event_percpu+0x49/0x2a0
[   71.780436]   handle_irq_event_percpu+0x30/0x70
[   71.780461]   handle_irq_event+0x37/0x60
[   71.780484]   handle_edge_irq+0x83/0x1b0
[   71.780506]   handle_irq+0x1f/0x30
[   71.780526]   do_IRQ+0x53/0x110
[   71.780544]   ret_from_intr+0x0/0x22
[   71.780566]   cpuidle_enter_state+0xaa/0x330
[   71.780591]   do_idle+0x203/0x280
[   71.780610]   cpu_startup_entry+0x6f/0x80
[   71.780634]   start_secondary+0x1b0/0x200
[   71.780657]   secondary_startup_64+0xa4/0xb0

Fix this by always using irq save spin locks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7fa57ca ]

When it's possible that the PF might end up trying to send a
packet to one of its own VFs, we have to forbid IPsec offload
because the device drops the packets into a black hole.
See commit 47b6f50 ("ixgbe: disallow IPsec Tx offload
when in SR-IOV mode") for more info.

This really is only necessary when the device is in the default
VEB mode.  If instead the device is running in VEPA mode,
the packets will go through the encryption engine and out the
MAC/PHY as normal, and get "hairpinned" as needed by the switch.

So let's not block IPsec offload when in VEPA mode.  To get
there with the ixgbe device, use the handy 'bridge' command:
	bridge link set dev eth1 hwmode vepa

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
…f hotkey

[ Upstream commit 78f3ac7 ]

In the past, Asus firmwares would change the panel backlight directly
through the EC when the display off hotkey (Fn+F7) was pressed, and
only notify the OS of such change, with 0x33 when the LCD was ON and
0x34 when the LCD was OFF. These are currently mapped to
KEY_DISPLAYTOGGLE and KEY_DISPLAY_OFF, respectively.

Most recently the EC on Asus most machines lost ability to toggle the
LCD backlight directly, but unless the OS informs the firmware it is
going to handle the display toggle hotkey events, the firmware still
tries change the brightness through the EC, to no effect. The end result
is a long list (at Endless we counted 11) of Asus laptop models where
the display toggle hotkey does not perform any action. Our firmware
engineers contacts at Asus were surprised that there were still machines
out there with the old behavior.

Calling WMNB(ASUS_WMI_DEVID_BACKLIGHT==0x00050011, 2) on the _WDG device
tells the firmware that it should let the OS handle the display toggle
event, in which case it will simply notify the OS of a key press with
0x35, as shown by the DSDT excerpts bellow.

 Scope (_SB)
 {
     (...)

     Device (ATKD)
     {
         (...)

         Name (_WDG, Buffer (0x28)
         {
             /* 0000 */  0xD0, 0x5E, 0x84, 0x97, 0x6D, 0x4E, 0xDE, 0x11,
             /* 0008 */  0x8A, 0x39, 0x08, 0x00, 0x20, 0x0C, 0x9A, 0x66,
             /* 0010 */  0x4E, 0x42, 0x01, 0x02, 0x35, 0xBB, 0x3C, 0x0B,
             /* 0018 */  0xC2, 0xE3, 0xED, 0x45, 0x91, 0xC2, 0x4C, 0x5A,
             /* 0020 */  0x6D, 0x19, 0x5D, 0x1C, 0xFF, 0x00, 0x01, 0x08
         })
         Method (WMNB, 3, Serialized)
         {
             CreateDWordField (Arg2, Zero, IIA0)
             CreateDWordField (Arg2, 0x04, IIA1)
             Local0 = (Arg1 & 0xFFFFFFFF)

             (...)

             If ((Local0 == 0x53564544))
             {
                 (...)

                 If ((IIA0 == 0x00050011))
                 {
                     If ((IIA1 == 0x02))
                     {
                         ^^PCI0.SBRG.EC0.SPIN (0x72, One)
                         ^^PCI0.SBRG.EC0.BLCT = One
                     }

                     Return (One)
                 }
             }
             (...)
         }
         (...)
     }
     (...)
 }
 (...)

 Scope (_SB.PCI0.SBRG.EC0)
 {
     (...)

     Name (BLCT, Zero)

     (...)

     Method (_Q10, 0, NotSerialized)  // _Qxx: EC Query
     {
         If ((BLCT == Zero))
         {
             Local0 = One
             Local0 = RPIN (0x72)
             Local0 ^= One
             SPIN (0x72, Local0)
             If (ATKP)
             {
                 Local0 = (0x34 - Local0)
                 ^^^^ATKD.IANE (Local0)
             }
         }
         ElseIf ((BLCT == One))
         {
             If (ATKP)
             {
                 ^^^^ATKD.IANE (0x35)
             }
         }
     }
     (...)
 }

Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e1f65b0 ]

It seems with some NICs supported by the e1000e driver a SYSTIM reading
may occasionally be few microseconds before the previous reading and if
enabled also pass e1000e_sanitize_systim() without reaching the maximum
number of rereads, even if the function is modified to check three
consecutive readings (i.e. it doesn't look like a double read error).
This causes an underflow in the timecounter and the PHC time jumps hours
ahead.

This was observed on 82574, I217 and I219. The fastest way to reproduce
it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
on the PHC.

Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
timecounter_read() in order to allow non-monotonic SYSTIM readings and
prevent the PHC from jumping.

Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23b5f73 ]

During HARD_RESET the data link is disconnected.
For self powered device, the spec is advising against doing that.

>From USB_PD_R3_0
7.1.5 Response to Hard Resets
Device operation during and after a Hard Reset is defined as follows:
Self-powered devices Should Not disconnect from USB during a Hard Reset
(see Section 9.1.2).
Bus powered devices will disconnect from USB during a Hard Reset due to the
loss of their power source.

Tackle this by letting TCPM know whether the device is self or bus powered.

This overcomes unnecessary port disconnections from hard reset.
Also, speeds up the enumeration time when connected to Type-A ports.

Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
---------
Version history:
V3:
Rebase on top of usb-next

V2:
Based on feedback from heikki.krogerus@linux.intel.com
- self_powered added to the struct tcpm_port which is populated from
  a. "connector" node of the device tree in tcpm_fw_get_caps()
  b. "self_powered" node of the tcpc_config in tcpm_copy_caps

Based on feedbase from linux@roeck-us.net
- Code was refactored
- SRC_HARD_RESET_VBUS_OFF sets the link state to false based
  on self_powered flag

V1 located here:
https://lkml.org/lkml/2018/9/13/94
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f96afa7 ]

libbpf is now able to load successfully test_l4lb_noinline.o and
samples/bpf/tracex3_kern.o.

For the test_l4lb_noinline, uncomment related tests from test_libbpf.c
and remove the associated "TODO".

For tracex3_kern.o, instead of loading a program from samples/bpf/ that
might not have been compiled at this stage, try loading a program from
BPF selftests. Since this test case is about loading a program compiled
without the "-target bpf" flag, change the Makefile to compile one
program accordingly (instead of passing the flag for compiling all
programs).

Regarding test_xdp_noinline.o: in its current shape the program fails to
load because it provides no version section, but the loader needs one.
The test was added to make sure that libbpf could load XDP programs even
if they do not provide a version number in a dedicated section. But
libbpf is already capable of doing that: in our case loading fails
because the loader does not know that this is an XDP program (it does
not need to, since it does not attach the program). So trying to load
test_xdp_noinline.o does not bring much here: just delete this subtest.

For the record, the error message obtained with tracex3_kern.o was
fixed by commit e3d91b0 ("tools/libbpf: handle issues with bpf ELF
objects containing .eh_frames")

I have not been abled to reproduce the "libbpf: incorrect bpf_call
opcode" error for test_l4lb_noinline.o, even with the version of libbpf
present at the time when test_libbpf.sh and test_libbpf_open.c were
created.

RFC -> v1:
- Compile test_xdp without the "-target bpf" flag, and try to load it
  instead of ../../samples/bpf/tracex3_kern.o.
- Delete test_xdp_noinline.o subtest.

Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7c528e4 ]

The refcount of a newly added overlay node decrements to one
(instead of zero) when the overlay changeset is destroyed.  This
change will cause the final decrement be to zero.

After applying this patch, new validation warnings will be
reported from the devicetree unittest during boot due to
a pre-existing devicetree bug.  The warnings will be similar to:

  OF: ERROR: memory leak before free overlay changeset,  /testcase-data/overlay-node/test-bus/test-unittest4

This pre-existing devicetree bug will also trigger a WARN_ONCE() from
refcount_sub_and_test_checked() when an overlay changeset is
destroyed without having first been applied.  This scenario occurs
when an error in the overlay is detected during the overlay changeset
creation:

  WARNING: CPU: 0 PID: 1 at lib/refcount.c:187 refcount_sub_and_test_checked+0xa8/0xbc
  refcount_t: underflow; use-after-free.

  (unwind_backtrace) from (show_stack+0x10/0x14)
  (show_stack) from (dump_stack+0x6c/0x8c)
  (dump_stack) from (__warn+0xdc/0x104)
  (__warn) from (warn_slowpath_fmt+0x44/0x6c)
  (warn_slowpath_fmt) from (refcount_sub_and_test_checked+0xa8/0xbc)
  (refcount_sub_and_test_checked) from (kobject_put+0x24/0x208)
  (kobject_put) from (of_changeset_destroy+0x2c/0xb4)
  (of_changeset_destroy) from (free_overlay_changeset+0x1c/0x9c)
  (free_overlay_changeset) from (of_overlay_remove+0x284/0x2cc)
  (of_overlay_remove) from (of_unittest_apply_revert_overlay_check.constprop.4+0xf8/0x1e8)
  (of_unittest_apply_revert_overlay_check.constprop.4) from (of_unittest_overlay+0x960/0xed8)
  (of_unittest_overlay) from (of_unittest+0x1cc4/0x2138)
  (of_unittest) from (do_one_initcall+0x4c/0x28c)
  (do_one_initcall) from (kernel_init_freeable+0x29c/0x378)
  (kernel_init_freeable) from (kernel_init+0x8/0x110)
  (kernel_init) from (ret_from_fork+0x14/0x2c)

Tested-by: Alan Tull <atull@kernel.org>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 347a28b ]

This happened while running in qemu-system-aarch64, the AMBA PL011 UART
driver when enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE.
arch_initcall(pl011_init) came before subsys_initcall(default_bdi_init),
devtmpfs' handle_remove() crashes because the reference count is a NULL
pointer only because wb->bdi hasn't been initialized yet.

Rework so that wb_put have an extra check if wb->bdi before decrement
wb->refcnt and also add a WARN_ON_ONCE to get a warning if it happens again
in other drivers.

Fixes: 52ebea7 ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks")
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6460979 ]

When the test 'CONFIG_DEBUG_TEST_DRIVER_REMOVE=y' is enabled,
arch_initcall(pl011_init) came before subsys_initcall(default_bdi_init).
devtmpfs gets killed because we try to remove a file and decrement the
wb reference count before the noop_backing_device_info gets initialized.

[    0.332075] Serial: AMBA PL011 UART driver
[    0.485276] 9000000.pl011: ttyAMA0 at MMIO 0x9000000 (irq = 39, base_baud = 0) is a PL011 rev1
[    0.502382] console [ttyAMA0] enabled
[    0.515710] Unable to handle kernel paging request at virtual address 0000800074c12000
[    0.516053] Mem abort info:
[    0.516222]   ESR = 0x96000004
[    0.516417]   Exception class = DABT (current EL), IL = 32 bits
[    0.516641]   SET = 0, FnV = 0
[    0.516826]   EA = 0, S1PTW = 0
[    0.516984] Data abort info:
[    0.517149]   ISV = 0, ISS = 0x00000004
[    0.517339]   CM = 0, WnR = 0
[    0.517553] [0000800074c12000] user address but active_mm is swapper
[    0.517928] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[    0.518305] Modules linked in:
[    0.518839] CPU: 0 PID: 13 Comm: kdevtmpfs Not tainted 4.19.0-rc5-next-20180928-00002-g2ba39ab0cd01-dirty projectacrn#82
[    0.519307] Hardware name: linux,dummy-virt (DT)
[    0.519681] pstate: 80000005 (Nzcv daif -PAN -UAO)
[    0.519959] pc : __destroy_inode+0x94/0x2a8
[    0.520212] lr : __destroy_inode+0x78/0x2a8
[    0.520401] sp : ffff0000098c3b20
[    0.520590] x29: ffff0000098c3b20 x28: 00000000087a3714
[    0.520904] x27: 0000000000002000 x26: 0000000000002000
[    0.521179] x25: ffff000009583000 x24: 0000000000000000
[    0.521467] x23: ffff80007bb52000 x22: ffff80007bbaa7c0
[    0.521737] x21: ffff0000093f9338 x20: 0000000000000000
[    0.522033] x19: ffff80007bbb05d8 x18: 0000000000000400
[    0.522376] x17: 0000000000000000 x16: 0000000000000000
[    0.522727] x15: 0000000000000400 x14: 0000000000000400
[    0.523068] x13: 0000000000000001 x12: 0000000000000001
[    0.523421] x11: 0000000000000000 x10: 0000000000000970
[    0.523749] x9 : ffff0000098c3a60 x8 : ffff80007bbab190
[    0.524017] x7 : ffff80007bbaa880 x6 : 0000000000000c88
[    0.524305] x5 : ffff0000093d96c8 x4 : 61c8864680b583eb
[    0.524567] x3 : ffff0000093d6180 x2 : ffffffffffffffff
[    0.524872] x1 : 0000800074c12000 x0 : 0000800074c12000
[    0.525207] Process kdevtmpfs (pid: 13, stack limit = 0x(____ptrval____))
[    0.525529] Call trace:
[    0.525806]  __destroy_inode+0x94/0x2a8
[    0.526108]  destroy_inode+0x34/0x88
[    0.526370]  evict+0x144/0x1c8
[    0.526636]  iput+0x184/0x230
[    0.526871]  dentry_unlink_inode+0x118/0x130
[    0.527152]  d_delete+0xd8/0xe0
[    0.527420]  vfs_unlink+0x240/0x270
[    0.527665]  handle_remove+0x1d8/0x330
[    0.527875]  devtmpfsd+0x138/0x1c8
[    0.528085]  kthread+0x14c/0x158
[    0.528291]  ret_from_fork+0x10/0x18
[    0.528720] Code: 92800002 aa1403e0 d538d081 8b010000 (c85f7c04)
[    0.529367] ---[ end trace 5a3dee47727f877c ]---

Rework to set suppress_bind_attrs flag to avoid removing the device when
CONFIG_DEBUG_TEST_DRIVER_REMOVE=y. This applies for pic32_uart and
xilinx_uartps as well.

Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 46f53a6 ]

Currently BPF verifier allows narrow loads for a context field only with
offset zero. E.g. if there is a __u32 field then only the following
loads are permitted:
  * off=0, size=1 (narrow);
  * off=0, size=2 (narrow);
  * off=0, size=4 (full).

On the other hand LLVM can generate a load with offset different than
zero that make sense from program logic point of view, but verifier
doesn't accept it.

E.g. tools/testing/selftests/bpf/sendmsg4_prog.c has code:

  #define DST_IP4			0xC0A801FEU /* 192.168.1.254 */
  ...
  	if ((ctx->user_ip4 >> 24) == (bpf_htonl(DST_IP4) >> 24) &&

where ctx is struct bpf_sock_addr.

Some versions of LLVM can produce the following byte code for it:

       8:       71 12 07 00 00 00 00 00         r2 = *(u8 *)(r1 + 7)
       9:       67 02 00 00 18 00 00 00         r2 <<= 24
      10:       18 03 00 00 00 00 00 fe 00 00 00 00 00 00 00 00         r3 = 4261412864 ll
      12:       5d 32 07 00 00 00 00 00         if r2 != r3 goto +7 <LBB0_6>

where `*(u8 *)(r1 + 7)` means narrow load for ctx->user_ip4 with size=1
and offset=3 (7 - sizeof(ctx->user_family) = 3). This load is currently
rejected by verifier.

Verifier code that rejects such loads is in bpf_ctx_narrow_access_ok()
what means any is_valid_access implementation, that uses the function,
works this way, e.g. bpf_skb_is_valid_access() for __sk_buff or
sock_addr_is_valid_access() for bpf_sock_addr.

The patch makes such loads supported. Offset can be in [0; size_default)
but has to be multiple of load size. E.g. for __u32 field the following
loads are supported now:
  * off=0, size=1 (narrow);
  * off=1, size=1 (narrow);
  * off=2, size=1 (narrow);
  * off=3, size=1 (narrow);
  * off=0, size=2 (narrow);
  * off=2, size=2 (narrow);
  * off=0, size=4 (full).

Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fba43f4 ]

This commit adds support for APOGEE duet FireWire, launched 2007, already
discontinued. This model uses Oxford Semiconductor FW971 as its
communication engine. Below is information on Configuration ROM of this
unit. The unit supports some AV/C commands defined by Audio subunit
specification and vendor dependent commands.

$ ./hinawa-config-rom-printer /dev/fw1
{ 'bus-info': { 'adj': False,
                'bmc': False,
                'chip_ID': 42949742248,
                'cmc': False,
                'cyc_clk_acc': 255,
                'generation': 0,
                'imc': False,
                'isc': True,
                'link_spd': 3,
                'max_ROM': 0,
                'max_rec': 64,
                'name': '1394',
                'node_vendor_ID': 987,
                'pmc': False},
  'root-directory': [ ['VENDOR', 987],
                      ['DESCRIPTOR', 'Apogee Electronics'],
                      ['MODEL', 122333],
                      ['DESCRIPTOR', 'Duet'],
                      [ 'NODE_CAPABILITIES',
                        { 'addressing': {'64': True, 'fix': True, 'prv': False},
                          'misc': {'int': False, 'ms': False, 'spt': True},
                          'state': { 'atn': False,
                                     'ded': False,
                                     'drq': True,
                                     'elo': False,
                                     'init': False,
                                     'lst': True,
                                     'off': False},
                          'testing': {'bas': False, 'ext': False}}],
                      [ 'UNIT',
                        [ ['SPECIFIER_ID', 41005],
                          ['VERSION', 65537],
                          ['MODEL', 122333],
                          ['DESCRIPTOR', 'Duet']]]]}

Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 68b5e43 ]

Add the proper includes and make smca_get_name() static.

Fix an actual bug too which the warning triggered:

  arch/x86/kernel/cpu/mcheck/therm_throt.c:395:39: error: conflicting \
  types for ‘smp_thermal_interrupt’
   asmlinkage __visible void __irq_entry smp_thermal_interrupt(struct pt_regs *r)
                                         ^~~~~~~~~~~~~~~~~~~~~
  In file included from arch/x86/kernel/cpu/mcheck/therm_throt.c:29:
  ./arch/x86/include/asm/traps.h:107:17: note: previous declaration of \
	  ‘smp_thermal_interrupt’ was here
   asmlinkage void smp_thermal_interrupt(void);

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Yi Wang <wang.yi59@zte.com.cn>
Cc: Michael Matz <matz@suse.de>
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811081633160.1549@nanos.tec.linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
mchinth and others added 26 commits February 5, 2019 15:38
In remote or acrn based collections, it is necessary to know if all the samples generated is copied to the host.
So, added an IOCTL to collect important statistical data like samples logged, dropped etc

Tracked-On: PKT-1717
Signed-off-by: Manisha Chinthapally <manisha.chinthapally@intel.com>
PROFILING_GET_STATUS is added to list of profiling hypercalls
This is supporting change to get profiling status info

Tracked-On: PKT-1717
Signed-off-by: Manisha Chinthapally <manisha.chinthapally@intel.com>
…d dead loop

Sometimes the ioreq_client_thread has no opportunity to be executed after
it is successfully created. Then when the ioreq_client is destroyed, the
thread is scheduled and internal state of ioreq_client can't be updated
correctly as acrn_ioreq_get_client returns NULL for destroyed client_id.
In such case it needs to wait for the state to release the resource and
dead loop is detected.
So the ioreq_client is used as the parameter of ioreq_client_thread and
assure that the internal state can be updated.

V2-V3: Consider the length limitation of kthread name and refine
the passed argument of kthread_run when creating the kthread.

Tracked-On: projectacrn/acrn-hypervisor#2383
Reviewed-by: He Min <min.he@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
…_thread

Now the complex state is updated in ioreq_client for ioreq_client_thread
(For example: IOREQ_CLIENT_EXIT)
In fact Linux provides the kthread API, which can be used to manage the
lifecycle of ioeq_client_thread.
When it needs to be destroyed, the kthread_stop is used to wake up and
terminate the ioreq_client_thread. And the kthread_should_stop is used
in the ioreq_client_thread.

V2-V3: Remove the acrn_ioreq_put_client in ioreq_client_thread.
Instead it is always acquired when ioreq_client_thread is alive.
Otherwise the refcnt of ioreq_client is incorrect if the thread
is created while it is not executed.

Tracked-On: projectacrn/acrn-hypervisor#2383
Reviewed-by: He Min <min.he@intel.com>
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
…oeventfd_info

Currently the ioreq_client of ioeventfd_info list is destroyed only when
the ioevent_info needs to be released. It is released too late.
So when the acrn_ioeventfd_deinit is called, it will firstly try to
release the ioreq_client and then release the other resources related
with ioevent_info list.

Tracked-On: projectacrn/acrn-hypervisor#2383
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
… of table lookup from client_id

Now the acrn_iroeq_create_client is used to create one ioreq_client and
then client_id is used to obtain driver-specific structure by looking up
hash table especially when one backend driver serves multi frontend
guest.

So the ioreq_client specific structure can be passed as the argument of
ioreq_handler, which eliminates the heavy look-up of hash table.

V2->V3: Add the NULL check of client_priv when the ioreq_client handler
is valid

Tracked-On: projectacrn/acrn-hypervisor#2383
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: He Min <min.he@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
…d of specific handle

The GVT will create the ioreq_client and then create the specific thread
to handle the emulated io_req. As the ioreq_client provides one thread
model to handle the emulated io_req, it is unnecessary for acrn_gvt to
create the specific kthread to handle the emulated io_req.
In such case all the kernel thread related with ioreq_client can be
handled by using the same style.

V2->V3: Change the acrngt_handle_kick to acrngt_emulate_ioreq
as it doesn't use the VBS-K kick function

Tracked-On: projectacrn/acrn-hypervisor#2383
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: He Min <min.he@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
The function of acrn_ioreq_create_client can help to create the below two
kinds of ioreq_client.
a. ioreq_client based on kernel thread mechansim and ioreq_client_thread
   is created.
b. fallback ioreq_client for user-space device model.

So it is unnecessary to check the kthread_should_stop when ioreq_client is
created for user-space device model.

Tracked-On: projectacrn/acrn-hypervisor#2383
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: He Min <min.he@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
… ioreq_client

Currently the acrn_ioreq_destroy_client will release the added trapped
iorange so that the corresponding resource still can be released even
when the ioreq_client user is terminated with some exception.

Under multi-thread scenario maybe the ioreq_client is already released
while acrn_ioreq_del_iorange is still called to detach the emulated
ioreq range. In such case it will fail on checking the returned value.
So it won't return error any more when the ioreq_client is invalid.

Tracked-On: projectacrn/acrn-hypervisor#2440
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
…VMMIO

The emulated io range is added for ioreq_client so that the VHM can
distribute the trapped ioreq to the corresponding ioreq_client.
Now the gvt ioreq_cleint already adds the PCI bar 0 to the trapped range.
Later it will add PVMMIO range  to the trapped range for gvt ioreq_client
when PVMMIO is enabled. While the PVMMIO is disabled, it will add back
the trapped range. It is redundant.

Tracked-On: projectacrn/acrn-hypervisor#2440
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
…vt-g boot time

Currently it will take the snapshot of the MCHBAR registers for gvt-g
initialization so that it can be used for guest vgpu. And it will cover
from 0x140000 to 0x17ffff. In fact based on the HW spec most of them are
meanlingless and some time is wasted to read these register.
Only the range of 0x144000 to 0x147fff contains the valid definition.
So the range of capturing I915 MCHBAR register is refined, which helps
to optimize the gvt-g boot time.

Tracked-On: projectacrn/acrn-hypervisor#2451
Acked-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
current oos memory occupy 33M memory, and its initialization
takes about 30ms for APL. Since now PVMMIO is used instead of
oos, reduce its memory size. Also change the kzalloc to kmalloc
as the oos page will be read first anyway. With this, the setup
time could be reduced to 3ms.

Tracked-On: projectacrn/acrn-hypervisor#2451
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
!!! This is only for the build purpose.

This is aligned with ww06.5

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
At the same time the kernel_config_sos is copied to kernel_config_uefi_sos
!!! Don't add it into PR request. This is for the test purpose

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
!!! This is only for the test purpose

Tracked-on: projectacrn/acrn-hypervisor#1723
The USB XHCI drivers need to be built-in in order to boot to a root
filesystem on a USB Mass Storage Device.

Change-Id: Ida1281db45ba30c4cff7ed2ad1768f65e22c5549
Tracked-on: projectacrn/acrn-hypervisor#1756
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
This patch adds support to use x2APIC in UOS when ACRN exposes x2APIC
to guests and supports x2APIC MSR emulation.

On older versions of ACRN which does not expose x2APIC in cpuid,
this patch does not do anything.

Tracked-On: projectacrn/acrn-hypervisor#1717
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
SoS config already enable the vbs_k_audio, the boot option from
launch uos script will use virtio for audio device.
The patch will enable the virtio audio FrontE from uos config,
and also enable its' driver.

Tracked-On: projectacrn/acrn-hypervisor#2374
Signed-off-by: Wei Liu <weix.w.liu@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Add back the CONFIG_DRM_FBDEV_EMULATION. Otherwise the boot phase
won't detect/configure the connected outputs.
In commit "drm/i915/gvt: force to active the high-performance mode
during vGPU busy", it set the max GPU freq when there's GVT workload,
but on BXT platform, the max GPU freq will impact the CPU performance,
To make a balance between GPU and CPU, we hardcode it to 600Mhz on BXT
platform.
Also, this patch will not disable rps interrupt anymore, so that if
workload is heavier, GPU freq can be adjusted to higher.

Tracked-On: projectacrn/acrn-hypervisor#2537
Signed-off-by: Min He <min.he@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
This patch fix two used uninitialized warnings in acrn_hvlog.c and
acrn_trace.c

Tracked-On: projectacrn/acrn-hypervisor#2588
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
The gpio virtio frontend driver implements a virtual GPIO controller
based on virtio framework to access GPIOs under virtualization.

It needs a backend service in the device-model on service OS side
to make it work. The backend service will emulate most of GPIO's
capabilities including set and get value, set and get direction, set
configuration, and GPIO IRQ function.

The backend service is available in ACRN device-model at github.
For more information, please refer to https://projectacrn.org

The ACRN virtio sub device id for GPIO is 0x8609.

Tracked-On: projectacrn/acrn-hypervisor#2512
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Reviewed-by: Yu Wang <yu1.wang@intel.com>
Add CONFIG_GPIO_VIRTIO=y in kernel config to enable gpio virtio
frontend driver.

Tracked-On: projectacrn/acrn-hypervisor#2512
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Reviewed-by: Yu Wang <yu1.wang@intel.com>
UP2 and APL-NUC use realtek NIC, enabling corresponding driver for it.

Tracked-On: projectacrn/acrn-hypervisor#2643
Signed-off-by: Tw <wei.tan@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
need to release the lock when time out
xinyunliu pushed a commit that referenced this pull request Sep 10, 2019
commit d0a255e upstream.

A deadlock with this stacktrace was observed.

The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio
shrinker and the shrinker depends on I/O completion in the dm-bufio
subsystem.

In order to fix the deadlock (and other similar ones), we set the flag
PF_MEMALLOC_NOIO at loop thread entry.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  projectacrn#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  projectacrn#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  projectacrn#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  projectacrn#13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  projectacrn#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

  PID: 14127  TASK: ffff881455749c00  CPU: 11  COMMAND: "loop1"
   #0 [ffff88272f5af228] __schedule at ffffffff8173f405
   #1 [ffff88272f5af280] schedule at ffffffff8173fa27
   #2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e
   #3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5
   #4 [ffff88272f5af330] mutex_lock at ffffffff81742133
   #5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio]
   #6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd
   #7 [ffff88272f5af470] shrink_zone at ffffffff811ad778
   #8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34
   #9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8
  projectacrn#10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3
  projectacrn#11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71
  projectacrn#12 [ffff88272f5af760] new_slab at ffffffff811f4523
  projectacrn#13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5
  projectacrn#14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b
  projectacrn#15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3
  projectacrn#16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3
  projectacrn#17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs]
  projectacrn#18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994
  projectacrn#19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs]
  projectacrn#20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop]
  projectacrn#21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop]
  projectacrn#22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c
  projectacrn#23 [ffff88272f5afec0] kthread at ffffffff810a8428
  projectacrn#24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xinyunliu pushed a commit that referenced this pull request Oct 15, 2019
commit cf3591e upstream.

Revert the commit bd293d0. The proper
fix has been made available with commit d0a255e ("loop: set
PF_MEMALLOC_NOIO for the worker thread").

Note that the fix offered by commit bd293d0 doesn't really prevent
the deadlock from occuring - if we look at the stacktrace reported by
Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex -
i.e. it has already successfully taken the mutex. Changing the mutex
from mutex_lock to mutex_trylock won't help with deadlocks that happen
afterwards.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  projectacrn#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  projectacrn#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  projectacrn#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  projectacrn#13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  projectacrn#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bd293d0 ("dm bufio: fix deadlock with loop device")
Depends-on: d0a255e ("loop: set PF_MEMALLOC_NOIO for the worker thread")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.