Skip to content

Commit 3ad11d7

Browse files
committed
Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe: - Series of merge handling cleanups (Baolin, Christoph) - Series of blk-throttle fixes and cleanups (Baolin) - Series cleaning up BDI, seperating the block device from the backing_dev_info (Christoph) - Removal of bdget() as a generic API (Christoph) - Removal of blkdev_get() as a generic API (Christoph) - Cleanup of is-partition checks (Christoph) - Series reworking disk revalidation (Christoph) - Series cleaning up bio flags (Christoph) - bio crypt fixes (Eric) - IO stats inflight tweak (Gabriel) - blk-mq tags fixes (Hannes) - Buffer invalidation fixes (Jan) - Allow soft limits for zone append (Johannes) - Shared tag set improvements (John, Kashyap) - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel) - DM no-wait support (Mike, Konstantin) - Request allocation improvements (Ming) - Allow md/dm/bcache to use IO stat helpers (Song) - Series improving blk-iocost (Tejun) - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang, Xianting, Yang, Yufen, yangerkun) * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits) block: fix uapi blkzoned.h comments blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue blk-mq: get rid of the dead flush handle code path block: get rid of unnecessary local variable block: fix comment and add lockdep assert blk-mq: use helper function to test hw stopped block: use helper function to test queue register block: remove redundant mq check block: invoke blk_mq_exit_sched no matter whether have .exit_sched percpu_ref: don't refer to ref->data if it isn't allocated block: ratelimit handle_bad_sector() message blk-throttle: Re-use the throtl_set_slice_end() blk-throttle: Open code __throtl_de/enqueue_tg() blk-throttle: Move service tree validation out of the throtl_rb_first() blk-throttle: Move the list operation after list validation blk-throttle: Fix IO hang for a corner case blk-throttle: Avoid tracking latency if low limit is invalid blk-throttle: Avoid getting the current time if tg->last_finish_time is 0 blk-throttle: Remove a meaningless parameter for throtl_downgrade_state() block: Remove redundant 'return' statement ...
2 parents 857d644 + 8858e8d commit 3ad11d7

File tree

144 files changed

+3229
-2446
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

144 files changed

+3229
-2446
lines changed

Documentation/filesystems/locking.rst

-3
Original file line numberDiff line numberDiff line change
@@ -488,9 +488,6 @@ getgeo: no
488488
swap_slot_free_notify: no (see below)
489489
======================= ===================
490490

491-
unlock_native_capacity and revalidate_disk are called only from
492-
check_disk_change().
493-
494491
swap_slot_free_notify is called with swap_lock and sometimes the page lock
495492
held.
496493

Documentation/userspace-api/ioctl/hdio.rst

+12-12
Original file line numberDiff line numberDiff line change
@@ -181,7 +181,7 @@ HDIO_SET_UNMASKINTR
181181

182182

183183
error return:
184-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
184+
- EINVAL Called on a partition instead of the whole disk device
185185
- EACCES Access denied: requires CAP_SYS_ADMIN
186186
- EINVAL value out of range [0 1]
187187
- EBUSY Controller busy
@@ -231,7 +231,7 @@ HDIO_SET_MULTCOUNT
231231

232232

233233
error return:
234-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
234+
- EINVAL Called on a partition instead of the whole disk device
235235
- EACCES Access denied: requires CAP_SYS_ADMIN
236236
- EINVAL value out of range supported by disk.
237237
- EBUSY Controller busy or blockmode already set.
@@ -295,7 +295,7 @@ HDIO_GET_IDENTITY
295295
the ATA specification.
296296

297297
error returns:
298-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
298+
- EINVAL Called on a partition instead of the whole disk device
299299
- ENOMSG IDENTIFY DEVICE information not available
300300

301301
notes:
@@ -355,7 +355,7 @@ HDIO_SET_KEEPSETTINGS
355355

356356

357357
error return:
358-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
358+
- EINVAL Called on a partition instead of the whole disk device
359359
- EACCES Access denied: requires CAP_SYS_ADMIN
360360
- EINVAL value out of range [0 1]
361361
- EBUSY Controller busy
@@ -1055,7 +1055,7 @@ HDIO_SET_32BIT
10551055

10561056

10571057
error return:
1058-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
1058+
- EINVAL Called on a partition instead of the whole disk device
10591059
- EACCES Access denied: requires CAP_SYS_ADMIN
10601060
- EINVAL value out of range [0 3]
10611061
- EBUSY Controller busy
@@ -1085,7 +1085,7 @@ HDIO_SET_NOWERR
10851085

10861086

10871087
error return:
1088-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
1088+
- EINVAL Called on a partition instead of the whole disk device
10891089
- EACCES Access denied: requires CAP_SYS_ADMIN
10901090
- EINVAL value out of range [0 1]
10911091
- EBUSY Controller busy
@@ -1113,7 +1113,7 @@ HDIO_SET_DMA
11131113

11141114

11151115
error return:
1116-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
1116+
- EINVAL Called on a partition instead of the whole disk device
11171117
- EACCES Access denied: requires CAP_SYS_ADMIN
11181118
- EINVAL value out of range [0 1]
11191119
- EBUSY Controller busy
@@ -1141,7 +1141,7 @@ HDIO_SET_PIO_MODE
11411141

11421142

11431143
error return:
1144-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
1144+
- EINVAL Called on a partition instead of the whole disk device
11451145
- EACCES Access denied: requires CAP_SYS_ADMIN
11461146
- EINVAL value out of range [0 255]
11471147
- EBUSY Controller busy
@@ -1237,7 +1237,7 @@ HDIO_SET_WCACHE
12371237

12381238

12391239
error return:
1240-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
1240+
- EINVAL Called on a partition instead of the whole disk device
12411241
- EACCES Access denied: requires CAP_SYS_ADMIN
12421242
- EINVAL value out of range [0 1]
12431243
- EBUSY Controller busy
@@ -1265,7 +1265,7 @@ HDIO_SET_ACOUSTIC
12651265

12661266

12671267
error return:
1268-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
1268+
- EINVAL Called on a partition instead of the whole disk device
12691269
- EACCES Access denied: requires CAP_SYS_ADMIN
12701270
- EINVAL value out of range [0 254]
12711271
- EBUSY Controller busy
@@ -1305,7 +1305,7 @@ HDIO_SET_ADDRESS
13051305

13061306

13071307
error return:
1308-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
1308+
- EINVAL Called on a partition instead of the whole disk device
13091309
- EACCES Access denied: requires CAP_SYS_ADMIN
13101310
- EINVAL value out of range [0 2]
13111311
- EBUSY Controller busy
@@ -1331,7 +1331,7 @@ HDIO_SET_IDE_SCSI
13311331

13321332

13331333
error return:
1334-
- EINVAL (bdev != bdev->bd_contains) (not sure what this means)
1334+
- EINVAL Called on a partition instead of the whole disk device
13351335
- EACCES Access denied: requires CAP_SYS_ADMIN
13361336
- EINVAL value out of range [0 1]
13371337
- EBUSY Controller busy

block/Kconfig

-2
Original file line numberDiff line numberDiff line change
@@ -161,8 +161,6 @@ config BLK_WBT_MQ
161161
depends on BLK_WBT
162162
help
163163
Enable writeback throttling by default on multiqueue devices.
164-
Multiqueue currently doesn't have support for IO scheduling,
165-
enabling this option is recommended.
166164

167165
config BLK_DEBUG_FS
168166
bool "Block layer debugging information in debugfs"

block/bfq-iosched.c

+7-2
Original file line numberDiff line numberDiff line change
@@ -4640,6 +4640,9 @@ static bool bfq_has_work(struct blk_mq_hw_ctx *hctx)
46404640
{
46414641
struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
46424642

4643+
if (!atomic_read(&hctx->elevator_queued))
4644+
return false;
4645+
46434646
/*
46444647
* Avoiding lock: a race on bfqd->busy_queues should cause at
46454648
* most a call to dispatch for nothing
@@ -5554,6 +5557,7 @@ static void bfq_insert_requests(struct blk_mq_hw_ctx *hctx,
55545557
rq = list_first_entry(list, struct request, queuelist);
55555558
list_del_init(&rq->queuelist);
55565559
bfq_insert_request(hctx, rq, at_head);
5560+
atomic_inc(&hctx->elevator_queued);
55575561
}
55585562
}
55595563

@@ -5921,6 +5925,7 @@ static void bfq_finish_requeue_request(struct request *rq)
59215925

59225926
bfq_completed_request(bfqq, bfqd);
59235927
bfq_finish_requeue_request_body(bfqq);
5928+
atomic_dec(&rq->mq_hctx->elevator_queued);
59245929

59255930
spin_unlock_irqrestore(&bfqd->lock, flags);
59265931
} else {
@@ -6360,8 +6365,8 @@ static void bfq_depth_updated(struct blk_mq_hw_ctx *hctx)
63606365
struct blk_mq_tags *tags = hctx->sched_tags;
63616366
unsigned int min_shallow;
63626367

6363-
min_shallow = bfq_update_depths(bfqd, &tags->bitmap_tags);
6364-
sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, min_shallow);
6368+
min_shallow = bfq_update_depths(bfqd, tags->bitmap_tags);
6369+
sbitmap_queue_min_shallow_depth(tags->bitmap_tags, min_shallow);
63656370
}
63666371

63676372
static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index)

block/bio.c

+9-11
Original file line numberDiff line numberDiff line change
@@ -713,20 +713,18 @@ struct bio *bio_clone_fast(struct bio *bio, gfp_t gfp_mask, struct bio_set *bs)
713713

714714
__bio_clone_fast(b, bio);
715715

716-
bio_crypt_clone(b, bio, gfp_mask);
716+
if (bio_crypt_clone(b, bio, gfp_mask) < 0)
717+
goto err_put;
717718

718-
if (bio_integrity(bio)) {
719-
int ret;
720-
721-
ret = bio_integrity_clone(b, bio, gfp_mask);
722-
723-
if (ret < 0) {
724-
bio_put(b);
725-
return NULL;
726-
}
727-
}
719+
if (bio_integrity(bio) &&
720+
bio_integrity_clone(b, bio, gfp_mask) < 0)
721+
goto err_put;
728722

729723
return b;
724+
725+
err_put:
726+
bio_put(b);
727+
return NULL;
730728
}
731729
EXPORT_SYMBOL(bio_clone_fast);
732730

block/blk-cgroup.c

+26-6
Original file line numberDiff line numberDiff line change
@@ -119,15 +119,24 @@ static void blkg_async_bio_workfn(struct work_struct *work)
119119
async_bio_work);
120120
struct bio_list bios = BIO_EMPTY_LIST;
121121
struct bio *bio;
122+
struct blk_plug plug;
123+
bool need_plug = false;
122124

123125
/* as long as there are pending bios, @blkg can't go away */
124126
spin_lock_bh(&blkg->async_bio_lock);
125127
bio_list_merge(&bios, &blkg->async_bios);
126128
bio_list_init(&blkg->async_bios);
127129
spin_unlock_bh(&blkg->async_bio_lock);
128130

131+
/* start plug only when bio_list contains at least 2 bios */
132+
if (bios.head && bios.head->bi_next) {
133+
need_plug = true;
134+
blk_start_plug(&plug);
135+
}
129136
while ((bio = bio_list_pop(&bios)))
130137
submit_bio(bio);
138+
if (need_plug)
139+
blk_finish_plug(&plug);
131140
}
132141

133142
/**
@@ -1613,16 +1622,24 @@ static void blkcg_scale_delay(struct blkcg_gq *blkg, u64 now)
16131622
static void blkcg_maybe_throttle_blkg(struct blkcg_gq *blkg, bool use_memdelay)
16141623
{
16151624
unsigned long pflags;
1625+
bool clamp;
16161626
u64 now = ktime_to_ns(ktime_get());
16171627
u64 exp;
16181628
u64 delay_nsec = 0;
16191629
int tok;
16201630

16211631
while (blkg->parent) {
1622-
if (atomic_read(&blkg->use_delay)) {
1632+
int use_delay = atomic_read(&blkg->use_delay);
1633+
1634+
if (use_delay) {
1635+
u64 this_delay;
1636+
16231637
blkcg_scale_delay(blkg, now);
1624-
delay_nsec = max_t(u64, delay_nsec,
1625-
atomic64_read(&blkg->delay_nsec));
1638+
this_delay = atomic64_read(&blkg->delay_nsec);
1639+
if (this_delay > delay_nsec) {
1640+
delay_nsec = this_delay;
1641+
clamp = use_delay > 0;
1642+
}
16261643
}
16271644
blkg = blkg->parent;
16281645
}
@@ -1634,10 +1651,13 @@ static void blkcg_maybe_throttle_blkg(struct blkcg_gq *blkg, bool use_memdelay)
16341651
* Let's not sleep for all eternity if we've amassed a huge delay.
16351652
* Swapping or metadata IO can accumulate 10's of seconds worth of
16361653
* delay, and we want userspace to be able to do _something_ so cap the
1637-
* delays at 1 second. If there's 10's of seconds worth of delay then
1638-
* the tasks will be delayed for 1 second for every syscall.
1654+
* delays at 0.25s. If there's 10's of seconds worth of delay then the
1655+
* tasks will be delayed for 0.25 second for every syscall. If
1656+
* blkcg_set_delay() was used as indicated by negative use_delay, the
1657+
* caller is responsible for regulating the range.
16391658
*/
1640-
delay_nsec = min_t(u64, delay_nsec, 250 * NSEC_PER_MSEC);
1659+
if (clamp)
1660+
delay_nsec = min_t(u64, delay_nsec, 250 * NSEC_PER_MSEC);
16411661

16421662
if (use_memdelay)
16431663
psi_memstall_enter(&pflags);

0 commit comments

Comments
 (0)