Skip to content

Commit 7102174

Browse files
onebox-liFMX
authored andcommitted
[CELEBORN-1759] Fix reserve slots might lost partition location between 0.4 client and 0.5 server
### What changes were proposed in this pull request? Fix the worker parses `ReserveSlots` logic for compatibility ### Why are the changes needed? When upgrading to 0.5, the 0.4 client reserves slots for the 0.5 worker. If there is only a replicate location, the worker parses abnormally, causing the actual reserve to fail, but returns success to the client. The worker log "Reserved 0 primary location and 0 replica location" appears. ### Does this PR introduce _any_ user-facing change? When upgrading to 0.5 from 0.4, fix potential reserve slot failure scenario.(only replica location). ### How was this patch tested? Manual test. Closes #2968 from onebox-li/fix-reserve-compatibility. Authored-by: onebox-li <[email protected]> Signed-off-by: mingji <[email protected]>
1 parent 3dd810c commit 7102174

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1271,7 +1271,7 @@ object ControlMessages extends Logging {
12711271
val pbReserveSlots = PbReserveSlots.parseFrom(message.getPayload)
12721272
val userIdentifier = PbSerDeUtils.fromPbUserIdentifier(pbReserveSlots.getUserIdentifier)
12731273
val (primaryLocations, replicateLocations) =
1274-
if (pbReserveSlots.getPrimaryLocationsList.isEmpty) {
1274+
if (pbReserveSlots.getPrimaryLocationsList.isEmpty && pbReserveSlots.getReplicaLocationsList.isEmpty) {
12751275
PbSerDeUtils.fromPbPackedPartitionLocationsPair(
12761276
pbReserveSlots.getPartitionLocationsPair)
12771277
} else {

0 commit comments

Comments
 (0)