Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Make empty ranges compare equal #32348

Merged
merged 9 commits into from
May 4, 2020
Merged

Conversation

sostock
Copy link
Contributor

@sostock sostock commented Jun 17, 2019

Fixes #29421

This PR makes empty ranges compare equal, i.e., 1:0 == 2:1.

I only changed the implementation of == since isequal does not have specialized methods for ranges and falls back to the AbstractArray implementation, which already shows the desired behavior. Of course, specialized isequal methods might still be desirable for better performance. (Should I add them? I’m not sure anyone uses isequal to compare ranges, so it might not be worth it.)

As discussed in #29421, PkgEval should be run to make sure that this does not break any package. Interestingly, it actually broke Base.split (also fixed here), because the implementation compared the result of some findfirst/findnext calls to 0:-1 (which seems to be a relic from the time before these functions returned nothing).

Should I add a NEWS.md entry for this?

Copy link
Member

@mbauman mbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you! I did a cursory search for both 0:-1 and 1:0 literals in base and the stdlibs — some more candidates to examine include:

stdlib/Dates/test/ranges.jl
38:                @test sortperm(dr) == 1:1:0
96:                @test sortperm(dr) == 1:1:0
156:                    @test sortperm(dr) == 1:1:0
214:                    @test sortperm(dr) == 1:1:0

stdlib/REPL/test/replcompletions.jl
924:    @test r == 1:0

The above tests should probably use ===.

base/strings/util.jl Show resolved Hide resolved
@mbauman mbauman added arrays [a, r, r, a, y, s] minor change Marginal behavior change acceptable for a minor release labels Jun 17, 2019
@sostock
Copy link
Contributor Author

sostock commented Jun 18, 2019

stdlib/Dates/test/ranges.jl
38:                @test sortperm(dr) == 1:1:0
96:                @test sortperm(dr) == 1:1:0
156:                    @test sortperm(dr) == 1:1:0
214:                    @test sortperm(dr) == 1:1:0

stdlib/REPL/test/replcompletions.jl
924:    @test r == 1:0

The above tests should probably use ===.

There are some cases where it is definitely necessary to check that the ranges as well as their startpoints are equal (like searchsorted, where the startpoint of the range carries information), but I’m not sure this is required in all cases. I would say that for sortperm, any empty range would be okay to return, so @test sortperm(dr) == 1:1:0 could also be replaced with @test isempty(sortperm(dr)). What are your thoughts on this?

Edit: there is a precedent for using == for non-empty and isempty for empty ranges in the tests:

https://github.com/sostock/julia/blob/6ee12e0b49b19aa636df3389d9dd062e200f7e1f/test/ranges.jl#L315-L320

@sostock
Copy link
Contributor Author

sostock commented Jun 19, 2019

I just tentatively replaced all comparison between empty ranges by === in test/ranges.jl, and it seems that === is indeed too restrictive in some cases, for example:

https://github.com/sostock/julia/blob/6ee12e0b49b19aa636df3389d9dd062e200f7e1f/test/ranges.jl#L1022

Test Failed at /home/sebastian/Development/julia/usr/share/julia/test/ranges.jl:1022
  Expression: -r === mr
   Evaluated: -1.0:26.0:-27.0 === -1.0:26.0:-27.0

So we could either say that == (or isempty, if applicable) is good enough in these cases, or we could write a function that compares first, last and step (i.e., the “old“ way of comparing ranges) for these tests.

In my opinion, it would suffice to change the comparison to === for some functions like searchsorted, and leave it unchanged elsewhere.

@KristofferC
Copy link
Member

Sorry...

image

@sostock
Copy link
Contributor Author

sostock commented Jun 20, 2019

I changed all tests that I found which compare empty ranges to use ===, with the following exceptions:

  • searchsorted does not always return a UnitRange{Int}. For example, searchsorted(1:5, BigInt(3)) returns a UnitRange{BigInt}). Since === does not work for BigInts, I wrote a function ==ᵣ to compare those ranges:

    julia/test/sorting.jl

    Lines 38 to 64 in aa60460

    # Compare ranges by comparing their `first` and `last` elements and their `length`. This
    # returns `false` if empty ranges have different startpoints, which is relevant for
    # `searchsorted`
    ==(r::AbstractRange, s::AbstractRange) =
    (first(r) == first(s)) & (length(r) == length(s)) & (last(r) == last(s))
    @testset "searchsorted" begin
    numTypes = [ Int8, Int16, Int32, Int64, Int128,
    UInt8, UInt16, UInt32, UInt64, UInt128,
    Float16, Float32, Float64, BigInt, BigFloat]
    @test searchsorted([1:10;], 1, by=(x -> x >= 5)) == 1:4
    @test searchsorted([1:10;], 10, by=(x -> x >= 5)) == 5:10
    @test searchsorted([1:5; 1:5; 1:5], 1, 6, 10, Forward) == 6:6
    @test searchsorted(fill(1, 15), 1, 6, 10, Forward) == 6:10
    for R in numTypes, T in numTypes
    @test searchsorted(R[1, 1, 2, 2, 3, 3], T(0)) === 1:0
    @test searchsorted(R[1, 1, 2, 2, 3, 3], T(1)) == 1:2
    @test searchsorted(R[1, 1, 2, 2, 3, 3], T(2)) == 3:4
    @test searchsorted(R[1, 1, 2, 2, 3, 3], T(4)) === 7:6
    @test searchsorted(R[1, 1, 2, 2, 3, 3], 2.5) === 5:4
    @test searchsorted(1:3, T(0)) ==1:0
    @test searchsorted(1:3, T(1)) == 1:1
    @test searchsorted(1:3, T(2)) == 2:2
    @test searchsorted(1:3, T(4)) ==4:3
  • test_range_identity is called with empty ranges and does not work with === (see my comment above), so I did not change it. I can change it to use ==ᵣ as well if requested.

    julia/test/ranges.jl

    Lines 1021 to 1051 in aa60460

    function test_range_identity(r::AbstractRange{T}, mr) where T
    @test -r == mr
    @test -Vector(r) == Vector(mr)
    @test isa(-r, typeof(r))
    @test broadcast(+, broadcast(+, 1, r), -1) == r
    @test 1 .+ Vector(r) == Vector(1 .+ r) == Vector(r .+ 1)
    @test isa(broadcast(+, broadcast(+, 1, r), -1), typeof(r))
    @test broadcast(-, broadcast(-, 1, r), 1) == mr
    @test 1 .- Vector(r) == Vector(1 .- r) == Vector(1 .+ mr)
    @test Vector(r) .- 1 == Vector(r .- 1) == -Vector(mr .+ 1)
    @test isa(broadcast(-, broadcast(-, 1, r), 1), typeof(r))
    @test 1 * r * 1 == r
    @test 2 * r * T(0.5) == r
    @test isa(1 * r * 1, typeof(r))
    @test r / 1 == r
    @test r / 2 * 2 == r
    @test r / T(0.5) * T(0.5) == r
    @test isa(r / 1, typeof(r))
    @test (2 * Vector(r) == Vector(r * 2) == Vector(2 * r) ==
    Vector(r * T(2.0)) == Vector(T(2.0) * r) ==
    Vector(r / T(0.5)) == -Vector(mr * T(2.0)))
    end
    test_range_identity(range(1.0, stop=27.0, length=10), range(-1.0, stop=-27.0, length=10))
    test_range_identity(range(1f0, stop=27f0, length=10), range(-1f0, stop=-27f0, length=10))
    test_range_identity(range(1.0, stop=27.0, length=0), range(-1.0, stop=-27.0, length=0))
    test_range_identity(range(1f0, stop=27f0, length=0), range(-1f0, stop=-27f0, length=0))

Edit: Apparently some of the new === tests do not work on 32-bit systems.

Edit 2: To fix the tests on 32-bit systems, I used the ==ᵣ function also for the Dates/ranges and REPL tests.

@StefanKarpinski
Copy link
Member

Can you tell if the failures are related now or unrelated?

@sostock
Copy link
Contributor Author

sostock commented Jul 1, 2019

The failures should be unrelated. I think they are fixed by #32412, so they should pass after running them again.

@mbauman
Copy link
Member

mbauman commented Jul 1, 2019

The failures are on both Windows buildbots and both:

Error in testset download:
Test Failed at C:\cygwin\home\Administrator\buildbot\worker-tabularasa\tester_win64\build\share\julia\test\download.jl:27
  Expression: m.captures[1] == "https://httpbin.org/get?test='^'"
   Evaluated: "https://httpbin.org/get?test=^" == "https://httpbin.org/get?test='^'"

The AppVeyor build passed, so even if there's range-y things going on in the processing of those strings, I have a hard time rationalizing how this change would lead to that sort of buildbot-only failure. I'll try rebuilding the buildbots. Perhaps httpbin.org was in a strange way for a moment.

@mbauman
Copy link
Member

mbauman commented Jul 1, 2019

Ah, since I only re-ran the tester bots it didn't rebuild with the patch from #32412. That's a good hypothesis and makes sense. The last things that are needed here are a NEWS.md note and (ideally) a PkgEval run.

Thanks so much for tackling this, @sostock! It's a great first PR. :)

@mbauman mbauman added needs news A NEWS entry is required for this change needs pkgeval Tests for all registered packages should be run with this change labels Jul 1, 2019
@sostock
Copy link
Contributor Author

sostock commented Jul 1, 2019

Thanks, @mbauman! I just added a NEWS.md item.

@StefanKarpinski
Copy link
Member

Indeed, this is a very impressive and greatly appreciated first PR. Thanks for sticking with it so long!

@mbauman mbauman removed the needs news A NEWS entry is required for this change label Jul 1, 2019
Copy link
Contributor Author

@sostock sostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

julia/test/sorting.jl

Lines 61 to 64 in 548b667

@test searchsorted(1:3, T(0)) ==1:0
@test searchsorted(1:3, T(1)) == 1:1
@test searchsorted(1:3, T(2)) == 2:2
@test searchsorted(1:3, T(4)) ==4:3

Note that the need for ==ᵣ in this case arises from the fact that the returntype of searchsorted depends on the type of the value being searched. There is an issue for that: #32568. Once that is resolved, the searchsorted tests can be changed to use === instead of ==ᵣ.

Edit: All tests for empty ranges now use ===.

@rfourquet
Copy link
Member

@nanosoldier runtests(ALL; vs =":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here. cc @maleadt

@mbauman
Copy link
Member

mbauman commented Apr 28, 2020

Essentially all the package test failures were due to transient download/connectivity errors. There's one, though that looks like a real failure:

StrRegex.jl:

rsplit/split: Test Failed at /home/pkgeval/.julia/packages/StrRegex/yt6hZ/test/runtests.jl:221
  Expression: split(abc, R"") == ["a", "b", "c"]
   Evaluated: SubString{String}["abc"] == ["a", "b", "c"]
...

That sure looks like the result of comparing an empty range, but I can't find it directly in that package.

@sostock
Copy link
Contributor Author

sostock commented Apr 28, 2020

The problem is __split from StrBase.jl:
https://github.com/JuliaString/StrBase.jl/blob/3c589c480deb1212a8586c25ddef1aa0fa099fec/src/util.jl#L66-L85

The result of find is compared to 0:-1 to see whether there were no matches.

@rfourquet
Copy link
Member

What is the state of this PR? There is one day left before feature freeze for 1.5, I'm willing to review and merge if it's ready. Or does it need to go through triage?

@sostock
Copy link
Contributor Author

sostock commented May 3, 2020

I think the PR is ok, but it breaks StrBase.jl, so I’m not sure how to proceed. I guess someone has to decide whether this is appropriate for a minor release. Technically, it would have to wait until 2.0.

@rfourquet
Copy link
Member

but it breaks StrBase.jl, so I’m not sure how to proceed

If I understood correctly, it seems like it would be fixed relatively easily, no? If so, I think we can move ahead. Triage already decided in favor of the change in (cf. the linked issue), and I will merge later today if no objections.
Would you be willing to make a PR to StrBase.jl to fix the issue?

@rfourquet rfourquet removed the needs pkgeval Tests for all registered packages should be run with this change label May 4, 2020
@sostock
Copy link
Contributor Author

sostock commented May 4, 2020

Would you be willing to make a PR to StrBase.jl to fix the issue?

Yes, I can make a PR to StrBase.jl.

@rfourquet rfourquet merged commit a6e27b3 into JuliaLang:master May 4, 2020
@rfourquet
Copy link
Member

@sostock thanks again for this great PR!

@mbauman
Copy link
Member

mbauman commented May 5, 2020

One interesting (apparently untested) fallout that didn't dawn on me until just now: empty OffsetArrays now compare equal where they would have been distinct before; OffsetArray([],-1) == OffsetArray([], -2) was false previously and is now true. Cf. JuliaArrays/OffsetArrays.jl#111.

@StefanKarpinski
Copy link
Member

Is that good or bad?

@mbauman
Copy link
Member

mbauman commented May 5, 2020

I'm not sure.

@timholy
Copy link
Member

timholy commented May 5, 2020

Me neither.

This is the API-designer's equivalent of "what's the sound of one hand clapping?" at its finest. And here the 20-something me smugly thought he had avoided squishy things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] minor change Marginal behavior change acceptable for a minor release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Equality of empty ranges
7 participants