Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log1p fails on MtlArray{Float32} #234

Closed
sotlampr opened this issue Aug 11, 2023 · 10 comments
Closed

log1p fails on MtlArray{Float32} #234

sotlampr opened this issue Aug 11, 2023 · 10 comments

Comments

@sotlampr
Copy link
Contributor

This raises InvalidIRError [...] Reason: unsupported unsupported use of double value

log1p.(MtlArray([1.0f0]))

The Julia Base implementation for log_proc2(::Float32) has some internal computation using Float64 and then downcasts it to Float32. Is this a known bug / any ideas for mitigating this?

Stack trace
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#broadcast_kernel#26")(::Metal.mtlKernelContext, ::MtlDeviceVector{Float32, 1}, ::Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(log1p), Tuple{Base.Broadcast.Extruded{MtlDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] Float64
   @ ./float.jl:261
 [2] log_proc2 (repeats 2 times)
   @ ./special/log.jl:248
 [3] log1p
   @ ./special/log.jl:381
 [4] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [5] _broadcast_getindex
   @ ./broadcast.jl:656
 [6] getindex
   @ ./broadcast.jl:610
 [7] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
  [1] Float64
    @ ./float.jl:261
  [2] convert
    @ ./number.jl:7
  [3] _promote
    @ ./promotion.jl:358
  [4] promote
    @ ./promotion.jl:381
  [5] +
    @ ./promotion.jl:410
  [6] log_proc2 (repeats 2 times)
    @ ./special/log.jl:248
  [7] log1p
    @ ./special/log.jl:381
  [8] _broadcast_getindex_evalf
    @ ./broadcast.jl:683
  [9] _broadcast_getindex
    @ ./broadcast.jl:656
 [10] getindex
    @ ./broadcast.jl:610
 [11] broadcast_kernel
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] +
   @ ./float.jl:408
 [2] +
   @ ./promotion.jl:410
 [3] log_proc2 (repeats 2 times)
   @ ./special/log.jl:248
 [4] log1p
   @ ./special/log.jl:381
 [5] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [6] _broadcast_getindex
   @ ./broadcast.jl:656
 [7] getindex
   @ ./broadcast.jl:610
 [8] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] /
   @ ./float.jl:411
 [2] log_proc2 (repeats 2 times)
   @ ./special/log.jl:248
 [3] log1p
   @ ./special/log.jl:381
 [4] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [5] _broadcast_getindex
   @ ./broadcast.jl:656
 [6] getindex
   @ ./broadcast.jl:610
 [7] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] Float32
   @ ./float.jl:258
 [2] log_proc2
   @ ./special/log.jl:249
 [3] log_proc2
   @ ./special/log.jl:248
 [4] log1p
   @ ./special/log.jl:381
 [5] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [6] _broadcast_getindex
   @ ./broadcast.jl:656
 [7] getindex
   @ ./broadcast.jl:610
 [8] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
  [1] Float64
    @ ./float.jl:261
  [2] convert
    @ ./number.jl:7
  [3] _promote
    @ ./promotion.jl:358
  [4] promote
    @ ./promotion.jl:381
  [5] +
    @ ./promotion.jl:410
  [6] log_proc2
    @ ./special/log.jl:260
  [7] log_proc2
    @ ./special/log.jl:248
  [8] log1p
    @ ./special/log.jl:381
  [9] _broadcast_getindex_evalf
    @ ./broadcast.jl:683
 [10] _broadcast_getindex
    @ ./broadcast.jl:656
 [11] getindex
    @ ./broadcast.jl:610
 [12] broadcast_kernel
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] +
   @ ./float.jl:408
 [2] +
   @ ./promotion.jl:410
 [3] log_proc2
   @ ./special/log.jl:260
 [4] log_proc2
   @ ./special/log.jl:248
 [5] log1p
   @ ./special/log.jl:381
 [6] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [7] _broadcast_getindex
   @ ./broadcast.jl:656
 [8] getindex
   @ ./broadcast.jl:610
 [9] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] Float32
   @ ./float.jl:258
 [2] log_proc2
   @ ./special/log.jl:260
 [3] log_proc2
   @ ./special/log.jl:248
 [4] log1p
   @ ./special/log.jl:381
 [5] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [6] _broadcast_getindex
   @ ./broadcast.jl:656
 [7] getindex
   @ ./broadcast.jl:610
 [8] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
  [1] Float64
    @ ./float.jl:261
  [2] convert
    @ ./number.jl:7
  [3] _promote
    @ ./promotion.jl:358
  [4] promote
    @ ./promotion.jl:381
  [5] *
    @ ./promotion.jl:411
  [6] log_proc1
    @ ./special/log.jl:227
  [7] log_proc1
    @ ./special/log.jl:223
  [8] log1p
    @ ./special/log.jl:396
  [9] _broadcast_getindex_evalf
    @ ./broadcast.jl:683
 [10] _broadcast_getindex
    @ ./broadcast.jl:656
 [11] getindex
    @ ./broadcast.jl:610
 [12] broadcast_kernel
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] *
   @ ./float.jl:410
 [2] *
   @ ./promotion.jl:411
 [3] log_proc1
   @ ./special/log.jl:227
 [4] log_proc1
   @ ./special/log.jl:223
 [5] log1p
   @ ./special/log.jl:396
 [6] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [7] _broadcast_getindex
   @ ./broadcast.jl:656
 [8] getindex
   @ ./broadcast.jl:610
 [9] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] +
   @ ./float.jl:408
 [2] log_proc1
   @ ./special/log.jl:227
 [3] log_proc1
   @ ./special/log.jl:223
 [4] log1p
   @ ./special/log.jl:396
 [5] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [6] _broadcast_getindex
   @ ./broadcast.jl:656
 [7] getindex
   @ ./broadcast.jl:610
 [8] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] +
   @ ./float.jl:408
 [2] log_proc1
   @ ./special/log.jl:227
 [3] log_proc1
   @ ./special/log.jl:223
 [4] log1p
   @ ./special/log.jl:396
 [5] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [6] _broadcast_getindex
   @ ./broadcast.jl:656
 [7] getindex
   @ ./broadcast.jl:610
 [8] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
  [1] Float64
    @ ./float.jl:261
  [2] convert
    @ ./number.jl:7
  [3] _promote
    @ ./promotion.jl:358
  [4] promote
    @ ./promotion.jl:381
  [5] +
    @ ./promotion.jl:410
  [6] log_proc1
    @ ./special/log.jl:241
  [7] log_proc1
    @ ./special/log.jl:223
  [8] log1p
    @ ./special/log.jl:396
  [9] _broadcast_getindex_evalf
    @ ./broadcast.jl:683
 [10] _broadcast_getindex
    @ ./broadcast.jl:656
 [11] getindex
    @ ./broadcast.jl:610
 [12] broadcast_kernel
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] +
   @ ./float.jl:408
 [2] +
   @ ./promotion.jl:410
 [3] log_proc1
   @ ./special/log.jl:241
 [4] log_proc1
   @ ./special/log.jl:223
 [5] log1p
   @ ./special/log.jl:396
 [6] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [7] _broadcast_getindex
   @ ./broadcast.jl:656
 [8] getindex
   @ ./broadcast.jl:610
 [9] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Reason: unsupported unsupported use of double value
Stacktrace:
 [1] Float32
   @ ./float.jl:258
 [2] log_proc1
   @ ./special/log.jl:241
 [3] log_proc1
   @ ./special/log.jl:223
 [4] log1p
   @ ./special/log.jl:396
 [5] _broadcast_getindex_evalf
   @ ./broadcast.jl:683
 [6] _broadcast_getindex
   @ ./broadcast.jl:656
 [7] getindex
   @ ./broadcast.jl:610
 [8] broadcast_kernel
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:59
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/YO8Uj/src/validation.jl:149
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/YO8Uj/src/driver.jl:415 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:253 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/YO8Uj/src/driver.jl:414 [inlined]
  [5] emit_llvm(job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/YO8Uj/src/utils.jl:89
  [6] emit_llvm
    @ ~/.julia/packages/GPUCompiler/YO8Uj/src/utils.jl:83 [inlined]
  [7] codegen(output::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/YO8Uj/src/driver.jl:129
  [8] codegen
    @ ~/.julia/packages/GPUCompiler/YO8Uj/src/driver.jl:110 [inlined]
  [9] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/YO8Uj/src/driver.jl:106
 [10] compile
    @ ~/.julia/packages/GPUCompiler/YO8Uj/src/driver.jl:98 [inlined]
 [11] #51
    @ ~/.julia/packages/Metal/qeZqc/src/compiler/compilation.jl:57 [inlined]
 [12] JuliaContext(f::Metal.var"#51#52"{GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/YO8Uj/src/driver.jl:47
 [13] compile(job::GPUCompiler.CompilerJob)
    @ Metal ~/.julia/packages/Metal/qeZqc/src/compiler/compilation.jl:56
 [14] actual_compilation(cache::Dict{Any, Any}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}, compiler::typeof(Metal.compile), linker::typeof(Metal.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/YO8Uj/src/execution.jl:125
 [15] cached_compilation(cache::Dict{Any, Any}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/YO8Uj/src/execution.jl:103
 [16] macro expansion
    @ ~/.julia/packages/Metal/qeZqc/src/compiler/execution.jl:162 [inlined]
 [17] macro expansion
    @ ./lock.jl:267 [inlined]
 [18] mtlfunction(f::GPUArrays.var"#broadcast_kernel#26", tt::Type{Tuple{Metal.mtlKernelContext, MtlDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(log1p), Tuple{Base.Broadcast.Extruded{MtlDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}, Int64}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Metal ~/.julia/packages/Metal/qeZqc/src/compiler/execution.jl:157
 [19] mtlfunction
    @ ~/.julia/packages/Metal/qeZqc/src/compiler/execution.jl:155 [inlined]
 [20] macro expansion
    @ ~/.julia/packages/Metal/qeZqc/src/compiler/execution.jl:77 [inlined]
 [21] #launch_heuristic#98
    @ ~/.julia/packages/Metal/qeZqc/src/gpuarrays.jl:14 [inlined]
 [22] launch_heuristic
    @ ~/.julia/packages/Metal/qeZqc/src/gpuarrays.jl:12 [inlined]
 [23] _copyto!
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:65 [inlined]
 [24] copyto!
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:46 [inlined]
 [25] copy
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/broadcast.jl:37 [inlined]
 [26] materialize(bc::Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Nothing, typeof(log1p), Tuple{MtlVector{Float32, Metal.MTL.MTLResourceStorageModePrivate}}})
    @ Base.Broadcast ./broadcast.jl:873
 [27] top-level scope
    @ REPL[38]:1
 [28] top-level scope
    @ ~/.julia/packages/Metal/qeZqc/src/initialization.jl:51

Julia version: 1.9.2
Metal version: 0.5.0

@maleadt
Copy link
Member

maleadt commented Aug 11, 2023

any ideas for mitigating this?

No easy ones, sorry. You could try to find a different implementation of this operation that doesn't require wider temporaries.

@sotlampr
Copy link
Contributor Author

Thanks, @maleadt. Do you think that this will fail on CUDA/AMDGPU? I am asking because it is used by some operations in Flux, so this might be a relevant issue for them too.

@maleadt
Copy link
Member

maleadt commented Aug 11, 2023

@sotlampr
Copy link
Contributor Author

Do you think a patch similar to this would be appropriate?

@maleadt
Copy link
Member

maleadt commented Aug 11, 2023

Yes, it would.

cc @oscardssmith (I mentioned this being a problem at JuliaCon)

@oscardssmith
Copy link

how much accuracy are you willing to lose? I can probably cook you up a FLoat32 only one that performs pretty well at the cost of slightly lower accuracy (probably in the 2-4 ULP area).

@sotlampr
Copy link
Contributor Author

If the only problem is the implementation of log_proc2, there is a single precision implementation in the original paper (p.385-386). But I think there is another Float64 cast in log_proc1 too

@oscardssmith
Copy link

oh, I think it's possible that the double precision casts may not be necessary at all. The biggest difference between our algorithm and Tang's is that we add an extra multiply at the end to account for the different base logs and that multiply may have a bunch of error without the extended precision. Would be good to try the version that doesn't upcast though...

@sotlampr
Copy link
Contributor Author

I made an attempt to re-write the functions without using double precision floats #236

I tried in a REPL and the functions seem to work, however using them in Metal with MtlArray does not.

@maleadt
Copy link
Member

maleadt commented Aug 15, 2023

#239

@maleadt maleadt closed this as completed Aug 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants