Re-implementation of Tang (1990) log procedures in pure float32 #236

sotlampr · 2023-08-12T17:40:01Z

Base.Math.log1p calculation relies on two procedures , log_proc1 and log_proc2 from the literature Tang (1990). The current implementation upscales to Float64 internally and as such is incompatible with Metal.

This PR attempts to convert the two procedures to pure-float32.

See also relevant discussion #234

sotlampr · 2023-08-12T17:43:08Z

src/device/intrinsics/math.jl

+    jp = unsafe_trunc(Int,128.0f0*F)-127
+
+    ## Steps 1 and 2
+    hi = t_log_Float32[jp]


Julia Base uses Float64 tables

sotlampr · 2023-08-12T17:43:55Z

src/device/intrinsics/math.jl

+    q = u*v*0.08333351f0
+
+    ## Step 4
+    logb(base)*(l + (u + q))


Similarly, logb in base returns a double

sotlampr · 2023-08-12T17:45:13Z

src/device/intrinsics/math.jl

+                    0.012512346f0)
+
+    ## Step 3
+    @inline function truncate(x)


From Tang (1990):

(1) Define M := 12 for single precision, and define M := 24 for double precision.
(2) u1 := u rounded (or truncated) to M significant bits.
(3) fi := f rounded (or truncated) to M significant bits.

I'm not sure if this is what is meant

sotlampr · 2023-08-12T17:45:40Z

src/device/intrinsics/math.jl

+      reinterpret(Float32,
+                  reinterpret(Int32, 0.012512346f0) & 0b11111111111100000000000000000000)
+    end
+    u₁ = truncate(u)


Non-upcasting alternative procedure from Tang (1990)

maleadt · 2023-08-12T18:00:36Z

cc @oscardssmith

oscardssmith · 2023-08-12T18:47:57Z

What's the error like for these implementations?

sotlampr · 2023-08-13T08:15:51Z

EDIT: CPU, running on metal has too high error
This log1p(using the modified procedures) compared with Base.Math.log1p (Float32) (ULP):

max 4.0 at x = -0.099609345
mean 0.12487844518189703

This log_proc2 compared with Base.Math.log_proc2 (Float32) (ULP):

log_proc2{Float32}, base=2
max 2.0 at x = 0.043909933
mean 0.1784229523275927

log_proc2{Float32}, base=ℯ
max 1.0 at x = 0.064494334
mean 0.004462545515167737

log_proc2{Float32}, base=10
max 2.0 at x = 0.03663287
mean 0.287801184326243

sotlampr · 2023-08-13T08:21:35Z

NOTE Running on Metal has very high error - something is not right

oscardssmith · 2023-08-13T15:37:57Z

Can you compare against Base.Math.log1p (Float64)? your current comparison isn't great since Base.Math.log1p (Float32) also has rounding error.

sotlampr · 2023-08-13T16:12:03Z

I assumed that we're interested in the difference between the upcasting float32 vs. the pure float32 implementation.

This log1p{Float32} v. Base.Math.log1p{Float64}

max 4.441346645355225 at x = -0.11138753
mean 0.17755624519091973

And for reference, Base.Math.log1p{Float32} v. Base.Math.log1p{Float64}

max 0.563554048538208 at x = 0.08200329
mean 0.11855523097385935

sotlampr · 2023-08-14T08:47:53Z

@oscardssmith do you think the error is acceptable? And most critically, running on Metal does not work - the error is way too high. Can you see any possible culprit in the code?

oscardssmith · 2023-08-14T12:21:10Z

the error is probably acceptable, but it is higher than I was expecting. I don't know why it's giving wrong answers on gpu. possibly the truncate is misbehaving?

sotlampr · 2023-08-14T12:49:47Z

The problems seems to be accessing t_log_Float32 - it returns 0

oscardssmith · 2023-08-14T12:52:06Z

now that you mention it, a table based method is probably not the best idea for a gpu.

oscardssmith · 2023-08-14T13:07:15Z

https://github.com/JuliaMath/openlibm/blob/master/src/s_log1pf.c might be a better approach.

sotlampr · 2023-08-15T09:53:58Z

Replaced by #239

Rough re-implementations of Tang log procedures in pure float32

fb94104

sotlampr mentioned this pull request Aug 12, 2023

log1p fails on MtlArray{Float32} #234

Closed

sotlampr commented Aug 12, 2023

View reviewed changes

Fix error in truncate function

9d84e37

More compact truncate function

c2181a6

Align truncate function with original julia implementation

f8bd6a8

Make funcs inline and skip bounds check on proc2

cbe7f58

sotlampr marked this pull request as draft August 14, 2023 08:42

sotlampr mentioned this pull request Aug 15, 2023

Port openlibm log1pf as log1p #239

Merged

sotlampr closed this Aug 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-implementation of Tang (1990) log procedures in pure float32 #236

Re-implementation of Tang (1990) log procedures in pure float32 #236

sotlampr commented Aug 12, 2023 •

edited

Loading

sotlampr Aug 12, 2023

sotlampr Aug 12, 2023

sotlampr Aug 12, 2023

sotlampr Aug 12, 2023

maleadt commented Aug 12, 2023

oscardssmith commented Aug 12, 2023

sotlampr commented Aug 13, 2023 •

edited

Loading

sotlampr commented Aug 13, 2023 •

edited

Loading

oscardssmith commented Aug 13, 2023

sotlampr commented Aug 13, 2023

sotlampr commented Aug 14, 2023

oscardssmith commented Aug 14, 2023

sotlampr commented Aug 14, 2023

oscardssmith commented Aug 14, 2023

oscardssmith commented Aug 14, 2023

sotlampr commented Aug 15, 2023

Re-implementation of Tang (1990) log procedures in pure float32 #236

Re-implementation of Tang (1990) log procedures in pure float32 #236

Conversation

sotlampr commented Aug 12, 2023 • edited Loading

sotlampr Aug 12, 2023

Choose a reason for hiding this comment

sotlampr Aug 12, 2023

Choose a reason for hiding this comment

sotlampr Aug 12, 2023

Choose a reason for hiding this comment

sotlampr Aug 12, 2023

Choose a reason for hiding this comment

maleadt commented Aug 12, 2023

oscardssmith commented Aug 12, 2023

sotlampr commented Aug 13, 2023 • edited Loading

sotlampr commented Aug 13, 2023 • edited Loading

oscardssmith commented Aug 13, 2023

sotlampr commented Aug 13, 2023

sotlampr commented Aug 14, 2023

oscardssmith commented Aug 14, 2023

sotlampr commented Aug 14, 2023

oscardssmith commented Aug 14, 2023

oscardssmith commented Aug 14, 2023

sotlampr commented Aug 15, 2023

sotlampr commented Aug 12, 2023 •

edited

Loading

sotlampr commented Aug 13, 2023 •

edited

Loading

sotlampr commented Aug 13, 2023 •

edited

Loading