Commit 9260ce4
authored
pulley: Reimplement wasm loads/stores & memory opcodes (#10154)
* pulley: Reimplement wasm loads/stores & memory opcodes
This commit is a large refactoring to reimplement how WebAssembly
loads/stores are translated to Pulley opcodes when using the
interpreter. Additionally the functionality related to memory support
has changed quite a bit with the interpreter as well. This is all based
off comments on #10102 with the end goal of folding the two Pulley
opcodes today of "do the bounds check" and "do the load" into one
opcode. This is intended to reduce the number of opcodes and overall
improve interpreter throughput by minimizing turns of the interpreter
loop.
The basic idea behind this PR is that a new basic suite of loads/stores
are added to Pulley which trap if the address is zero. This provides a
route to translate trapping loads/stores in CLIF to Pulley bytecode
without actually causing segfaults at runtime. WebAssembly translation
to CLIF is then updated to use the `select` trick for wasm loads/stores
where either 0 is loaded from or the actual address is loaded from.
Basic support for translation and such is added for this everywhere, and
this ensures that all loads/stores for wasm will be translated
successfully with Pulley.
The next step was to extend the "g32" addressing mode preexisting in
Pulley to support a bounds check as well. New pattern-matches were added
to ISLE to search for a bounds check in the address of a trapping
load/store. If found then the entire chain of operations necessary to
compute the address are folded into a single "g32" opcode which ends up
being a fallible load/store at runtime.
To fit all this into Pulley this commit contains a number of
refactorings to shuffle around existing opcodes related to memory and
extend various pieces of functionality here and there:
* Pulley now uses a `AddrFoo` types to represent addressing modes as a
single immediate rather than splitting it up into pieces for each
method. For example `AddrO32` represents "base + offset32". `AddrZ`
represents the same thing but traps if the address is zero. The
`AddrG32` mode represents a bounds-checked 32-bit linear memory access
on behalf of wasm.
* Pulley loads/stores were reduced to always using an `AddrFoo`
immediate. This means that the old `offset8` addressing mode was
removed without replacement here (to be added in the future if
necessary). Additionally the suite of sign-extension modes supported
were trimmed down to remove 8-to-64, 16-to-64, and 32-to-64 extensions
folded as part of the opcode. These can of course always be re-added
later but probably want to be added just for the `G32` addressing mode
as opposed to all addressing modes.
* The interpreter itself was refactored to have an `AddressingMode`
trait to ensure that all memory accesses, regardless of addressing
modes, are largely just copy/pastes of each other. In the future it
might make sense to implement these methods with a macro, but for now
it's copy/paste.
* In ISLE the `XLoad` generic instruction removed its `ext` field to
have extensions handled exclusively in ISLE instead of partly in
`emit.rs`.
* Float/vector loads/stores now have "g32" addressing (in addition to
the "z" that's required for wasm) since it was easy to add them.
* Translation of 1-byte accesses on Pulley from WebAssembly to CLIF no
longer has a special case for using `a >= b` instead of `a > b - 1` to
ensure that the same bounds-check instruction can be used for all
sizes of loads/stores.
* The bounds-check which folded a load-of-the-bound into the opcode is
now present as a "g32bne" addressing mode. with its of suite of
instructions to boo.
Overall this PR is not a 1:1 replacement of all previous opcodes with
exactly one opcode. For example loading 8 bits sign-extended to 64-bits
is now two opcodes instead of one. Additionally some previous opcodes
have expanded in size where for example the 8-bit offset mode was remove
in favor of only having 32-bit offsets. The goal of this PR is to reboot
how memory is handled in Pulley. All loads/stores now use a specific
addressing mode and currently all operations supported across addressing
modes are consistently supported. In the future it's expected that some
features will be added to some addressing modes and not others as
necessary, for example extending the "g32" addressing mode only instead
of all addressing modes.
For an evaluation of this PR:
* Code size: `spidermonkey.cwasm` file is reduced from 19M to 16M.
* Sightglass: `pulldown-cmark` is improved by 15%
* Sightglass: `bz2` is improved by 20%
* Sightglass: `spidermonkey` is improved by 22%
* Coremark: score improved by 40%
Overall this PR and new design looks to be a large win. This is all
driven by the reduction in opcodes both for compiled code size and
execution speed by minimizing turns of the interpreter loop. In the end
I'm also pretty happy with how this turned out and I think the
refactorings are well worth it.
* Use new `is_pulley` helper more
* Improve `addrz` helper, tighten up `memory-inbounds.wat` a bit
* Improve codegen in a few `memory-inbounds.wat` cases
* Fix test expectation1 parent 0159ff5 commit 9260ce4
36 files changed
Lines changed: 2191 additions & 1652 deletions
File tree
- cranelift
- codegen
- meta/src
- src
- filetests/filetests/isa
- pulley32
- pulley64
- crates
- cranelift/src
- translate/code_translator
- wasmtime/src/runtime/vm
- pulley
- src
- tests/all
- tests/disas/pulley
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
186 | 186 | | |
187 | 187 | | |
188 | 188 | | |
| 189 | + | |
189 | 190 | | |
190 | 191 | | |
191 | 192 | | |
| |||
200 | 201 | | |
201 | 202 | | |
202 | 203 | | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
203 | 208 | | |
204 | 209 | | |
205 | 210 | | |
| |||
230 | 235 | | |
231 | 236 | | |
232 | 237 | | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
233 | 242 | | |
234 | 243 | | |
235 | 244 | | |
236 | 245 | | |
237 | 246 | | |
238 | 247 | | |
| 248 | + | |
239 | 249 | | |
240 | 250 | | |
241 | 251 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
| 76 | + | |
77 | 77 | | |
78 | | - | |
| 78 | + | |
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| |||
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
161 | 177 | | |
162 | 178 | | |
163 | 179 | | |
| |||
168 | 184 | | |
169 | 185 | | |
170 | 186 | | |
171 | | - | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
172 | 191 | | |
173 | | - | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
174 | 236 | | |
175 | 237 | | |
176 | 238 | | |
| |||
198 | 260 | | |
199 | 261 | | |
200 | 262 | | |
| 263 | + | |
201 | 264 | | |
202 | 265 | | |
203 | 266 | | |
| |||
220 | 283 | | |
221 | 284 | | |
222 | 285 | | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
223 | 423 | | |
224 | 424 | | |
225 | 425 | | |
| |||
467 | 667 | | |
468 | 668 | | |
469 | 669 | | |
470 | | - | |
471 | | - | |
| 670 | + | |
| 671 | + | |
472 | 672 | | |
473 | | - | |
| 673 | + | |
474 | 674 | | |
475 | 675 | | |
476 | 676 | | |
| |||
487 | 687 | | |
488 | 688 | | |
489 | 689 | | |
490 | | - | |
491 | | - | |
| 690 | + | |
| 691 | + | |
492 | 692 | | |
493 | | - | |
| 693 | + | |
494 | 694 | | |
495 | 695 | | |
496 | 696 | | |
| |||
0 commit comments