Skip to content

Study Next Gen Graphics API

Yaochuang edited this page May 28, 2019 · 14 revisions

Metal

  • Metal explicitly divides GPU work into three categories: Render, Compute and Copy. It is more fit Modern GPU arch.
  • How about state query of rendering and computing ? Is there explicit APIs to do that ?
  • How about Geometry Shader in Metal ?
  • Metal puts Rendering into two categories: Vertex and Pixel, much like GPU Hardware does, Vertex Space and Screen Space.
  • Can Metal make Compute easy to use in Rendering ? Case by Case talk.

Resources

MTLBuffer

  • store unformatted data accessible to GPU.
  • Metal creates Buffer with storage option: Shared, Managed and Private. There is NO Buffer Usage in Metal. How does Metal exploit Buffer performance by its Usage ? Does this mean MTLBuffer supports all access, you can read, write and update at your will ?
  • There is NO so called "Resources view" in Metal, Metal directly bind Buffers to GPU slots. It seems simpler than D3D.
    • Why dose Microsoft not choose this way ? Does this have performance overhead ?
    • How to understand MTLBuffer Address ? Any way to map directly to "VGA" ? Or is there "VGA" in Metal ?
    • How about Constant Buffer in Metal ? There is no explicit API to do that. Does this mean a performance loss ? immutable and mutable, performance concern.
    • Constant Buffer and Vertex Buffer share the same Binding API .
  • Create buffer API can be with new storage, reuse storage, copy data to new storage. It is more useful than those in D3D11 and OGL.
    • Create Texture sharing Buffer data.
  • Buffer update is easy in Metal. There is no explicit map and unmap.
    • Suppose there is a big MTLBuffer, one part of it is in a draw call, can we can update the other part by another CPU thread at the same time ?
  • How about "Structed Buffer" and "Byte address buffer" in Metal ? These type are explicitly supplied in D3D.
  • How about "Unordered Access View" used in Compute Shader in D3D11 ? Does that mean all MTLBuffer support out of order accessing ?

MTLHeap

  • MTLHeap and MTLDevice all have newBufferWithLength interface
    • It requires a descriptor to Create a MTLHeap from Device, There is no such requirements for MTLBuffer.
    • What is difference between them ?
      • MTLHeap is only target against a certain GPU, can be cached by CPU. Seems accessible by both GPU and CPU.
      • MTLHeap is used to allocate Buffers and Textures. So It focused on block memory management.
    • MTLHeap is much like D3D11Buffer accessible by both GPU and CPU depending on memory usage.
    • MTLHeap can improve memory efficiency.
    • MTLResourceOptions in Metal is like Memory Usage in D3D11

MTLTexture

  • Storing formatted image data accessible to the GPU. Following is good comments in metal code.

    Each image in a texture is a 1D, 2D, 2DMultisample, or 3D image. The texture contains one or more images arranged in a mipmap stack. If there are multiple mipmap stacks, each one is referred to as a slice of the texture. 1D, 2D, 2DMultisample, and 3D textures have a single slice. In 1DArray and 2DArray textures, every slice is an array element. A Cube texture always has 6 slices, one for each face. In a CubeArray texture, each set of six slices is one element in the array.

    • Create with a MTLTextureDescriptor. Create with new storage,
    • MTLTexture has a MTLBuffer object inside. We can create a texture view for buffer.
    • Create new MTLTexture Object sharing storage with different pixel format.
    • Support Buffer sharing across address space. Is that possible in D3D11 D3D12 ?
    • Framebuffer only texture, obtained from CAMetalDrawables. It only exist in Metal. There is NO such concept in D3D. It can only be used as a render target.
    • MTLTextureUsage determines how the resource be used: read in shader, write in shader and as render target.
    • Read and Write for MTLTexture from CPU , **are map and unmap operation need **?
    • In Metal, Blit Command can do Texture Copy. In D3D11, It do this by a single API call.
    • In Metal, Create Texture from an Image on disk is done by MTKTextureLoader. So MTKTextureLoader will be included if other framework wants to integrate Metal ?
  • 2DMultisample Texture, There are counterparts in D3D.
    • Is it required to resolve MSAA texture before copy it into non-MSAA texture. How does Metal resolve MSAA ? By API ? By Custom ?
    • IOSurface compatitable

Metal Shading Language

  • Share the same definitions between Metal and Metal Shading Language.
  • How is MSL compiled and linked ?
    • Metal provides no APIs to do this, instead, it depends on external tools to compile and archive them as a library.
    • These tools are integrated with Xcode. After archived, Create a Default Library from Device by API newDefaultLibrary . Then Pick Shader Function from within it.
  • Compared with OGL and D3D11, Metal gives less control, is that a good thing ?
  • MTLFunction
    • A handle to to intermediate Metal Shading code.
    • Storing informations of intermediate code, such as function type, vertex attribute.
    • message newArgumentEncoderWithBufferIndex:
      • Used to create an Argument Buffer Encoder to encode arguments into a MTLBuffer at a certain binding point, the layout of the encoded Buffer is identical to that in Metal Shading Code.
    • MTLFunction is something like a bridge between Metal Shading Code and Objective-C.
    • There is NO such mechanism in D3D11.
    • MTLArgumentEncoder This type maybe tightly coupled with MTLFunction
      • Encode Resource bindings and Store them in an Argument Buffer(MTLBuffer).
      • Supported bings: Buffer, Texture, Sampler, Render Pipe State, Constant Data, ICB.
  • MTLFunctionConstantValues
    • control function by a dynamic way. CPU manipulates GPU resources.

Indirect Command Buffer

  • MTLIndirectCommandBuffer
    • It is a GPU Device Resource, It is neither different from MTLBuffer nor MTLTexture.
    • It stores encoded GPU Commands.
      • They are Indirect Commands, represented by Metal class MTLIndirectRenderCommand.
      • These commands are executed from Compute Shader. So GPU can issue draw calls by itself in this way.
  • Argument Buffer
    • A MTLBuffer is used to reduce CPU overhead of binding resources by different API calls.
    • It stores resource bindings of Buffers, textures and samplers.
      • So there is an encoder helping to encode those bindings into Argument Buffer.
    • There is no corresponding in D3D11 ?

GPU Work Submission

  • Command Queue
    • It is a little like Device Context in D3D11.
    • It is thread safe, much fit for multi-threading environment.
    • Command Buffer can encoded by different encoders serially, So a Command Buffer may contain render commands, compute commands, blit commands ?
  • Command Buffer
    • It encodes and packs "GPU Commands" explicitly by a "Command Encoder".
    • Each thread can have its own "Command Buffer" and operates on it.
    • It has a addCompletedHandler interface to add call back function when command buffer completed execution.
    • Can Command Buffer be reused after submitted ? It seems not.
  • Command Encoder
    • It is used to encode GPU commands into buffer.
    • Render Command Encoder is created from Command Buffer by a Render Pass descriptor.
    • Compute and Blit command Encoder, there is no such constraints, because they are independent of Rendering.
    • Render Command Encoder is 1:1 map to "A Render Pass".
    • Render Command Encoder can only be created serially except of Parallel Render command Encoder. That is before end encoding one you cannot create a new one.
      • create encoder
      • encoder GPU commands
      • endEncoding
    • Render Command Encoder contains an implicit "ClearColor" command.
  • MTLRenderPassDescriptor
    • It is used to create "Render Command Encoder" from commandBuffer by renderCommandEncoderWithDescriptor
    • It is a description of output of a Render Pass. There is NO counterpart in D3D11.
    • What role does it play in rendering process ?
      • It contains color attachment array, depth attachment and stencil attachment.
      • Sample position operations of Multi-Sample technique. It gives a way to do custom MSAA. Does D3D provide API to do these things ?
      • Layered Rendering, Tile Shading
      • Visibility Result buffer. How this buffer is written and queried ?

Render Pipeline State (PSO) in Metal

  • MTLRenderPipelineDescriptor is used to create MTLRenderPipelineState object from device

    • calling newRenderPipelineStateWithDescriptor to create MTLRenderPipelineState .
    • Vertex function, Fragment function. How about tessellation and geometry stage ?
      • D3D11 compiles source code into byte code, then generates shader objects from byte code.
      • D3D11 binds different Shader objects to different stages at run time.
    • Vertex layout description
    • Rasterizer sample count For MSAA
    • color attachment description array, depth format, stencil format.
      • D3D11 use Resource View: RTV, DSV, SRV / CBV / UAV.
    • Tessellation related settings.
    • MTLPipelineBufferDescriptorArray, what is this ?
  • MTLDepthStencilDescriptor && MTLDepthStencilState

    • create MTLDepthStencilState object from device by newDepthStencilStateWithDescriptor
      • depthWriteEnabled
      • depthCompareFunction
    • Why not include MTLDepthStencilState in MTLRenderPipelineState ?

Present Rendering Result in Metal

  • How to present rendering results in Metal ?

D3D12

Work submission

Command Queue

  • ID3D12CommandQueue Provides methods for submitting command lists, synchronizing command list execution, instrumenting the command queue, and updating resource tile mappings. doc on microsoft
  • Command Queues of all types (3D, compute and copy) share the same interface and are all command-list based.
    • Command List type determines Command Queue Type.
  • Command Queue is an abstraction to GPU Engine and affects how engine scheduled.
    • So Interface for synchronization such as Wait(), Signal() are all handled by Command Queue.
  • Command Queue synchronization.

Command List

  • ID3D12CommandList is created by CreateCommandList().
    • It is GP-Buffer in NV GPU containing an ordered set of commands that the GPU executes.
    • So it is a kind of GPU Resource. Interface ID3D12CommandAllocator serves this purpose.
    • ID3D12CommandAllocator

Resources Binding Model

  • Resources in D3D12 is created from 3 categories: committed, placed, reserved. I feel resource creation in D3D12 is a little complicated. For example, heap description and resource state.
    • committed resource has both VA and PA. It is the common one since DX9.
    • placed resource is a pointer to a certain region of a heap. id3d12heap
    • reserved resource has its own unique GPU virtual address space. It is hard to understand.

GPU recognize the resource

  • A descriptor is a small block of data that fully describes an object to the GPU . It is a GPU specific opaque format.
    • SRVs, UAVs, CBVs and Samplers, they are descriptor.
    • It resembles resource view object in concept in old API.
  • A descriptor handle is the unique address of the descriptor. similar to a pointer, but is opaque as its implementation is hardware specific.
    • The handle is unique across descriptor heaps
  • Null descriptors and Default descriptors descriptors overview
  • The primary purpose of a descriptor heap is to encompass the bulk of memory allocation required for storing the descriptor specifications of object types that shaders reference for as large of a window of rendering as possible.
  • CBV, UAV and SRV entries can be in the same descriptor heap.
  • Samplers entries cannot share a heap with CBV, UAV or SRV entries.
  • These heaps are shared between both the graphics and compute pipelines
  • Descriptor heaps can only be edited immediately by the CPU, there is no option to edit a descriptor heap by the GPU.
  • Direct3D 12 does require the use of descriptor heaps, there is no option to put descriptors anywhere in memory.
  • On some hardware, switch descriptor heap can be an expensive operation, requiring a GPU stall to flush all work that depends on the currently bound descriptor heap.
  • Descriptors are of varying size,
  • Descriptors get bound to the pipeline by having their contents recorded directly into the command list. These heaps are always non-shader visible.
    • Render Target Views (RTVs)
    • Depth Stencil Views (DSVs)
    • Stream Output Views (SOVs)
    • Index Buffer Views (IBVs) and Vertex Buffer Views (VBVs) are passed directly to API methods, and do not have specific heap types.

Blogs for Other Graphic API

Clone this wiki locally