<Prev | Content | Next>

05. GPU Frame Capture: pt.2 Code

In the previous episode I talked about GPU frame capturing only through the GUI. For more control (like capturing a specific set of command buffers) you need to add some code. In this section I show how to make captures clearer and define capture boundaries.

Labels

I already explained the importance of labels in earlier episodes, but let's check again how a capture looks when we skip them:

You can see some command buffers, encoders, and resources, but you have no idea what they are about (especially if it's not your code or very old code). To make them more meaningful, simply set the .label values for your Metal entities (command queues, command buffers, command encoders, pipeline states, buffers, and textures).

NOTE: You need to set the label's value on pipeline state, buffer, and texture descriptors when creating those entities.

As you can see, now every resource and command entity has a meaningful name, so you can determine whether you're using the right resource (and that it contains the proper values) much more easily.

Debug Groups

Often your Metal program may have complex logic. For example (from my experience) you might build an image editor or animation app. In this app Metal lives on different layers and provides different functionality: rendering custom UI components, composing canvases, applying effects, painting, etc. If you keep all those command buffers, encoders, and so on at the same level, the capture looks like a total mess.

As you can see, we have a flat structure under the Command Buffer. That's fine for this particular effect because it is simple enough. But, as I said before, your processing may include different command buffers, a complex pipeline, and so on. So let's group our entities into a clearer hierarchy using debug groups on the command buffer or command encoder (because you may have lots of dispatch, blit, or draw calls per encoder):

  • .pushDebugGroup("Group name") - starts the group with Group name.
  • .popDebugGroup() - finishes the group.

Now we have our encoders and calls grouped in a meaningful hierarchy, so we can navigate through them much more efficiently.

GPU Frame Capture

Often you need to capture your Metal command buffers under special conditions, such as when a specific tool is running or a certain UI component renders, which doesn't happen every frame. To solve that, you can capture command buffers directly from your code, where you know all those conditions.

NOTE: Sometimes you may need to allow capturing manually (by default it's on). Set MetalCaptureEnabled to YES in your target's Info.plist, or export the environment variable MTL_CAPTURE_ENABLED=1 (which is enabled by default).

Programmatic Capture

For programmatic captures, Metal provides MTLCaptureManager, which handles captures, scopes, and saving results.


// (1)
func startMetalCapture(captureObject: Any) {
    // (2)
    let captureManager = MTLCaptureManager.shared()
    // (3)
    let captureDescriptor = MTLCaptureDescriptor()
    captureDescriptor.captureObject = captureObject
    do {
        // (4)
        try captureManager.startCapture(with: captureDescriptor)
    } catch {
        fatalError("error when trying to capture: \(error)")
    }
}

// (5)
func stopMetalCapture() {
    let captureManager = MTLCaptureManager.shared()
    captureManager.stopCapture()
}


func perform(commandQueue: MTLCommandQueue) {
    // (6)
    startMetalCapture(device: commandQueue.device)

    // (7)
    let commandBuffer = commandQueue.makeCommandBuffer()!
    // ...
    commandBuffer.commit()

    // (8)
    stopMetalCapture()
}

I wrapped starting and stopping capture into helper functions as official documentation recommends.

  1. A wrapper for starting capture.
  2. In most cases you can use the standard shared instance of MTLCaptureManager, but you could also set up your own.
  3. Setting up a capture descriptor, where you configure the following parameters:

    • captureObject - set this to an instance of MTLDevice, MTLCommandQueue, or MTLCaptureScope, depending on what you are capturing.
    • destination - where to send the capture results: developerTools to show it in Xcode, or gpuTraceDocument to save it to a file.
    • outputURL - the output file URL (required when .destination = .gpuTraceDocument).
  4. Kick the capture manager to start capturing your command buffers. You can check its status with MTLCaptureManager.shared().isCapturing.

  5. A wrapper for finishing capture.
  6. Start capturing.
  7. Your command buffer(-s) and Metal encoders, calls, etc.
  8. Stop capturing.

NOTE: You get all command buffers that were created after startMetalCapture() and committed before stopMetalCapture().

Programmatic Scope

A default scope includes all command buffers that were created after it begins and committed before it ends, but sometimes you need a particular command buffer without capturing lots of others. For this purpose you can set up a custom capture scope.

// (1)
let customCaptureScope = MTLCaptureManager.shared().makeCaptureScope(device: device)
// (2)
customCaptureScope?.begin()

// (3)
let commandBuffer = commandQueue.makeCommandBuffer()!
// ...
commandBuffer.commit()

// (4)
customCaptureScope?.end()
  1. You need to create a scope from MTLCaptureManager on an MTLDevice or MTLCommandQueue.
  2. Begin your scope when necessary.
  3. Perform your command buffers.
  4. End your scope.

NOTE: You get all command buffers that were created after the scope begins and committed before it ends.

You can also use this to set a custom scope for captures triggered from Xcode: assign it to MTLCaptureManager.shared().defaultCaptureScope.

Example

In Tiamat I capture a selected pattern from my simulation field. It's often difficult to capture manually a command buffer that performs once while your frame updates are continuous, so manual capture is almost impossible. In this case I simply start capture before the command buffer I want and stop afterward.

If you have a complex processing pipeline with lots of command buffers (which isn't ideal, but sometimes necessary), you can also use custom scopes.

Conclusion

  • Use labels and debug groups to make GPU captures more meaningful and understandable.
  • Use code to capture specific command buffers.
  • Capturing from code also lets you save the results to a file, which is convenient when running remotely.

<Prev | Content | Next>