06. Blit Operations

Metal offers a broad range of blit operations, which are essential for efficient data transfer between memory locations. A blit operation (short for BLock Information Transfer) involves copying a section of a buffer, texture, or their parts from one place in memory to another.

By leveraging these operations, you can move data efficiently without the need for additional rendering or compute passes. So if you need to copy, fill, or generate mipmaps, use blit operations to perform these tasks more efficiently.

Encoder

Blit is a GPU operation that runs as part of a command buffer, so you'll need a corresponding command encoder to perform it. Here's a simple example:

if let encoder = commandBuffer.makeBlitCommandEncoder() {
    // Perform your blit operations here
    encoder.endEncoding()
}

This encoder handles the actual blit operations, and once you’re done, you end the encoding to complete the process.

Operations

The blit operations in Metal are pretty simple and well documented. However, I’ll briefly explain some of them:

Copying

There're a heap of copying .copy(...) operations. Note, that you can copy areas within the same texture or buffer, but be cautious — if the areas overlap, it may result in undefined behavior.

From buffer to buffer

Since buffers don’t have many additional attributes and are simply blocks of memory, copying them is straightforward:

// Repeating 20 bytes at offset 20 starting from offset 40
encoder.copy(
    from: srcBuffer,
    sourceOffset: 20,
    to: srcBuffer,
    destinationOffset: 40,
    size: 20
)

From buffers to texture

These operations can be particularly useful when uploading a raw image buffer to a GPU texture — at least, that’s the most common case in my experience.

// Uploading an image with size 640x480 and 4 bytes per pixel to a texture.
encoder.copy(
    from: buffer,
    sourceOffset: 0,
    sourceBytesPerRow: 640 * 4,
    sourceBytesPerImage: 640 * 480 * 4,
    sourceSize: MTLSize(width: 640, height: 480, depth: 1),
    to: texture,
    destinationSlice: 0,
    destinationLevel: 0,
    destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0))

From texture to buffer

Personally, I’ve mostly used these operations for downloading textures.

// Downloading texture from previous example back to the buffer.
encoder.copy(
    from: texture, 
    sourceSlice: 0, 
    sourceLevel: 0, 
    sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0), 
    sourceSize: MTLSize(width: 640, height: 480, depth: 1), 
    to: buffer, 
    destinationOffset: 0, 
    destinationBytesPerRow: 640 * 4, 
    destinationBytesPerImage: 640 * 480 * 4)

From texture to texture

straight copy of the whole texture

  // Copying the whole content of `srcTexture` into `dstTexture`.
  encoder.copy(from: srcTexture, to: dstTExture)

copying whe whole slices

  // Copying mip level 3 of slice 1 of `srcTexture` to mip level 3 of slice 2 of `dstTexture`
  encoder.copy(
      from: srcTexture,
      sourceSlice: 1,
      sourceLevel: 3,
      to: dstTExture,
      destinationSlice: 2,
      destinationLevel: 3,
      sliceCount: 1,
      levelCount: 1)

copying some area of one texture to an area of destination texture

  // Copying a rectangular area with size 120x90 and origin (10, 20) in `srcTexture`
  // to origin (20, 30) in `dstTexture`
  encoder.copy(
      from: srcTexture,
      sourceSlice: 0,
      sourceLevel: 0,
      sourceOrigin: MTLOrigin(x: 10, y: 20, z: 0),
      sourceSize: MTLSize(width: 120, height: 90, depth: 1),
      to: dstTexture,
      destinationSlice: 0,
      destinationLevel: 0,
      destinationOrigin: MTLOrigin(x: 20, y: 30, z: 0))

Filling Buffer

If you need to fill a buffer with the same constant bytes, use something like the following:

encoder.fill(buffer: buffer, range: 0..<bufferLength, value: 0x42)

Generating Mipmap

This is pretty simple way to generate a mipmap for the given texture:

encoder.generateMipmap(for: texture)

Optimisation Content

To be honest, I’ve never used these functions myself, but they do exist and seem like a good reason to experiment.

The purpose of these functions is to realign a texture’s memory for more efficient access by the GPU or CPU:

optimizeContentsForGPUAccess(texture: texture)
optimizeContentsForCPUAccess(texture: texture)

Synchronisation

Here we have a function for synchronizing the CPU copy of managed resources with their GPU copy (though I’m not sure how relevant this is for ARM processors, since they use shared memory):

encoder.synchronize(texture: texture, slice: 0, level:0)

Sometimes, you need to synchronize operations within a pass using fences (MTLFence). Metal provides the following functions to handle this in a blit pass (and yes, similar fence synchronization is available in render and compute encoders):

// ...
// Wait for the fence to be updated
encoder.waitForFence(fence)
// ...
// Update the fence and continue processing:
encoder.updateFence(fence)

NOTE: If you call updateFence() before waitForFence(), it can cause a GPU deadlock — be careful!

Other functions

There are several of other operations, but I’ve never used them — feel free to check them out in the documentation.

Conclusion

As you can see, there’s a comprehensive set of functions for fast operations with GPU buffers and textures.
These operations are highly optimized, so there’s no need to replicate them using compute or rendering encoders.
You can even perform certain image manipulations just by copying (it may seem a bit unconventional, but in some cases, it’s an effective solution).