Metal offers a broad range of blit operations, which are essential for efficient data transfer between memory locations. A blit operation (short for BLock Information Transfer) involves copying a section of a buffer, texture, or their parts from one place in memory to another.
By leveraging these operations, you can move data efficiently without the need for additional rendering or compute passes. So if you need to copy, fill, or generate mipmaps, use blit operations to perform these tasks more efficiently.
Blit is a GPU operation that runs as part of a command buffer, so you'll need a corresponding command encoder to perform it. Here's a simple example:
if let encoder = commandBuffer.makeBlitCommandEncoder() {
// Perform your blit operations here
encoder.endEncoding()
}
This encoder handles the actual blit operations, and once you’re done, you end the encoding to complete the process.
The blit operations in Metal are pretty simple and well documented. However, I’ll briefly explain some of them:
There're a heap of copying .copy(...) operations. Note, that you can copy areas within the same texture or buffer, but be cautious — if the areas overlap, it may result in undefined behavior.
From buffer to buffer
Since buffers don’t have many additional attributes and are simply blocks of memory, copying them is straightforward:
// Repeating 20 bytes at offset 20 starting from offset 40
encoder.copy(
from: srcBuffer,
sourceOffset: 20,
to: srcBuffer,
destinationOffset: 40,
size: 20
)
From buffers to texture
These operations can be particularly useful when uploading a raw image buffer to a GPU texture — at least, that’s the most common case in my experience.
// Uploading an image with size 640x480 and 4 bytes per pixel to a texture.
encoder.copy(
from: buffer,
sourceOffset: 0,
sourceBytesPerRow: 640 * 4,
sourceBytesPerImage: 640 * 480 * 4,
sourceSize: MTLSize(width: 640, height: 480, depth: 1),
to: texture,
destinationSlice: 0,
destinationLevel: 0,
destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0))
From texture to buffer
Personally, I’ve mostly used these operations for downloading textures.
// Downloading texture from previous example back to the buffer.
encoder.copy(
from: texture,
sourceSlice: 0,
sourceLevel: 0,
sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0),
sourceSize: MTLSize(width: 640, height: 480, depth: 1),
to: buffer,
destinationOffset: 0,
destinationBytesPerRow: 640 * 4,
destinationBytesPerImage: 640 * 480 * 4)
From texture to texture
straight copy of the whole texture
// Copying the whole content of `srcTexture` into `dstTexture`.
encoder.copy(from: srcTexture, to: dstTExture)
copying whe whole slices
// Copying mip level 3 of slice 1 of `srcTexture` to mip level 3 of slice 2 of `dstTexture`
encoder.copy(
from: srcTexture,
sourceSlice: 1,
sourceLevel: 3,
to: dstTExture,
destinationSlice: 2,
destinationLevel: 3,
sliceCount: 1,
levelCount: 1)
copying some area of one texture to an area of destination texture
// Copying a rectangular area with size 120x90 and origin (10, 20) in `srcTexture`
// to origin (20, 30) in `dstTexture`
encoder.copy(
from: srcTexture,
sourceSlice: 0,
sourceLevel: 0,
sourceOrigin: MTLOrigin(x: 10, y: 20, z: 0),
sourceSize: MTLSize(width: 120, height: 90, depth: 1),
to: dstTexture,
destinationSlice: 0,
destinationLevel: 0,
destinationOrigin: MTLOrigin(x: 20, y: 30, z: 0))
If you need to fill a buffer with the same constant bytes, use something like the following:
encoder.fill(buffer: buffer, range: 0..<bufferLength, value: 0x42)
This is pretty simple way to generate a mipmap for the given texture:
encoder.generateMipmap(for: texture)
To be honest, I’ve never used these functions myself, but they do exist and seem like a good reason to experiment.
The purpose of these functions is to realign a texture’s memory for more efficient access by the GPU or CPU:
optimizeContentsForGPUAccess(texture: texture)optimizeContentsForCPUAccess(texture: texture)Here we have a function for synchronizing the CPU copy of managed resources with their GPU copy (though I’m not sure how relevant this is for ARM processors, since they use shared memory):
encoder.synchronize(texture: texture, slice: 0, level:0)
Sometimes, you need to synchronize operations within a pass using fences (MTLFence). Metal provides the following functions to handle this in a blit pass (and yes, similar fence synchronization is available in render and compute encoders):
// ...
// Wait for the fence to be updated
encoder.waitForFence(fence)
// ...
// Update the fence and continue processing:
encoder.updateFence(fence)
NOTE: If you call
updateFence()beforewaitForFence(), it can cause a GPU deadlock — be careful!
There are several of other operations, but I’ve never used them — feel free to check them out in the documentation.