Skip to content

Commit

Permalink
Write profile trace in the current folder. (#115)
Browse files Browse the repository at this point in the history
  • Loading branch information
maleadt committed Mar 3, 2023
1 parent 5f500ea commit 3a4b1c6
Show file tree
Hide file tree
Showing 3 changed files with 59 additions and 54 deletions.
22 changes: 9 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,41 +121,37 @@ julia> Array(c)
## Profiling

This package also supports profiling GPU execution for later visualization with Apple's
Xcode tools. The easiest way to generate a GPU report is to use the `Metal.@profile` macro as seen
below. To profile GPU code from a Julia process,
you must set the `METAL_CAPTURE_ENABLED` environment variable. On the first Metal
Xcode tools. The easiest way to generate a GPU report is to use the `Metal.@profile` macro
as seen below. To profile GPU code from a Julia process, you must set the
`METAL_CAPTURE_ENABLED` environment variable before importing Metal.jl. On the first Metal
command detected, you should get a message stating "Metal GPU Frame Capture Enabled" if the
variable was set correctly.
variable was set correctly:

```julia
$ METAL_CAPTURE_ENABLED=1 julia
...

julia> ENV["METAL_CAPTURE_ENABLED"] = 1
julia> using Metal

julia> function vadd(a, b, c)
i = thread_position_in_grid_1d()
c[i] = a[i] + b[i]
return
end
vadd (generic function with 1 method)

julia> a = MtlArray([1]); b = MtlArray([2]); c = similar(a);
... Metal GPU Frame Capture Enabled

julia> Metal.@profile @metal threads=length(c) vadd(a, b, c);
[ Info: GPU frame capture saved to /var/folders/x3/75r5z4sd2_bdwqs68_nfnxw40000gn/T/jl_WzKxYVMlon/jl_metal.gputrace/
[ Info: GPU frame capture saved to julia_capture_1.gputrace
```
This will generate a `.gputrace` folder in a temporary directory. To view the profile, open
the folder with Xcode. Since the temporary directory is destroyed when the Julia process
ends though, be sure to copy the `.gputrace` directory to a stable location on your system
for later viewing.
This will generate a `.gputrace` folder in the current directory. To view the profile, open
the folder with Xcode.
Note: Xcode is a large install, and there are some peculiarities with viewing Julia-created
GPU traces. It's recommended to only have one trace open at a time, and the shader profiler
may fail to start.
## Metal API wrapper
Finally, all of the above functionality is made possible by interfacing with
Expand Down
71 changes: 37 additions & 34 deletions src/utilities.jl
Original file line number Diff line number Diff line change
Expand Up @@ -46,41 +46,48 @@ function versioninfo(io::IO=stdout)
return
end

function profile_dir()
root = pwd()
i = 1
while true
path = joinpath(root, "julia_capture_$i.gputrace/")
isdir(path) || return path
i += 1
end
end

"""
@profile [kwargs...] ex
Metal.@profile [kwargs...] ex
Profile Metal/GPU work using XCode's GPU frame capture capabilities.
Note: Metal frame capture must be enabled before launching Julia (METAL\\_CAPTURE\\_ENABLED=1)
and XCode is required to view and interpret the GPU trace output.
Several keyword arguments are supported that influence the behavior of `@profile`:
- `dir`: the directory to save the GPU trace folder as. Will append required ".gputrace" \
postfix by default if not explicitly put.
- `capture`: the object to capture GPU work on. Can be a MtlDevice, MtlCommandQueue, or \
MtlCaptureScope. This defaults to the global command queue, and selecting a different \
capture object may result in no GPU commands detected when viewed from Xcode.
!!! note
Metal frame capture must be enabled before launching Julia (METAL\\_CAPTURE\\_ENABLED=1)
and XCode is required to view and interpret the GPU trace output.
Several keyword arguments are supported that influence the behavior of `Metal.@profile`:
- `capture`: the object to capture GPU work on. Can be a MtlDevice, MtlCommandQueue, or
MtlCaptureScope. This defaults to the global command queue, and selecting a different
capture object may result in no GPU commands detected when viewed from Xcode.
- `dest`: the type of GPU frame capture output. Potential values:
- `MTL.MtCaptureDestinationGPUTraceDocument` for folder output for later viewing/sharing. (default)
- `MTL.MtCaptureDestinationDeveloperTools` for direct XCode viewing.
- `MTL.MtCaptureDestinationGPUTraceDocument` for folder output for later
viewing/sharing. (default)
- `MTL.MtCaptureDestinationDeveloperTools` for direct XCode viewing.
Note that when profiling the resulting gputrace folder in Xcode, do so one at a time to \
avoid "no profiling data found" errors.
When profiling the resulting gputrace folder in Xcode, do so one at a time to avoid "no
profiling data found" errors.
"""
macro profile(ex...)
work = ex[end]
kwargs = ex[1:end-1]
# Default output directory - generate random path with required folder name ending
dir = tempname()*"/jl_metal.gputrace/"
# Default destination type to GPU trace document
dest = MTL.MtCaptureDestinationGPUTraceDocument
# Default capture object to global command queue
capture = global_queue(current_device())

dest = MTL.MtCaptureDestinationGPUTraceDocument # default: folder output
capture = global_queue(current_device()) # default: capture global command queue
if !isempty(kwargs)
for kwarg in kwargs
key,val = kwarg.args
if key == :dir
dir = val
elseif key == :dest
if key == :dest
dest = val
elseif key == :capture
capture = val
Expand All @@ -90,20 +97,16 @@ macro profile(ex...)
end
end

expr = quote
local result = nothing
# Start tracking GPU work
startCapture($capture, $dest; folder=$dir)
quote
result = nothing
dir = profile_dir()
startCapture($capture, $dest; folder=dir)
try
# Execute GPU work and store result
result = $work
@info "GPU frame capture saved to $($dir)\n"
result = $(esc(work))
@info "GPU frame capture saved to $dir"
finally
# Stop tracking
stopCapture()
end
return result
end

return esc(expr)
end
20 changes: 13 additions & 7 deletions test/profiling.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
@testset "profiling" begin
mktempdir() do tmpdir

# Verify Metal capture is enabled via environment variable
@test haskey(ENV, "METAL_CAPTURE_ENABLED")
Expand Down Expand Up @@ -34,11 +35,10 @@ desc.destination = MTL.MtCaptureDestinationGPUTraceDocument

# Output URL
@test desc.outputFolder == nothing
path = tempname()*"/jl_metal.gputrace"
path = joinpath(tmpdir, "test.gputrace")
desc.outputFolder = path
@test desc.outputFolder == path


# Capture Scope
queue = MtlCommandQueue(current_device())
default_scope = manager.defaultCaptureScope
Expand All @@ -55,21 +55,27 @@ new_scope.label = new_label
manager.defaultCaptureScope = new_scope
@test manager.defaultCaptureScope == new_scope


# Capturing
bufferA = MtlArray{Float32,1}(undef, tuple(4), storage=Shared)

@test !isdir(path)
@test manager.isCapturing == false
startCapture(manager, desc)
@test manager.isCapturing
@test_throws ErrorException startCapture(manager, desc)
@metal threads=4 tester(bufferA)
stopCapture(manager)
@test manager.isCapturing == false
@test isdir(path)

# Profile Macro
Metal.@profile @metal threads=4 tester(bufferA)
Metal.@profile capture=current_device() @metal threads=4 tester(bufferA)
@test_throws ArgumentError Metal.@profile dir=path @metal threads=4 tester(bufferA)
cd(path) do
Metal.@profile @metal threads=4 tester(bufferA)
@test isdir("julia_capture_1.gputrace")
Metal.@profile capture=current_device() @metal threads=4 tester(bufferA)
@test isdir("julia_capture_2.gputrace")
@test_throws ArgumentError Metal.@profile @metal threads=4 tester(bufferA)
end

end
end
end

0 comments on commit 3a4b1c6

Please sign in to comment.