Skip to content

Commit

Permalink
perf: add more notes on profiling, plus hacky script for use with dtrace
Browse files Browse the repository at this point in the history
  • Loading branch information
wincent committed Aug 15, 2024
1 parent 7e2ea3a commit 4ee21e2
Show file tree
Hide file tree
Showing 2 changed files with 110 additions and 1 deletion.
52 changes: 51 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,9 +122,15 @@ vagrant destroy

# Profiling

In order to get intelligible stack traces, compile with debug symbols with:

```
make PROFILE=1
```

## On macOS

I didn't have any success the last time I tried these, but including the notes here for reference anyway:
I didn't have any success the last time I tried `xctrace`, but including the notes here for reference anyway:

```
xctrace record --launch bin/benchmarks/matcher.lua --template "CPU Profiler" # Instruments.app hangs while opening this.
Expand All @@ -145,3 +151,47 @@ I also attempted using the `/usr/bin/sample` tool, which produces results, albei
(sleep 1 && luajit bin/benchmarks/matcher.lua) &
sample -wait luajit -mayDie
```

So far, `dtrace` is the only thing I've been able to get working usefully:

```
sudo -v
luajit bin/benchmarks/matcher.lua & ; DTRACE_PID=$! ; sudo vmmap $DTRACE_PID | grep commandt.so ; sudo dtrace -x ustackframes=100 -p $DTRACE_PID -n \
'profile-100 /pid == '$DTRACE_PID'/ { @[ustack()] = count(); }' -o dtrace.stacks
```

ie. refresh sudo credentials, kick off `luajit`, grab the base address of the `commandt`.so library so that we can symbolicate later on, run `dtrace` for 60s, sample 100 times per second (ie. every 10ms), grab user stack (not kernel frames), then exit.

I tried a few tricks[^tricks] to get `dtrace` to symbolicate for us automatically but eventually had to do it manually with `atos`. Grab the base address of the `__TEXT` segment (printed by `vmmap`); in this example, `0x104ac0000`:

```
__TEXT 104ac0000-104ac8000 [ 32K 32K 0K 0K] r-x/rwx SM=COW /Users/USER/*/commandt.so
__DATA_CONST 104ac8000-104acc000 [ 16K 16K 16K 0K] r--/rwx SM=COW /Users/USER/*/commandt.so
__LINKEDIT 104acc000-104ad0000 [ 16K 16K 0K 0K] r--/rwx SM=COW /Users/USER/*/commandt.so
```

Then run this hacky script:

```
cat dtrace.stacks | bin/symbolicate-dtrace 0x104ac0000 > dtrace.symbolicated
```

Which produces output that can then be visualized with:

```
git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph
./stackcollapse.pl dtrace.symbolicated > dtrace.collapsed
./flamegraph.pl dtrace.collapsed > dtrace.svg
```

[^tricks]:
Tricks which didn't work included running from inside `lua/wincent/commandt/lib` (where the dSYM bundle is), and moving the dSYM bundle up to the root and running from there.

The probable reason why automatic symbol discovery doesn't work is the UUID mismatch between the library and the process that `dtrace` is executing:

```
dwarfdump --uuid lua/wincent/commandt/lib/commandt.so # This matches...
dwarfdump --uuid lua/wincent/commandt/lib/commandt.so.dSYM # ... with this;
dwarfdump --uuid /opt/homebrew/bin/luajit # but not with this.
```
59 changes: 59 additions & 0 deletions bin/symbolicate-dtrace
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env node

/**
* SPDX-FileCopyrightText: Copyright 2024-present Greg Hurrell and contributors.
* SPDX-License-Identifier: BSD-2-Clause
*/

import child_process from 'node:child_process';
import readline from 'node:readline';

const SYMBOL_TABLE = {};

const BASE_ADDRESS = process.argv[2];

if (!BASE_ADDRESS.match(/^0x[0-9a-f]{8,12}$/)) {
throw new Error(`Invalid base address: ${JSON.stringify(BASE_ADDRESS)}`);
}

const rl = readline.createInterface({
input: process.stdin,
crlfDelay: Infinity,
});

for await (const line of rl) {
process.stdout.write(transformLine(line) + '\n');
}

function transformLine(line) {
return line.replace(/^(\s+)(0x[0-9a-f]{8,12})$/, (_match, whitespace, address) => {
if (!SYMBOL_TABLE[address]) {
SYMBOL_TABLE[address] = lookup(address);
}
return `${whitespace}${SYMBOL_TABLE[address]}`;
});
}

function lookup(address) {
const result = child_process.spawnSync('atos', [
'-arch',
'arm64',
'-o',
'lua/wincent/commandt/lib/commandt.so',
'-l',
BASE_ADDRESS,
address,
]);

// Result should be of the form:
//
// recursive_match (in commandt.so) (score.c:42)
//
// Transform that into:
//
// commandt.so`recursive_match (score.c:42)
//
return result.stdout.toString().trimEnd().replace(/(\w+)\s+\(in ([^)]+)\)\s+\(([^)]+)\)/, (_match, fn, library, location) => {
return `${library}\`${fn} (${location})`;
});
}

0 comments on commit 4ee21e2

Please sign in to comment.