Unit Reference
This page explains the key metrics and units used throughout zymtrace.
CPU % Mode: Absolute vs Relativeβ
The CPU % Mode toggle allows you to switch between two different ways of calculating CPU percentages:
-
Abs (Absolute) β Shows real CPU usage percentagesβjust like what you'd see in tools such as
htop
orbtop
. For example, if your cluster has 2 machines with 8 cores each (16 cores total), and your workloads are consuming 4 cores on each machine (8 cores total), zymtrace will display 50% CPU usage for that service. -
Rel (Relative) β Presents CPU utilization in relation to other workloads you're profiling, helping you see which services dominate your compute resources. This shows each entity's percentage of the total consumed compute across all profiled workloads.
Use Absolute mode to understand real infrastructure utilization. Use Relative mode to understand the distribution of CPU consumption among actively profiled workloads.
Core Size Referenceβ
Core usage values are displayed with the following units:
- mc - millicore (1/1000 core)
- c - core
- kc - kilocore (1000 cores)
- Mc - megacore (1,000,000 cores)
Self vs Total Timeβ
Understanding the difference between self and total time is crucial for identifying optimization opportunities:
Self Timeβ
Time spent executing the function's own code only, excluding any functions it calls.
- Indicates if the function itself is expensive
- High self time = optimize the function's algorithm
- Measures the direct cost of the function's instructions
Total Timeβ
Time including all called functions (the function plus everything it calls).
- Shows the full cost of calling this function
- High total time with low self time = optimize what it calls
- Measures the complete execution path
Exampleβ
Function: processData()
Self CPU: 2% (function is fast)
Total CPU: 45% (but calls expensive operations)
β Optimize what it calls, not the function itself
In this example, processData()
spends very little time in its own code (2%), but the functions it calls consume 43% of CPU time. The optimization opportunity is in the called functions, not processData()
itself.
Unit Definitionsβ
zymtrace uses specific units and metrics to measure and display resource consumption across your infrastructure. Understanding these units is essential to fully utilize the platform. The following sections detail the units used for GPU and CPU profiling.
- GPU Units
- CPU Units
GPU Unitsβ
Unit | Description |
---|---|
Self Time | For CUDA frames, Self Time reflects the time spent executing this particular function on the GPU, excluding any functions it called. For CPython and Native frames this is always zero because they are not actually executed on the GPU. |
Total Time | For CPython and native frames, Total Time indicates the GPU time consumed by CUDA kernels launched by the function (including its callees); for CUDA frames, it reflects time spent executing the function's body and any functions it called on the GPU. |
Self GPU % | For CUDA frames, Self GPU reflects the percentage of time spent executing this particular function on the GPU, excluding any functions it called. For CPython and Native frames this is always zero because they are not actually executed on the GPU. |
Total GPU % | For CPython and native frames, Total GPU indicates the percentage of GPU time consumed by CUDA kernels launched by the function (including its callees); for CUDA frames, it reflects percentage of time spent executing the function's body and any functions it called on the GPU. |
The data above varies based on the frame type:
- CUDA frames: Represent actual GPU kernel execution
- CPython/Native frames: Represent CPU code that launches GPU kernels
When to Use Self vs Total (GPU)β
Use Self GPU Time to identify expensive GPU kernels, find kernels that need optimization, or look for inefficient CUDA implementations.
Use Total GPU Time to understand the full GPU cost of a function, find which high-level functions trigger expensive GPU operations, or decide what GPU workloads to optimize or replace.
CPU Unitsβ
Unit | Description |
---|---|
Cores Used (self) | Indicates the amount of CPU cores used per second (on average) in the function's body, excluding the cores used in any functions it called. |
Cores Used (total) | Indicates the amount of CPU cores used per second (on average) in the function's body, including the cores used in any functions it called. |
Self CPU % | Indicates the percentage of CPU time spent in the function's body, excluding the time spent in any functions it called. |
Total CPU % | Indicates the percentage of CPU time spent in the function's body, including the time spent in any functions it called. |
When to Use Self vs Total (CPU)β
Use Self CPU Time to identify hot loops or expensive algorithms, find functions that need code optimization, or look for inefficient implementations.
Use Total CPU Time to understand the full cost of a code path, find which high-level functions are most expensive, or decide what functionality to optimize or replace.