This is a wrong approach. What users want from these metrics is the feedback abo...

YZF · on July 4, 2023

I disagree. The general user has no control over the code executing. It's an application written by someone else. When that application is utilizing a core, then it's utilizing a core and this is what this metric is (correctly) telling us. If you're in the business of writing software and trying to squeeze the most out of a core then you use different tools.

crabbone · on July 5, 2023

These tools aren't for "general user". They are either for system programmers, or for system administrators.

> When that application is utilizing a core,

Core of what? A real CPU? A virtual CPU? Do we count hyperthreading TM?

You are just repeating a term that you didn't define -- "utilization". I did define it in the way that to me seems plausible given how people usually understand it intuitively. You just keep throwing this word around, but you don't even care to explain what you mean.

YZF · on July 5, 2023

We start with a physical core. Virtual cores have "virtual" utilization" and similarly hyperthreaded cores (which is a bit of a marketing term that isn't always useful in the real world). Naturally if you want to understand what a VM is doing you need to also look at the hypervisor. If you want to dive into exactly what's going on with hyper-threaded cores it can be harder given you don't have perfect visibility.

A physical core can either be idle. Or it can be executing instructions. The portion of the time that it's executing instructions is when it's utilized. I think this is a pretty clear and meaningful definition that's been used for decades.

A system admin running Outlook on a server is not going to be able to do anything about a pipeline stall in Outlook on some particular CPU/memory/motherboard. From their perspective when the utilization is 100% Outlook is cpu-bound and can't do more work. And that's why we have this metric. A stall, or an unused execution unit, or an inefficient sequence of instructions, or inefficient algorithms or many other things are all things that cause the actual work you're getting out of the core to be less than what you could get if you rewrote the program. This is not what CPU utilization % means. If there are power management or thermal considerations then that's also another thing you need to look at to get a complete picture.

Now Outlook might be I/O bound, which is a different problem, for which we look at different metrics. By the way, your I/O metrics reported by various tools are also all imperfect, things like whether the I/Os are sequential, or random, the block size, the mix of reads and writes, all have their own peculiar performance characteristics. Which again are of interest for some people optimizing I/O but not generally something that users of applications can do much about.

EDIT: It feels like you are looking for something that tells you as a programmer how much more you can squeeze out of your CPU. There's no such metric. It's up to you to use tools like profilers and your understanding of architecture and your imagination to figure that out. The utilization metric is super useful. I use it a lot. I've used it for years. Do I need to understand all the other factors that influence it - sure do. Is it something I'd use instead of profiling? no.