When CPU Affinity Matters
When CPU affinity matters
Case 1: Limit a Process’s CPU Consumption
One common case where CPU affinity matters is one of CPU resource allocation. Specifically, keeping a process limited to using a certain amount of CPU time or percent of the total available. By limiting a process to specific cores, you have the ability to control the available CPU time it has access to, out of the total CPU time pool. Thus, you can keep it at an increment of 25%, 50%, 75%, or 100% for a fully (all physical) quad-core processor. Now, when you throw in logical processors, the picture gets more complex. While the scheduler may show an exact 50% for 2 of 4 logical cores, that may not mean those two cores are executing at 50% the capacity of two physical cores. The computational capacity limits are actually staggered, so they might instead be: 25%, 32%, 82%, 100%. It is hard to say, and varies for each CPU. However, this is an effective method to limit an application’s CPU use. It is *not* recommended for system components, security software, or other critical services.
You can do this with Process Lasso, using any of three automation features:
Remember, the OS CPU scheduler itself will work-around any busy cores, so you need not worry too much about putting too much on too few cores, *but* you should be careful not to under-utilize your computing capacity.
Case 2: Per-Core Frequency Scaling
Some newer processors, both AMD (TurboCore) and Intel (TurboBoost), have frequency scaling technologies that allows for scaling up of specific cores on-demand. The development of this emerging feature in CPU hardware means that what CPU affinity you set may actually make a big difference in real-world performance.
Normally as a thread gets a time slice (a period in which to use the core), it is granted whichever core [CPU] is determined to be most free by the operating system’s scheduler. Yes, this is in contrast to the popular fallacy that the single thread would stay on a single core. This means that the actual thread(s) of an application might get swapped around to non-overclocked cores, and even under-clocked cores in some cases. As you can see, changing the affinity and forcing a single-threaded CPU to stay on a single CPU makes a big difference in such scenarios. The scaling up of a core does not happen instantly, not by a long shot in CPU time.
Therefore, for primarily single (or limited) thread applications, it is sometimes best to set the CPU affinity to a specific core, or subset of cores. This will allow the ‘Turbo’ processor frequency scaling to kick in and be sustained (instead of skipping around to various cores that may not be scaled up, and could even be scaled down).
Another possible criticism of this feature is that certain threads will run faster than others. This is an unusual scenario many OS CPU Schedulers may not be prepared to handle. Windows 7, with its cycle-based counter, should be able to adequately handle it – but others not so much. While this usually doesn’t matter anyway, it could conceivably cause timing issues in cases where timing just happens to be important – for instance a race condition. Sure, race conditions should never exist in the ‘wild’, but in this imperfect world we know they do.
Regardless, the end lesson is that it takes TIME to change frequency — often more time than the code took to execute. Keeping the CPU affinity restricted to certain overclocked cores is more optimal than having the threads of the process swapped around to all cores, over-clocked or not. As of now no OS scheduler does takes the active clock speed of individual cores into consideration. Perhaps that will change in a future release of Windows, but it seems unlikely at least for several years.
Case 3: Core Thrashing
Just by the name, you know this is a bad thing. You lose performance when a thread is swapped to a different core, due to the CPU cache being ‘lost’ each time. In general, the *least* switching of cores the better. One would hope the OS would try to avoid this, but it doesn’t seem to at all in quick tests under Windows 7. Therefore, it is recommended you manually adjust the CPU affinity of certain applications to achieve better performance.
Case 4: Intel Hyper-Threading and AMD Bulldozer
Another important use case is avoiding placement of loads on logical (Hyper-Threaded) cores of the same physical CPU core. This should only be done when there are no more physical CPU cores to assign a load to. The Windows scheduler is aware of this and will move a thread to them only if appropriate, at least most of the time.
External and Reference Links
- Intel Site on TurboBoost
- Extreme Overclocking AMD Phenom II x64 1055/1090T review
- Slashdot: Intel Turboboost vs. AMD Turbocore
- Tom’s Hardware: CORE Or Boost? AMD’s And Intel’s Turbo Features Dissected