<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>LoadRunner TnT &#187; processor</title>
	<atom:link href="http://www.loadrunnertnt.com/tag/processor/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.loadrunnertnt.com</link>
	<description>Performance Testing, LoadRunner Tips &#38; Tricks</description>
	<lastBuildDate>Mon, 08 Mar 2010 07:57:02 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Understanding Processor: Thread State</title>
		<link>http://www.loadrunnertnt.com/concepts/understanding-processor-thread-state/</link>
		<comments>http://www.loadrunnertnt.com/concepts/understanding-processor-thread-state/#comments</comments>
		<pubDate>Thu, 29 Jan 2009 16:54:25 +0000</pubDate>
		<dc:creator>TnT Admin</dc:creator>
				<category><![CDATA[Concepts]]></category>
		<category><![CDATA[processor]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.loadrunnertnt.com/?p=178</guid>
		<description><![CDATA[As we know, a multiprogramming OS switches the processor back and forth between all the program threads that are executing. When the current thread blocks, usually due to I/O, the Windows Scheduler finds another thread that is ready to run and schedules it for execution. If no threads are ready to run, Windows schedules a [...]]]></description>
			<content:encoded><![CDATA[<p>As we know, a multiprogramming OS switches the processor back and forth between all the program threads that are executing. When the current thread blocks, usually due to I/O, the Windows Scheduler finds another thread that is ready to run and schedules it for execution. If no threads are ready to run, Windows schedules a thread associated with the System Idle process to run instead. When an I/O operation completes, a blocked thread becomes eligible to run again. This scheme means that threads alternate back and forth between the two states: a ready state, where a thread is eligible to execute instructions, and a blocked state. A blocked thread is waiting for some system event that signals that the transition from waiting to ready can occur.<br />
<span id="more-178"></span></p>
<p>Thread state Counter is an instantaneous counter that you will need to observe at very fine granularity to catch this behavior. The following tables described the thread states and reasons.</p>
<p><strong>Values for Thread State Counter</strong></p>
<p>0 &#8211; Initializing<br />
1 &#8211; Ready<br />
2 &#8211; Running<br />
3 &#8211; Standby<br />
4 &#8211; Terminated<br />
5 &#8211; Waiting<br />
6 &#8211; Transition<br />
7 &#8211; Unknown</p>
<p><strong>Values for the Thread Wait Reason counter</strong></p>
<p>1 &#8211; Waiting for a page to be freed<br />
0 &#8211; Waiting for a component of the Windows NT Executive<br />
1 &#8211; Waiting for a page to be freed<br />
2 &#8211; Waiting for a page to be mapped or copied<br />
3 &#8211; Waiting for space to be allocated in the page or nonpaged pool<br />
4 &#8211; Waiting for an Execution Delay to be resolved<br />
5 &#8211; Suspended<br />
6 &#8211; Waiting for a user request<br />
7 &#8211; Waiting for a component of the Windows NT Executive<br />
8- Waiting for a page to be freed<br />
9 &#8211; Waiting for a page to be mapped or copied<br />
10 &#8211; Waiting for space to be allocated in the page or nonpaged pool<br />
11 -Waiting for Execution Delay to be resolved<br />
12 &#8211; Suspended<br />
13 &#8211; Waiting for a user request<br />
14 &#8211; Waiting for an event pair high<br />
15 &#8211; Waiting for an event pair low<br />
16 -Waiting for an LPC Receive notice<br />
17 &#8211; Waiting for an LPC Reply notice<br />
18 &#8211; Waiting for virtual memory to be allocated<br />
19 &#8211; Waiting for a page to be written to disk<br />
20+ &#8211; Reserved</p>
<p><a title="Windows 2000 Performance Guide" href="resources/66-books" target="_blank">(Source: Windows 2000 Performance Guide by Mark Friedman &amp; Odysseas Pentakalos)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.loadrunnertnt.com/concepts/understanding-processor-thread-state/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Understanding Processor: Ready Queue</title>
		<link>http://www.loadrunnertnt.com/concepts/understanding-processor-ready-queue/</link>
		<comments>http://www.loadrunnertnt.com/concepts/understanding-processor-ready-queue/#comments</comments>
		<pubDate>Sat, 24 Jan 2009 16:48:37 +0000</pubDate>
		<dc:creator>TnT Admin</dc:creator>
				<category><![CDATA[Concepts]]></category>
		<category><![CDATA[processor]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.loadrunnertnt.com/?p=186</guid>
		<description><![CDATA[The Processor Queue Length counter in the System object is an extremely important indicator of processor performance. It is an instantaneous peek at the number of Ready threads that are currently waiting to run. Even though reporting processor utilization is much more popular, the Processor Queue Length is actually a more important indicator of a [...]]]></description>
			<content:encoded><![CDATA[<p>The <strong>Processor Queue Length</strong> counter in the System object is an extremely important indicator of processor performance. It is an instantaneous peek at the number of Ready threads that are currently waiting to run. Even though reporting processor utilization is much more popular, the Processor Queue Length is actually a more important indicator of a processor bottleneck. It shows that work is being delayed, and the delay is directly proportional to the length of the queue.<span id="more-186"></span></p>
<p>Since there is one <strong>Scheduler Dispatch Queue</strong> that services all processors, the Queue Length counter is only measured at System level. The <strong>Thread State</strong> counter in the Thread object indicates precisely which threads are waiting for service at the processor(s). In other words, the Processor Queue Length counter indicates how many threads are waiting in the Scheduler dispatch Ready Queue, while the Thread State counter tells which thread are in the queue. A good working assumption is that when the processor is very busy queuing delays impact the performance of executing threads. The longer the queue, the longer the delays that threads encounter.</p>
<p>By monitoring the Thread State counter for all instances of the Thread object at the same time, it is possible to determine which threads are waiting in the Processor Ready Queue at the time measurements are taken. Note that the instantaneous measure of the size of the Ready Queue is influenced by the dispatchability of the performance monitor application itself. In other words, the Ready Queue at the time the Scheduler was finally able to dispatch the System Monitor Measurement thread. The System Monitor runs at high priority, but by no means at the highest priority in the system.</p>
<p><strong>Priority Scheduling</strong> is the general solution designed to cope with situations where processor is very busy. Priority scheduling orders the Ready Queue, ensuring that under conditions of scarcity, the highest-priority work gains favored access to the resource.</p>
<p><a title="Windows 2000 Performance Guide" href="resources/66-books" target="_blank">(Source: Windows 2000 Performance Guide by Mark Friedman &amp; Odysseas Pentakalos)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.loadrunnertnt.com/concepts/understanding-processor-ready-queue/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Understanding Processor: Ready Queue Management</title>
		<link>http://www.loadrunnertnt.com/concepts/188/</link>
		<comments>http://www.loadrunnertnt.com/concepts/188/#comments</comments>
		<pubDate>Wed, 21 Jan 2009 16:38:48 +0000</pubDate>
		<dc:creator>TnT Admin</dc:creator>
				<category><![CDATA[Concepts]]></category>
		<category><![CDATA[processor]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.loadrunnertnt.com/?p=188</guid>
		<description><![CDATA[Think of priority scheduling as the set of rules for ordering the Ready Queue, which is the internal data, structure that points to the threads that are ready to execute. A ready tread (from IE or any other application) transitions directly to the running state, where it executes if no other higher-priority threads are running [...]]]></description>
			<content:encoded><![CDATA[<p>Think of priority scheduling as the set of rules for ordering the <strong>Ready Queue</strong>, which is the internal data, structure that points to the threads that are ready to execute. A ready tread (from IE or any other application) transitions directly to the running state, where it executes if no other higher-priority threads are running or waiting. If there is another thread, the Windows Scheduler selects the highest-priority thread in the Ready Queue to run.<span id="more-188"></span></p>
<p>Once a thread is running, it executes continuously on the processor until one of the following events occurs:</p>
<ol>
<li>An external interrupt occurs</li>
<li>The thread voluntarily relinquishes the processor, usually because it needs to perform I/O</li>
<li>The thread involuntarily relinquishes the processor because it incurred page fault, which requires the system to perform I/O on its behalf</li>
<li>A maximum uninterrupted execution time limit is reached</li>
</ol>
<p><strong>Interrupts</strong></p>
<p>An interrupt is a signal from an external device to the processor. Hardware devices raise interrupts to request immediate servicing. An I/O request to a disk device, for example, once initiated, is processed at the device independently of the CPU. When the device completes the request, it raises an interrupt to a signal the processor that the operation has completed. This signal is treated as a high-priority event: the device is relatively slow compared to the processor, the device needs attention, and some other user may be waiting for the physical device to be free. When the processor recognizes the interrupt request, it:</p>
<ol>
<li>Stops whatever it is doing immediately (unless it is already servicing a higher-priority interrupt request)</li>
<li>Saves the status of the current running thread (including the current values of processor registers, e.g. Program Counter showing the next instruction to be executed and the Stack Pointer pointing to the program’s working storage) in an internal data structure called the Thread Context.</li>
<li>Begins processing the interrupt.</li>
</ol>
<p>The thread that was running when the interrupt occurred return to the<strong> Ready Queue</strong>, and it might not be the thread the <strong>Scheduler</strong> selects to run following interrupt processing.</p>
<p>Interrupt processing likely adds another thread to the Ready Queue, namely the thread that was waiting for the event to occur. In Windows, one probable consequence of an interrupt is a reordering of the Scheduler Ready Queue following interrupt processing. The device driver that completes the interrupt processing supplies a boost to the priority of the application thread that transitions from waiting to ready when the interrupt processing completes. Interrupt processing juggles priorities so that the thread made ready to run following interrupt processing is likely to be the highest-priority thread waiting to run in the Ready Queue. Thus, the application thread waiting for an I/O request to complete is likely to receive service at the processor next.</p>
<p><strong>Voluntary wait</strong></p>
<p>A thread voluntarily relinquishes the processor when it issues an I/O request and then waits for the request to complete. Other voluntary waits include a timer wait or waiting for a serialization signal from another thread. A thread issuing a voluntary wait enters the Wait State, causing the Windows Scheduler to select the highest-priority task in the Ready Queue to execute next. The Thread Wait Reason for a thread in a voluntary wait is 7.</p>
<p><strong>Involuntary wait</strong></p>
<p>Involuntary waits are most frequently associated with virtual memory management. For example, a thread enters an involuntary wait if the processor attempts to execute an instruction referencing data in buffer that is currently not resident in memory. Since the instruction cannot be executed, the processor generates a page fault interrupt, which Windows must resolve by allocating free page in memory and reading the page containing the instruction or data into memory from disk. The currently running process is suspended and the Program Counter reset to re-execute the failed instruction. The suspended task is placed in an involuntary wait state until the page requested is brought into memory and the instruction that originally failed is executed. At that point, the VM manager component of Windows is responsible for transitioning the thread from wait state back to ready.</p>
<p><strong>Time allotment exceeded</strong></p>
<p>A thread does not need to perform I/O or wait for an event is not allowed to monopolize the processor completely. Without intervention from the Scheduler, some very CPU-intensive execution threads will attempt to do this. A program bug may also cause the thread to go into an infinite loop, in which it attempts to execute continuously. Either way, the Windows Scheduler eventually interrupts the running thread if no other type of interrupt occurs. If the thread does not relinquish the processor voluntarily, the Scheduler eventually forces it to return to the Ready Queue. This form of processor sharing is called time-slicing, and it is designed to prevent a CPU-bound task from dominating the use of the processor for an extended period of time. Without time-slicing, a high-priority CPU-intensive thread could delay other threads waiting in the Ready Queue indefinitely. The Windows Scheduler implements time-slicing by setting a clock timer interrupt to occur at regular intervals to check on the threads that are running</p>
<p>When a thread’s allotted time-slice is exhausted, Windows Scheduler interrupts it and looks for another ready thread to dispatch. Of course, if the interrupted thread happens to be the highest-priority thread (or only ready thread), the Scheduler selects it to run again immediately. However, Windows also lowers the likelihood that a CPU-intensive thread will monopolize the processor. This technique of boosting the relative priority of threads waiting on device interrupts and reducing the priority of CPU-intensive threads approximates a mean time to wait algorithm, a technique for maximizing throughput in a multiprogramming environment.</p>
<p><a title="Windows 2000 Performance Guide" href="resources/66-books" target="_blank">(Source: Windows 2000 Performance Guide by Mark Friedman &amp; Odysseas Pentakalos)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.loadrunnertnt.com/concepts/188/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Understanding Processor: Interrupt Processing</title>
		<link>http://www.loadrunnertnt.com/concepts/understanding-processor-interrupt-processing/</link>
		<comments>http://www.loadrunnertnt.com/concepts/understanding-processor-interrupt-processing/#comments</comments>
		<pubDate>Fri, 16 Jan 2009 16:34:36 +0000</pubDate>
		<dc:creator>TnT Admin</dc:creator>
				<category><![CDATA[Concepts]]></category>
		<category><![CDATA[processor]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.loadrunnertnt.com/?p=191</guid>
		<description><![CDATA[Interrupts are subjected to priority. The interrupt priority scheme is hardware-determined, but in the interest of portability it is abstracted by the Windows HAL. During interrupt processing, interrupts from lower-priority interrupts are masked so that they remain pending until the current interrupt processing completes. Following interrupt processing during which interrupts themselves are disabled, the operation [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Interrupts</strong> are subjected to priority. The interrupt priority scheme is hardware-determined, but in the interest of portability it is abstracted by the<strong> Windows HAL</strong>. During interrupt processing, interrupts from lower-priority interrupts are masked so that they remain pending until the current interrupt processing completes. Following interrupt processing during which interrupts themselves are disabled, the operation system returns to its normal operating mode with the processor reset once more to receive interrupt signals. The processor is once again enabled for interrupts.<br />
<span id="more-191"></span>Strictly speaking, on an Intel processor, there is a class of interrupts used for switching between the user level and privileged OS code. Although this involves interrupt processing on the Intel microprocessor, we are not referring to that type of interrupts here. Switching privilege levels does not necessarily cause the executing thread to relinquish the processor. However, Windows does classify these OS supervisor call interrupts as context switches.</p>
<p>In Windows, hardware device interrupts are serviced by an <strong>interrupt service routine</strong>, or<strong> ISR</strong>, which is a standard device driver function. Device drivers are extensions of the OS tailored to respond to the specific characteristics of the devices they understand and control. The ISR code executes at the interrupt level priority , with interrupts at the same or lower level disabled. An ISR is high priority by definition since it interrupts the regularly scheduled thread and executes until it voluntarily relinquishes the processor (or itself interrupted by a higher-priority interrupt).</p>
<p>The ISR normally signals the device to acknowledge the event, stops the interrupt from occurring, and saves the device status for later processing. It then schedules a <strong>deferred procedure call (DPC)</strong> to a designated routine that performs the bulk of the device-specified work associated with interrupt processing. DPCs are a special feature of Windows designed to allow the machine to operate enabled for interrupts as much as possible. DPC code executes at a higher priority than other OS privileged modes, but one that does not disable further interrupts from occurring and being serviced.</p>
<p>The <strong>DPC mechanism</strong> in Windows keeps the machine running in a state enabled for interrupts as much as possible. This architecture is especially useful with Intel PC hardware where many devices connected to a single PCI bus can share the same<strong> Interrupt Request Queue (IRQ) level</strong>. DPC routines execute from a separate DPC dispatcher queue, which Windows empties before it calls the Scheduler to re-dispatch an ordinary kernel or application thread. A typical function carried out by a DPC routine is to reset the device for the next operation and launch the next request if one is queued. When the DPC completes, a high-priority kernel or device driver thread then marks the I/O function complete. At the end of this chain events, a waiting thread transitions to the ready state, poised to continue processing now that the I/O it requested has completed.</p>
<p>When an interrupt occurs, the thread executing loses control of the processor immediately. When the DPC performing the bulk of the interrupt processing completes, the Windows Scheduler checks the Ready Queue again and dispatches the highest-priority thread. The interrupted thread does not necessarily regain control following interrupt processing because a higher-priority task may be ready to run. This is variously known as preemptive scheduling, preemptive multithreading or preemptive multitasking.</p>
<p><a title="Windows 2000 Performance Guide" href="resources/66-books" target="_blank">(Source: Windows 2000 Performance Guide by Mark Friedman &amp; Odysseas Pentakalos)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.loadrunnertnt.com/concepts/understanding-processor-interrupt-processing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Understanding Processor: Processor State</title>
		<link>http://www.loadrunnertnt.com/concepts/understanding-processor-processor-state/</link>
		<comments>http://www.loadrunnertnt.com/concepts/understanding-processor-processor-state/#comments</comments>
		<pubDate>Sat, 10 Jan 2009 16:27:58 +0000</pubDate>
		<dc:creator>TnT Admin</dc:creator>
				<category><![CDATA[Concepts]]></category>
		<category><![CDATA[processor]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.loadrunnertnt.com/?p=193</guid>
		<description><![CDATA[Processor utilization can be further broken down into time spent executing in user mode (Intel Ring 3) or in privileged mode (Ring 0), two mutually exclusive states. Applications typically run in the more restricted user mode, while operating system functions run in privileged mode. Whenever, an application implicitly or explicitly calls an OS service (e.g. [...]]]></description>
			<content:encoded><![CDATA[<p><span style="font-family: arial;"><strong>Processor utilization</strong> can be further broken down into time spent executing in user mode (Intel Ring 3) or in privileged mode (Ring 0), two mutually exclusive states. Applications typically run in the more restricted user mode, while operating system functions run in privileged mode. Whenever, an application implicitly or explicitly calls an OS service (e.g. to allocate or free memory, or perform some operation on a file), a context switch occurs as the system transitions from user to privileged mode and back again. The portion of time that a tread is executing in user mode is captured as <strong>% User Time</strong>; privileged mode execution time is captured in the <strong>% Privileged Time</strong> counter.<span id="more-193"></span></span></p>
<p><span style="font-family: arial;">Processor time usage in Windows is broken out into two additional subcategories. <strong>% Interrupt Time</strong> represents processor cycles consumed in device driver <strong>interrupt service routines (ISRs)</strong>, which process interrupts from attached peripherals such as the keyboard, mouse, disks, network interface card, etc. This is worked performed at very high priority, typically while other interrupts are disabled. It is captured and reported separately not only because of its high priority, but also because it is not easily associated with any particular user process. Windows also tracks the amount of time device drivers spend in <strong>deferred procedure calls (DPCs)</strong>, which also service peripheral devices but run with interrupts enabled. DPCs represent higher-priority work than other system calls and kernel thread activity. Note that<strong> % DPC Time</strong> is already included in the % Privileged Time measure.</span></p>
<p><span style="font-family: arial;">The Scheduler’s thread timing function is notified whenever any context switch occurs, and dutifully records the processing time for the completed function in the appropriate bucket. The context switch might involve going from one user thread to another, a user thread to a kernel function, or a kernel thread to an ISR, followed by a DPC.</span></p>
<p><a title="Windows 2000 Performance Guide" href="resources/66-books" target="_blank">(Source: Windows 2000 Performance Guide by Mark Friedman &amp; Odysseas Pentakalos)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.loadrunnertnt.com/concepts/understanding-processor-processor-state/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Understanding Processor: Processor Basics</title>
		<link>http://www.loadrunnertnt.com/concepts/understanding-processor-processor-basics/</link>
		<comments>http://www.loadrunnertnt.com/concepts/understanding-processor-processor-basics/#comments</comments>
		<pubDate>Mon, 05 Jan 2009 16:03:56 +0000</pubDate>
		<dc:creator>TnT Admin</dc:creator>
				<category><![CDATA[Concepts]]></category>
		<category><![CDATA[processor]]></category>
		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.loadrunnertnt.com/?p=195</guid>
		<description><![CDATA[Windows is a multiprogramming OS, which means that it manages and selects among multiple programs that can all be active in various stages of execution at the same time. The displaceable unit in Windows, representing the application or system code to be executed, is the thread. The Scheduler running inside the Windows OS kernel keeps [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Windows</strong> is a multiprogramming OS, which means that it manages and selects among multiple programs that can all be active in various stages of execution at the same time. The displaceable unit in Windows, representing the application or system code to be executed, is the thread. The Scheduler running inside the Windows OS kernel keeps track of each thread in the system and points the processor hardware to threads that are ready to run.<span id="more-195"></span></p>
<p>The basic rationale for multiprogramming is that most computing tasks do not execute instructions continuously. After a program thread executes for some period of time, it usually needs to perform an I/O operation like reading information from the disk, printing characters on a printer, or drawing data on the display. While the program is waiting for this I/O function to complete, it does not need to hand on to the processor. An OS that supports multiprogramming saves the status of a program that is waiting, restores its status when it is ready to resume execution, and finds something else that can run in the interim.</p>
<p>Because I/O devices are much slower than the processor, I/O operations typically take a long time compared to CPU processing, A single I/O operation to a disk may take 10 milliseconds, which means that the disk is only capable of executing perhaps 100 such operation per seconds. Printers, which are even slower, are usually rated in pages printed per minute. In contrast, processors might execute an instruction every one or two clock cycles.</p>
<p>In a <strong>multiprogramming</strong> OS, programs execute until they block, normally because they are waiting for an external event to occur. When this awaited event finally does occur, interrupt processing makes the program ready to run again.</p>
<p>Multiprogramming introduces the possibility that a program will encounter delays waiting for the processor while some other program is running. In Windows following an interrupt, the thread that was notified that an event it was waiting on occurred usually receives a priority boost from Windows Scheduler. The result is that the thread that was executing when the interrupt occurred is often pre-empted by a higher-priority thread following interrupt processing. This can delay thread execution, but does tend to balance processor utilization across CPU- and I/O bound threads.</p>
<p>Due to the great disparity between the speed of the devices and the speed of the processor, individual processing threads are usually not a very efficient use of the processor if allowed to execute continuously. It is important to understand that multiprogramming actually slows down individual execution threads because they are not allowed to run uninterrupted from start to finish. In other words, when the waiting thread becomes ready to execute again, it may well be delayed because some other thread is in line ahead of it. Multiprogramming represents an explicit trade off to improve overall CPU throughput, quite possibly at the expense of the execution time of any individual thread.</p>
<p><a title="Windows 2000 Performance Guide" href="resources/66-books" target="_blank">(Source: Windows 2000 Performance Guide by Mark Friedman &amp; Odysseas Pentakalos)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.loadrunnertnt.com/concepts/understanding-processor-processor-basics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Detecting processor bottlenecks</title>
		<link>http://www.loadrunnertnt.com/analyze/detecting-processor-bottlenecks/</link>
		<comments>http://www.loadrunnertnt.com/analyze/detecting-processor-bottlenecks/#comments</comments>
		<pubDate>Sat, 30 Aug 2008 06:53:26 +0000</pubDate>
		<dc:creator>TnT Admin</dc:creator>
				<category><![CDATA[Analyze]]></category>
		<category><![CDATA[Bottleneck]]></category>
		<category><![CDATA[processor]]></category>

		<guid isPermaLink="false">http://www.loadrunnertnt.com/?p=332</guid>
		<description><![CDATA[In this article of &#8220;Detecting processor bottlenecks&#8221;, we are providing a general idea of determining the bottleneck with two broad categories, namely (a) Processor Load and (b) Process Priorities. Taken from the book, &#8220;Java Performance Tuning&#8221; written by Jack Shirazi, its recommended to go through it to get a better understanding in determining resource bottlenecks [...]]]></description>
			<content:encoded><![CDATA[<p>In this article of &#8220;Detecting processor bottlenecks&#8221;, we are providing a general idea of determining the <strong>bottleneck</strong> with two broad categories, namely <strong>(a) Processor Load</strong> and <strong>(b) Process Priorities</strong>. Taken from the book, <a title="Java Performance Tuning" href="index.php?view=article&amp;catid=38%3Arecommended-resources&amp;id=66%3Abooks&amp;option=com_content&amp;Itemid=41" target="_blank">&#8220;Java Performance Tuning&#8221;</a> written by Jack Shirazi, its recommended to go through it to get a better understanding in determining resource bottlenecks and the of tuning Java technologies. The term CPU and Processor refers to the same thing are used interchangeably in this article.<span id="more-332"></span></p>
<p><span style="text-decoration: underline;"><strong>Processor Load</strong></span></p>
<p>Two areas of <strong>Processor Load</strong> are worth watching as primary performance points.   They are the <strong>Processor Utilization</strong> (expressed in percentage) and the<strong> Runnable Queue</strong> of processes and threads (often called the load or the task queue).</p>
<p><strong>Processor Utilization;</strong> The first indicator is simply the percentage of the CPU (Or CPUs) being used by all the various threads. If this is up to 100% for significant periods of time, you may have a problem. On the other hand, if it isn’t, the CPU is under-utilized, but that is usually preferred. However, the amount of processes and threads existing in the system can be huge which it will be overwhelming to look at all of them. Therefore, start with known processes and tasks, such as application ones (user tasks), followed by OS tasks.</p>
<p>Some common symptoms can be resulted from the following:</p>
<ul>
<li>Low CPU usage can indicate that your application may be blocked for significant periods on disk or network I/O (High I/O or poor I/O)</li>
<li>Low CPU usage can indicate that contention is on another server in the architecture and it&#8217;s waiting for that server to complete its task and send data back to it.</li>
<li>High CPU usage can indicate thrashing (lack of RAM or too many threads)</li>
<li>High CPU contention can indicate inefficient code which (indicating that you need to tune the code and reduce the number of instructions being processed to reduce the impact on the CPU).</li>
</ul>
<p>A reasonable target is <strong>75% CPU utilization</strong> (which from different authors varies from 75% till 85%). This means that the system is being worked toward its optimum, but also allowing some slacks for spikes due to other system or application requirements. However, note that if more than 50% of the CPU is used by system processes (i.e. administrative and IS process), your CPU is probably under-powered. This can be identified by looking at the load of the system over some period when you are not running any applications (always allow the system to run a normal/no load scenario to log its initial benchmark).</p>
<p><strong>Runnable Queue;</strong> The second performance indicator indicates the average number of processes or threads waiting to be scheduled for the processor by the OS. They are run-able processes, but the processor has no time to run them and is keeping them waiting for some significant amount of time. As soon as the run queue goes above zero, the system may display contention for resources. However, there are still exceptions where runnable queue is above zero and the system is still performing at an acceptable level. A good way to identify the acceptable runnable queue of the system is to graph the Avg. Transaction Response Time with the runnable queue statistics (in Windows is <strong>Processor Queue Length</strong>). Observe any degradation of response time when the runnable queue increases. For capacity planning, a guideline proposed by Adrian Cockcroft is that performance starts to degrade if the <span style="color: #ff0000;">run queue grows bigger than four times the number of CPUs</span>.</p>
<p>If you can upgrade the CPU of the target environment, doubling the CPU speed is usually better than doubling the number of CPUs. And remember that parallelism in an application doesn’t necessarily need multiple CPUs. If I/O is significant, the CPU will have plenty of time for many threads.</p>
<p><span style="text-decoration: underline;"><strong>Process Priorities</strong></span></p>
<p>The OS also has the ability to prioritize the processes in terms of providing processor time by allocating process priority levels. Processor priorities provide a way to throttle high-demand processes, thus giving other processes a greater share of the processes. If there are other processes that need to run on the same machine but it doesn’t matter if they were run slowly, you can give your application processes a (much) higher priority than those other processes, thus allowing your application the lion’s share of CPU time on a congested system. This is worth keeping in mind if your application consists of multiple processes, you should also consider the possibility of giving your various processes different levels of priority. Being tempted to adjust the priority levels of processes, however, is often a sign that the CPU is underpowered for the tasks you have given it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.loadrunnertnt.com/analyze/detecting-processor-bottlenecks/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>How do we determine processor contention?</title>
		<link>http://www.loadrunnertnt.com/analyze/how-do-we-determine-processor-contention/</link>
		<comments>http://www.loadrunnertnt.com/analyze/how-do-we-determine-processor-contention/#comments</comments>
		<pubDate>Mon, 21 Apr 2008 08:24:54 +0000</pubDate>
		<dc:creator>TnT Admin</dc:creator>
				<category><![CDATA[Analyze]]></category>
		<category><![CDATA[Bottleneck]]></category>
		<category><![CDATA[processor]]></category>

		<guid isPermaLink="false">http://www.loadrunnertnt.com/?p=347</guid>
		<description><![CDATA[Simplest way, use the Processor(_Total)\% Processor Time which measures the average processor utilization of your machine (i.e. utilization averaged over all processors not a specified processor). You can further break down the usage by examining which instance/process is hogging the processor with Processor(instance)\% Processor Time.
For example, instances/processes such as IIS and MS Exchange can be [...]]]></description>
			<content:encoded><![CDATA[<p>Simplest way, use the <strong>Processor(_Total)\% Processor Time</strong> which measures the average <strong>processor utilization</strong> of your machine (i.e. utilization averaged over all processors not a specified processor). You can further break down the usage by examining which instance/process is hogging the processor with <strong>Processor(instance)\% Processor Time</strong>.</p>
<p>For example, instances/processes such as <strong>IIS</strong> and <strong>MS Exchange</strong> can be examined with <strong>Process(inetinfo)\% Processor Time</strong> and <strong>Process(store)\% Processor Time</strong> respectively. While <strong>WebLogic Server</strong> can be examined with <strong>Process(java.exe)\% Processor Time</strong>. (Note: WLS may generate more than one java.exe which you will have to be sure of the correct instance). A rule of thumb, keep the Processor(_Total)\% Processor Time under <strong>85%</strong>. However, there maybe cases of spikes due to <strong>backup</strong> jobs, which you can determined with patterns of constant periodic spikes or known backup tasks schedules.</p>
<p><span id="more-347"></span>If the processor is running at around <strong>70% to 80%</strong>, its normally a good sign and means your machine is handling its load effectively and not under utilized. Average processor utilization of around <strong>20% or 30%</strong> may suggests your machine is under utilized and maybe a wise to do server consolidation using <strong>VMWare</strong>, or allowing more applications to be house in it. <a href="http://www.windowsnetworking.com/articles_tutorials/Key-Performance-Monitor-Counters.html" target="_blank"><span style="text-decoration: underline;"><span style="color: #0066cc;">(Source: Mitch Tulloch, Windows Networking)</span></span></a></p>
<p>Going down a step deeper, we can determine if the <strong>processor</strong> is busy working on <strong>kernel-related</strong> task or <strong>user-related</strong> task with the use of <strong>Processor(_Total)\% Privileged Time</strong> and <strong>Processor(_Total)\% User Time</strong> respectively. Kernel-related task will comprises of OS tasks to ensure the functionality of the OS while user-related task comprises of application tasks not native to the OS (e.g. WebLogic Servers as <strong>java.exe</strong>). I will discuss it in details in future posts.</p>
<p>If kernel mode <strong>utilization</strong> is high, your machine is likely underpowered as it’s too busy handling basic OS housekeeping functions to be able to effectively run other applications. It may be wiser to (1) reduce the amount of OS tasks and (2) services or distribute the schedules in non-peak timing. If user mode utilization is high, it may be your server running too many specific roles which you may consider (1) scaling the hardware (adding more or better processor), (2) reduce the number of user applications or (3) horizontal scaling of the server architecture. <a href="http://www.windowsnetworking.com/articles_tutorials/Key-Performance-Monitor-Counters.html" target="_blank"><span style="text-decoration: underline;"><span style="color: #0066cc;">(Source: Mitch Tulloch, Windows Networking)</span></span></a></p>
<p>In addition, you can use <strong>System\Processor Queue Length</strong> to determine processor contention. Use this counter when the <strong>processor utilization</strong> peaks at a period of <strong>100%</strong>. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work <strong>(active threads)</strong> available (ready for execution) than the machine’s processors are able to handle.</p>
<p>Mitch highlighted that it is not a definitive way of determining the <strong>bottleneck</strong> for some services like <strong>IIS 6</strong> pool (similarly for JVM which spawns threads from the <strong>java.exe</strong> process) which manage their own worker threads. For example, on a busy IIS, you would want to look at other counters like <strong>ASP\Requests Queued</strong> or <strong>ASP.NET\Requests Queued</strong> as well . Also, the larger the number of active services and applications running on your server will have a busier the processor queue then norm. On such server running near 100% utilization, contention may only be a significant factor when System\Processor Queue Length exceeds more than the recommended queue length of <strong>5</strong> . <a href="http://www.windowsnetworking.com/articles_tutorials/Key-Performance-Monitor-Counters.html" target="_blank"><span style="text-decoration: underline;"><span style="color: #0066cc;">(Source: Mitch Tulloch, Windows Networking)</span></span></a></p>
<p>Similarly, Microsoft TechNet suggests that different types of <strong>multiprocessor</strong> to have different guidelines. The suggestion was 3 threads in the queue. That is, if there are 4 processors, the threshold should be 12 for all processors. <a href="http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/core/fned_ana_jzel.mspx?mfr=true" target="_blank"><span style="text-decoration: underline;"><span style="color: #0066cc;">(Source: Microsoft TechNet)</span></span></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.loadrunnertnt.com/analyze/how-do-we-determine-processor-contention/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
