jump to navigation

RESXTOP Counters for Storage June 20, 2012

Posted by vbry21 in RESXTOP.
Tags: ,
add a comment

Here is the final RESXTOP post, this one is about Storage.

Disk throughput can be monitored using RESXTOP.

The disk adapter screen hit the d key in RESXTOP.

READs/s – Number of disk reads per second

WRITES/s – Number of disk writes per second

The sum of reads/second and writes/second equals I/O operations/second (IOPS). IOPS is a common benchmark for storage subsystems and can be measured with tools like Iometer.

Disk throughput can also be monitored using the following metrics instead.

MBREAD/s – Number of megabytes read per second

MBWRTN/s – Number of megabytes written per second

As well as monitoring disk throughput, the disk adapter screen (type d in the window) lets us monitor disk latency as well, for myself, this is the counter I’m most interested in, purely because disk throughput does not necessarily mean we have an issue.

The following counters are of interest

ADAPTR – The name of the host bus adapter (vmhba#), which includes SCSI, iSCSI, RAID and Fibre Channel adapters.

DAVG/cmd – The average amount of time it takes a device (which includes the HBA, the storage array, and everything in between) to service a single I/O request (read or write).If the value < 10, the system is healthy. If the value is 11–20 (inclusive), be aware of the situation by monitoring the value more frequently. If the value is > 20, this most likely indicates a problem.

KAVG/cmd – The average amount of time it takes the VMkernel to service a disk operation. This number represents time spent by the CPU to manage I/O. Because processors are much faster than disks, this value should be close to zero. A value or 1 or 2 is considered high for this metric.

GAVG/cmd – The total latency seen from the virtual machine when performing an I/O request. GAVG is the sum of DAVG plus KAVG

 

RESXTOP counters for Networking June 19, 2012

Posted by vbry21 in RESXTOP.
Tags: ,
add a comment

The following are a list of the most interesting statistics which can be gathered within RESXTOP.

To display network statistics in RESXTOP, type n in the window.

Configuration information about the objects is listed first, followed by the performance metrics.

The USED-BY column identifies the network connections by:

Physical adapter – An example is vmnic0.

vSphere network object – One example is VMkernel port, such as vmk0.

MbRX/s – Amount of data received in Mbps

PKTTX/s – Average number of packets transmitted per second in the sampling interval

PKTRX/s – Average number of packets received per second in the sampling interval

%DRPTX – Percentage of outbound packets dropped in the sampling interval

%DRPRX – Percentage of inbound packets dropped in the sampling interval

RESXTOP memory counters for the Ballooning mechanism June 18, 2012

Posted by vbry21 in RESXTOP.
Tags: ,
add a comment

Balloon Driver Counters in RESXTOP

In a previous post I mentioned that the balloon driver wasn’t necessarily a bad thing, the reasoning behind this is as follows.

Ballooning as a process is part of normal operations when your host memory becomes overcommitted, now think of your normal physical servers, if you have a server with let’s say 8 GB of RAM, would you expect that server to be constantly using 8GB of RAM, hopefully the answer is NO!!!

The fact that the ballooning is occurring does not necessarily indicate a performance problem, if we also get swapping, then we do have a problem.

What the balloon driver does is allow the guest VM to give up physical memory pages that are not being used. To enable this all we have to do is install VMTools.

Onto the counters that are useful metrics to analyse, these are important in that, even though ballooning is not a bad thing, the counters give us an indication that perhaps we are approaching memory saturation.

To access the counters hit m when in RESXTOP to access memory

MEMCTL/MB – This line will display the memory balloon statistics for the entire host. All numbers are in megabytes.

The ‘curr’ is the total amount of physical memory reclaimed using the ballooning mechanism.

The ‘target’ is the total amount of physical memory ESXi wants to reclaim with ballooning.

The ‘max’ is the maximum amount of physical memory that ESXi can reclaim with ballooning.

MCTL?- This value is either Y for the balloon driver installed per VM and N if not installed.

MCTLSZ – This value is also reported per VM and represents the amount of physical memory the balloon driver is holding for use by other VMs.

MCTLTGT – This value is also reported per VM and represents the amount of physical memory that the host wants to reclaim from the VM.

MCTLMAX – This value is also reported per VM and represents the amount of physical memory that can be reclaimed as a maximum.

RESXTOP Memory Counters for Host Memory Swapping June 17, 2012

Posted by vbry21 in RESXTOP.
Tags: ,
add a comment

I mentioned in an earlier article that within a VMware ESXi environment that Host Swapping was a bad thing, the reason for that is, if an ESXi host runs low on physical RAM, then the issue is, that VMs memory will be swapped to disk. This host level swapping severely affects the performance of the VMs being swapped.

Based on the above statement, we should monitor for swapping.

We could check the advanced performance graphs in our vSphere client, what we would do is highlight our ESXi host, click on the Performance Tab and then select Memory from the drop down, we are interested in two counters.

Memory Swap In Rate – The rate at which memory is swapped from disk.

Memory Swap Out Rate – The rate at which memory is swapped out to disk.

We can also use RESXTOP from the vMA virtual appliance, the most interesting counters are listed below.

From RESXTOP press m to access the memory counters.

SWR/s – This indicates the amount of memory, measured in megabytes and represents the rate at which the ESXi host is swapping memory in from disk.

SWW/s – This indicates the amount of memory, measured in megabytes and represents the rate at which the ESXi host is swapping memory to disk.

SWCUR – This is the amount of swap space currently used by the virtual machine.

SWTGT – This is the amount of swap space that the host expects the virtual machine to use.

Just as an additional bit of info, we can also get some useful statistics from the CPU screen (press c within the RESXTOP screen).

%SWPWT – This gives us an indicator of a performance issue due to wait time experienced by the VM. It represents the percentage of time that the VM is waiting for memory to be swapped in.

 

RESXTOP Memory Counters for the Host June 14, 2012

Posted by vbry21 in RESXTOP.
Tags: ,
add a comment

This article explains the common counters on the RESXTOP Memory screen (Part 1) Host Memory

Ah, back to RESXTOP

When we first launch RESXTOP from the command line we find that we are automatically placed into CPU view, if you hit the m (lowercase) key you’re taken into memory view.

Below is a description of some metrics that maybe of some interest.

PMEM/MB – This lists the total amount of physical memory available on your host.

VMKMEM/MB – This is the amount of physical memory currently in use by the VMkernel.

PSHARE – This shows the amount of memory savings from the transparent page sharing.

Memory State – This value can be high, low, soft or hard. This basically tells us whether the VMkernel has enough memory for performing its critical operations.

If Memory State is:

High, the VMkernel has sufficient memory to perform its critical tasks.

Soft, Hard or Low, then the VMkernel is having trouble getting the memory it needs.

Therefore we really love High, anything else means we have a host that is overcommitted for memory.

Generally

High = No memory reclamation technique

Soft = Ballooning

Hard = Swapping and Ballooning

Low = Swapping (Bad)

 

 

 

 

RESXTOP CPU Counters June 1, 2012

Posted by vbry21 in RESXTOP.
Tags: ,
add a comment

Ah, back to RESXTOP

When we first launch RESXTOP from the command line we find that we are automatically placed into CPU view and along the top of the screen we get loads of headers, but what do they mean?

Below is a description of the most common counters that we will encounter.

PCPU USED(%) – CPU utilisation per physical CPU (logical CPUs).

%USED – CPU utilisation. This counter displays the percentage of physical CPU core cycles used by a group of worlds (resource pools, running VMs, or other worlds). Also includes %SYS

%SYS – This is the percentage of time spent in the ESXi VMkernel on behalf of the world/resource pool to process interrupts and to perform other system activities.

%RDY – Percentage of time the group was ready to run, but was not provided CPU resources on the host on which to execute.

%WAIT – Percentage of time the group spent in the blocked or busy wait state. This includes the percentage of time the group was idle.

%CSTP – Percentage of time the vCPUs of a virtual machine spent in the co-stopped state, waiting to be co-started. This gives an indication of the co-scheduling overhead incurred by the VM. If this value is low, then any performance problems should be attributed to other issues and not co-scheduling of the VMs VCPUs.

%MLMTD – Percentage of time the VMkernel did not run the resource pool/world because that would violate the resource pool/world’s limit setting.

NWLD – Number of worlds associated with a given group. Each world can consume 100% of a physical CPU, which is why you might see some unexpanded groups with %USED above 100%

I got the above from the resource management guide from VMware, later articles will discuss the memory, networking, storage and VM counters.

RESXTOP Navigation May 28, 2012

Posted by vbry21 in RESXTOP.
add a comment

When we’re using RESXTOP in interactive mode, we can change the view by using various keys to change the screen behaviour.

COMMANDS ARE CASE-SENSITIVE

From the vMA use fastpass authentication to target your host and then run RESXTOP

The characters that change the screen view are now listed below

c – Switch to the CPU resource utilisation screen (default)

m – Switch to the memory resource utilisation screen

d – Switch to the storage (disk) adapter resource utilisation screen

u – Switch to the storage (disk) device resource utilisation screen

v – Switch to the virtual disk resource utilisation screen

n – Switch to the network resource utilisation screen

V – Display only virtual machines in the screen

h – Display the help screen

f – Change the columns

q – quit

RESXTOP displays information based on worlds. Think of these as processes. A world can represent a VM and a VMKernel component.

On the RESXTOP screen we will see ID (World ID), GID (Resource Pool ID) and NAME (Name of the running world).

In later articles we’ll look at the four main categories of CPU, Memory, Networking and Storage

RESXTOP May 27, 2012

Posted by vbry21 in RESXTOP.
Tags: ,
add a comment

Managing real-time performance using RESXTOP

I was sitting on the Train The Trainer (TTT) for the new Optimize and Scale course last week and we were looking at performance monitoring tools, in the Install, Configure and Manage Course we look at the performance charts that are built into the vSphere Client.

On the Optimize and Scale course we look at the vMA (vSphere Management Assistant), and for performance we examine results from the RESXTOP utility.

What this utility allows is command-line real-time monitoring and collection of data for the main system resources, these being, CPU, Memory, Networking and Storage.

The beauty of this utility would be for example, to allow us to identify say CPU issues on the host and then via various options identify the problem Virtual Machine.

We can use RESXTOP in one of three modes.

Interactive Mode: All statistics are displayed as they are collected, showing how the system is used in real-time.

Batch Mode: The statistics are collected and output to a saved file for later analysis.

Replay Mode: The data that was collected by the vm-support command is then interpreted and played back as RESXTOP statistics.

In later articles we’ll look at the four main resources and the data we can view and collect.