Monday, 22 February 2010

Helpful ESX file paths

Check the following file (could be multiple secure logs followed with a .1, .2 etc) for logging in details in to the Service Console.


Check the following file if you like to see the iSCSI session, connection information.


If you are having VMKernel issues look in the following file at the exact same time that the error happen.


Friday, 19 February 2010

Helpful AD Commands

Dsquery is a command-line tool that is built into Windows Server. To use dsquery, you must run the dsquery command from an elevated command prompt.

use dsquery to get back only the active users
dsquery user -limit 0

Command line Active Directory query tool. Primarily used to find and cleanup old computer accounts that haven't been used. Can also be used to clean up user accounts when the proper filter is specified. 

oldcmp -report
Generate html report of all cmpaccs > 90 days old

oldcmp -report -age 0 -onlydisabled
Generate html report of all disabled cmpaccs

oldcmp -disable -unsafe -forreal
Generate html report of all cmpaccs > 90 days, sort on pwage
Will REALLY DISABLE all accounts identified.

csvde examples of import and export user accounts to and from Active Directory.

Get a list of members contained within a group
csvde.exe -f output.csv -r (objectClass=group) -l member

Get list of computer objects
csvde.exe -f output.csv -r (objectClass=Computer)

Thursday, 11 February 2010

Understanding VMware Fault Tolerance

VMware Fault Tolerance (FT) provides continuous availability for applications by creating a live shadow instance of a virtual machine that is in virtual lock-step with the primary instance. By allowing instantaneous failover between the two instances in the event of hardware failure, VMware Fault Tolerance eliminates even the smallest of data loss or disruption.

VMware Fault Tolerance (FT) works by creating an identical copy of a virtual machine. One copy of the virtual machine, called the primary, is in an active state, receiving requests, serving information and running applications. Another copy, called the secondary, receives the same input that is received by the primary.

In an FT environment, one virtual machine runs as a primary and FT runs a secondary virtual machine on a different ESX host. The secondary virtual machine shares the primary's virtual disks. The virtual machines are kept in lock-step via logging information sent over a private network connection. The primary is the sender of this logging information and the secondary only listens. FT is based on VMware Record/Replay technology.

In the event that the primary's ESX host fails, the secondary virtual machine takes over without interrupting applications.
VMware FT provides more continuity than VMware HA because FT does not require a virtual machine restart and the secondary virtual machine immediately comes online with all, or almost all state information preserved.
Virtual machines protected by FT are not handled by VMware HA for restart priority. It is considered disabled in the restart priority.

Determining Node Failure
VMware FT uses network heartbeats to determine when primary and backup hosts are down. Backup goes live and becomes the new primary if it declares the current primary dead.
FT uses an atomic operation on the shared VMFS to distinguish between a failed host from a network failure.

VMware FT Migration Transition States
VMware FT tracks failover operations with a variety of states. For more information on transition states, see - VMware Fault tolerance migration transition states (1010634).

Wednesday, 10 February 2010

How to Disable Client-Side DNS Caching in Windows XP and Windows Server 2003

Windows contains a client-side Domain Name System (DNS) cache. The client-side DNS caching feature may generate a false impression that DNS "round robin" is not occurring from the DNS server to the Windows client computer. When you use the ping command to search for the same A-record domain name, the client may use the same IP address. This behavior is different from Microsoft operating systems earlier than Windows 2000. These operating systems do not include the client-side DNS caching feature. This article describes how to disable DNS caching.

To stop DNS caching, run either of the following commands:
  • net stop dnscache

  • sc servername stop dnscache
To disable the DNS cache permanently in Windows, use the Service Controller tool or the Services tool to set the DNS Client service startup type to Disabled. Note that the name of the Windows DNS Client service may also appear as "Dnscache."

Note The overall performance of the client computer decreases and the network traffic for DNS queries increases if the DNS resolver cache is deactivated.

The DNS Client service optimizes the performance of DNS name resolution by storing previously resolved names in memory. If the DNS Client service is turned off, the computer can still resolve DNS names by using the network's DNS servers.

When the Windows resolver receives a positive or negative response to a query, it adds that positive or negative response to its cache, and as a result, creates a DNS resource record. The resolver always checks the cache before querying any DNS server. If a DNS resource record is in the cache, the resolver uses the record from the cache instead of querying a server. This behavior expedites queries and decreases network traffic for DNS queries.

You can use the Ipconfig tool to view and to flush the DNS resolver cache. To view the DNS resolver cache, type ipconfig /displaydns at a command prompt. Ipconfig displays the contents of the DNS resolver cache, including the DNS resource records that are preloaded from the Hosts file and any recently queried names that were resolved by the system. After a certain time period, the resolver discards the record from the cache. The time period is specified in the Time to Live (TTL) associated with the DNS resource record. You can also flush the cache manually. After you flush the cache, the computer must query DNS servers again for any DNS resource records previously resolved by the computer. To delete the entries in the DNS resolver cache, type ipconfig /flushdns at a command prompt.

Friday, 5 February 2010

VMotion a VM with a Internal Network

I've always been under the impression that you could never VMotion a VM that has an active connection to an internal switch (with no link to any pNIC), but you actuall can. In order to do it, you have to add some particular lines in the vpxd.cfg file on the VitualCenter Server itself.


C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\vpxd.cfg

and add the following lines to the bottom of the cfg file.

< test >
< compatiblenetworks >
< vmonvirtualintranet >false< /vmonvirtualintranet >
< /compatiblenetworks>
< /test >
< /migrate >

You have to create this section at the end of the vpxd.cfg file but still in the section.

Note: This is not supported by VMware but is very handy and you also need to remove the spaces in the < >!

Thursday, 4 February 2010


Esxtop allows monitoring and collection of data for all system resources: CPU, memory, disk and network.
Understanding all the information in esxtop can seem like quiet a lot to take it at first but once you use esxtop and understand all the information you wont stop using it.

The following keys are the ones I use the most.
open console session or ssh to ESX(i) and type:

By default the screen will be refreshed every 5 seconds, change this by typing:
s 2

Changing views is easy type the following keys for the associated views:
c = cpu 
m = memory 
n = network 
i = interrupts 
d = disk adapter 
u = disk device 
v = disk VM

To Ad/Remove fields:

Changing the order:

Saving all the settings you’ve changed:
To capture the information and export it to a CSV use the following command:
esxtop -b -d 2 -n 100> esxtopcapture.csv

Where “-b” stands for batch mode, “-d 2″ is a delay of 2 seconds and “-n 100″ are 100 iterations. In this specific case esxtop will log all metrics for 200 seconds.


here are a few of the metric thresholds that i use

CPU%RDY10Overprovisioning of vCPUs, excessive usage of vSMP or a limit(check %MLMTD) has been set. See Jason’s explanation for vSMP VMs
CPU%CSTP100Excessive usage of vSMP. Decrease amount of vCPUs for this particular VM. This should lead to increased scheduling opportunities.
CPU%MLMTD0If larger than 0 the world is being throttled. Possible cause: Limit on CPU.
CPU%SWPWT1VM waiting on swapped pages to be read from disk. Possible cause: Memory overcommitment.
CPUTIMER/S (H)1000High timer-interrupt rate. It may be possible to reduce this rate and thus reduce overhead. The amount of overhead increases with the number of vCPUs assigned to a VM.
MEMMCTLSZ (I)1If larger than 0 host is forcing VMs to inflate balloon driver to reclaim memory as host is overcommited.
MEMSWCUR (J)1If larger than 0 host has swapped memory pages in the past. Possible cause: Overcommitment.
MEMSWR/s (J)1If larger than 0 host is actively reading from swap(vswp). Possible cause: Excessive memory overcommitment.
MEMSWW/s (J)1If larger than 0 host is actively writing to swap(vswp). Possible cause: Excessive memory overcommitment.
MEMN%L (F)80If less than 80 VM experiences poor NUMA locality. If a VM has a memory size greater than the amount of memory local to each processor, the ESX scheduler does not attempt to use NUMA optimizations for that VM and “remotely” uses memory via “interconnect”.
NETWORK%DRPTX1Dropped packages transmitted, hardware overworked. Possible cause: very high network utilization
NETWORK%DRPRX1Dropped packages received, hardware overworked. Possible cause: very high network utilization
DISKGAVG (H)25Look at “DAVG” and “KAVG” as the sum of both is GAVG.
DISKDAVG (H)25Disk latency most likely to be caused by array.
DISKKAVG (H)5Disk latency caused by the VMkernel, high KAVG usually means queuing. Check “QUED”.
DISKQUED (F)1Queue maxed out. Possibly queue depth set to low. Check with array vendor for optimal queue depth value.
DISKABRTS/s (K)1Aborts issued by guest(VM) because storage is not responding. For Windows VMs this happens after 60 seconds by default. Can be caused for instance when paths failed or array is not accepting any IO for whatever reason.
DISKRESETS/s (K)1The number of commands reset per second.
For a more detailed view over ESXTOP read the followign VMware article.