Wednesday, December 23, 2009

5 Simple Steps to Resolve Memory Ballooning (Part 3 of 4)

The secret to running a virtual environment is to actively manage the limited resources available so that all the virtual machines get the resources they need. One key to managing this is to understand memory ballooning and swapping.
  • Memory Swapped: The amount of memory reclaimed by swapping (this is host swapping, not the guest operating system).
  • Memory Ballooned: The amount of memory that is reclaimed by ballooning.
Ballooning and swapping are both used by the VMkernel. The memory balloon driver (vmmemctl) works with the virtual machine to reallocate memory that is considered less important to the guest operating system. VMware had a lot foresight when writing this. The balloon drivers rely on the guest OS to determine how to provision memory, as opposed to trying to calculate it for all the various operating systems that can be virtualized. The less important memory is then swapped by the host via the VMKernel and the memory is then available to other guests that may need it.

There will be little or no ballooning or swapping in an environment that is well run and has enough memory for all the virtual machines. The balloon driver only goes into action if there is not enough memory for the virtual machine or if the guest has a limit set and is not getting enough memory. You will see large memory consumed percentages and then see the balloon and swapping starting in this case. I have also seen high ballooning and swapping when a VM has a limit set and the guest needs tries to use more memory than is available to it. All of this only happens if VMtools are installed because vmmemctl using this agent to communicate with the guest OS to determine which memory to swap out.

The solutions to resolving memory ballooning and swapping are usually straightforward.

  1. You should remove the limit if there and enough memory is available on the host or resource pool to satisfy the allocated memory.
  2. Add more memory to the host or resource pool if no limit has been set.
  3. Move other virtual machines to free up memory on the host or cluster.
  4. Analyze how much memory has been allocated to each guest and compare this to how much memory is used over a week or 10 day period. This will show you any virtual machines that have been over allocated for memory. You can re-size them to have the right amount of memory and use the memory you have freed up on other virtual machines.
  5. Power off virtual machines that are not needed.
Watching and monitoring ballooning and swapping will help you solve the first resource bottleneck that most environments run into. This will allow you to safely run more virtual machines on the same hosts and storage.

Check back on this blog or follow us on Twitter for the next post in this series!

Part 1 of this series is: Managing Memory in VMware (Part 1 of 4)

Part 2 of this series is: Managing Memory in VMware (Part 2 of 4)

Tuesday, December 22, 2009

Capacity Planning is Dead - Capacity Optimization is in

When most people think about capacity planning, they think of a once a quarter activity which tries to predict hardware resources needed to run new and existing business applications. This practice was conceived back in the mainframe days when capacity planning was done by a few people with Capacity Planning in their job tittles. Capacity Planners practiced this black art. These people constructed elaborate models and spoke in language most people did not understand. They belonged to groups like the Computer Measurement Group (CMG)

Rewind forward to 2009. Server virtualization changed the relationship between hardware and software and in the process has turned System Administrators into Capacity Planners for their shop. The planning for hardware capacity has changed as well. Unlike a mainframe data center made up of a few very large machines, a modern virtualized server room is made up of many more machines where VMs move from host to host often in automated manner making capacity planning a more real time exercise. Furthermore, unlike the mainframe where application workloads were predictable and static, in vitualized data center VMs are powered on unpredictably. Capacity needs for applications can change very quickly and become problematic causing performance bottlenecks. Just think what VDI will do in terms of rapid VM provisioning when hundreds of new VMs are launched at 9am when users come in the morning .

We have entered a new age of capacity management in which:
  • Systems Admins are Capacity Planners
  • Capacity planning must be on ongoing and tied to the rate of change in your data center
  • Capacity availability must be monitored in close to real time
Once a quarter Capacity Planning is no longer adequate. Progressive organizations have realized that Capacity Planning has evolved into constant Capacity Optimization!!

I welcome your comments

Managing Memory in VMware (Part 2 of 4)

In the last blog I talked about the different memory metrics available from VMware to measure memory. Today I will go into more detail on how to use these metrics to determine when you have a performance issue.

The 2 metrics that I will be talking about today are memory consumed and memory granted. Both of these counters measure how much of the host memory is being used by the virtual machine. The difference between them is that memory consumed takes memory pages that are shared into account where memory granted does not. I will talk about TPS (Transparent Page Sharing) in my blog tomorrow.
  • Memory Consumed: Actual consumption of physical memory that has been allocated to the virtual machine the amount of machine memory allocated to the virtual machine, accounting for savings from shared memory.
  • Memory Granted: The amount of guest physical memory that is mapped to machine memory.
Memory consumed is a great resource to look at to determine if there are performance issues because it highlights any issues where a guest is not using much memory or where a virtual machine is not getting the memory that it needs. For example, any virtual machine that has a memory consumed value over 100% has a limit set and is not getting enough memory so the VMkernel starts to balloon and swap the memory. Memory consumed is also an excellent metric to use when computing how many more virtual machines you can fit in your environment since it measure how much of the hosts memory is being used by virtual machines and is not available to use by new VMs.

One complaint that some users have with memory consumed is that it is not always the best counter to use in specific cases. RedHat and some applications like SQL server and Oracle can grab all the memory available regardless of how much memory is used and keep the memory tied up from a VMware perspective. In these cases the best metric to use is memory active since it shows how much of the memory is actually being used not what has been taken on the host.

Check back on this blog or follow us on Twitter for the next post in this series!

Part 1 of this series is :Managing Memory in VMware (Part 1 of 4)

Monday, December 21, 2009

Managing Memory in VMware (Part 1 of 4)

The secret to running a virtual environment is to actively manage the limited resources available so that all the virtual machines get the resources they need.

The first bottleneck that most virtual environments have is memory so managing memory is critical to running a healthy productive virtual environment. This is especially true in VMware since you can over-commit memory on a host or cluster which can lead to serious performance issues if you are not actively monitoring memory. The trick is to understand which memory statistics are available and what they are measuring so that you can get a complete picture of how this critical resource is being used by your hosts and virtual machines.

Let’s start with the basics. Here is a list of the counters in VMware and a definition of what they measure:
  • Consumed: Actual consumption of physical memory that has been allocated to the virtual machine the amount of machine memory allocated to the virtual machine, accounting for savings from shared memory.
  • Active: Amount of memory recently accessed.
  • Memory Granted: The amount of guest physical memory that is mapped to machine memory.
  • Memory Balloon: The memory balloon driver (vmmemctl) works with the virtual machine to reclaim pages that are considered least valuable by the guest operating system. This technique increases or decreases memory pressure on the guest operating system, causing the guest to use its own native memory management algorithms to decide what pages to give up.
  • Memory shared: Memory Shared for the host is the sum of each virtual machine's Memory Shared.
  • Memory Shared: Common is the amount of machine memory that is shared by virtual machines.
Memory active and memory granted are the default statistics that are listed in the vCenter performance graphs. They tell you how much memory has been mapped on the host for each virtual machine (Memory Granted) as well how much memory is being accessed at the given time (Memory Active). The both of these counters look at the resources from the virtual machines point of view and are the right metrics to look at to determine what is happening in real time.

Over the next few days I will be going into greater detail about managing memory in VMware. Check back on this blog or follow us on Twitter!

Part 2
Part 3
Part 4

Friday, December 11, 2009

Hardware over-provisioning is the norm in VMware shops

After 4 years of examining capacity utilization in hundreds of organizations running VMware ESX I can unequivocally state that most over-provision hardware resources for VMs. This has led to overspending on storage and servers, which especially these days of tight IT budgets has slowed down server virtualization growth in many shops. Of course the next logical question is why does this happen? Politics and lack of visibility into hardware resource utilization is the answer. Application owners who are afraid and lack virtualization knowledge demamd form VMware admins that they provision extra vCPUs, memory and storage to over compensate for the "virtualization overhead". Once VMs are deployed nobody bothers to go back to check how the allocated resources are actually being used, or not. At Vkernel we found that almost always the allocated resources vs what was needed left as much as 30%+ unused. Can you imagine how much 30% of SAN storage and 30% fewer servers would save your organization? Conversely you can reuse the hardware resources to grow your environment by 30% without buying more servers and storage.

How is your shop managing hardware capacity? Would love to hear from you

Tuesday, December 8, 2009

Capacity Analyzer with Hyper-V Beta Now Available

Capacity Analyzer with Hyper-V Beta is released.

VKernel has extended our award winning Capacity Analyzer virtual appliance so that administrators can now manage both their VMware, Vshpere and Hyper-V virtual environments from a single software solution. Many companies are using more than 1 hypervisor to virtualize their servers and desktops and needed a single solution that will provide the analytics that will allow them to make the most of the hardware that they are using. Hyper-V is included in Server 2008 and is an easy to use virtualization product by Microsoft that has comparable features to VMware.

VKernel provides all the performance metrics needed to manage your virtual systems in an easy to use and interpret web interface so that you can manage your Hyper-V and VMware environments from the same interface. We are among the first companies to offer the ability to manage different hypervisors with the same solution.

Here is a link to the beta of Capacity Analyzer so that you can try it in your own environment.
http://www.vkernel.com/download/capacity-analyzer-hyper-v

Top features of Capacity Analyzer with Hyper-V:

  • Predict and avoid costly performance degradations and downtime by properly allocating resources
  • Get more out of your existing VMware infrastructure, so that you can delay unnecessary purchases
  • Safely increase virtual machine (VM) densities to lower the cost per VM
  • Find available capacity to safely add new VMs
  • Identify your top resources consumers
  • Set alerts to take the necessary proactive measures to avoid problems altogether

Monday, December 7, 2009

7 ways to Grow Vmware environment with less hardware

Once organizations start virtualizing servers with VMware or Hyper-V it is hard to stop. The befits of server virtualization are very compelling. When organizations have virtualized 15% to 20% of servers it becomes very obvious that hardware costs to continue to scale are significant. SAN is of course the biggest expense and additional servers are not cheap either. Smart Vmware and Hyper-V shops are realizing that they can not continue to virtualize and grow without constantly buying more SANs and servers. That's very expensive way to grow your datacenter. Before you go to your boss and ask for more money for hardware, here is 7 ways to grow your VMware environment with less hardware

1. Shrink VM resource allocations. Most organizations have over allocated storage and memory for 98% of VMs. Yes shocking!!! Even worse admins don't go back to compare the allocated vs actual resources needed for each VM. Thus there is a huge waste of resources

2 Delete abandoned VMs It is amazing how quickly VMware data centers have become grave yards for VMs that are no longer used and have not been fired up in ages, VMs that are still running and doing nothing (Zombies)

3 Delete abandoned Snapshots. Snapshots are so popular with admins because they enable disaster recovery, but few admins bother to go back and delete them. When SAN storage costs $25 to $50 per gig it adds up to real money quickly

4 Delete abandoned disk images. In many environments we see that VMs are deleted in VMware but the disk images have never been deleted. Given the size of average VMs a lot of disk spaces is wasted this way.

5. Make CPU and Memory reservations carefully. People are running out of hardware resources when they make too many reservations on memory or cpu. By default VMware is not going to let you start new VMs if it does not have enough resource to ensure HA fail over.

6. Delete VMs after QA has been finished. In QA groups a common mistake is not to delete VMs after testing has been completed which results in a huge waste of money

7 Balance your VMs workloads. You may have a bottleneck in storage, storage I/O, memory or CPU on a particular host or a cluster. The answer is not always to throw more hardware at the problem. Before you do, try to balance out all your hosts with VMs that are complementary to each each in terms CPU, Memory, and Storage I/O. Don't place all CPU bound VMs on one host. Same for memory and storage I/O

I welcome your comments and ideas