Showing posts with label ESX. Show all posts
Showing posts with label ESX. Show all posts

Monday, June 16, 2008

Run OS Run...

The above title is a horrible attempt at a Forest Gump reference.

Either way - Novell announces support for VMWare's Virtual Machine Interface - which allows Suse Linux Enterprise Server (SLES) 10 Service Pack 2's kernel to have "increased performance and better interoperability".

Think of it this way - there will be preferred OS's to virtualize - Novell says you can virtualize Suse since its just x86 virtualization but if you do SLES 10SP2 - you can get VMI support.

In the paravirtualization space - this will mean increased density, better running VM's, better running ESX hosts, etc.

In other words - "the guest operating system is modified to work more closely with the underlying hardware and not just with the virtualized environment."

And its a brand differentiator as well - the OS and the VM platform both have to be tuned or tunable - and with VMI - VMWare is saying they are prepared to virtualize an OS like SUSE better than its competitors.

Monday, June 2, 2008

One Quad or Two?

One of the best resources on the Internet for VMWare implementation is the VMTN community forums - its top notch.

This week there was a discussion about budgets and performance (where finance always mixes it up with IT (that and Chargeback)).

The post asks about the value of two medium-speed (1.6 - 2 Ghz) QuadCore CPU's ($$) vs. one high speed (3.33 Ghz) QuadCore CPU ($$$$).

I liked William Bishop from Huntsville Hospital response - "You'll get better density on the dual socket". He prefers the "dual proc, quad core setup" and has been "adminning vmware from some of the first dual cores to the newest quad cores."

I wonder if he has done anything with 4 socket x QuadCores?

Density is important - it's going to help you drive down your per VM costs and generate better ROI on the dollars invested in a virtualization product.

Thursday, May 29, 2008

B-Hive acquisition is a real smart move for VMWARE

Since the dawn of computing when it comes to performance management the most critical element is the user. It does not matter what performance metrics say inside the Virtual Center. It is all about how your users perceive the performance to be. Is it fast, Ok, too slow. It is not about the metrics. At the end of the day it is a qualitative experience. As David Marshall
correctly points out in his coverage of the news, the cool factor here is that now VMWare will be able to granularly break down where the time in application response is being spent. Is it in the network, database, application? When combined with vmotion, a vm could be moved to another host based on the analysis of where the problems are. If it is a network problem, move vm to another segment closer to ens users. If it is a host capacity issue, move it to another host where more capacity exists. The angle that really excites me is the ability to monitor more granularly down to the end user level. We would be able to answer how much resource is being utilized by a given user. This can be used in chargeback, capacity management etc. I hope to see Vmware publish this API in the near future!!

Tuesday, May 27, 2008

Data Rich, Insight Poor

This phrase has been gaining popularity.

It's a phrase used by CIO's to describe their business systems, the mountains of data generated by business applications.

This mountain has spawned new phrases and software to address business intelligence, dashboards, scorecards, key performance indicator's, etc.

At the recent MIT CIO Symposium, I heard a CIO talk about how easy it was to become "data rich, insight poor", she was talking primarily about how their BI product didn't make them that smart because the data was all over the place - they were engaged in new initiatives (ETL, Data Warehousing) to try and be data rich, insight rich.

I think a significant tertiary effect of virtualization is the increase of information.

Before you had individual piles of data, but with a virtualized environment you have the system and performance data from the ESX Server, the Host OS, the SAN, the Network, etc.

VKernel's Capacity Bottleneck Analyzer is a "ESX Intelligence" product - its a KPI, Dasboard, to do the analytics and automation to lift and load data out of Virtual Center and start analyzing resource constraints.

This problem is going to get worse.

The density of virtual machines is going to increase, we are going to see more CPU's, more RAM and more VMs per physical boxes.

It's time to look at the mountain of data and start working on the insight.

Saturday, May 24, 2008

Capacity Planning for ESX is a multi-dimensional challenge

Windows Admins listen up. Your world has changed. No more one application running on one Windows server where none of your capacity resources were shared. Once you virtualize your servers, they have to share memory, cpu, storage, network bandwidth, disk i/o, network i/0, etc, with other VMs running on the same hardware. Capacity planning which was a non-event in the Windows world is now a must do, otherwise you will run out of capacity and experience performance problems or even worse - downtime.

Capacity Planning is a multidimensional problems. To do it correctly you must take into account literally hundreds of variables. Here are some of them:

- how many VMs do you deploy?
- where are you going to deploy them
- how much resource to allocate to them
- what happens if you want to change hardware
- will you violate any configuration constraints?
- do you need another host?
-what resource will you run out of first? memory? CPU, Storage -- and where?
- how many more VMs can you fit into each cluster?
- what happens if VMs get vmotioned?
- will you violate DRS affinity rules?
- what configuration constraints will you violate?
- will DRS work?
- will HA work?

I can go on and on. I hope you see just how complex capacity planning has become. As VM density on hosts continues to increase, capacity planning in VMware will become even more critical, because every physical server becomes more business critical and failure is not an option. Systems Management is fun again!!

Sunday, April 27, 2008

VMs can cost more than physical servers -- really!!

I am amazed how many virtual environments I have now seen that are severely under utilizing the new hardware and are afraid to increase VM density. They buy expensive server hardware, loaded it with 16Gigs or more for $30K to $50K and are running just a handful of VMs on it. This is analogous to driving a perfectly good Ferrari without ever getting out of first gear!! Say your are running 8 Vms on a $50K hardware. Add the cost of SANs, etc and you can quickly see how the cost of each VM can actually be higher than the physical server it replaced. This of course begs the question why do people underutilize the hardware?

As far as I can tell there are several reasons. Some are just being utilization "ignorant" about their environment, but the majority is simply afraid to "push the metal" and increase utilization because of concerns about running into ESX performance problems or worth yet -- downtime. Since finding capacity bottlenecks using Virtual Center is not trivial and time consuming, and predicting future capacity bottlenecks requires fairly advanced mathematical analysis of all core 4 resource types , disk I/O etc, most Vmware Admins lack the time and experience to do this exercise. So they keep the Ferrari in first gear, keep driving blindfolded, and hope that vm sprawl does not catch up with them. With availability of tools like the Vkernel Capacity Bottleneck Analyzer
VMware admins will gain visibility into current and future capacity problems and steer clear of performance issues. It heps driving with lights on!! Tell us what you think www.vkernel.com

Wednesday, April 16, 2008

I want my Availability

I wanted to do a knock off the Dire Strait's tune - Money for Nothing and then change the "I want my MTV" into "I want my Availability" - VKernel will then hire a Dire Straits cover band to perform it somewhere.

Availability vs. Capacity - is one better than the other?

Are there types of Availability?

Virtual John writes about High Availability (HA) vs. Continuous Availability (CA)

"In plain English this means, if one of your hosts in a cluster of VMWare Servers goes away the VMs will reboot elsewhere. Reboot = downtime, so is this high availability? Or just higher availability than no fault tolerance?"

It could be a important misnomer - the VM's with HA will be expected to not have any interruption of business service (enter VMotion) and voila - "I want my Availability".

Thursday, April 10, 2008

Virtualization's Dark Side!?!?!?

Virtualization's Dark Side

Forbes reporting on security risks in virtualization. So expect CEO's and CFO's to kick emails off to CIO's asking them about it. It's Joanna Rutkowski and Jon Uberheide (Wolverines!!).

Joanna is talking "virtual machine escape" or "hyperjacking" and "blue pills" which are basically taking control or injecting a malware hypervisor. Jon thinks its going to be an intercept during a VMotion or a "live-machine migration".

"Rutkowska and Oberheide both say that the attacks they discussed are likely too new to have ever been used by real-world cybercriminals. Security researchers say that virtualization-based attacks aren't likely to be common"

So it's theoretical at this point and that's sort of good news but with Virtualization becoming more common place, the potential exists for security issues.

Forbes cites IDC and says

"Virtualization usage grows--at the breakneck speed of around 40% a year, according to a 2007 report".

I am also interested in what IBM (ISS's X-Force) is doing

"an 18-month-old research initiative called PHANTOM, devoted to protecting virtual machine hypervisors from hackers."

I know they did some R&D on sHype.

Friday, March 28, 2008

Swims like a mainframe.

I love the duck test. If a bird looks like a duck, swims like a duck and quacks like a duck, then it's probably a duck.

HP's new 8-way DL785 G5 looks like a mainframe wannabe.

Replace the mainframe OS ($$$$) and replace it with ESX ($$). Then replace mainframe workloads with virtual machines - basically VM workloads - both are consumers of disk, memory, cpu, network.

Mainframes try to run continuously at over 70% busy. A 90% figure is more typical, and modern mainframes could see sustained periods of 100% CPU utilization. You're going to need a capacity tool.

Typically, a mainframe is repaired without being shut down. Also, memory, storage and processor modules of chips could be added or hot swapped without being shut down. It is not unusual for a mainframe to be continuously switched on for 6 months at a stretch.

So maybe if it runs CPU like a mainframe, and has uptime like a mainframe, is it a mainframe??

Check out the numbers:

8 sockets (up to 32 cores)
64 DIMM slots, (Up to 256 GB of RAM - 4 GB max per slot)
11 PCI-e expansion slots (3 x16 slots, 3 x8 slots and 5 x4 slots)
2.3 terabytes of internal storage

When the 8 GB DIMMs ship - this could be 512 GB of RAM.

HP is aiming these behemoths are two roles:

1) Very Large Database Systems (VLDBS)

Very large database servers with massive data buffer caches.

2) Very Large Virtualization System (VLVS)

These are going to push capacity and virtual machine counts to new historic levels.

A huge issue with these massive systems into production is finding better/smarter management tools that can help you identify potential capacity bottlenecks and gather capacity and performance data. Oh and don't forget about VM chargeback.

The key to these beasts looks like the Opteron chipset - no shared memory bus - each processor has its own memory and I/O bus. Sun's Sun Fire X4600's also running's eight sockets and Opteron's. I can't imagine Intel is going to stand for that - new word of the week - octal core.

The mainframe folks are seeing a return to shared processing of the very large systems, so it may not be a mainframe per se, but this system sure quacks and swims like one. Except it csts like a server.

Friday, February 29, 2008

How to predict future capacity bottlenecks


Your virtual data center is growing. You are adding a ton of new VMs every week. Wouldn't it be really cool if you head a "crystal ball" that told you in how many days you will run into capacity bottlenecks and what type of bottlenecks it will be (cpu, memory, storage) ?

Now you can. Join Vkernel's beta program for Capacity Bottleneck Analyzer that will kick off in early March





Tuesday, February 26, 2008

How many more VMs can you fit into a Cluster, Resource Pool or Host?

Given the speed at which most admins are adding new VMs to their environment ("vm sprawl"), every admin has to figure out where to deploy the new VMs. Simply guessing about resource availability will lead to performance problems and downtime. What I am proposing here is a simple 4 step process you can use to determine how many more VMs you can fit into a Cluster, Resource Pool or a host. Here it goes

  1. Select a cluster, resource pool or a host
  2. Get info on available memory, storage, cpu, disk i/o and net i/o
  3. Calculate an average VM footprint in the cluster, RP or host in terms memory, cpu, storage, disk i/o
  4. Apply average VM footprint to every resource type to see which resource you will run out of first.That’s how many more VMs you can fit into host, cluster or resource pool

o

Tuesday, February 12, 2008

Servers are no longer a "Resource Boundary"

One of the hardest concepts for System Administrators new to virtualization to understand is the shared resource management. VMware ESX makes it possible to share resources namely memory, cpu, storage and network not only inside a physical host, but also across multiple physical hosts. The resources are pulled together to create one massive resource pool captured in a concept called a cluster. Even resources inside clusters can be further subdivided into many Resource Pools. For admins who are only used to dealing with physical servers as resource boundaries this can be confusing, especially when it comes to planning and management of capacity. For example when monitoring or determining resource capacity, Admins must now take into consideration how all resource boundaries are affected. Looking just at physical servers is no longer an option!

Friday, February 1, 2008

How many new VMs are you adding per week?

How many new VMs are you adding per week? This is very important question, because it has major implication to capacity availability in your ESX data center and ultimately performance. Every VM you deploy will consume cpu, memory, storage and network resources. It will also add additional disk I/O. It is easy to see how, if uncontrolled, you can quickly run out of resources and develop capacity bottlenecks. Of course the trick is to figure out which resource you are going to run out of first? Will you hit the bottleneck in memory, cpu, storage, disk i/o or network? The answer is it really depends on your environment, but in most cases the first bottleneck is memory. Why? Remember you were able to virtualize servers, because they were under utilizing CPU. That is what enabled you to combine 8+ plus servers on one piece of hardware. When you think about memory, it is a different story. Just because your servers are now virtual, it does not mean they are consuming less memory. Hence that's why in most environments the first capacity bottleneck is memory. What do you think the second capacity bottleneck you are likely to hit? Let me know at abakman@vkernel.com

Tuesday, January 29, 2008

9 capacity bottlenecks in ESX that kill performance

I have compiled a list of "things" that can cause you to run out of capacity resources in your ESX data center and run into performance problems or even downtime:

1. Adding new VMs though uncontrolled VM sprawl
2. Removing hosts from clusters
3. HA enabling your cluster without accounting for fail over
4. Changing Fail Over Capacity setting in a Cluster
5. Increasing reservations in VMs
6. Changing Resource Pool Configurations
7. Power up many VMs that were powered off or in maintenance
8. Natural growth rates in Storage, CPU, Memory and Network utilization
9. Changes in workloads can result in Disk I/O bottlenecks

Did I miss any? Let me know abakman@vkernel.com

Tuesday, November 20, 2007

Are you ready to SHARE your resources?

Sharing what? Resources? Memory? CPU? Storage?

There is an entire generation of Sys Admins now that has grown up with a distributed computing data center where one application is normally run on one server. This mostly happened because of Windows instability. Most administrators did not want to deal with trying to troubleshoot OS problems and multiple application problems at the same time. The threat of the infamous "Blue Screen of Death" defacto created this one application one server architecture. In this world admins did not have to think about or worry about sharing of resources.

Welcome to server virtualization where sharing of resources IS the primary idea. Sys Admins now will have to get used to the fact that their VM may suffer performance degradation as a result of its neighbor VM running on the same hardware and consuming a disproportionate amount of CPU and memory. So now Capacity Analysis and Capacity Monitoring becomes important again just as it was back in the mainframe days. Sys Admins now have to really pay attention to "Who is consuming what resources". Capacity Analysis is not a one time event. It is an ongoing activity. In fact many System Administrators I have spoken with are already spending a good chuck of their time troubleshooting capacity related bottlenecks in their environment.

The problem will only get worse. As organizations continue to add Virtual Machines at an exponential rate, this problem in fact will get exponentially more challenging. The unpredictability of work load management will make Capacity Analysis a required activity that will have to be performed at least daily. Just because you have used VMWARE Capacity Planner for your initial P-to-V conversion, you have to realize that it was nothing more than initial sizing. As you continue to add more virtual systems to the mix, many of the previous assumptions made by the Capacity Planner will no longer be accurate.

I love virtualization! Let me know what you think and email to abakman@vkernel.com

Alex Bakman

Tuesday, November 13, 2007

Virtual server sprawl reality

If you have not heard the term "virtual machine sprawl", welcome to virtualization. While the number of physical hosts in your environment will start shrinking, the number of VMs will grow exponentially once your users figure out just how easy it is to create "another server".

The implications are many:

1. If you thought that you had "too many" servers to manage before, guess what? It will actually get worse. A thousand of anything is too much to manage, ten thousand of anything will send you off the deep end. The management challenge is in the numbers, and there is no relief in sight on this front. To fight it getting organized is the answer. You will have to get really good at keeping asset inventory of your VMs. You have to know how many VMs you have where are they, what's in them, what state they are in.

2. Capacity planing will quickly become an issue. Think about it. Adding VMs at a quick pace will begin to strain your ESX resources. You will be amazed at just how quickly your "plentiful" amount of memory, storage and CPU starts to disappear. Each VMs is consuming recourses and before you know your overburdened hosts begin to develop performance problems. To fight it, you have to get disciplined and control introductions of new VMs. At minimum you need a an approval process to quickly review new VM requests.

3. Audit of VM environment will become even more challenging. With so many VMs being added, knowing who is acting on them,, what changes are being made and where will require a real herculean effort.

Are we having fun yet? What do you think? Drop me a line.

Sunday, November 11, 2007

Virtualized Dataceter Brings New Challenges

Right now an average US corporation has about 7% of its Datacenter virtualized. As organizations continue to virtualize servers they will face 3 new management challenges:

1. Explosion in the number of virtual servers. User have already figured out just how easy it is for IT to create new virtual servers. The number of requests for new virtual servers will continue to skyrocket

2. Sharing of resources: memory, cpu, storage and network . In the traditional data center where one application server was dedicated to one application, no sharing of resources took place. That's not the case anymore in the virtualized datacenter

3. Servers have grown "legs". In the traditional datacenter we did not need to worry about servers moving around the network from one location to another. Now we do.

In the next post I will explore how VMWARE administrators can address the 3 new challenges

Friday, October 19, 2007

How to Calculate how much to Chargeback

While most agree that charging departments for computing resources consumed is the way to go, many get stuck with the question of "How do we compute what what we need to charge for Memory, CPU, Storage and Network usage. Since you have many departments that share Virtual Datacenter, how do you go about figuring out how much to charge users per every GB of memory used, or for every Ghz of CPU consumed. This is a real brain cramp!

It took us a little while here at VKernel, but we "cracked the code" on this one. We created a spreadsheet that takes into consideration your ESX hosts, storage and network devices and what you paid for them, how many user departments you have, who is using the resources, cost recovery timeframe, etc and automatically calculates rates that you should be charging your users per day for memory, CPU, Storage and Network.

As you make changes to your infrastructure simply update the spreadsheet and it will recalculate the rates. You can download the Calculator and the White Paper that describes step by step how to use it from

http://www.vkernel.com/resourcecenter/methodology/


Let me know what you think?

Thursday, October 18, 2007

Charging Customers for ESX Resources Should Be Fair

Many organizations are trying to figure how to do chargeback in a virtualized environment. For technical folks it is not an easy task. They understand the feeds and speeds but don’t know how to translate Gigabytes and GHz into dollars and cents. Conversely, accounting types understand cost recovery, but can’t quiet grasp this virtualization “thing”.

It is actually not that hard. As IT embraces “utility” or “on demand” computing, it can borrow from many lessons learned by companies who provide us with electricity, oil and gas. As every consumer knows, your utility company charges you for the amount of utilities you consume. If you use more you have to pay more and conversely if you use little you pay little. This approach is fair and easy to understand

Many initial attempts at chargeback are based around charging a flat rate per VM. It goes something like this. My server costs me X dollars and I can approximately host 8 VMs on a dual processor machine therefore I should charge every client X/8 per VM.

While simplistic, this approach is flawed in many ways:

  1. We all know that VMs consume vastly different amount of resources (cpu, memory, storage, and network). A busy MS Exchange server supporting thousands of users is consuming a lot more resources then an old application server used by a couple of people. It is simply unfair for IT to charge the same price to all users.
  2. A flat per VM model does not capture many other costs associated with running a data center In addition to consumable resources, IT must recover for software licenses, electricity and cooling, administrative cost and many other expenses. While some of them are “fixed” expenses, many are variable and must be adjusted for each billing period
  3. Remember the original reason why many application servers were virtualized in the first place – they were underutilizing resources or ran on old hardware that was getting impossible to support. These servers consume hardly any resources and users wanted to save money by virtualizing them. To turn around and charge the users a lot of money for these underutilized servers is not right.
  4. The fact is that most VMs are shared applications used by many departments. Some VMs are used by many while others are dedicated to a particular group. Furthermore to say that all departments use VMs equally is not based in reality. For example, take SAP. I am sure that people in finance spend a lot more time in SAP then people in IT. How do you account for this uneven usage between departments? Flat model breaks down here again.
  5. Here is another problem. In dynamic utility computing, resources are allocated on demand. If you need more capacity for a business application, another VM gets launched and consumable resources get allocated for it on demand. How would you keep track of resources in this scenario. Again per VM flat charging model breaks down

The only conclusion one can draw is that chargeback needs to be based on consumption of resources and services. That way departments only pay for resources and services they actually use. Life is not always fair, but maybe at least in the new Datacenter it can be J

Let me know what you think.

Alex Bakman abakman@vkernel.com