Weblog of Mark Vaughn, and IT professional and vExpert specializing in Enterprise Architecture, virtualization, web architecture and general technology evangelism
As with server virtualization in years past, new data center technology is unsettling for some IT pros. But, if you embrace change, you potentially make yourself more marketable and can vastly improve data center operations.
Several years ago, a hardware vendor ran a commercial that showed someone frantically searching for missing servers in a data center. He eventually learns that the large room, full of servers was consolidated into a single enterprise server platform.
It was an amusing commercial, although this trend of server consolidation did not play out as the advertisement implied. In fact, it was server virtualization that brought about large-scale consolidation efforts. But the premise was still true: Reducing the data center’s physical footprint caused a good deal of anxiety in IT administrators.
Multi-hypervisor management tools are great for common provisioning and monitoring tasks, but they fall flat when it comes to deeper functionality, which provide the most value.
Looking past the feature parity and marketing hype in the hypervisor market, there are compelling reasons to deploy a heterogeneous virtualization environment (i.e., reducing costs, improving compatibility with certain applications, avoiding vendor lock-in). Do you keep fighting it off? Do you concede and draw lines of delineation between the hypervisors? Do you throw your hands up and simply let them all in?
The answer certainly depends on your skillsets, needs and tolerance for pain. But if you’re looking for an easy way out through a multi-hypervisor management tool, you may be disappointed.
While nice on paper, chargeback and showback present complicated challenges for virtualization- and cloud-based data centers. To successfully implement these concepts, organizations must carefully evaluate their company-wide and IT objectives.
First, you need to determine if you are looking to simply report on resource consumption or potentially charge tenants for their resource utilization. If you sell hosting services, the answer is obvious: You need chargeback. But even within the corporate environment, more IT departments are looking to become profit centers, and must evaluate whether they need full-fledged chargeback, or simply showback.
Where showback is used purely to track utilization without billing, chargeback is used to bill tenants for resource consumption. And each model has its own considerations and challenges.
Virtualizing tier-one applications has begun to catch on. If your virtualization strategies do not include business-critical workloads, you may want to reevaluate that decision.
Over the past three to five years, a major shift took place in server infrastructure. As budgets tightened, and virtualization technologies matured, server virtualization exploded into a perfect storm on the data center scene.
By the time most IT shops began exploring virtualization, their server environments were rife with low-end servers, which consumed rack space at an unprecedented pace, stressing data center infrastructures to their breaking points.
In many of these IT shops, the return on investment and other benefits of server virtualization, such as reduced power draw and greater resource efficiency, had an immediate effect. But many IT departments found themselves with a daunting list of servers to virtualize. Between migration lists and the constant stream of new servers, IT first attacked the “low hanging fruit,” or the easy-to-virtualize servers that presented little to no risk.
That was a good strategy, but now you are through the “low hanging fruit” and need to take a serious look at how to virtualize your tier one applications. For more on this topic, read my TechTarget article “Tier-one applications: Part of effective virtualization strategies“. Feel free to come back here and leave comments.
Too often, people are afraid to ask stupid questions. In my opinion, stupid questions may be the best and most valuable questions.
It is very easy to develop “group think”, where people within a group or organization adopt a common view of a situation. This is good for teamwork, but bad for vetting complicated designs or enacting long term strategies. About the only thing that can be guarantees in a long term strategy is that the decision criteria will change over the course of time. However, if you blindly stick with the strategy, even after the criteria used to develop the strategy have changed, you may no longer be going in the right direction for the organization.
3 years into a 5 year project, once everyone is fully educated on the goals and the project plan, you need someone that will stand up and ask the stupid question…”why are we doing this?”. Don’t let the question annoy you, and don’t let your pride keep you from truly considering the question. Every project and design needs a devil’s advocate, forcing you to revisit decisions and justify why they are (still) relevant.
IT pros still argue over horizontal vs. vertical scaling, but the evolution of virtualization hardware appears to have subtly shifted the data center design debate to horizontal scaling vs. converged infrastructure.
Virtualization is winning the battle over installing an operating system on bare metal, and I think most people will concede that point. Not everyone has adopted virtualization, but time will soften their defenses and allow for improved virtualization and options — until all are assimilated.
But what will happen to server hardware? Do people still care about the server? Even hypervisors need a physical home, and there are still plenty of discussions on how to best architect that foundation.
One of my co-workers, Chris Reed (www.creedtek.com), uses the analogy that the most expensive pet you can get is a free dog. The initial cost is great. How can you get cheaper than $0? However, this transaction is followed by the vet bills, the inevitable property damage and the chewed up slippers. Free dogs rarely have their shots, which puppies need several rounds of. They usually need to be “fixed”, and I like to add in a location chip. Soon, you have shelled out a significant amount of cash on a free pet.
Many people will adopt a similar approach to virtualization, selecting a free product with the assumption that it will save them money. Can that work? Yes. Are there hidden costs to be aware of, and even expect in the near future? Definitely. Can those costs be significant? Yes, and they can be quite significant.
If you are comparing free hypervisors, then it comes down to features. Microsoft includes more features in their free version than VMware, but their features are less robust than the same features from VMware. And as you move into the higher licensing levels to migrate into an enterprise solutions, VMware’s features are significantly more robust. That could mean starting on one hypervisor at the free level, then having to change hypervisors as your environment matures. That can be a VERY painful process.
I wrote more on this topic in my last article “Free virtualization: It’s free for a reason“. I would also recommend going to www.virtualizationmatrix.com for a great break down of features between Citrix, Microsoft and VMware at various versions of their products. Andreas Groth has done a significant amount of research to build that matrix.
Virtualization management tools are popping up everywhere, and some are much better than others. In fact, few really shine at this point.Part of the problem is that these tools are attempting to tame a wild animal. Virtualization technologies are expanding and growing at a blinding pace, and no one can truly keep up with the current pace of change…let alone manage it.
Vendors like VKernel, Veeam and Quest are doing a good job, but don’t hold your breathe looking for one tool to rule them all. There will always be advanced features within a hypevisor that management tools have not caught up to. You will either have to limit yourself to the tools supported by your management platform (better pick a robust platform or you will be crippling your hypervisor and destroying your ROI), or you will have to accept that you will still use the native hypervisor management tools to manage advanced features (limiting the ROI on the new management tool).
This trade off is frustrating, but one that will not go away until the pace of change within virtualization technologies slows down considerably. In other words, this will not change any time soon.
Though I generally recommend against it, I admit there may be reasonable cases for mixing hypervisors within an environment. As you evaluate decisions like that, be sure to consider the impact on ROI. OpEx can go through the roof in those scenarios and easily wipe out the CapEx savings used to justify the decision. If you are then looking to a management tool to bring the two hypervisors together in a single pane of glass, do not set your expectations too high on the capabilities of any tool to provide a high value in that scenario. The few tools that could make any real impact there may be cost prohibitive. Before you know it, you have pushed both the CapEx and OpEx through the roof trying to manage a mixed environment.
In July of 1969, the US was in a race for the moon. Astronauts Michael Collins, Buzz Aldrin and Neil Armstrong were entrusted with the Apollo 11 mission, taking the first shot at the momentous achievement. Last night, I caught a great documentary on this mission, waking through the many planning details and challenges involved in going where no man had gone before. In fact, knowing the technology of the time, I am still amazed that we were able to pull this off on the first attempt. These men were truly brave, trusting their lives to such new and really untested technologies.
<IMAGE MISSING>
One thing that caught my attention was how meticulous Mission Control was, as they faced a number of “go/no go” decisions from the launchpad to the moon and back. As Buzz Aldrin and Neil Armstrong undocked the lunar module and began their decent to the moon’s surface, they were faced with almost impossible odds. There were so many calculations that had to be made in a split second, with no prior experiences to draw from. Gene Kranz, NASA’s Flight Director at Mission Control, was faced with a number of tough decisions as the lunar module approached the surface of the moon.
They did not know exactly when they would touch down, or how much fuel they would burn in the process, but they did have a good idea of how much they would need to relaunch and connect back up with the command module for the return trip to Earth. With every second, fuel consumption was being calculated and measured to insure this was not a one-way trip. About 30 seconds after beginning their final approach, Neil Armstrong calls out “1201 program alarm”. This was a computer error code that simply meant there that the computer was unable to complete all of the calculations it was attempting and had to move on. Timing was critical, and the programmers of the flight computers knew that should this condition occur, it was more important to simply note that some data was lost and move on. I can imagine the concern this caused, both in Mission Control and cramped lunar module. This is where the many eyes and ears monitoring the situation at Mission Control had to step up and insure that the important information was not dropped.
As I watched this, I noticed much this is similar to how PCoIP handles data (you knew this had to have a virtualization tie in, right?). PCoIP uses UDP. UDP is stateless, it will drop data packets. At the most basic level of the solution, the networking layer, packets can be lost the data does not stop flowing. UDP is like the flight computers on-board the Apollo 11 lunar module. At the application layer, PCoIP becomes Gene Kranz and Mission Control. It is looking to determine what may have been lost and how that can impact the overall mission. Calculating the 100 feet in front of the lunar module was infinitely more important than continuing calculations for the 100 feet behind it. With PCoIP, the goal is an efficient and pleasant user experience. To achieve that, with a myriad of unknown factors that can come into play, UDP is recognizing that not all data is critical to the end goal. Instead, PCoIP is the watchful eye that makes the “go/no go” decisions. For USB communications, data input and other critical data, it will request a retransmit of the data to insure accuracy reliable delivery of information. For audio or video packets, where it would inflict more harm to pause communications and attempt retransmits of the lost packets, it simply makes the decision to move on.
Protocols build on top of TCP, like RDP or ICA/HDX, cannot provide this intelligent decision-making. TCP guarantees delivery of packets from the network layer, so the application has no option but to pause while waiting for the delivery of data. Sometimes, you really need to allow the software protocol to apply some intelligence to that decision.
As the Apollo 11 lunar module was rapidly approaching the surface of the moon, with nearly zero tolerance for errors, NASA knew that some data loss was acceptable. In fact, it was preferable to the penalty that could have been incurred by stopping the pending functions to complete previous ones. Had the Apollo 11 flight computers actually stopped measuring fuel consumption and distance to the moon for a few seconds to finish whatever computation triggered that 1201 program alarm, the event of July 20, 1969, may have ended very differently.
This is a tricky topic, and one that many organizations are getting hung up on. This can be a major contributor to virtualization stall. Virtualization stall is when an organization finds their efforts to virtualize their environments stalling out and forward progress becoming more and more difficult to achieve. This can often occur when the “low hanging fruit” that the organization started with is done, and the tasks involved in maintaining the virtual environment combine with the difficulty of addressing the remaining “difficult child” physical servers that need to be virtualized.
The easy work is done, and now you begin to realize that moving forward may require a different game plan. Everyone may be in the same boat, but they are each rowing in a slightly different direction. The boat is either going nowhere or in circles, but little to no progress is being made. The solution is not to force people to go in one direction, but to educate them on each others roles and help them understand the impact of their actions on everyone else. In doing this, they begin to go in the right direction as a common group with a common goal. In IT, this means breaking out silos and building new groups where every member of the team carries the same goal and will be measured against the same expectations. This change is not nearly as easy as it sounds, and will surely meet a great deal of resistance.
For more on this topic, please read my article “Overcoming virtualization turf wars” at SearchServerVirtualization, then come back here and leave comments.