VMware View Usage Dashboard

October 4, 2011 1 comment

I’ve had the new version of my app running for a few months. This has given me quite a bit of raw data, but no really nice method for perusing it (aside from just raw SQL on the DB).

So I took some time yesterday to write a sort of dashboard based on the data I’ve collected and based on questions I get from the “higher-ups.”

This is version number 1. The dashboard is web-based using a combination of whatever I felt like writing at the time, some .NET, some classic ASP, javascript and AJAX. All of the graphs/charts on the page are from Google charts.

The top of the page shows current usage in each of my three pools. The gauge is scaled for the number of machines in the pool: pool 1 has 20 machines, pools 2 and 3 each have 100. The CPU utilization is averaged across all users as the percent reported by Windows itself (like you would see in Task Manager).

Next to those is a term cloud that shows the top 10 currently running apps in the pools. As a cloud it means that the more instances there are of a given app, the larger the font in respect to the other apps listed.

The Start and Stop buttons control the AJAX that refreshes the gauges and term cloud on a specific interval.

Under that is a graph that shows logins and average CPU for the month of September. This one is just looking at one specific pool right now (Pool 2 from the gauges). And below that are two pie charts that show top apps for the month by frequency and time-in-use.

I think I will continue to tweak this based on what I’d like to see along with any other requests I get from ‘higher up.’

New vscsiStats Excel Macro

March 11, 2010 17 comments

I wrote an Excel Macro to process vscsiStats data and turn it into pretty charts & graphs. I shared that macro with my friend Matt Kelliher and after showing him how to use it, he suggested and made a modification to it. The latest version will still process the data and create charts but then it will also export the charts as PNG, create an HTML file and put thumbnails of the charts in. You can then click on any of the charts for a full-screen view. Handy method of presenting the data in a concise format.

Let me know what you think of the file, you may download a copy here. You are welcome to download and use this macro but please leave the comments at the top (feel free to buy us a beer or two too)

This macro has been tested in Excel 2007 and 2010 beta. I must say though,  it runs much more slowly in these than in 2003.

Note: SAVE your spreadsheet first before running this version of the macro, it uses the current save location as the starting point for creating the HTML and saving the images.

Vmware Performance

February 19, 2010 Leave a comment

I posted yesterday that I was seeing a huge amount of storage traffic on my View Manager box. Turns out that it was due to two things: a snapshot and vswp.

This morning I checked a few settings before proceeding. According to VCenter, the view manager box only had 600meg of ram allocated, but task manager on that windows 2k3 showed 2.9Gig in use and committed, mainly pagefile. Strange, why the disconnect between the two?

It turns out that when the resource pool was created for the view manager server, the reservations were never changed.  The pool had a default memory reservation of 600meg. (all set by someone else I might add). So in looking at the pool, the ESX host was only giving the view manager server 600 meg of physical ram, and was using a vswp file to make up for the rest of the 2gig allocation.

I shut the server down, removed it from the resource pool, set the reservation on the pool higher, put the server back in then set a reservation on the server itself that matched it’s allocated physical ram.

At the same time, in Vcenter I deleted the snapshot that was in place. The snap was just over 3gig so it didn’t take too long to merge it back in.

After all of this finished, I went back into the Sun storage analytics. What an amazing change!

You can see shaded in yellow the combined traffic to storage just from the view manager’s delta and vswp files. The data point highlighted in the graph shows 3122 ops per sec to the swap file and 225 to the delta.

Notice how the overall NFS traffic drops pretty drastically after about 9:25.  The vswp file is empty now and the machine’s task manager shows very little page file usage.

VMware / Sun analytics

February 19, 2010 Leave a comment

I was going to do a post about vscsiStats processing in excel just in and of itself. But today an opportunity presented itself that I hope to be able to exploit.
Seemed like our Vmware View Manager was dog slow. Boss was complaining at me about it. I thought, OK, this is an opportunity to gather some vscsistats and process them to see what’s going on with the storage on this thing. As a starting point, our View Manager is a Win2k3 box, 2gig ram running on a Dell Poweredge 2950 ESX4. Backend storage is provided via NFS from a Sun 7410c.
First thing I did was login to ESX with putty, clear vscsiStats and then start gathering statistics. I collected for 60 minutes, exported the data and then processed it in Excel.

hmmm. interesting:

IO Lengths

IO Lengths

Looks reasonable enough, let’s look at read and write throughput as reported by vscsiStats:

hmmm. 215K average total throughput to storage. That’s not enough to cause the performance degradation that we are seeing. Another bit of calculation in Excel showed me that the View Manager was averaging 22IOPS. Again, not enough to cause any of the problems we are having with it.

So, I then logged into the Sun 7410 head that runs NFS for this ESX cluster. A few minutes of time looking at the analytics on the sun and I had my answer.

The green shaded area is NFSv3 ops per second to/from the view manager. Yep, take another look 6750 ops per second in that spike. I left the other boxes listed in there although their names have been hidden to protect the innocent. The next highest is 530 IOPS for an SAP server, then 294 and 92 for more SAP servers. One of my file servers is next at 61.

Clicking on the list for View Manager to open the hierarchy, I saw that there were two files listed that made up that 6750 ops, one of them named …delta.vmdk and one of them a vswp file.

The vm has a snapshot and about half of the ops are going to the delta file. Then for some reason the View Manager is exceeding its allocated 2gig of ram and using the virtual swap space.

I’m not an expert at any of these tools, but it sure is nice when you can use the tools you have to track things down and find suspected causes of the problems. I will be making changes to this machine first thing in the morning to remove the snapshot and increase the ram and set a reservation so it doesn’t use a vswp file.

I’ll post again once I gather more stats to see if this has helped with performance.

