On to something more interesting…

We use VMware ESX 3.5 for our virtualization solution, and I was tasked with finding a way to automate monitoring CPU/MEM usage for ESX guests. We use VirtualCenter to manage and maintain all of our VMs, and we use a Network Monitoring solution to monitor all of the devices in our infrastructure. Relying on the VMware guest to provide accurate performance data has proven unreliable in the past, but the performance data that VirtualCenter provides is reliable and is what we wanted to monitor from our centralized solution.

Let me go off on a little aside here for a second to address a common argument regarding network monitoring that I think a lot of people are presented with in their organizations… Just about every product has their own, proprietary monitor/alert system in place that will notify you when something is wrong — this is something that we all know and, for the most part, can rely on pretty comfortably. The problem with this is that you get into a scenario where you have 500 different things that need to be monitored and, in turn, you get 500 different things that perform the monitoring. This becomes a real hassle to maintain and doesn’t provide a central repository for historical performance data. Utilizing a network monitoring solution (like Zenoss or Zyrion Traverse) will afford you a one-stop-shop for tracking performance and creating/managing SLAs for your different services and devices. Additionally, most modern solutions will allow you to write custom scripts to keep track of your organization’s non-standard processes and services.

So, this is what I’ve done for the purposes of our ESX guests… I’ve created a script to monitor the guests individually over SNMP to facilitate watching things like running processes, user counts, and disk space usage. We use the standard ESX template to monitor the ESX host itself. The correlation between the host and guest is not easily made when you have to look at two different devices to get all of the data you need… So, the idea here is that we’re going to create a custom test, inside the monitoring application, that will replace the existing CPU/MEM usage tests, and will pull the real CPU/MEM performance data from VirtualCenter. This will also not create a second-polling cycle, because we’ll really only be polling the VirtualCenter’s internal database for data that it has already collected.

Requirements: VI Perl Toolkit (http://www.vmware.com/support/developer/viperltoolkit/)

The bare metal:

#!/usr/bin/perl -w

use strict;
use warnings;
use Switch;
use VMware::VIM2Runtime;
use VMware::VILib;

use vars qw/ $perfMgrView /;

# Initialization Options, but after the script has been initialized...
{
        # Set up the required options for this script
        my %opts = ( "vm" => { type=>"=s", help=>"Virtual Machine to Query", required=>1 }, "param" => { type=>"=s", help=>"[ cpu | mem ]", required=>1 }, "pct" => { type=>"=i", help=>"use percentages" } );

        # These methods are native to the VMWare libraries and validates
        # that we have everything necessary to establish a secure connection
        # to vCenter
        Opts::add_options(%opts);
        Opts::parse();
        Opts::validate();

        # We need our additional parameters "vm" and "param" to determin what we're going to query
        # and from where...
        #Opts::Usage() if not Opts::option_is_set("vm") or not Opts::option_is_set("param");

        eval {
                Util::connect();
        };
        if ($@) {
                # Do something useful here if we can't connect to VirtualCenter
        }

        # Grab the vCenter ocject reference? We'll need this for working with
        $perfMgrView = Vim::get_view(mo_ref => Vim::get_service_content()->perfManager());
}

eval {
        my $pct = Opts::get_option("pct");

        my $vm = Opts::get_option("vm");
        my $entity = Vim::find_entity_view( view_type=>'VirtualMachine', filter => { name=> $vm } );
        my $stats = PerfQuerySpec->new( entity=>$entity, format=>'csv', intervalId=>20 );

        my $maxCpu = $stats->{'entity'}->{'summary'}->{'runtime'}->{'maxCpuUsage'};
        my $maxMem = $stats->{'entity'}->{'summary'}->{'runtime'}->{'maxMemoryUsage'};

        my $cpuUse = $stats->{'entity'}->{'summary'}->{'quickStats'}->{'overallCpuUsage'};
        my $memUse = $stats->{'entity'}->{'summary'}->{'quickStats'}->{'guestMemoryUsage'};

        my $param = Opts::get_option("param");

        my $result;
        if (! $pct) {
                switch ($param) {
                        case "cpu" { $result = $cpuUse }
                        case "mem" { $result = $memUse }
                }
        } else {
                switch ($param) {
                        case "cpu" { $result = sprintf("%0.2f", ($cpuUse / $maxCpu) * 100)  }
                        case "mem" { $result = sprintf("%0.2f", ($memUse / $maxMem) * 100)  }
                }
        }
        print "$result\n";
        Util::disconnect();
};
if ($@) {
        Util::disconnect();
}

$SIG{__END__} = sub { Util::disconnect(); }

And then we execute the script with the parameters:

--username ${vcenter_username} --password ${vcenter_password} --vm ${guest_name} --param mem --url https://${vcenter_hostname}/sdk/webService --pct=(0|1)

Blam! Problem solved… Reliable CPU & Memory performance data on a per-guest basis.

let me know if you have any questions.

-dan

One Response to “Monitoring ESX Guests with PERL and VirtualCenter API”

  1. Hello Dan, i want to create a perl script to get IOPS for the datastores, can you please give me an example? the output needs cooking as the wmi values or there are plain values?

Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2013 Dan's Blog Suffusion theme by Sayontan Sinha