最新消息:

libvirt API获得Xen虚拟机CPU使用率

CPU admin 4570浏览 0评论

最近在用Nagios监控Xen PV虚拟机的时候出现了问题,在被监控的服务器上是采用nrpe来采集数据的。但是在进程里无法看到PV虚拟机的进程,虽然可以通过xm top vpsname的方式来获取名为vpsname虚拟机的cpu使用率情况,但是不便于采集数据,通过xm list可以采集到cpu时间,根据CPU时间的差值,可以计算CPU使用率,可是该命令只能root执行,因为该命令可以进行关闭,重启虚拟机等重要操作,所以如果把权限给了nrpe,将可能造成严重的安全问题。
幸好livirt提供了API,所以我打算尝试用API写一个Nagios的插件来满足的我的需求,我的想法就是分别2次获得虚拟机的CPU时间,并分别记录2次取得数据时的系统时间,然后根据差值来计算,在理论上是存在一些误差的。

1.要使用API,首先需要安装libvirt-devel

[root@aikaiyuan ~]# yum -y install libvirt-devel

2.我的代码如下,文件名为vCpu.c

/**
 * Program Name: vCpu.c
 * Author: steptodream
 * Description:A simple plugin to get vps cpu usage
 *             for nagios(nrpe) by libvirt api
 * Compile:gcc -o vCpu vCpu.c -lvirt
 */
#include <stdlib.h>
#include <stdio.h>
#include <libvirt/libvirt.h>
/* define the exit status for nagios */
#define OK       0
#define WARNING  1
#define CRITICAL 2
#define UNKNOWN  3
/* get cpu time of the given name */
double getCpuTime(char *vpsName,virConnectPtr conn) {
    virDomainInfo info;
    virDomainPtr domain = NULL;
    int ret;
    /* Find the domain of the given name */
    domain = virDomainLookupByName(conn, vpsName);
    if (domain == NULL) {
        printf("Failed to find the vps called %sn", vpsName);
        exit(OK);
    }
    /* Get the information of the domain */
    ret = virDomainGetInfo(domain, &info);
    virDomainFree(domain);
    if (ret < 0) {
        printf("Failed to get information for %sn", vpsName);
        exit(OK);
    }
    return info.cpuTime;
}
int main(int argc,char * argv[])
{
    char *vpsName;             /* vps name */
    int  interval = 1;         /* check interval */
    double warning;            /* warning value */
    double critical;           /* critical value */
    double cpuUsage;           /* cpu usage of the vps */
    struct timeval startTime;  /* time of the first time to get cpu time */
    struct timeval endTime;    /* time of the second time to get cpu time */
    int realTime;              /* real interval between two times */
    long long startCpuTime;    /* cpu time of the first time */
    long long endCpuTime;      /* cpu time of the second time */
    int  cpuTime;              /* value of startCpuTime - endCpuTime */
    char *output;              /* output data for nagios */
    int  ret;                  /* exit status for nagios */
    virConnectPtr conn;        /* connection pointer */
    switch (argc){
        case 5:
             interval = atoi(argv[4]);
        case 4:
             vpsName  = argv[1];
             warning  = atof(argv[2]);
             critical = atof(argv[3]);
             break;
        default:
             printf("Usage:vCpu <vName> <warning> <critical> [interval]nn");
             return OK;
    }
    /* connect to local Xen Host */
    conn = virConnectOpenReadOnly(NULL);
    if (conn == NULL) {
        printf("Failed to connect to local Xen Hostn");
        return OK;
    }
    /* get cpu time the first time */
    startCpuTime = getCpuTime(vpsName, conn);
    /* get start time */
    if (gettimeofday(&startTime, NULL) == -1) {
        printf("Failed to get start timen");
        return OK;
    }
    /* wait for some seconds  */
    sleep(interval);
    /* get cpu time the second time */
    endCpuTime = getCpuTime(vpsName, conn);
    /* get end time */
    if (gettimeofday(&endTime, NULL) == -1) {
        printf("Failed to get end timen");
        return OK;
    }
    /* colose connection */
    virConnectClose(conn);
    /* calculate the usage of cpu */
    cpuTime = (startCpuTime - endCpuTime) / 1000;
    realTime = 1000000 * (startTime.tv_sec - endTime.tv_sec) +
        (startTime.tv_usec - endTime.tv_usec);
    cpuUsage = cpuTime / (double)(realTime);
    /* display cpuUsage by percentage */
    cpuUsage *= 100;

    /* make output data and exit status for nagios*/
    if (cpuUsage > critical) {
        output = "CRITICAL";
        ret    = CRITICAL;
    } else if (cpuUsage > warning){
        output = "WARNING";
        ret    = WARNING;
    } else {
        output = "OK";
        ret    = OK;
    }
    printf("%s CPU:%.2f%|CPU=%.2f",output,cpuUsage,cpuUsage);
    return ret;
}

3.编译测试,根据我的需求,我设置了3个必须参数和1个可选参数,分别是虚拟机名称vpsName,警告值warning(百分比值),危急值critical(百分比值)和检查间隔interval(秒)

[root@aikaiyuan ~]# gcc -o vCpu vCpu.c -lvirt
[root@aikaiyuan ~]# ./vCpu vmtest 1 2
OK CPU:0.20%|CPU=0.20

当然了,你同时可以再打开一个终端,用xm top vmtest来获取vmtest的cpu使用率,然后对比一下取值是否接近一致。我们再来看看返回值是否正常,因为Nagios是靠这个来判断服务状态的。

[root@aikaiyuan ~]# echo $?
0

注意,我的具体要求是检测指定名称的虚拟机的CPU使用率,如果超过了规定的warning或者critical值,就给使用者发邮件,所以在没有得到数据或者程序出错的情况,我都是以正常状态退出程序的。

另外,本人开发经验薄弱,所以代码里难免存在错误和不合理的地方以及不完善的地方(比如参数的合法性检验),仅供参考。

最后,关于代码中时间,我是以微秒(us)为统一单位来计算的,得到的cpu的时间是纳秒(ns),这个在结构体virDomainInfo定义中可以看到:

struct virDomainInfo{
    unsigned char   state   : the running state, one of virDomainState
    unsigned long   maxMem  : the maximum memory in KBytes allowed
    unsigned long   memory  : the memory in KBytes used by the domain
    unsigned short  nrVirtCpu   : the number of virtual CPUs for the domain
    unsigned long long  cpuTime : the CPU time used in nanoseconds
}

而通过gettimeofday取得存取到timeval结构体里的时间,包含tv_sec(s秒)和tv_usec(us微秒)这2种单位,可以从timeval结构体的定义中看到:

struct timeval {
    time_t      tv_sec;     /* seconds */
    suseconds_t tv_usec;    /* microseconds */
};

 

转载请注明:爱开源 » libvirt API获得Xen虚拟机CPU使用率

您必须 登录 才能发表评论!