最新消息:

pcre 库学习

pcre admin 3110浏览 0评论

概述

pcre是与perl一致的正则表达式,nginx就是用的该库。

系统:Mac OS X
pcre版本:version 8.38 2015-11-23
安装路径:

$brew list pcre
/usr/local/Cellar/pcre/8.38/bin/pcre-config
/usr/local/Cellar/pcre/8.38/bin/pcregrep
/usr/local/Cellar/pcre/8.38/bin/pcretest
/usr/local/Cellar/pcre/8.38/include/ (6 files)
/usr/local/Cellar/pcre/8.38/lib/libpcre.1.dylib
/usr/local/Cellar/pcre/8.38/lib/libpcre16.0.dylib
/usr/local/Cellar/pcre/8.38/lib/libpcre32.0.dylib
/usr/local/Cellar/pcre/8.38/lib/libpcrecpp.0.dylib
/usr/local/Cellar/pcre/8.38/lib/libpcreposix.0.dylib
/usr/local/Cellar/pcre/8.38/lib/pkgconfig/ (5 files)
/usr/local/Cellar/pcre/8.38/lib/ (10 other files)
/usr/local/Cellar/pcre/8.38/share/doc/ (64 files)
/usr/local/Cellar/pcre/8.38/share/man/ (103 files)

先看个简单的例子,输出pcre版本号:

$cat pretest.c
#define PCRE_STATIC
#include <stdio.h>
#include <pcre.h>

int main() {
    const char *s = pcre_version();
    printf("version %s\n", s);
    return 0;
}

编译&执行:

gcc pretest.c -I /usr/local/Cellar/pcre/8.38/include/ -L /usr/local/Cellar/pcre/8.38/lib/ -lpcre

$./a.out
version 8.38 2015-11-23

pcre api

  • pcre_compile
pcre *pcre_compile(const char *pattern, int options,
           const char **errptr, int *erroffset,
           const unsigned char *tableptr);

将一个正则表达式编译为一个内部结构,匹配多个字符串时可以加快匹配速度。
参数:
pattern: 包含正则表达式的c字符串
options: 0或者其他参数选项
errptr: 返回的错误信息
erroffset: 错误信息偏移
tableptr: 字符数组或空
具体请看man手册。

  • pcre_fullinfo
int pcre_fullinfo(const pcre *code, const pcre_extra *extra,
            int what, void *where);

返回编译好的模式信息。
参数:
code: 编译好的模式,pcre_compile的返回值。
extra: pcre_study()的返回值,或NULL
what: 要返回什么信息
where: 返回的结果
具体请看man手册

  • pcre_study
pcre_extra *pcre_study(const pcre *code, int options,
            const char **errptr);

对编译好的模式进行学习,提取可以加速匹配的信息
参数:
code: 编译好的模式
options: 选项
errptr: 错误信息
具体请看man手册

  • pcre_exec
int pcre_exec(const pcre *code, const pcre_extra *extra,
            const char *subject, int length, int startoffset,
            int options, int *ovector, int ovecsize);

使用编译好的模式进行匹配,采用与Perl相似的算法,返回匹配串的偏移位置
参数:
code: 编译好的模式
extra: 指向一个pcre_extra结构体,可以为NULL
subject: 需要匹配的字符串
length: 匹配的字符串长度(Byte)
startoffset: 匹配的开始位置
options: 选项位
ovector: 指向一个结果的整型数组
ovecsize: 数组大小
具体请看man手册

例子

#define PCRE_STATIC
#include <stdio.h>
#include <string.h>
#include <pcre.h>

int main() {
    const char *err;
    int erroffset;
    const char *s = "<title>Hello World</title>";
    const char *p = "<title>(.*)</title>";
    int infosize;
    int ovector[30]= {0};
    pcre *re = pcre_compile(p, 0, &err, &erroffset, NULL);
    if (re == NULL) {
        printf("compile err: %s %d\n", err, erroffset);
        return 1;
    }
    int n = pcre_fullinfo(re, NULL, PCRE_INFO_SIZE, &infosize);
    if (n < 0) {
        printf("fullinfo err: %d\n", n);
        pcre_free(re);
        return 1;
    }
    printf("fullinfo res: %d\n", infosize);

    int rc = pcre_exec(re, NULL, s, strlen(s), 0, 0, ovector, 30);
    if (rc < 0) {
        pcre_free(re);
        printf("pcre_exec %d\n", rc);
        return 1;
    }
    for (int i=0; i<rc; i++) {
        const char *substring_start = s + ovector[2*i];  
        int substring_length = ovector[2*i+1] - ovector[2*i];  
        printf("$%2d: %.*s\n", i, substring_length, substring_start); 
    }
    return 0;
}

运行:

$./a.out
fullinfo res: 111
$ 0: <title>Hello World</title>
$ 1: Hello World

FROM:https://github.com/vislee/leevis.com/issues/65

转载请注明:爱开源 » pcre 库学习

您必须 登录 才能发表评论!