Perl Compatible Regular Expressions is a library written in C, which implements a regular expression engine, inspired by the capabilities of the Perl programming language. Philip Hazel started writing PCRE in summer 1997. PCRE's syntax is much more powerful and flexible than either of the POSIX regular expression flavors and than that of many other regular-expression libraries. While PCRE originally aimed at feature-equivalence with Perl, the two implementations are not fully equivalent. During the PCRE 7.x and Perl 5.9.x phase, the two projects have coordinated development, with features being ported between them in both directions. A number of prominent open-source programs, such as the Apache and Nginx HTTP servers, and the PHP and Rscripting languages, incorporate the PCRE library; proprietary software can do likewise, as the library is BSD-licensed. As of Perl 5.10, PCRE is also available as a replacement for Perl's default regular-expression engine through the re::engine::PCRE module. The library can be built on Unix, Windows, and several other environments. PCRE is distributed with a POSIX C wrapper, a native C++ wrapper, several test programs, and the utility programpcregrep built in tandem with the library.
Features
;Just-in-time compiler support ;Flexible memory management ;Consistent escaping rules ;Extended character classes ;Minimal matching ;Unicode character properties ;Multiline matching ;Newline/linebreak options
Newline is a linefeed character. Corresponding linebreaks can be matched with \n.
Newline is a carriage return. Corresponding linebreaks can be matched with \r.
Newline/linebreak is a carriage return followed by a linefeed. Corresponding linebreaks can be matched with \r\n.
Any of the above encountered in the data will trigger newline processing. Corresponding linebreaks can be matched with or with \R. See below for configuration and options concerning what matches backslash-R.
Any of the above plus special Unicode linebreaks. When not in UTF-8 mode, corresponding linebreaks can be matched with or \R. In UTF-8 mode, two additional characters are recognized as line breaks with : LS, and PS. On Windows, in non-Unicode data, some of the ANY linebreak characters have other meanings. For example, \x85 can match a horizontal ellipsis, and if encountered while the ANY newline is in effect, it would trigger newline processing. See below for configuration and options concerning what matches backslash-R.
;Backslash-R options ;Beginning of pattern options ;Named subpatterns ;Backreferences ;Subroutines ;Atomic grouping ;Look-ahead and look-behind assertions ;Escape sequences for zero-width assertions ;Comments ;Recursive patterns ;Generic callouts
Differences from Perl
Differences between PCRE and Perl include but are not limited to: ;Recursive matches are atomic in PCRE and non atomic in Perl ;The value of a capture buffer deriving from the ? quantifier when nested in another quantified capture buffer is different ;PCRE allows named capture buffers to be given numeric names; Perl requires the name to follow the rule of barewords ;PCRE allows alternatives within lookbehind to be different lengths ;PCRE does not support certain "experimental" Perl constructs ;PCRE and Perl are slightly different in their tolerance of erroneous constructs ;PCRE has a hard limit on recursion depth, Perl does not With the exception of the above points PCRE is capable of passing the tests in the Perl 't/op/re_tests' file, one of the main syntax level regression tests for Perl's regular expression engine.