Comparison of regular expression engines
This is a comparison of regular expression engines.
Libraries
List of regular expression libraries
Name |
Official website |
Programming language |
Software license |
Used by |
Boost.Regex[Note 1] |
Boost C++ Libraries |
C++ |
Boost |
|
Boost.Xpressive |
Boost C++ Libraries |
C++ |
Boost |
|
CL-PPCRE |
Edi Weitz |
Common Lisp |
BSD |
|
cppre |
Jeff Stuart |
C++ |
GPL |
|
DEELX |
RegExLab |
C++ |
Free personal and commercial use |
|
FREJ[Note 2] |
Fuzzy Regular Expressions for Java |
Java |
LGPL |
|
GLib/GRegex[Note 3] |
GLib reference manual |
C |
LGPL |
|
GRETA |
Microsoft Research |
C++ |
? |
|
ICU |
International Components for Unicode |
C, C++[Note 4] |
ICU |
Foundation (Apple and Swift open-source versions) |
Jakarta/Regexp |
The Apache Jakarta Project |
Java |
Apache |
|
java.util.regex |
Java's User manual |
Java |
Java license |
jEdit |
JRegex |
JRegex |
Java |
BSD |
|
Oniguruma |
Kosako |
C |
BSD |
Atom, Take Command Console, Tera Term, TextMate, Sublime Text and SubEthaEdit |
Pattwo |
Stevesoft |
Java (compatible with Java 1.0) |
LGPL |
|
PCRE |
pcre.org |
C, C++[Note 5] |
BSD |
Nginx, Julia, HHVM, Notepad++ |
Qt/QRegExp |
Digia |
C++ |
Qt GNU GPL v. 3.0,
Qt GNU LGPL v. 2.1,
Qt Commercial |
Kate, Kile |
regex - Henry Spencer's regular expression libraries |
ArgList |
C |
BSD |
|
RE2 |
RE2 |
C++ |
BSD |
|
Henry Spencer's Advanced Regular Expressions |
Tcl |
C |
BSD |
|
SubReg |
Matt Bucknall |
C |
MIT |
|
TRE[Note 2] |
Ville Laurikari |
C |
BSD |
|
TPerlRegEx |
TPerlRegEx VCL Component |
Object Pascal |
MPLv1.1 |
|
TRegExpr |
RegExp Studio |
Object Pascal |
Dual-license: freeware, or LGPL with static linking exception |
|
RGX |
RGX |
C++ based component library |
P6R |
|
XRegExp |
XRegExp |
JavaScript |
MIT |
|
Wolfram Language (Mathematica) |
Wolfram Language Documentation Center |
Wolfram Language |
|
Mathematica, the Wolfram Development Platform |
- ↑ Formerly called Regex++
- 1 2 One of fuzzy regular expression engines
- ↑ Included since version 2.13.0
- ↑ ICU4J, the Java version, does not support regular expressions.
- ↑ C++ bindings were developed by Google and became officially part of PCRE in 2006.
Languages
List of languages and frameworks including regular expression support
Language |
Official website |
Software license |
Remarks |
.NET |
MSDN |
MIT License[Note 1][Note 2] |
|
POSIX C (C) |
libc/regex from BSD |
BSD |
According to regex(3), available from at least 4.4BSD (if not earlier) |
C++11 (C++) |
C++ standards website |
? |
Since ISO14822:2011(e) |
D |
D |
Boost Software License[Note 3] |
|
Go |
Golang.org |
BSD-style |
|
Haskell |
Haskell.org |
BSD3 |
Omitted in the language report, and in GHC's Hierarchical Libraries |
Java |
Java |
GNU General Public License |
REs are written as strings in source code: all backslashes must be doubled, harming readability. |
JavaScript (ECMAScript) |
ECMA-262 |
BSD3 |
Limited but REs are first-class citizens of the language with a specific /.../mod syntax. |
Julia |
JuliaLang.org |
MIT License |
REs are part of the language core library using PCRE built-in and an optional wrapper for (C code) ICU is available. |
Lua |
Lua.org |
MIT License |
Uses simplified, limited dialect; can be bound to more powerful library, like PCRE or an alternative parser like LPeg. |
Mathematica |
Wolfram |
Proprietary |
|
Free Pascal (Object Pascal) |
www.freepascal.org |
LGPL with static linking exception |
Free Pascal 2.6+ ships with TRegExpr from Sorokin and two other regular expression libraries; See wiki.lazarus.freepascal.org/Regexpr. |
OCaml |
Caml |
LGPL |
As of 2010, the standard module is generally regarded as deprecated;[1] often recommended libraries are pcre (with full support for PCRE) and re (which is not as complete but claims better performance and provides frontends to popular syntaxes: PCRE, Perl, Posix, Emacs, shell globbing). |
Perl |
Perl.com |
Artistic License, or GNU General Public License |
Full, central part of the language |
PHP |
PHP.net |
PHP License |
Has two implementations, with PCRE being the more efficient in speed, functions |
Python |
python.org |
Python Software Foundation License |
Python has two major implementations, the built in re and the regex library. |
Ruby |
ruby-doc.org |
GNU Library General Public License |
Ruby 1.8 and 1.9 use different engines; 1.9 integrates Oniguruma. |
SAP ABAP |
SAP.com |
Proprietary |
|
Tcl |
tcl.tk |
Tcl/Tk License (BSD-style) |
Tcl library doubles as a regular expression library. |
ActionScript 3 |
ActionScript Technology Center |
Free |
|
Wolfram Language |
Wolfram Research |
Proprietary; usable for free on a limited scale on the Wolfram Development platform. |
|
Language features
NOTE: An application using a library for regular expression support does not necessarily offer the full set of features of the library, e.g. GNU grep which uses PCRE does not offer lookahead support, though PCRE does.
Part 1
Language feature comparison (part 1)
|
"+" quantifier |
Negated character classes |
Non-greedy quantifiers[Note 1] |
Shy groups[Note 2] |
Recursion |
Look-ahead |
Look-behind |
Backreferences[Note 3] |
>9 indexable captures |
Boost.Regex |
Yes |
Yes |
Yes |
Yes |
Yes[Note 4] |
Yes |
Yes |
Yes |
Yes |
Boost.Xpressive |
Yes |
Yes |
Yes |
Yes |
Yes[Note 5] |
Yes |
Yes |
Yes |
Yes |
CL-PPCRE |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
EmEditor |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
No |
FREJ |
No[Note 6] |
No |
Some[Note 6] |
Yes |
No |
No |
No |
Yes |
Yes |
GLib/GRegex |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
GNU grep |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
? |
Haskell |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
ICU Regex |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
Java |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
JavaScript (ECMAScript) |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
No |
Yes |
Yes |
JGsoft |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
Lua |
Yes |
Yes |
Some[Note 7] |
No |
No |
No |
No |
Yes |
No |
.NET |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
OCaml |
Yes |
Yes |
No |
No |
No |
No |
No |
Yes |
No |
OmniOutliner 3.6.2 |
Yes |
Yes |
Yes |
No |
No |
No |
No |
? |
? |
PCRE |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Perl |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
PHP |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Python |
Yes |
Yes |
Yes |
Yes |
Yes[Note 8] |
Yes |
Yes |
Yes |
Yes |
Qt/QRegExp |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
No |
Yes |
Yes |
R[Note 9] |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
RE2 |
Yes |
Yes |
Yes |
Yes |
No |
No |
No |
No |
Yes |
Ruby |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
TRE |
Yes |
Yes |
Yes |
Yes |
No |
No |
No |
Yes |
No |
Vim 7.4b.000 (28 July 2013 (2013-07-28)) [±] |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
No |
RGX |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
Tcl |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
TRegExpr |
Yes |
? |
Yes |
? |
? |
? |
? |
? |
? |
XRegExp |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
No |
Yes |
Yes |
Part 2
Language feature comparison (part 2)
|
Directives[Note 1] |
Conditionals |
Atomic groups[Note 2] |
Named capture[Note 3] |
Comments |
Embedded code |
Unicode property support [2] |
Balancing groups[Note 4] |
Variable-length look-behinds[Note 5] |
Boost.Regex |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
Some[Note 6] |
No |
No |
Boost.Xpressive |
Yes |
No |
Yes |
Yes |
Yes |
No |
No |
No |
No |
CL-PPCRE |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Some[Note 6] |
No |
No |
EmEditor |
Yes |
Yes |
? |
? |
Yes |
No |
? |
No |
No |
FREJ |
No |
No |
Yes |
Yes |
Yes |
No |
? |
No |
No |
GLib/GRegex |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
Some[Note 6] |
No |
No |
GNU grep |
Yes |
Yes |
? |
Yes |
Yes |
No |
No |
No |
No |
Haskell |
? |
? |
? |
? |
? |
No |
No |
No |
No |
ICU Regex |
Yes |
No |
Yes |
No |
Yes |
No |
Yes |
No |
No |
Java |
Yes |
No |
Yes |
Yes[Note 7] |
Yes |
No |
Some[Note 6] |
No |
No |
JavaScript (ECMAScript) |
No |
No |
No |
No |
No |
No |
No |
No |
No |
JGsoft |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
Some[Note 6] |
No |
Yes |
Lua |
No |
No |
No |
No |
No |
No |
No |
No |
No |
.NET |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
Some[Note 6] |
Yes |
Yes |
OCaml |
No |
No |
No |
No |
No |
No |
No |
No |
No |
OmniOutliner 3.6.2 |
? |
? |
? |
? |
No |
No |
? |
No |
No |
PCRE |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
No |
Perl |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
No |
PHP |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
No |
No |
No |
Python |
Yes |
Yes |
Yes[Note 8] |
Yes |
Yes |
No |
Yes[Note 9] |
No |
Yes[Note 8] |
Qt/QRegExp |
No |
No |
No |
No |
No |
No |
No |
No |
No |
RE2 |
Yes |
No |
? |
Yes |
No |
No |
Some[Note 6] |
No |
No |
Ruby |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Some[Note 6] |
No |
No |
Tcl |
Yes |
No |
Yes |
No |
Yes |
No |
Yes |
No |
No |
TRE |
Yes |
No |
No |
No |
Yes |
No |
? |
No |
No |
Vim |
Yes |
No |
Yes |
No |
No |
No |
No |
No |
Yes |
RGX |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
No |
No |
XRegExp |
Leading only |
No |
No |
Yes |
Yes |
No |
Yes |
No |
No |
- ↑ Also known as Flags modifiers or Option letters. Example pattern: "(?i:test)".
- ↑ Also called Independent sub-expressions
- ↑ Similar to back references but with names instead of indices
- ↑ Special feature allowing to match balanced constructs without recursion
- ↑ Refers to the possibility of including quantifiers in look-behinds, thus making their length unpredictable
- 1 2 3 4 5 6 7 8 Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode
revision is released until they are updated to comply.
- ↑ Available as of JDK7
- 1 2 Supported by the optional regex library only.
- ↑ May only be available in the regex library when used with Python versions after 3.3
API features
- 1 2 Means the format can be used internally without explicit conversion.
- ↑ Partial match of the whole regular expression. For example the pattern ".*END$" will match any string partially, but only strings ending with END fully.
- ↑ Supports Unicode 4.0 standard from 2003; latest plans for JDK7 include Unicode 6.0 (2011) support.
- ↑ Implementation uses original UCS-2 support/features, so it only recognizes 64K chars total (vs UTF-16's 1,112,064 characters). A Microsoft developer-representative answered a bug report on this as "will not fix" in 2010..
- ↑ Since version 8.30
- ↑ Tcl includes facilities to convert to and from UTF-8.
- ↑ wxRegEx uses any system supplied POSIX library or if not available and for Unicode mode uses Henry Spencer's library.
See also
References
External links