Please note javascript is required for full website functionality.

Blog

A to Z of Excel Functions: The REGEXEXTRACT function

26 August 2024

Welcome back to our regular A to Z of Excel Functions blog.  Today we look at the REGEXEXTRACT function.

 

The REGEXEXTRACT function

The term “regex” is an abbreviation of “regular expressions” and is frequently implemented in various programming languages such as C, C++, Java, Python, VBScript – and now,Excel.

Microsoft has stated that the version of Regex coming to Excel uses a “flavor” (sic) called PCRE2 (PHP>=7.3) for those that need to know the underlying technical stuff. 

Clearly, we need to learn a little about “regular expressions” before continuing.  Alternatively referred to “rational expressions” upon occasion, a regular expression is a sequence of characters that specifies what is known as a “match pattern” in text.  You have most likely used this functionality in Excel already, with features such as “Find and Relace” or by using the FIND or SEARCH functions in Excel.  The purpose of this function is to help you match, locate and manage text (strings) in Excel.

The text is obvious but understanding patterns requires you to learn the syntax for regular expressions.  Here is a crash course table, which summarises some – but not all – of the main elements, usually referred to as “tokens”.

Now we are all experts in regex, let’s consider the REGEXEXTRACT function.  This function is used extract one or more strings that match a specified pattern from the text being analysed.  You may extract the first match, all matches or capturing groups from the first match.  Its syntax is as follows:

REGEXEXTRACT(text, pattern, [return_mode], [ignore_case])

It has the following three arguments:

  • text: this is required, and represents the text you are searching within
  • pattern: this is also required.  This is the regular expression to be applied
  • return_mode: the first of two optional arguments, this specifies which matches to return.  It has three alternatives:

 


    Capturing groups are part of a regular expression (“regex”) pattern surrounded by parentheses “(…)”.  They allow you to return separate parts of a single match individually
  • ignore_case: the final (optional) argument.  This determines whether the match should be case sensitive.  It has the following two [2] options:

This function always returns text values.  You may convert these results back to numerical values using the VALUE function.

Consider the following examples:


We’ll continue our A to Z of Excel Functions soon.  Keep checking back – there’s a new blog post every business day.  
A full page of the function articles can be found here


Newsletter