Hello, I made some regular expression functions in Power Query to enhance text processing capabilities.
Background:
Currently, Power Query primarily relies on basic text functions such as Text.Contains() and Text.Select(), which do not support complex text cleansing requirements. Although users can implement regular expression functionality via Python or custom functions, this approach increases complexity and reduces both readability and maintainability.
It is worth noting that Excel has already added three regular expression functions for Microsoft 365 users in 2024: REGEXTEST, REGEXEXTRACT, and REGEXREPLACE
https://techcommunity.microsoft.com/blog/microsoft365insiderblog/new-regular-expression-regex-functi...

Suggestions:
Based on Excel's implementation, I have already finished custom functions in Power Query for your reference:
Text.RegexTest
Text.RegexTest(text as text, pattern as text, optional case_sensitivity as logical) as logical

Tests if the specified text matches a given regular expression pattern. Returns true if it matches, otherwise returns false. case_sensitivity is optional, used to indicate whether to be case-sensitive.
let
// Implementation of Text.RegexTest function to check if a string matches a given regex pattern
Text.RegexTest = (text as text, pattern as text, optional caseSensitivity as number) =>
let
// Set default caseSensitive to 0 (case-insensitive) if not provided
caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
// Set regex modifier based on case sensitivity setting (0 = case-sensitive, otherwise case-insensitive)
regexModifier = if caseSensitivityValue = 0 then "" else "i",
// Construct the JavaScript regex expression
regexExpression = "/" & pattern & "/" & regexModifier,
// Create HTML with embedded JavaScript to evaluate the regex pattern
htmlContent =
"<script>" &
"var matches = """ & text & """.match(" & regexExpression & ");" &
"document.write(matches ? 'true' : 'false');" &
"</script>",
// Execute the JavaScript using Web.Page and extract the result
resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0},
// Convert string result to a logical value
resultValue = if resultText = "true" then true else false
in
resultValue
in
Text.RegexTest
Text.RegexExtract
Text.RegexExtract(text as text, pattern as text, optional return_mode as text, optional case_sensitivity as logical) as text

Extracts substrings from text that match the regular expression pattern. return_mode is optional, used to specify the return mode (e.g., return first match, complete match, or capturing groups), case_sensitivity is optional, used to control whether to be case-sensitive.
let
// Text.RegexExtract function to extract content matching a given regex pattern
Text.RegexExtract = (
text as text,
pattern as text,
optional returnMode as number,
optional caseSensitivity as number
) =>
let
// Set default return mode (0 = first match only) if not provided
returnModeValue = if returnMode is null then 0 else returnMode,
// Set default case sensitivity (0 = case-sensitive) if not provided
caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
// Set regex modifier based on case sensitivity (g = global, i = case-insensitive)
regexModifier = if caseSensitivityValue = 1 then "g" else "gi",
// Construct the JavaScript regex expression
regexExpression = "/" & pattern & "/" & regexModifier,
// Generate JavaScript logic based on return mode:
// 0 = first match only
// 1 = all matches as array
// other = capturing groups
javascriptLogic = if returnModeValue = 0 then
"var match = str.match(regex); var res = match ? match[0] : null;"
else if returnModeValue = 1 then
"var res = str.match(regex);"
else
"var match = regex.exec(str); res = match ? match.slice(1) : null;",
// Build HTML content with embedded JavaScript
htmlContent = Text.Combine({
"<script>",
"var regex = ", regexExpression, ";",
"var str = """, text, """;",
javascriptLogic,
"document.write(res)",
"</script>"
}),
// Execute JavaScript via Web.Page and extract the result
resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in
// Return extracted text
resultText
in
Text.RegexExtract
Text.RegexReplace
Text.RegexReplace(text as text, pattern as text, replacement as text, optional occurrence as number, optional case_sensitivity as logical) as text

Replaces parts of the text that match the regular expression pattern with the specified replacement string. occurrence is optional, specifies the index of the match to replace (if omitted, replaces all matches by default), case_sensitivity is optional, used to indicate whether to be case-sensitive.
let
// Text.RegexReplace function to replace content matching a regex pattern with specified text
Text.RegexReplace = (
text as text, // Original text
pattern as text, // Regex pattern (e.g., "[0-9]{3}-")
replacement as text, // Replacement text
optional occurrence as number, // Which occurrence to replace (0 = all, default)
optional caseSensitivity as number // Case sensitivity (0 = case-sensitive, default)
) =>
let
// Handle default parameters
occurrenceValue = if occurrence = null then 0 else occurrence,
caseSensitivityValue = if caseSensitivity = null then 0 else caseSensitivity,
// Generate JavaScript regex with appropriate modifiers
regexModifier = if caseSensitivityValue = 0 then "g" else "gi",
regexExpression = "/" & pattern & "/" & regexModifier,
// Build JavaScript replacement logic
javascriptCode =
"var regex = " & regexExpression & ";" &
"var str = '" & text & "';" &
"var replacement = '" & replacement & "';" &
"var occ = " & Text.From(occurrenceValue) & ";" &
"
var matches = [];
var match;
while ((match = regex.exec(str)) !== null) {
matches.push({
index: match.index,
length: match[0].length
});
}
if (occ === 0) {
// Replace all occurrences
document.write(str.replace(regex, replacement));
} else {
// Calculate target position
var targetIndex;
if (occ > 0) {
targetIndex = occ - 1;
} else {
targetIndex = matches.length + occ;
}
if (targetIndex < 0 || targetIndex >= matches.length) {
document.write(str);
} else {
var target = matches[targetIndex];
var res = str.substring(0, target.index) +
replacement +
str.substring(target.index + target.length);
document.write(res);
}
}
",
// Execute JavaScript and get result
htmlContent = "<script>" & javascriptCode & "</script>",
resultText = Web.Page(htmlContent)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in
resultText
in
Text.RegexReplace
I think these custom functions will enable Power Query users to handle complex text pattern matching tasks more efficiently. I look forward to any further discussion with you!