Cash Talks - BS Walks RSS 2.0
# Saturday, 01 January 2011

I had an interesting problem where I needed to protect text within Quoted Strings from a series of Regex search and replace operations.  Trying to write even simple regular expressions replace process that would exclude text in quotes was too painful for my understanding of Regex.  I’m sure there are gurus who can throw that out but I’m not one of them. 

While my job was more extensive than simply upper case, what if you wanted to change all characters in a string to upper case except for ones within double quotes?  For example:

The “quick brown fox” jumped over the “lazy dog”
would become
THE “quick brown fox” JUMPED OVER THE “lazy dog”

There are number of quoted string regex expressions out there and I needed one that provided for escaped “ within the string that had a preceding  \.

The following regex replace would replace quoted strings with the word STUFF

[regex]::Replace($value,'"([^"\\]*(\\.[^"\\]*)*)"', 'STUFF')

Not much help but that’s where the regex MatchEvaluator comes in to play.  It’s a function that gets called for each match and the text it returns is what gets used.  For example, what if I wanted to upper case only characters within double quotes?  The following code would make quick work of it.

$value = 'The "quick brown fox" jumped over the "lazy dog"'

 

$QuotedTextMatchEvaluator = [System.Text.RegularExpressions.MatchEvaluator]{

       $args[0].ToString().ToUpper()

}

 

[regex]::Replace($value,'"([^"\\]*(\\.[^"\\]*)*)"',$QuotedTextMatchEvaluator)

This produces
The “QUICK BROWN FOX” jumped over the “LAZY DOG”

Simple enough but as I said it was the opposite of what I needed and I was doing more interesting work than generating uppercase.  My idea was to grab all of the quoted strings, save them in an array and then restore them when I was done.  The save-quotedstrings function replaces each quoted string with a unique token then restore-quotedstrings replaces the token with the saved string.  While context will dictate what is possible as a token, I chose to use an ASCII 1, a rarely used non-visible character followed by an incrementing number for each quoted string.  save-quotestrings returns the tokenized string and the array of saved quoted strings.  restore-quotedstrings takes a tokenized string and the saved string array and restores them.

 

function save-quotedstrings([string]$value)

{

       $SavedStrings = @()

       $QuotedTextMatchEvaluator = [System.Text.RegularExpressions.MatchEvaluator]{

              $SavedStrings += $args[0].ToString()

              [char]1 + ($SavedStrings.Count)

       }

       [regex]::Replace($value,'"([^"\\]*(\\.[^"\\]*)*)"',$QuotedTextMatchEvaluator),$SavedStrings

}

 

function restore-quotedstrings([string]$value,[array]$savedStrings)

{

       (1..$savedStrings.Count) | %{$value = $value.Replace([char]1 + "$_",$savedStrings[$_-1])}

       $value

}

 

 

$value = 'This "is \"a\" test" of a "quoted string" saver'

$value

$value,$savedStrings = save-quotedstrings $value

$value

$value = $value.ToUpper()

$value

$value = restore-quotedstrings $value $savedStrings

$value

The preceding code generates the following output.

This "is \"a\" test" of a "quoted string" saver
This _1 of a _2 saver
THIS _1 OF A _2 SAVER
THIS "is \"a\" test" OF A "quoted string" SAVER

Note:  the _ is show instead of the non-displayable ASCII 1.

Saturday, 01 January 2011 00:16:06 (GMT Standard Time, UTC+00:00)  #    Comments [55] -
Powershell
Archive
<2011 January>
SunMonTueWedThuFriSat
2627282930311
2345678
9101112131415
16171819202122
23242526272829
303112345
About the author/Disclaimer

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© Copyright 2017
Cash Foley
Sign In
Statistics
Total Posts: 20
This Year: 0
This Month: 0
This Week: 0
Comments: 5994
Themes
Pick a theme:
All Content © 2017, Cash Foley
DasBlog theme 'Business' created by Christoph De Baene (delarou)