Match and Replace quotes in a string

Hi folks, I'm using regex in vb.net to search through strings for quote characters, and replace them with a new string: "

So far I'm using this:

\B"\b([^"\x27]+?)\b"\B

and using: "$1" as the replacement string;

and as a test input string:

"Test" For "quote" "good then"

returns: "Test" For "quote" "good then"

That's fine, except if any other punctuation characters are present after the quoted word, but before the quote.

So, this string: "Test!" For "quote" "good then"

Returns: "Test!" For "quote" "good then"

The two quotes wrapping the string 'Test' are ignored.

I've found this to be true with other characters as well, and have tried to escape them inside the group, such as

\.\?, but that hasn't work.

So, any ideas on how to look for the quotes with other punctuation in the string, besides the quotes?

Thanks!

[1185 byte] By [Gary7] at [2008-1-7]
# 1

why can't u use a slight variation of your code as:

match w/

\x27([^x27]+?)\x27

then replace w/

&quote;$1&quote;

it will render the target text

"Test!" For "quote" "good then"

to

&quote;Test!&quote; For &quote;quote&quote; &quote;good then&quote;

your using of \b is inefective, because u need to handle multi-word strings like

"good then"

which have white spaces in them, so using \b loses its validity in this case.

SergeiZ at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 2

typo in my prev post: i meant:

\x27([^\x27]+?)\x27

SergeiZ at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 3

Gary7 wrote:
I'm using regex in vb.net to search through strings for quote characters, and replace them with a new string: "

If you really are just trying to replace every occurrence of the quote character with another string then using a regular expression seems to be overkill when you could just use String.Replace (String, String).


Dim test1 As String = """Test!"" For ""quote"" ""good then"""

Dim test2 As String

test2 = test1.Replace("""", """)

FrankBoyne at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 4

Hi Sergei,

Did I miss something? Your expression doesn't return any matches.

Gary

Gary7 at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 5

Hi Frank, thanks for this!

I am going to include some routines such as you have indicated, but in reality I need to test for several different characters and replace them. This includes varieties of characters as well, such as the differeing versions of quotations, etc.

Which will lead me to another question about search and replace:

Can one search for multiple characters of different types, and replace them each with diferent characters in one search pattern?

Thanks!

Gtary

Gary7 at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 6

yes, it was my fault, i used \x27 (sinfle quote) instead of \x22 (double quote); try this

\x22([^\x22]+?)\x22

SergeiZ at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 7

Thanks for this, Sergei! I caught the hex - I think it captures single quotes ' . Anyway, that does what I need and I thank you.

How about if I want to also add another character to match, as well a seperate replacement string. Can this be done in one string, or should I trap them in another regex?

any ideas?

Thanks!

Gary7 at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 8

pls post more relevant samples pertaining to what u r up to, thne we'll discuss. We cannot do it blindly, w/o target text.

SergeiZ at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 9

Hi Sergei,

I simply asked a question in principal. One could imagine any set of characters, I would think, but if not;

suppose I want to search for more than just quotes in a string. Maybe I need to also find all the single quotes and apostrophe's as well, and replace them with some other string of characters.

So with a single-item match, we used: \x22([^\x22]+?)\x22 which will find all the hex equivs of standard quotations.

But what if the quotes are in a "pretty" state - inverted at one end of the word -- and thus aren't recognized in the hex value x22?

Since "pretty quotes" would be replaced the same as regular quotes, it would stand to reason we only need to match the instances along with regular quotes, and change them both to the desired characters.

But what if I wanted to match ampersands and replace those with an html equivilent. Is this something that can be accomplished within the same expression, or would a new one need to be constructed?

That is the question, so do you need any more info?

Thanks!

Gary

Gary7 at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 10

it can be done in one regex, if that's your preference. the regex will be

\x22(?<group_inside_quotes>[^\x22]+?)\x22|(?<group_ampersand>&)

applied vs this target str:

"quoted text" and ampersand &

it will give you 2 matches:

quoted text

[and]

&

next u send those two matches to Regex.MatchEvaluator, which evaluates both matches and does replace based on the values of the two capturing groups:

<group_ampersand>

[and]

<group_inside_quotes>

implementing this logic

if(length of <group_ampersand> >0) {do replace}

if(length of <group_inside_qiotes> >0) {do replace}

*pretty quotes* can be handled similarly, as another part in the OR-ed list of possibllities.

SergeiZ at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 11

Very nice!

Thank you Sergei for the discussion. it has done more than the countless articles, examples and sample chapters I've read over the past 4 days!

I'll be experimenting with this a bit. I'll certainly post back if needed!

Thanks again!

Gary7 at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 12

u r welcome and happy regexing!

SergeiZ at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 13

Hi Sergei; I promissed I'd have more questions, and of course I do!

So, running the expression you provided:

\x22(?<quotes_group>[^\x22]+?)\x22|(?<ampersand_group>&)

using this string:

"Test This Joe," "and Push & it & hard!"

returns this analysis:

Match 1: "Test This Joe," 0 16


Group "quotes_group": Test This Joe, 1 14


Group "ampersand_group" did not participate in the match


Match 2: "and Push & it & hard!" 17 23


Group "quotes_group": and Push & it & hard! 18 21


Group "ampersand_group" did not participate in the match

So group ampersand isn't finding the ampersand character.

Does this need to be built up with the brackets and caret?

I thought the actual ampersand symbol didn't need to be

escaped; no?

Thanks!

Gary7 at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...
# 14

I guess folks gave up at thank you!

I've carried this issue to Experts Exchange, and if anyone is interested:

http://tinyurl.com/3adlj4

Gary7 at 2007-10-2 > top of Msdn Tech,.NET Development,Regular Expressions...

.NET Development

Site Classified