Regular expression to replace comma for a csv file import
Hi,
I am having trouble trying to write my own regular expression. I have tried a couple and not even sure if it is correct. If you guys know any articles that provide great explanation, please let me know.
Here is my regular expression if you guys able to fix it for me.
Regex.Replace(line, "([\w]+),([\w]+)", "$1<comma>$2")
I am trying to replace comma in this sentence and turn it into <comma> and later replace it again after splitting each column in the csv file.
"Company","Job Title","First Name","Middle Name","Last Name","Business Street","Business Street 2","Business City","Business State","Business Postal Code","Business Country","Business Phone","Mobile Phone","Business Fax","E-mail Address","Web Page","Notes:
Dear customer,
I want to thank you for your purchase.
Sincerly,
Me.
"
I am able to delete all the line breaks but cannot seperate which commas for the delimiter. Is there a way to replace char and line break together?
string.replace(",vbcrlf", "<comma><br>") << maybe?
"Company","Job Title","First Name","Middle Name","Last Name","Business
Street","Business Street 2","Business City","Business State","Business
Postal Code","Business Country","Business Phone","Mobile
Phone","Business Fax","E-mail Address","Web Page","Notes:Dear customer,I want to thank you for your purchase.Sincerly,Me."
[1915 byte] By [
doank] at [2007-12-25]
You can use REs but that is probably overkill for your need. String.Replace should work for you. The only issue with it (and your RE) is that it'll pick up all commas.
String.Replace(",", "<comma>")
You can not replace two different sets of tokens (commas and linebreaks) in one operation. You'll have to replace it twice.
However even this may be overkill for you. I assume you are reading a CSV file into memory and then processing it. It would probably be easier to use code similar to the following.
using
(StreamReader rdr = new StreamReader(filename))
{
string strLine = rdr.ReadLine();
while (strLine != null)
{
//Break it up
string[] tokens = strLine.Split(',');
string strNewLine = String.Join("<comma>", tokens);
//Do work strLine = rdr.ReadLine();
//Next
};
};The above code relies on the framework to deal with end of lines and commas. It takes up a little more memory but is really efficient. Even better is the fact that you can preprocess each token before combining them back together. For example if you need to support quoted commas you can preprocess the returned tokens to combine tokens that start and end a quoted string. Michael Taylor - 8/30/06
I only wanted to replace commas in the double quotes so that I am able to use String.Split(",") method. So before doing Split, I wanted to replace all commas inside the double quotes into <comma> so those <comma> will not be split. I believe RE is the only way to differentiate.. Or am I wrong? Either way I am still stuck hehe. Thanks Taylor.
Are all linebreaks denoted as vbCrLf in ASP.NET?
line = line.Replace(vbCrLf, "<br>") ' Is there a reason why this line of code does not work?
It should replace this sentence:
"Hello World
Good Morning World
Good Afternoon World
Good Night World
"
Into:
"Hello World<br>Good Morning World<br>Good Afternoon World<br>Good Night World<br>"
But it doesnt, I am not sure why.