search words sequence in a string with regular expression
HI,
I've got some strings that contain the following text:
1) "That film is very beautifull".
2) "How much beautifull is the film?".
3) "The film is boring".
I need, if it is possible with RegEx,
to search if the string contains "film" AND "beautifull".
I've tried with the following pattern: "film.*beaut.*",
but IsMatch method returns only 1)
I'm not interested in the exact sequence of the words. Can I achieve this with regex?
I know that I could do "(film.*beaut.*)|(beaut.*film.*)" but the number of the words to search is variable, so i don't want to perform all the possible combinations of the words in the search pattern.
Thanks in advance
Marco
[810 byte] By [
Abunet] at [2008-3-7]
The problem with your expression is that it only works if film is followed by beautiful. Instead use the expression
film | beautiful
Now use Matches to get all matches in the string of either film or beautiful.
Regex
re = new Regex("film|beautiful");foreach (Match match in re.Matches(input))
{
Console.WriteLine(match.Value);
};Michael Taylor - 11/13/06
Thanks Michael,
Are you confirming that doesn't exist a pattern that checks if n words (in any sequence) are present in a string?
Take this string, for example:
string input = "That film is very beautiful. One of the Best film I've ever seen";
I need to know if film an beautiful are present.
Regex re = new Regex("film|beautiful");
foreach (Match match in re.Matches(input))
{
Console.WriteLine(match.Value);
};
your test will return three matches: "That film is very beautiful. One of the Best film I've ever seen"
Maybe you haven't understood the question. I know how to achive my result. For example I can use the String.IndexOf method to get the position of the string I'm searching. For example:
string input = "That film is very beautiful. One of the Best film I've ever seen";
string[] words = "film beautiful".ToLower().Split(new string[] { " " }, System.StringSplitOptions.RemoveEmptyEntries);
bool foundAllWords = true;
foreach (string w in words)
{
if (input.IndexOf(w) < 0)
foundAllWords = false;
}
if (foundAllWords)
Console.WriteLine("The phrase match the pattern");
else
Console.WriteLine("The phrase doesn't match the pattern");
but, unfortunately it doesn't answer my question.
To test for both you can use this expression:
film.*beautiful|beautiful.*film
I believe it will work. The problem with your original expression was that it would only detect film followed by beautiful. The above expression should handle either.
Michael Taylor - 11/14/06
Michael,
In the first post, I've already written that I can write a pattern like "film.*beautiful|beautiful.*film"
what happens if I need to search four words? I need to write in the pattern all the combination of them?
I wrote:
I know that I could do "(film.*beaut.*)|(beaut.*film.*)" but the number of the words to search is variable, so i don't want to perform all the possible combinations of the words in the search pattern.