Regular Expressions, also known as Regex, in C# define patterns that are used to match, find, or replace text inside C# strings. A regex pattern is a special string that defines the criteria using which a string is searched, matched, or replaced.
In addition to Regex, the String and Array classes in C# language contain a variety of functions that provide string manipulation capabilities. For example, split(), indexOf(), find() etc.
Both Regex and String/Array functions offer string manipulation functionalities such as splitting strings and finding specific elements within a string etc.
In this article, you will see how to use Regex and String functions to split a string, to find an index of a string within another string, and to find different elements within a string, with the help of examples. At the end of each section, a final verdict is given about whether to use Regex or String/Array functions in order to perform the corresponding tasks.
Table of Contents
- Regular Expression vs split()
- Regular Expressions vs IndexOf()
- Regular Expressions vs find()
- Regular Expressions vs Contains()
- Regular Expressions vs StartsWith() and EndsWith()
- Regular Expressions vs Replace()
Regular Expression vs split()
The split() functions from both the String class and the Regex library returns a string split into tokens. In this section, you will see both ways of splitting strings in C#.
Regex.Split()
The Split() function from Regex can be used to split a string in C#. Furthermore, In C# String.split() function also splits a string.
Let’s see a simple example of how to split a string with the Regex.Split() function.
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string text = "Hello*welcome to*C#"; foreach (var tokens in Regex.Split(text, @"\*")) Console.WriteLine(tokens); Console.WriteLine("\nWith delimiters =========\n"); string text2 = "Hello*welcome to* C#"; foreach (var tokens in Regex.Split(text, @"(\*)")) Console.WriteLine(tokens); } } }
In the script above, the Regex.Split() function splits the text “Hello*welcome to*C#” using the asterisk as a delimiter. The first parameter to the Regex.Split() function is the text string, while the second parameter is the Regex expression that is to be matched. The Regex expression used is @“\*”. This Regex expression splits a string at indexes where an asterisk occurs. In the output, the asterisk itself is not included.
Next, in order to include delimiter asterisk in the out, the regex expression @”(\*)” splits the string. Notice round brackets in the regular expression, these round brackets tell the Split() function to include the delimiter in the split string.
Here is the output of the script:
Hello welcome to C# With delimiters ========= Hello * welcome to * C#
The output clearly shows that the first script doesn’t include delimiters while the second script included delimiter while splitting a string.
String.Split()
Alternatively, you can use String.Split() function to split a string. With String.Split() function, you simply need to pass the delimiter that you want to use to split a string.
Here is an example:
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string text3 = "Hello * welcome to * C#"; foreach (var tokens in text3.Split("*")) Console.WriteLine(tokens); Console.WriteLine("With delimiters ========="); string text4 = "Hello * welcome to * C#"; foreach (var tokens in text4.Split("(*)")) Console.WriteLine(tokens); } } }
Here is the output of the script above:
Hello welcome to C# With delimiters ========= Hello*welcome to*C#
In the scrip above, the String.split() function has successfully split the string. However, using (“*”) as a delimiter did not split the string, which shows that unlike Regex.Split() function, the String.split() function cannot be used to split a string including delimiters.
Final Verdict
If you have simple splitting tasks that involve splitting a string using simple delimiters such as commas, asterisks, etc, use String.Split() function. For complex string splitting involving multiple patterns, use Regex.Split(). Furthermore, if you need to include delimiters in the split string, use Regex.Split() since String.split() doesn’t include delimiters in the split string.
Regular Expressions vs IndexOf()
The String.indexOf() function in C# returns the zero-based index for the first character or substring within an input string. The Regex.Match() can also return the index of the first occurrence of a character or a substring within an input string. In this section, you will see both of the ways.
String.indexOf()
Let’s first see how the String.indexOf() function returns the index of the first occurrence of a character or string.
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { String text = "Welcome to CSharp"; int index = text.IndexOf("a"); Console.WriteLine(index); } } }
The above script returns the index of the first occurrence of the character “a” which is 14 as shown in the output below:
14
Let’s see another example of the indexOf() function where we print the index of the first occurrence of any digit. To do so, we need to pass an array of digits from 1 to 9 to the indexOfAny() function.
namespace RegexCodes { class Program { static void Main(string[] args) { String text = "Welcome to CSharp 2020"; int index = text.IndexOfAny("0123456789".ToCharArray()); Console.WriteLine(index); } } }
Here is the output of the script above:
18
Regex.Match()
In this section, you will see how Regex.Match() function is used to return the index of the first occurrence of the matched character.
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { String text = "Welcome to CSharp"; int index = Regex.Match(text, "a").Index; Console.WriteLine(index); } } }
Here is the output.
14
Finally, you can again use Reges.Match() function to return the first index of any of the digits. To do so, you can use “\d” Regex expression as shown below:
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { String text = "Welcome to CSharp 2020"; int index = Regex.Match(text, @"\d").Index; Console.WriteLine(index); } } }
18
Final Verdict
If you want to get the index of a single literal word or character such as digit “9”, or character “a”, you should C# String.indexOf() function. However, if you have a complex pattern to match involving multiple characters in the form of variables such as all digits, all words, or all special characters, use Regex.Match() function.
Regular Expressions vs find()
The find() function in C# is an array function that returns the items in an array that matches certain criteria. The same function can be performed by Regex.Match() or Regex.Matches() function. Let’s see examples of both.
Array.find()
The Array.find() function is applicable to an array. The following script uses the find() function to return the word in the input string that starts with “20”. You can see that the string has to be first converted into an array of words before the find() function operates on it.
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string [] text = "Welcome to CSharp 2020 and 3050".Split(); var result = Array.Find(text, x => x.StartsWith("20")); Console.WriteLine(result); } } }
Here is the output:
2020
Regex.Matches()
Let’s now use Regex.Matches() function to return all the words in the input string that start with a digit.
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string text = "Welcome to CSharp 2020 and 3050"; var result = Regex.Matches(text, @"(\d+)"); foreach(var word in result) Console.WriteLine(word.ToString()); } } }
Here is the output of the script above.
2020 3050
Final Verdict
In order to find words with a text, Regex.Match() or Regex.Matches() is preferred over the Array.find() function for two reasons. With Regex, you do not need to split the input string into an array of words. And secondly, with Regex, you can easily perform generic matches such as any digit or any words. With Find(), you have to specify literal values to search.
Regular Expressions vs Contains()
The Contains() method from the default C# string class searches for a particular character or substring within another string and returns true if that string is found. Else the method returns false. The string to be searched is passed as a parameter to the Contains() method. The difference between the Find() and Contains() method is that Find() is an array method that returns the whole item that matches a certain pattern. On the other hand, Contains() is a string method that returns a boolean value if a substring is found within another string. You can also use the Regex Match() method to search for the existence of a string within another string. You will see both the example of both the methods in this section.
String.Contains()
The following script uses the Contains() method to see if the substring “red cat” exists within another string or not. Since the substring, “red cat” exists within the string “Big red cat jumps over lazy dog”, the method returns true.
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string sent = "Big red cat jumps over lazy dog"; if(sent.Contains("red cat")) { Console.WriteLine("Cat found in the sentence"); } } } }
Output:
Cat found in the sentence
Regex.Match()
The Match() method from C# Regex library can also be used to find if a string contains another substring. To do so, you need to create an object of the Regex class and pass the substring to be searched to the constructor of the Regex class. Next, you need to call the Match() method on the Regex class object and pass the input string to the method. The Match() method returns an object. To see if the input contains the substring, you need to check the value of the Success attribute of the object. Look at the following example.
using System; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string sent = "Big red black cat jumps over lazy dog"; Regex reg = new Regex("red cat"); Match match = reg.Match(sent); if (match.Success) { Console.WriteLine("Cat found in the sentence"); } } } }
Output:
Cat found in the sentence
With Regex Match() function, you can search for multiple substrings within a string. You have to use wildcards for that. For instance, in the following script the wildcard (.*) searches for every word. For example, to search for a substring that contains any word before the word “cat”, you can use the regex expression: “(.*) cat” as shown in the following script. The script below searches for the Regex in three sentences. Since the first two sentences contain the word before the word “cat”, a boolean value of True will be returned for the first two sentences.
using System; using System.Text.RegularExpressions; using System.Collections.Generic; namespace RegexCodes { class Program { static void Main(string[] args) { string sent1 = "Big red black cat jumps over lazy dog"; string sent2 = "small green cat jumps over lazy dog"; string sent3 = "green dog is cute"; List<string> sents = new List<string>(); sents.Add(sent1); sents.Add(sent2); sents.Add(sent3); Regex reg = new Regex("(.*) cat"); foreach (string sent in sents) { Match match = reg.Match(sent); if (match.Success) { Console.WriteLine("Cat found in the sentence"); } } } } }
Output:
Cat found in the sentence Cat found in the sentence
Final Verdict
If you want to search for a string literal or one string, it is preferred to use the Contains() method from the default string class. Else if you have to search for multiple substrings with complex patterns, inside another substring, you should use the Match() method from Regex.
Regular Expressions vs StartsWith() and EndsWith()
You can use both default string functions or Regex functions to check if a string starts or ends with a particular string or a character. Let’s examples of both.
String.StartsWith() and String.EndsWith()
In C#, the StartsWith() returns true if a string starts with the character or substring passed as a parameter to the StartsWith() function. Similarly, the EndsWith() function checks if a string ends with a particular string or a character. Look at the following example.
using System; using System.Text.RegularExpressions; using System.Collections.Generic; namespace RegexCodes { class Program { static void Main(string[] args) { string sent1 = "Big red black cat jumps over lazy dog"; if (sent1.StartsWith("B")) { Console.WriteLine("the sentence starts with B"); } string sent2 = "small green cat jumps over lazy dog"; if (sent2.EndsWith("g")) { Console.WriteLine("the sentence ends with g"); } } } }
Output:
the sentence starts with B the sentence ends with g
In the above script both the StartsWith() and EndsWith() functions return true since the first string starts with B while the second string ends with g.
Regex.Match()
With Regex, you can again use the Match() function to check if a string starts or ends with a particular character or string. To check if a string is present at the start, you can use the regex expression: “^” followed by a string or a character. For instance, if you want to check if a string starts with the character “a”, you can use the regex expression “^a”. Once the string is passed to the Match() function, you can use the success method of the object returned by the Match() function to see if the regex expression returns true. Furthermore, with the Match() function, you can also check multiple characters or strings. For example, in the following script, you use the regex expression “^(B|s|g)” which returns true if a string starts with “B”, “s”, or “g”.
using System; using System.Text.RegularExpressions; using System.Collections.Generic; namespace RegexCodes { class Program { static void Main(string[] args) { string sent1 = "Big red black cat jumps over lazy dog"; string sent2 = "small green cat jumps over lazy dog"; string sent3 = "green dog is cute"; List<string> sents = new List<string>(); sents.Add(sent1); sents.Add(sent2); sents.Add(sent3); Regex reg = new Regex("^(B|s|g)"); foreach (string sent in sents) { Match match = reg.Match(sent); if (match.Success) { Console.WriteLine("Cat found in the sentence"); } } } } }
Output
Cat found in the sentence Cat found in the sentence Cat found in the sentence
Similarly, to see if a string ends with a particular character or string, you can use the “$” expression. The character or string that you want to check, should be passed before the “$” sign. For instance, if you want to check if a string ends with the letter “a”, you should use “a$”. You can also check multiple strings or letters to see if any of them occurs at the end of a string. For example, you can use the following regex expression to see if a string ends with the letters “B”, “e”, or “g”.
Regex reg = new Regex("(B|e|g)$");
Final Verdict
If you have to search the start or end of a string for a string literal, it is always convenient and faster to use the default string methods StartsWith() and EndsWith(), respectively. On the other hand, if you want to search the start or end of a string for multiple characters or substrings, you can use the regex Match() method.
Regular Expressions vs Replace()
The Replace() method from the string class replaces a substring within an input string, with another substring passed to the Replace() method as a parameter. The following script replaces a substring “Green” with “Black” inside the input string “Green cat jumped over the lazy dog”.
String.Replace()
using System; using System.Text.RegularExpressions; using System.Collections.Generic; namespace RegexCodes { class Program { static void Main(string[] args) { string input = "Green cat jumped over the lazy dog"; Console.WriteLine(input); string output = input.Replace("Green", "Black"); Console.WriteLine(output); } } }
Output:
Green cat jumped over the lazy dog Black cat jumped over the lazy dog
Regex.Replace()
The Regex module also contains a Replace() method which can replace a substring within another string. With Regex, you can specify complex string patterns to replace. For instance, in the following script, the Regex expression replaces any five character word with starts with “G” with the word Black.
using System; using System.Text.RegularExpressions; using System.Collections.Generic; namespace RegexCodes { class Program { static void Main(string[] args) { string input = "Green cat jumped over the lazy dog"; Console.WriteLine(input); string output = Regex.Replace(input, @"G....", "Black"); Console.WriteLine(output); } } }
Output:
Green cat jumped over the lazy dog Black cat jumped over the lazy dog
Final Verdict
You should use the Replace() method from the string class when you want to replace a string literal or a single string inside another string. However, for more complex patterns, use the Replace() method from the Regex module