Saturday, March 31, 2012

Trimming a string to remove special & non-numeric characters

the string format is "XXXXXX-06-X-1234". how can i trim it so that it removes all the dashes and non-numeric characters? the result after such a trim would be "061234"

thanks in advance.

Create a function that removes any characters you don't want. In this case, it's pretty easy:

Function ParseDigits(ByVal strRawValue asString)As String Dim strDigitsAs String =""If strRawValue =Nothing Then Return strDigitsFor Each cAs Char In strRawValue.ToCharArray()If c.IsDigitThen strDigits &= cEnd If Next c' return the number string, or "" if no numbers were in the string.Return strDigitsEnd Function

You can always iterate through each character and build a new string (Use the StringBuilder class). You could also try using RegEx.Replace method to replace all non-numeric characters with an empty string. I don't know which method would perform faster, if I had to guess, I'd say the manual iteration approach.

Does anyone know an easier way to do this?


quickest way to do this (and best performance is with a Regex:

C#:

string initialString ="XXXXXX-06-X-1234";System.Text.RegularExpressions.Regex nonNumericCharacters =new System.Text.RegularExpressions.Regex(@."\D");string numericOnlyString = nonNumericCharacters.Replace(initialString, String.Empty);

VB:

Dim initialString asString ="XXXXXX-06-X-1234"Dim nonNumericCharactersAs New System.Text.RegularExpressions.Regex(@."\D")Dim numericOnlyStringAs String = nonNumericCharacters.Replace(initialString,String.Empty)

Enjoy.


I think you did that backwards. You're having him match all numbers and replace them with an empty string. He WANTS numbers!

C#:

string initialString ="XXXXXX-06-X-1234";System.Text.RegularExpressions.Regex nonNumericCharacters =new System.Text.RegularExpressions.Regex(@."[^0-9]");string numericOnlyString = nonNumericCharacters.Replace(initialString, String.Empty);

VB:

Dim initialString asString ="XXXXXX-06-X-1234"Dim nonNumericCharactersAs New System.Text.RegularExpressions.Regex("[^0-9]")Dim numericOnlyStringAs String = nonNumericCharacters.Replace(initialString,String.Empty)

the "d" is capitalised:

/d = [0-9]

and

/D = [^0-9]

either way, you can use /D or [^0-9]


bipro/ds2goat

thanks for the help! that function works great. one more question though. sometimes the first set of characters (XXXXXX-) can have a number. is there anyway to totally ignore those first 6 characters, then trim the remaining characters using the function above?

example: "221300-06-D-1234" would come out as "061234"

thanks again for the help


bipro/ps2goat

thanks for the help! that function works great. one more question though. sometimes the first set of characters (XXXXXX-) can have a number. is there anyway to totally ignore those first 6 characters, then trim the remaining characters using the function above?

example: "221300-06-D-1234" would come out as "061234"

thanks again for the help


Is this a consistent thing? I.e., do all of the strings you are working with have those six characters you want to ignore? What kind of consistent pattern is there in the data?

yes, it is a consistent pattern (AAANNN-NN-A-NNNN), where "A" is alpha & "N" is numeric. i want to always ignore the first 6 characters, whether they be alpha or numeric.

thanks again.


You can get rid of the first 6 characters by

string _test = "AAANNN-NN-A-NNNN";
_test = _test.Replace(_test.Substring(0,6), string.Empty);

Now apply the above regular expression on the _test;

Thanks


works perfectly! here's my entire function:

Dim testAs String ="SPM300-06-D-1234"test = test.Replace(test.Substring(0, 6),String.Empty)Dim nonNumericCharactersAs New System.Text.RegularExpressions.Regex("\D")Dim numericOnlyStringAs String = nonNumericCharacters.Replace(test,String.Empty)Response.Write(numericOnlyString)
thanks again for all the help. i really appreciate it.
test = test.Replace(test.Substring(0, 6),String.Empty)
It seems to me that it makes more sense to do one operation than 2. e_screw told you to do

but it would read better if you did this instead:

test = test.Substring(6)
This will return all characters after index 6 (or the seventh character, since you don't need the first hypen either). Good programming is being lazy, and being lazy means writing less code to do what you want (less = more efficient). =)

0 comments:

Post a Comment