Thursday, June 13, 2013

String manipulation in javascript thorugh RegExp II

Problem 5 :: Swap between 1st and 3rd word in every combination of 3 words in a given string. For example, string like "I am right" would result into "right am I" i.e 1st and 3rd words are swapped.
Solution ::

<script>
// String
var str = "We are the men in red. He is the best boy.";

// Define RegExp
var patt = /(\S+)\s+(\S+)\s+(\S+)/g;

// Apply RegExp
var res = str.replace( patt, "{$3} $2 {$1}" );

// Print in Console
console.log( res );
</script>


Output ::
{the} are {We} {red.} in {men} {the} is {He} best boy.

Explanation ::
i) Regular expression (\S+)\s+(\S+)\s+(\S+) tries to capture 3 adjacent words joind with 2 occurrences of single/multiple spaces.
ii) The replace() function replaces such matched 3 words with "{$3} $2 {$1}" or 1st and 3rd words swapped.

Problem 6 :: I have a string like "I have 10 dollars in my pocket". We need to convert all numeric value ("10") to currency formatted text like "$10.00".
Solution ::

<script>
// String
var str = "0+ 1 + 10 + 10.00 + $5.00 = $26.00, right?";

// Define RegExp
var patt = /(\$*\d+\.*\d*)/g;

// Apply RegExp
var res = str.replace( patt, function($1)

{ var p = $1.replace(".00","").replace("$","");  
  return "$" + p + ".00"; 

);

// Print in Console
console.log( res );
</script>


Output ::
$0.00+ $1.00 + $10.00 + $10.00 + $5.00 = $26.00, right?

Explanation ::
The capturing group (\$*\d+\.*\d*) implies the following :
i) $ may be present at the beginning
ii) Next, series of digits
iii) A dot (.) may appear next
iv) Again a series of digits might appear.

The RegExp above would capture text like "$12" or "12" or "$12.00". Next we have a function which replace all "$" and ".00" characters from the matched word, and finally produces the resultant string by adding "$" and ".00" at the beginning and the end respectively. The replacement is required in case where texts like "$10" or "$20.00" are found.

Problem 7 :: This problem has been featured on MDN page. Converting Celsius to Fahrenheit through RegExp. Check out the code below.
Solution ::

<script>
// String
var str = "Last 7 day's temperature : 14.5C 4C 0C -3C -5.6C";

// Define RegExp
var patt = /([-]*\d+\.*\d*[C|F])/g;

// Apply RegExp
var res = str.replace( patt, function($1)
{
  var v = $1; 
 
  // GET the numeric value
  var t = parseFloat( v.substr(0, v.length - 1 ) );

  if(v.substr(-1) == 'C')
  {
     // Convert to Farenheit
     var result = ( ( t * 1.8 ) + 32 ).toFixed(2) + "F";
  }

  if(v.substr(-1) == 'F')
  {
     // Convert to Celsius
     var result = ( ( t - 32 ) / 1.8 ).toFixed(2) + "C";
  }
  return $1 + "[" + result + "]";

} );

// Print in Console
console.log( res );
</script>


Output ::
 
Last 7 day's temperature : 14.5C[58.10F] 4C[39.20F] 0C[32.00F] -3C[26.60F] -5.6C[21.92F]

Explanation ::
This is a nice program to convert Celsius to Fahrenheit and vice versa within a string. The output shows that all the temperatures in Celsius are appended with same in Fahrenheit. For example "14.5C" has been appended with "[58.10F]". Let's dissect the RegExp "/([-]*\d+\.*\d*[C|F])/g"..

i) [-]* means the number may start with a negative sign
ii) \d+\.*\d* means any sequence of digits may be followed by a dot or a series of digits. This captures numbers like "1", "12", "1.23" or "13.2".
iii) [C|F] means the number may be followed by either 'C' or 'F'.

The function we used with replace() function is very simple. We use  v.substr(0, v.length - 1) statement to extract "14.5" from "14.5C". And then we put a if-else logic to convert from Celsius to Fahrenheit or vice-versa. We use toFixed() function to round a floating point number to 2 decimal places i.e rounding 58.100000000 off to 58.10. The replace() function is responsible for making the string "14.5C"  converted to "14.5C[58.10F]".


Problem 8 :: Let's find out all words which have comma after them.
Solution ::

<script>
// String
var str = "We didn't do it, even they didn't, who did it then?";

// Define RegExp
var patt = /(\S+)(?=,)/g;

// Apply RegExp
var res = str.replace( patt, function($1){ return  "[" + $1 + "]"; } );

// Print in Console
console.log( res );
</script>


Output ::
We didn't do [it], even they [didn't], who did it then? I [did],

Explanation :: We have actually wrapped all matched words with a square bracket around them. Let's check the RegExp /(\S+)(?=,)/g ...

i) 'g' modifier is used to denote a global search
ii) (\S+) means any series of non-whitespace characters
iii) (?=,) denotes a positive lookahead regexp which starts with ?=. This means 'followed by comma' which would not be included in resultant match.

However, the above RegExp has a problem, it can match "iran,iraq," as a whole from a string like "iran,iraq, and Nepal". To get rid of this, we would change the pattern like this ::

var patt = /(\S[^\s,]+)(?=,)/g;

The inclusion of "[^\s,]" would accept all words containing comma two separate words. The output would be "[iran]" and "[iraq]".

We could have written the regexp pattern without using the positive lookahead as this ::

var patt = /(\S[^,\s]+),/g ;

 
It captures any character excluding comma and whitespace, followed by a comma. Problem is all the matched words are returned with a comma at the end. For this reason, positive lookahead is only solution to this. 


See previous article for more info on JS RegExp.

No comments: