String Manipulation
This document was written by CS 290W TA David Corcoran and was last
modified
Perl is very well known for ease in manipulation of strings. What
might take other programming languages several lines of code can
generally be done in Perl in less than 2 lines.
- The dot operator (.)
- This allows you to concatenate two strings together into one string.
# Example using the Dot Operator
$sFirstName = "Jim ";
$sLastName = "Fields";
$sFullName = $sFirstName.$sLastName;
This will take the value of $sFirstName and concatenate the value of
$sLastName. Thus, this will give you the $sFullName "Jim Fields".
- The index function
- This allows you to find an occurence of one string within another and
return the position of that occurence.
# Example using the index() function
$sString = "CS290w Happens";
$sSubString = "Happens";
$iIndex = index($sString, $sSubString);
$sString contains the string that you are searching. The index()
function finds the position in the string where $sSubString begins.
In the above example, $iIndex will get the value 7 since "Happens"
starts at position seven in the string. If $sSubString cannot be
found in $sString, then index() returns -1.
- The substring function
- This allows you to copy part of a string from another by
specifying beginning position and (in one version) length of string.
# Example using the substr() function
# $sSubString = substr($sString, $iStart);
$sOldString = "Kelly Corcoran";
$sNewString = substr($sOldString, 4);
$sNewString will contain the new value. It looks at string
$sOldString as an array of characters. It goes to the character at
position 4 which would be 'y' and then it grabs it and the rest of the
string. This new string is now: "y Corcoran" which will be placed in
$sNewString. $sOldString is not changed.
# Example using the substr() function
# $sSubString =
# substr($sString, $iStart, $iLength);
$sOldString = "Kelly Corcoran";
$sNewString = substr($sOldString, 1, 3);
$sNewString will contain the new value. It looks at string
$sOldString as an array of characters. It goes to the 1st index which
would be 'e' and then it grabs it and the next 2 characters following
it making the new string 3 characters long. This new string is now:
"ell" which will be placed in $sNewString. $sOldString is not changed.
Especially useful is when you combine both index() and substr() as in
the following example. Say we want to get only the second word of the
phrase "CS290W Happens". We could use the following code:
# Example using index() and substr()
$sString = "CS290W Happens";
$sSubString = " ";
$iIndex = index($sString, $sSubString);
$sNewString = substr($sString, $iIndex +1);
In this example we are attempting to obtain the second word of the
phrase "CS290W Happens". We do this by first finding the first
occurence of a space using the index function. This will contain the
position of that space ... which is put into the substr function as
the beginning. So, $sNewString will be "Happens".
- The length function
- This function returns the length in characters of the string.
# Example using the length() function
$sString = "JavaCard";
$iLength = length ($sString);
$iLength will contain the length of the string $sString which is 8
characters long.
- The split function
- This allows you to split up one string into multiple strings by
looking for occurrences of a character or characters within it. The
results are stored into an array.
# Examples using the split() function
$sSearchString = "cows,cattle,bovine creatures,moo";
@asSearchFor = split (/,/, $sSearchString);
# IP Address 163.185.20.182
$sIpAddress = "163.185.20.182";
@asClassAddresses = split (/\./, $sIpAddress);
Notice the backslash in the split function. What is this for? The
/.../ is a place where a regular expression can appear. (These will
be discussed in the next section Patterns
and Transliteration.) The backslash is necessary before the dot
to avoid having the dot interpreted as a special character in a
regular expression. A "." in a regular expression means match any
character!
What we are doing here is really useful. If you remember back from
CS190W, an IP Node Number is composed of its A.B.C.Node. The above
script allows you to distinguish between each by placing each A,B,C,
and Node into separate indices of array @asClassAddresses. So if
someone asked you "What class B network do you belong to?" You would
simply print out the value for $asClassAddress[1]. In the end you
have split the above string into 4 parts removing the dot between
them.
$asClassAddresses[0] = "163"
$asClassAddresses[1] = "185"
$asClassAddresses[2] = "20"
$asClassAddresses[3] = "182"
Don't worry though, we won't be covering class networks until the
Networking portion of this course :-)
- The join function
- This does the exact opposite of split(). Here you give an array of
scalars and a delimiting character. Join will glue it all together.
# Examples using the join() function
@asSearchFor = ("cows", "cattle", "bovine creatures", "moo");
$sSearchString = join ("->", @asSearchFor);
# $sSearchString = "cows->cattle->bovine creatures->moo"
@asIpAddress = ("163","185","20","182");
$sFullIpAddress = join (".", @asIpAddress);
Join will take the individual values "163", "185", "20", and "182" and
glue them together using a dot. Your final result stored in string
$sFullIpAddress will be "163.185.20.182".