Programming with GNU Software

Noeud:Basic String Usage, Noeud Ť Next ť:Iterators and Generic Algorithms, Noeud Ť Up ť:Strings

Basic String Usage

Constructing a string object is easy, and the following code shows a number of different ways to create strings, and perform some simple operations on them:

     /* string1.cc
        Compiled using g++ string1.cc -o string1 */
     #include <string>
     
     int main()
     {
       char *cstring = "third=c_string\0";
       string first("first");
       string second("2nd string", 3);
       string third(cstring);
       string fourth(4, '4');
       string five("This is the fifth");
       string fifth(five, five.find("fifth"));
     
       cout << first << ", " << second << ", "
            << third << ", " << fourth << ", " << fifth << endl;
     
       string line(first+", "+second+", "+third.substr(0, 5)+
     	      ", "+fourth.substr(3)+"th, "+fifth);
       cout << line << endl;
     }

     Example 3.26: string1.cc: examples of creating strings

The output is as follows:

     $ ./string1
     first, 2nd, third=c_string, 4444, fifth
     first, 2nd, third, 4th, fifth

We begin by creating a C string, then follow up by creating five string objects. The first string we create contains the character array we pass in - "first". The constructor for second takes a character array and initialises the string to have the first 3 characters from the array, so it contains the string "2nd". The constructor for third takes the C string "third=c_string\0". The string variable fourth is created by passing a character and a number; the resultant string is made up the character repeated n times (so in this case, fouth is made up of "4444". Finally, we create a string called five from the character array "This is the fifth", and use that string to initialise the string fifth. Notice that we are making a call to find within the constructor to fifth; find returns the first position, if it exists, of the occurrence of the string passed in to it. Since the string "fifth" indeed exists, the result - 12 - is passed back, and fifth is created by taking the twelfth character (and all beyond) of string five.

We then print all of these initialised strings (see the output above), and create a new string called line that will contain copies of the original strings, slightly modified. We call substr twice within the creation of line; substr, when passed 2 integers, returns the string represented by the start of the first position, counting as many characters as are passed in for the second argument. So calling substr(0, 5) on "third=c_string\0" will return "third". Just passing one integer to the call to substr means that we take all characters starting from the position passed in. So substr(3) on "4444" will return the element at index 3 and beyond, which is a '4'. The result is that line, when printed comes out as

     first, 2nd, third, 4th, fifth

As you can see, we've performed some relatively complex string manipulations with just a few lines of code.

Let's look more closely now at finding items within a string. The previous example was contrived because we planned all along for things to go our way; by this, I mean that we knew that the find calls would return the values that we were interested in. But what value is returned when we fail to find a position within a string? The answer lies in looking at the value npos. npos is defined within the string namespace, and defines the maximum size a string can be. When a search function fails to find part of a string, it returns npos, which we need to check against in order to ascertain whether the find worked or not. At the surface level, it's very useful, although as we'll see in a minute, there are a few pitfalls to be wary of. First though, an example:

     /* string2.cc
        Compiled using g++ string2.cc -o string2 */
     #include <string>
     
     int main()
     {
       std::string::size_type i;
       string sentence("Mary had a little lamb, his\
      fleece was as white as snow...");
       i = sentence.find("You'll never find this...");
     
       if (i == std::string::npos)
         cout << "i == npos; failed to find string.\n";
     
       cout << sentence.substr(0, sentence.rfind(" lamb")) << " "
            << sentence.substr(sentence.find("fleece"), 6) << endl;
     
       cout << sentence.substr(0, sentence.rfind("Again, no such string"))
            << endl;
     
       i = 0;
       int num = 0;
       /* Find out how many 'a's there are in the string 'sentence': */
       while(i != std::string::npos)
         {
           i = sentence.find("a", i);
           if (i != std::string::npos)
     	{
     	  num++;
     	  i++;
     	}
         }
       cout << "Found " << num << " occurrences of 'a'" << endl;
       exit(0);
     }

     Example 3.27: string2.cc: finding things within a string

The output goes like this:

     $ ./string2
     i == npos; failed to find string.
     Mary had a little fleece
     Mary had a little lamb, his fleece was as white as snow...
     Found 7 occurrences of 'a'
     $

Let's discuss the code. To begin with we create a variable i of type size_type and assign it to npos. After declaring and initialising the string sentence, we run the find function on sentence, passing in the string "You'll never find this...". The result is assigned to i. If the search would have succeeded, it would've returned the index of the first element of the string we're searching for; but since the string we're searching for does not exist within sentence, npos is returned, and the evaluation i == std::string::npos will be true, since find returns npos because it failed to find the search string.

The statement

       cout << sentence.substr(0, sentence.rfind(" lamb")) << " "
            << sentence.substr(sentence.find("fleece"), 6) << endl;

includes a new function call to rfind, as well as doing some more substring manipulation. rfind is similar to find, except that it searches back through the string in question instead of forwards. It returns the first position of the string it is searching for as it occurs from the end of the string. Since the string " lamb" exists, find returns the relevant position and "Mary had a little" is retrieved as the substring. The second part of the cout statement uses the find method as we'd expect it to work, and "fleece" is extracted (recall that substring returns the string starting from the first argument and counting as many characters as there are in the second argument). The result is that the string "Mary had a little fleece" is printed.

However, the following statement

       cout << sentence.substr(0, sentence.rfind("Again, no such string"))
            << endl;

is different and deceiving; we use rfind to look for a string that clearly isn't in the string sentence. Since we're trying to create a substring from the start of sentence, to sentence.rfind("Again, no such string"), what will be printed out? The answer is that the entire sentence string will be printed, because rfind failed and as a result returned npos. And because npos returns the maximum (unsigned) value of its type, and the length of sentence is clearly less than that value, the cout statement just prints out the string in its entirety.

What we should have done is something like this:

       long pos = sentence.rfind("Again, no such string");
       /* If we've found what we're looking for, print the string out,
          or do whatever else we want: */
       if (pos != npos)
         cout << sentence.substr(0, pos)
              << endl;
       else /* The find failed, so do something else ... */

So, be warned! Always check the return value of a find() or rfind() method, to see if it is equal to npos or not. If it is equal to npos and you don't check for it, the example code above in string2.cc illustrates what could happen.

These are just a few of the operations we can perform with a string, and there are plenty of others that are all just as intuitive to use, such as insert(), erase() and replace(), amongst many others; since they are easy to understand they're in the String Summary, if you want to see the full range of operations you can use.

In the last section of string2.cc, we counted the number of occurrences of the character 'a' that occur within sentence. It's fairly routine what we're trying to achieve here, so no code-breakdown is necessary. However, this section of code is undeserving; we can greatly reduce the amount of code for such a simple operation by using string iterators and a generic algorithm...