Tuesday, 28 May 2013

Reading Input with std::getline

A lot of beginners seem to have trouble with more complicated input, especially with reading in a loop until the end of a file (or end of input through std::cin). I thought about trying to go through all the various things I've seen people do wrong, but that could get pretty messy. So instead I thought I'd just show some examples building up to doing it right (or at least in a way that works, and that I think is fairly nice. "Right" is open to a lot of interpretation).
Let's start simple. Read a user's name, and greet them:
#include <iostream>
#include <string>

std::string readName()
{
    std::cout << "What's your name? ";
    std::string name;
    std::cin >> name;
    return name;
}

void greet (const std::string& name)
{
    std::cout << "Hello, " << name << '\n';
}

int main()
{
    greet(readName());
}
(By the way, if you didn't already know, it is safe to bind a const T& to a temporary, but not a T&). We run this:
$ ./hello1 
What's your name? Chris Sharpe
Hello, Chris
Pretty close, but reading with std::cin >> someString stops at the first whitespace. Instead we are going to use std::getline. The only change is in readName():
std::string readName()
{
    std::cout << "What's your name? ";
    std::string name;
    std::getline(std::cin, name); // Change here.
    return name;
}
Then:
$ ./hello2 
What's your name? Chris Sharpe
Hello, Chris Sharpe
That's what we want! What about a more complicated, and more useful example. Reading a configuration file, where each option might have a different type of value, for instance a filename (std::string) or switch for some program behavior (bool). It is convenient to use the formatted input we get with >>, but we want to actually read from the input file with getline.
The important functions are parseConfigFile() and parseConfigLine().
#include <fstream>
#include <iomanip>
#include <iostream>
#include <sstream>
#include <string>
#include <utility>

// Class to hold the program's configuration options.
class Config
{
    private:
        // Member data
        std::string someFilePath_;
        int         someSize_;
        bool        someSwitch_;

        // Static data: names of options
        static const std::string SOME_FILE_PATH;
        static const std::string SOME_SIZE;
        static const std::string SOME_SWITCH;

        // Private functions
        void parseConfigFile(const std::string&);
        void parseConfigLine(const std::string&);

    public:
        // Constructors
        Config();
        Config(const std::string&);

        // Accessors
        const std::string& someFilePath() const;
        int                someSize()     const;
        bool               someSwitch()   const;

        std::string dumpConfigAsString()  const;
};


int main(int argc, char* argv[])
{
    if (argc>1)
    {
        Config config{argv[1]};
        std::cout << config.dumpConfigAsString();
    }
    else
    {
        Config defaultConfig{};
        std::cout << defaultConfig.dumpConfigAsString();
    }
    return 0;
}


// class Config

const std::string Config::SOME_FILE_PATH {"some_file_path"};
const std::string Config::SOME_SIZE {"some_size"};
const std::string Config::SOME_SWITCH {"some_switch"};

Config::Config()
    // Set the defaults
    : someFilePath_{"default.png"}
    , someSize_{2}
    , someSwitch_{true}
{}

Config::Config(const std::string& configFilePath)
    : Config{} // Set the defaults
{
    parseConfigFile(configFilePath);
}


void Config::parseConfigFile(const std::string& configFilePath)
{
    std::ifstream inpFile{configFilePath};
    if (!inpFile.is_open())
    {
        std::cerr
            << "Could not open configuration file \""
            << configFilePath
            << "\"\n" // Note the string concatenation
               "Using default configuration options.\n";
        return;
    }

    std::string configLine;
    // Read a line at a time.
    // Doing this inside the loop condition means
    // we end correctly at the bottom of the file.
    while ( std::getline(inpFile, configLine) )
    {
        parseConfigLine(configLine);
    }
}

void Config::parseConfigLine(const std::string& configLine)
{
    // Ignore comment or empty lines
    if ('#' == configLine[0] || configLine.empty())
        return;

    // Split the line using >> operations
    std::istringstream iss {configLine};

    std::string configOption;
    iss >> configOption;

    // Compare against the known configurable options
    if ( SOME_FILE_PATH == configOption )
    {
        std::string tmpString;
        if (iss >> tmpString)
        {
            someFilePath_ = std::move(tmpString);
        }
        else // The read to std::string failed
        {
            std::cerr
                << "Failed to read configuration option \""
                << SOME_FILE_PATH
                << "\" as a string.\n";
        }
    }
    else if ( SOME_SIZE == configOption )
    {
        int tmpInt;
        if (iss >> tmpInt)
        {
            someSize_ = tmpInt;
        }
        else // The read to int failed
        {
            std::cerr
                << "Failed to read configuration option \""
                << SOME_SIZE
                << "\" as an integer.\n";
        }
    }
    else if ( SOME_SWITCH == configOption )
    {
        bool tmpBool;
        if (iss >> std::boolalpha >> tmpBool)
        {
            someSwitch_ = tmpBool;
        }
        else // The read to bool failed
        {
            std::cerr
                << "Failed to read configuration option \""
                << SOME_SWITCH
                << "\" as a boolean switch.\n";
        }
    }
    else
    {
        std::cerr
            << "Unrecognised configuration option \""
            << configOption
            << "\"\n";
    }
}

const std::string& Config::someFilePath() const
{
    return someFilePath_;
}

int                Config::someSize()     const
{
    return someSize_;
}

bool               Config::someSwitch()   const
{
    return someSwitch_;
}

std::string Config::dumpConfigAsString()  const
{
    std::ostringstream oss;
    oss
        << SOME_FILE_PATH << ' ' << someFilePath() << '\n'
        << SOME_SIZE      << ' ' << someSize()     << '\n'
        << SOME_SWITCH    << ' ' << std::boolalpha
                                 << someSwitch()   << '\n';
    return oss.str();
}
This is fairly long, just to make it a realistic example, but there are two absolutely key points:
  1. Don't mix use of operator>> and getline on the same stream. operator>> will leave a newline character on the stream. Any following use of getline would immediately hit that and return an empty line. This can really confuse people, especially when the reads happen far apart in code, so there is no obvious connection.
  2. Have the read as the loop condition when reading a whole file. This is partly because you want to check immediately after the read, and partly because you might otherwise be tempted to check stream.eof(), which will only work if you have just tried to read past the end of the file. Not if that last read used up all the characters, and you'll get completely stuck if the last few characters can't be read in the way you want (i.e. you are using formatted input with operator>>). Consider this badly broken example:
    int main()
    {
        int tmpInt;
        while (!std::cin.eof())
        {
            std::cout << "Number: ";
            std::cin >> tmpInt;
            std::cout << "Read " << tmpInt << '\n';
        }
    }
    
    A run of this program might go like this:
    $ ./a.out 
    Number: 123
    Read 123
    Number: abc
    Read 0
    Number: Read 0
    Number: Read 0
    Number: Read 0
    ...
    
    You can see it gets stuck in the loop because it can't actually read any further, but we technically haven't read past the end of the file.
If you've been paying attention, you'll have noticed this is broken if the file name has a space in it. How could you fix that?
As an aside, you might have heard of the GNU Readline Library which allows entry history, line editing, and other cool stuff. Recently I was working with a little test application that would be much easier to use with these features, but I couldn't make the changes to use Readline, so I started writing a little wrapper shell script. After wasting a fair amount of effort for something that worked ok, I found rlwrap. Guess I should have searched harder in the first place. There's always someone who's already solved your problem.

No comments:

Post a Comment