c++boost.gif (8819 bytes) Header <boost/persistent.hpp>

Introduction

Often, it is necessary to store data for a longer duration than the current program execution. C++ provides the I/O library (std::27 [lib.input.output]) for that purpose. It is extensible, because users can overload operator<< and operator>> for their own classes and thus store the values of previously unknown objects persistently.

Traditionally, these shift operators have been used to provide user-readable input and output. However, it is not always ensured that the output of operator<< can be used by operator>> to produce an object which is equivalent to that written. For example, writing a string with an embedded newline will produce two strings upon reading.

In contrast, the framework provided in this header file has been designed to allow storage and exact reproduction of objects. It tries to separate two orthogonal concepts as clearly as possible: First, iteration through some data structure, second, reversible encoding of individual data items for external storage.

Reversible encoding and output is handled by a Writer object, which must provide a write(const T&) member function for each data type T it wishes to handle. Likewise, input and decoding is handled by a Reader object, which must provide a read(T&) member function for each data type T it wishes to handle. Reader and Writer should agree on what data types they can handle.

Check out persistence_demo.cpp to get an impression of its features.

Synopsis

template<class Desc, class T>
void describe(Desc & desc, T& x);

template<class Writer, class ForwardIterator>
void save_sequence(Writer writer, ForwardIterator first, ForwardIterator last);
template<class Writer, class T>
void save(Writer writer, const T& x);
template<class Writer, class Container>
void save_file(const Container & cont, const std::string & filename);

template<class T, class Reader, class OutputIterator>
void load_sequence(Reader reader, OutputIterator out, std::size_t n);
template<class Reader, class T>
void load(Reader reader, T& x);
template<class Reader, class Container>
void load_file(Container & cont, const std::string & filename);

class shift_writer;
class shift_reader;
class binary_writer;
class binary_reader;

Saving data

The various save functions iterate through the provided data structure x of type T, leaving encoding and output of built-in types and types they have not been specialized for to the given Writer.

Overloaded save functions are provided for std::vector, std::list, std::deque, std::map, std::multimap, std::set, std::multiset, and std::pair. There is also a function to store an iterator range.

The convenience function save_file has the effects as if:

  ofstream file(filename.c_str());
  file.exceptions(std::ios::failbit|std::ios::badbit);
  // implementation-defined structure identifier for the container
  file << /* ... */;
  save(Writer(file), cont);

Loading data

The various load functions fill the provided data structure x of type T, leaving input and decoding of built-in types and types they have not been specialized for to the given Reader.

The read function overloads are for the same types than the save ones.

The convenience function load_file has the effects as if:

  ifstream file(filename.c_str());
  file.exceptions(std::ios::failbit|std::ios::badbit);
  file >> /* ... */;
  // implementation-defined structure identifier verification for the container
  load(Reader(file), cont);

Specifying Encoding/Decoding and Input/Ouput for Simple Types

Synopsis

class shift_writer
{
public:
  explicit shift_writer(std::ostream & strm, char delim_chr = '"',
	       char escape_chr = '\\');
  template<class T> void write(const T& x);
  void write(double d);
  void write(const std::string & s);
};

class shift_reader
{
public:
  explicit shift_reader(std::istream & strm, char delim_chr = '"',
	       char escape_chr = '\\');
  template<class T> void read(T & x);
  void read(std::string & x);
};

class binary_writer
{
public:
  explicit binary_writer(std::ostream & strm);
  template<class T> void write(const T& x);
  void write(const std::string& x);
};

class binary_reader
{
public:
  explicit binary_reader(std::istream & strm);
  template<class T> void read(T& x);
  void read(std::string & x);
};

Description

shift_writer and shift_reader use the traditional operator<< and operator>> for input/output and encoding/decoding. Data elements are separated by a single space on output; whitespace after each data element is ignored on input. shift_writer::write(double) ensures the full precision of the given floating-point number is output. std::string is specially encoded: The string is enclosed in delim_chr, any occurrence of delim_chr or escape_chr within the string is prefixed (escaped) by escape_chr.

binary_writer and binary_reader use the binary in-memory representation of the data elements as the encoding, including compiler-dependent padding (if any). std::string is treated like std::vector<char>.

Note: The framework does not provide for comment removal or other features useful for configuration file parsing.

Adapting the framework for user-defined types

Adaptation for user-defined types can be provided at several stages of the encoding and decoding processes, depending on the requirements of the type to be coded. The following exposition will use the following data structures:
struct user_type {
  int i;
  template<class Desc%gt;
  void describe(Desc & d) { d & i; }
};
std::vector<user_type> v;
Calling save(writer, v) calls the following functions (omitting the Writer template parameters):
  1. save<T, Alloc>(writer, v) with T = std::vector<user_type>
  2. save_sequence<fwd iter of vector>(writer, v.begin(), v.end(), v.size());
  3. writer.start_sequence(v.size());
  4. for each element in v: save<T>(writer, element) with T = user_type
  5. T is not an arithmetic type, thus:
  6. save_descriptor d(writer); describe<T>(d, element) with T = user_type
  7. user_type::describe(save_descriptor)
  8. save<T>(writer, i) with T = int
  9. T is an arithmetic type, thus:
  10. writer.write(i);
  11. ...
  12. writer.end_sequence()
All global functions are called without qualification, so Koenig lookup applies and thus you can overload them. Overloading is possible at each level: Automatic deduction of user_type's layout is impossible in C++, so you have to provide at least one of the above functions. Always choose the option with the highest number which fulfills your needs.

Loading works analogous, with save() replaced by load() and writer replaced by reader.

Writing describe() functions

describe() functions are useful for compound types such as struct's. Writing a describe() functions for some type is similar to writing an iostream-based operator<< for that type, except that the passed-in descriptor is used on the leftmost side and operator& is used instead of operator<<.

History and Acknowledgements

Thanks to Beman Dawes for the original idea and further suggestions for improvement. (This section needs updating!)


Jens Maurer, 2000-11-18