Copyright © 2006 Julio M. Merino Vidal
Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt.)
Table of Contents
C++ is a very powerful object oriented programming language: it is compiled into native code, supports generics through templates, allows multiple inheritance as well as lots of other programming techniques. However, a language on its own is not that useful: it has to provide some libraries that aid the developer to accomplish his task at hand. In this sense, the standard C++ library is currently limited to managing data collections and data streams.
If one looks at other languages and platforms — e.g. Java and .NET —, it is easy to see that they provide much more functionality in the standard APIs than C++ does. These additional features ease the developer's job a lot because he can focus on the problem he has to solve. They also make the language much more useful out of the box, attracting more developers that need to quickly implement some code.
It is clear that the standard C++ library lacks abstraction layers for the functionality provided by the underlying operating system. Developers must directly call the underlying platform to achieve many trivial tasks such as directory iteration, process management, interprocess communication or network access. This is a problem because the libraries included in the operating system (such as the POSIX libc or the Win32 API) do not integrate well with C++: they are not object oriented and are not developed following good C++ practices. As a result, applications using them become difficult to maintain, are unportable and grow fragile classes to interact with low-level details.
Although there are libraries that cover some of these gaps, there are still many subsystems for which no abstraction exists yet. One of these is process management. Many applications need to interact with other programs either because they are front-ends or simply because they need to take advantage of their functionality. This can make developers choose some language other than C++ that eases process interaction simply because they do not want to deal with low-level operating system details.
Boost has already addressed operating system abstraction layers area by providing some libraries that give access to specific functionality in a portable way. These libraries aim at the eventual inclusion in the C++ standard. For example, consider the Boost.Filesystem and Boost.Thread libraries.
It is clear that there is a need for a library to manage processes, and this library is well suited for eventual inclusion in Boost. As time passes, it might also be proposed for addition into one of the future C++ library revisions.
The Boost Process library provides a framework to manage processes and communicate with them using standard C++ streams. The framework is flexible enough to handle operating system specific functionality while keeping a minimum common portable set to be used in applications that need to run in different operating systems.
The following list of items describe the features provided by the library.
In the long term, Boost.Process' most inner goal is to provide a portable abstraction layer over the operating system that allows the programmer to manage any running processes, not only those spawned by it. The library is designed to allow this future extension, although it currently does not provide the necessary functionality to manage non-child processes due to its complexity. In other words, the library currently focuses on child process management only.
Boost.Process' most important feature is the ability to painlessly launch external applications, control them during their lifetime and set up communication channels between them and their parent. Traditionally, this has been a boring and hard task in C and C++ because there are many details to take into account: create the new process, launch the external binary, set up anonymous pipes to communicate with it, wait for its termination, inspect its exit status, etc.
To make things worse, each major operating system has its own process model and API to manage child processes. Writing an application that deals with both is time consuming and in most cases drives away the developer from his original goal — not related at all to interacting with the underlying operating system. Therefore, Boost.Process provides an abstraction to handle all these details on behalf of the developer.
An application launching a child process typically wants to communicate
with it by means of data transfers. This IPC happens at the level of file
handles and typically involves the standard input (
stdin ), the standard output
stdout ) and the standard error output (
stderr ). If the operating system supports it
(e.g. Unix), other unnamed data streams can be redirected.
In order to properly integrate with the standard C++ library, this IPC is morphed into the standard C++ streams making interaction with child processes a piece of cake.
Related to this, a process may emit data through multiple streams. The most typical example are applications that print regular messages through their standard output and error messages through the standard error output. However, it is sometimes desired to forget about all distinctions and treat several of these streams as a single one. Hence, the library allows merging different streams at the operating system level. This involves much less coding and overhead at the final application than manually handling different streams, because the receiver will be presented a single data flow.
Boost.Process allows a process to be managed in several different operation modes; these are described below:
The library provides the necessary functionality to model pipelines and handle them. A pipeline is a unidimensional chain of interconnected processes in which the output of one of them is connected to the input of the following.
Boost.Process is constructed with portability in mind. Both the POSIX process management model and the Win32 one are taken into account in the library and are supported by its API. More details are available in the platforms and compilers chapter.
To achieve this goal, the library provides a common and system-agnostic API that lets the developer manage processes without ever knowing the operating system the code is running under. However, it is a fact that each operating system provides specific and useful features that are not portable to others; these must not be banned to developers if they need them. Therefore, the library provides a non-portable way to access these features that is clearly separated from the portable API to avoid introducing portability bugs by mistake.
Boost.Process is modelled around a set of abstract concepts that are implemented in the form of C++ templates and classes. This allows the developer to replace any part of the library with custom implementations as long as they match the requirements of each concept.
This comes helpful if there is a need for extreme efficiency or highly OS-specific functionality not provided by the standard classes.
The library comes with a very complete set of regression tests to ensure that it can be correctly used in new code and that it behaves according to its specifications. These tests try to be as complete as possible and are an excellent tool to verify that the library works correctly on a new platform.
Although the library was not developed following the test driven development methodology, it is still interesting to have as much automated tests as possible. If the library misbehaves in some area, it should first be proved wrong by a new test and only then fixed to pass it.
The library is implemented with efficiency in mind although this is not a primary goal of its development. Improvements to efficiency are of course welcome but are second class items in front of the other goals. However, it is understandable that efficiency can be a very important feature for some developers. So, as mentioned in the modularity requirement, the library user is allowed to replace parts of it to improve the areas that may not be efficient enough.
Last revised: August 22, 2006 at 16:59:48 +0200