New: Oct 14, 2000, minor updates Sept. 1, 2002

Experiments with the Open Source Pthreads Library
and Some Comments

Chapter 10 (Edition 2) introduced the "condition variable" model, and this model is based upon the the POSIX Pthreads standard, which is available in nearly every UNIX implementation, LINUX, Compaq OpenVMS, and other operating systems. Elsewhere in Chapter 10 and these pages, I've commented upon the inescapable need for condition variables, or their equivalent. The Win32 events really are not well designed and cause a large amount of confusion, even among experienced programmers and writers. Furthermore, if Windows had a good Pthreads library, multithreaded code could be easily ported from other platforms.

Fortunately, there is an Open Source implementation, available at http://sources.redhat.com/pthreads-win32. Simply download it, follow the directions, and you have a Pthreads DLL and library. Does it work? Does it have good performance? How is it implemented? Sept 1, 2002: Experience to date indicates very positive answers to all of these questions, and the open source library can be recommended both for Windows-only development and especially when code most be portable to UNIX, Linux, and other systems.

First, here is a quick Pthreads summary.

Quick Pthreads Summary: The most significant aspect of Pthreads for this discussion is that it properly formalizes the condition variable model. Pthreads also supplies mutexes and thread management that are analogous to what Win32 delivers. Thread cancellation is another important Pthreads capability that improves on Win32. See Butenhof Programming with POSIX Threads, AWL, 1997, for more information (this book is highly recommended). The important features for this discussion are:

  1. There are distinct data types, such as pthread_thread_t, pthread_mutex_t, and pthread_cond_t, for different objects, whereas Win32 simply uses HANDLE everywhere (I prefer the Win32 approach).
  2. There is no direct equivalent to the Win32 event. Rather, there are condition variables (data type pthread_cond_t), and there is no distinction between auto and manual reset. Condition variables most closely resemble auto-reset events, but therre are some differences.
  3. There are two functions used to wait on condition variables: pthread_cond_wait and pthread_cond_timedwait. The two functions require both a mutex and a CV argument, and they: a) unlock the mutex (it is an error to make the call without the mutex being locked), b) atomically wait on the CV (possibly with a timeout), and c) lock the mutex. There is no single equivalent Win32 function; you either need to use two (SignalObjectAndWait and WaitForSingleObject) or three (ReleaseMutex and two WaitForSingleObject) Win32 calls (all of this is discussed at the start of Chapter 10 - Second Edition).
  4. CVs are "signaled" using pthread_cond_signal (roughly equivalent to SetEvent on an auto-reset event - but be careful here) and pthread_cond_broadcast (roughly equivalent to PulseEvent on a manual-reset event - be careful here too).
  5. Thread management and mutexes are roughly similar to their Win32 equivalents for our purposes, but it is worth pointing out that all the objects have attributes (thread stack size, whether a mutex is recursive, whether a mutex or CV is process shared, and so on) that, in general, can be mapped to Win32 capabilities.
  6. In summary, Pthreads provide: a) source level portability to numerous platforms, b) an industry standard, c) a complete and well-designed API for mutltithreaded application development.

The Open Source implementation passed some quick tests and provided good performance. I used a Pthreads version of the three-stage queuing system, ThreeStage.c (Program 10-5, Edition 2. Contact me, jmhart@world.std.com,  for the source code, or download the book's example code) to obtain data that extends the performance data in Table C-5, using both the broadcast and signal models (the code is included at the end, and the data is below). So, the library works (it worked right away; there were no problems, and source level portability appears to be feasible), and it gives comparable performance to the other implementations. Note: While the timing results are not shown here, the same tests run under Windows Me show that Me does not scale well when the number of threads is large.

(Sept 1, 2002) The fact that multithreaded applications are portable across a wide range of target architectures and systems is, to me, at least, a remarkable achievement. At one time, not so long ago, multitasking was considered to be among the most system-specific issues confronting an application developer. As someone who has to create portable applications, and port between Windows and UNIX/Linux, I'm grateful that I no longer have to use preprocessor macros to select between POSIX and Win32 code, and I no longer have to create painful emulations of condition variables when porting from UNIX/Linux to Windows.

I also ran the program using macros to implement the Pthreads API. The macros are very simple and are listed after the performance data.

How is the Open Source library implemented? You can read the code; in some cases, it is not simple. For example, look at the condition variable implementation; there is an intricate data structure with thread counters, and only MR events and SetEvent are used (requiring ResetEvent calls). Incidentally, the Schmidt and Pyarali work (cited at the end of Chapter 10) was influential in the Open Source work. Is the Open Source implementation too complex? Why are the macros so simple? In short, the macros only solve part of a larger whole problem. Here are some explanatory comments:

  1. Sept. 1, 2002. The complexity is necessary due to the Win32 event model; thus the complexity is more a critique of the Win32 model than of the open source implementation. The open source developers had to do their best with what they were given. However, under WNT, 2000, and XP, SignalObjectAndWait can be used, simplifying some of the code. As W9x and WMe fade into the sunset over time, perhaps this whole question will become moot.
  2. The macros are only used in the context of Chapter 10's "condition variable model" which are always used in a loop where we "test the CV predicate and test it again" (as Butenhof urges you to do). A general implementation, however, cannot assure that the emulated CVs will always be used properly.
  3. The macros assume that, when a CV is created, it will always be signaled with just one of the two emulated calls: pthread_cond_signal (in which case, SetEvent is used with an auto-reset event) or pthread_cond_broadcast (in which case, PulseEvent is used with a manual reset event). A general implementation, however, cannot make this assumption and must, therefore, use a MR event (to allow multiple threads to be released) as the event's type cannot be changed once it is created.
  4. The macro implementation does not need to be concerned with attributes. 

In conclusion, there are good reasons why the general purpose Open Source implementation is more complex than the simple macro implementation shown below. Nonetheless, I'm not convinced that the Open Source implementation could not be simplified, although a lot of smart people have worked on it. Personally, I have not had the time to think about it thoroughly. Opinions, comments, etc. appreciated, and watch this space. Furthermore, the Open Source code contains at least one comment about a possible deadlock; this comment is not reassuring. Sept. 1, 2002. I've had email conversations with two of the open source developers, and the above point is discussed at http://sources.redhat.com/ml/pthreads-win32/2002/msg00012.html. I've never had a problem and have used the open source library extensively (but, of course, failure to observe a defect does not prove that a defect does not exist!).

Performance Results for the Open Source Pthreads Emulation

Here is Table C-5 (from Edition 2) extended to include the Open Source Pthreads emulation. Contact me (jmhart@world.std.com) for a copy of ThreeStage.c implemented in Pthreads or download it from the book's example code. The implementation can be configured to use either the macros (next section) or the Open Source emulation, and results for both are included below. 

The first six columns simply repeat the data that is already in the book. The rightmost two columns contain:

In summary, the Open Source Pthreads emulation provides performance that is fully competitive with what can be achieved using the Win32 API directly (at least, for this one set of tests). Also, existing Pthreads applications can be ported painlessly to Win32.

tablec5.jpg (97507 bytes)

Simplified Pthread Emulation with Macros

I used these macros successfully to develop an application targeted for UNIX, Linux, and even OpenVMS, but I was able to develop and debug on my W2000 laptop (a real convenience). Furthermore, pthread_cond_signal was never used, simplifying the CV wait. Here are the macros (Sept. 1, 2002. BEWARE - as stated above, these macros are not a general solution to the problem of emulating Pthreads under Win32).

Nov 21, 2004. I've fixed these slightly and added macros for UNIX I/O. This way, I was able to write a threaded file processing program with a single source file that can be built on UNIX, Linux, and Windows with no source code changes and minimal conditional compilation.

#ifdef _WIN32
/* Windows requires definitions for the POSIX file I/O functions */
#include <io.h>
#define _INTEGRAL_MAX_BITS 64
#define read(fd,pbuffer,count) _read(fd,pbuffer,count)
#define write(fd,pbuffer,count) _write(fd,pbuffer,count)
#define open(fn,flag) _open(fn,flag)
#define open3(fn,flag,mode) _open(fn,flag,mode)
#define close(fd) _close(fd)
#define lseek64(handle,offset,origin) _lseeki64(handle,offset,origin)
#define sleep(t) Sleep(1000*(t))
#define sync() ;
#define off64_t __int64
#define size64_t __int64

#else
#ifdef _LINUX
#define open3(fn,flag,mode) open(fn,flag,mode)
#define lseek64(handle,offset,origin) lseek(handle,offset,origin)
#define FlushFileBuffers(i) 1
#define off64_t long long
#define size64_t long long

#else

#define open3(fn,flag,mode) open(fn,flag,mode)
#define FlushFileBuffers(i) 1
#define size64_t long long

#endif
#endif

#ifdef _WIN32
#define _WIN32_WINNT 0x500 /* Require Windows NT5 (2K, XP, 2K3) */
#include <windows.h>
#include <time.h>
#else
#include <unistd.h>
#include <sys/time.h>
#endif

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>

/* Thread_emulation.h */
/* Author: Johnson M. Hart */
/* Emulate the Pthreads model for the Win32 platform */
/* The emulation is not complete, but it does provide a subset */
/* required for a first project */

#ifndef _THREAD_EMULATION
#define _THREAD_EMULATION

/* Thread_emulation.h */
/* Author: John Hart, July 27, 2000 */
/* Emulate the Pthreads model for both Win32 and Pthreads platforms*/
/* The emulation is not complete, but it does provide a subset */
/* that will work with many well-behaved programs */
/* IF YOU ARE REALLY SERIOUS ABOUT THIS, USE THE OPEN SOURCE */
/* PTHREAD LIBRARY. YOU'LL FIND IT ON THE RED HAT SITE */
#ifndef _THREAD_EMULATION
#define _THREAD_EMULATION

/* Thread management macros */
#ifdef _WIN32
/* Win32 */
#define _WIN32_WINNT 0x500 /* WINBASE.H - Enable SignalObjectAndWait */
#include <process.h>
#include <windows.h>
#define THREAD_FUNCTION DWORD WINAPI
#define THREAD_FUNCTION_RETURN DWORD
#define THREAD_SPECIFIC_INDEX DWORD
#define pthread_t HANDLE
#define pthread_attr_t DWORD
#define pthread_create(thhandle,attr,thfunc,tharg) (int)((*thhandle=(HANDLE)_beginthreadex(NULL,0,(THREAD_FUNCTION)thfunc,tharg,0,NULL))==NULL)
#define pthread_join(thread, result) ((WaitForSingleObject((thread),INFINITE)!=WAIT_OBJECT_0) || !CloseHandle(thread))
#define pthread_detach(thread) if(thread!=NULL)CloseHandle(thread)
#define thread_sleep(nms) Sleep(nms)
#define pthread_cancel(thread) TerminateThread(thread,0)
#define ts_key_create(ts_key, destructor) {ts_key = TlsAlloc();};
#define pthread_getspecific(ts_key) TlsGetValue(ts_key)
#define pthread_setspecific(ts_key, value) TlsSetValue(ts_key, (void *)value)
#define pthread_self() GetCurrentThreadId()
#else
/* pthreads */
/* Nearly everything is already defined */
#define THREAD_FUNCTION void *
#define THREAD_FUNCTION_RETURN void *
#define THREAD_SPECIFIC_INDEX pthread_key_t
#define thread_sleep(nms) sleep((nms+500)/1000)
#define ts_key_create(ts_key, destructor) pthread_key_create (&(ts_key), destructor);
#endif

/* Syncrhronization macros: Win32 and Pthreads */
#ifdef _WIN32
#define pthread_mutex_t HANDLE
#define pthread_cond_t HANDLE
#define pthread_mutex_lock(pobject) WaitForSingleObject(*pobject,INFINITE)
#define pthread_mutex_unlock(pobject) ReleaseMutex(*pobject)
#define pthread_mutex_init(pobject,pattr) (*pobject=CreateMutex(NULL,FALSE,NULL))
#define pthread_cond_init(pobject,pattr) (*pobject=CreateEvent(NULL,FALSE,FALSE,NULL))
#define pthread_mutex_destroy(pobject) CloseHandle(*pobject)
#define pthread_cond_destroy(pobject) CloseHandle(*pobject)
#define CV_TIMEOUT INFINITE /* Tunable value */
/* USE THE FOLLOWING FOR WINDOWS 9X */
/* For addtional explanation of the condition variable emulation and the use of the
* timeout, see the paper "Batons: A Sequential Synchronization Object" 
* by Andrew Tucker and Johnson M Hart. (Windows Developer’s Journal, 
* July, 2001, pp24 ff. www.wdj.com). */
//#define pthread_cond_wait(pcv,pmutex) {ReleaseMutex(*pmutex);WaitForSingleObject(*pcv,CV_TIMEOUT);WaitForSingleObject(*pmutex,INFINITE);};
/* You can use the following on Windows NT/2000/XP and avoid the timeout */
#define pthread_cond_wait(pcv,pmutex) {SignalObjectAndWait(*pmutex,*pcv,INFINITE,FALSE);WaitForSingleObject(*pmutex,INFINITE);};
//#define pthread_cond_broadcast(pcv) PulseEvent(*pcv)
#define pthread_cond_signal(pcv) SetEvent(*pcv)
static int OnceFlag;
//static DWORD ThId; /* This is ugly, but is required on Win9x for _beginthreadex */
#else
/* Not Windows. Assume pthreads */

#endif

#endif