Project Milestone Report:

Transactional Workload Characterization in PostgreSQL and in Oracle

 
Nikos Hardavellas (hardav@cs.cmu.edu)
Stephen Somogyi (ssomogyi@ece.cmu.edu)

April 2, 2004

Project Home Page
 

Major Changes

Our initial project (submitted March 3, 2004) proposed intelligent buffer pool management in distributed shared-memory multiprocessors. Unfortunately, we quickly discovered that tuning PostgreSQL was a far more involved effort than originally thought. Considering this, and given that our regular research (i.e. not class related) requires development of a TPC-C kit for Oracle, we decided to change the focus of our project. This new project consists of (a) installation, configuration, and tuning of both PostgreSQL and Oracle for TPC-C, and (b) characterization - both from a hardware and software perspective - of these workloads.

This will result in two major benefits over the original proposal. First, the characterization is something that we have been interested in performing for quite some time. However, as it is not directly related to our research, finding time for it was difficult. In retrospect (already!), this characterization is suited perfectly to a class project. Second, we will develop checkpoints for Simics, that are tuned configurations of PostgreSQL and Oracle for TPC-C. These will be extremely useful for our ongoing research into memory streaming in multiprocessor systems, and will be used by other groups as well.

Accomplishments

We acquired the TPC-C kit and adapted it to the newest version of PostgreSQL (compilation of the kit was broken, because of some type changes, but was relatively straightforward to correct). Significant time has been spent tuning PostgreSQL on a real machine (linux, dual x86 processors). Oracle has been acquired.

Milestone

The goals for the milestone were almost met. The missing piece is that PostgreSQL is not tuned on Solaris yet (only on Linux). This is low risk because PostgreSQL is known to perform similarly (given comparable hardware, of course) on the Solaris and Linux platforms.

Surprises

  1. Oracle (i.e. the physical media) was more difficult to acquire than anticipated. Since CMU has a site license, we expected this to be painless.
  2. PostgreSQL is still not entirely behaving itself. When running TPC-C, the system seems to be in one of two modes: active (approximately 5% CPU idle time) and broken (all processes are still running, but the CPUs are 100% idle). We have not yet determined the cause of these inactive periods, although it is likely they are due to locking or checkpointing.

Revised Schedule

Week
Tasks
4/5
Install and test Orcale. Adapt TPC-C kit for Oracle. Tune TPC-C on Oracle. Install and tune TPC-C on PostgreSQL running on Solaris. Axis 1 characterization of PostgreSQL.
4/12
Finalize tuning TPC-C on Oracle. Axis 1 characterization of Oracle. Install PostgreSQL on Simics/SimFlex. Warmup and obtain checkpoints. Axis 2 characterization of PostgreSQL.
4/19
Install Oracle on Simics/SimFlex. Warmup and obtain checkpoints. Axis 2 characterization of Oracle.
4/26
Oracle vs. PostgreSQL comparison. Final report and poster presentation.

Resources

We are supposed to get Oracle on April 2nd. Assuming this happens, we will have all resources that are required.