SPECweb2005 Release 1.20 E-commerce Workload Design Document

Version 1.20, Last modified 04/05/2006


Overview

The E-commerce workload in SPECweb2005 is designed to simulate a Web server that sells computer systems; this includes allowing end users to search, browse, customize, and purchase products.  The workload was developed by analyzing log files of actual E-commerce sites, as well as browsing popular Web stores to gather statistics such as average page size, image sizes and access frequencies (including If-Modified-Since caching from the browser side), and capturing form data a user typically fills out when purchasing products.

Dynamic pages

By observing the requests to typical E-commerce Web stores, it is apparent that three distinct phases are involved in an E-commerce transaction; these are described below.

Phase 1: Browse

The first phase involves a user visiting the home page, possibly selecting a country or region (and thus being directed to the appropriate localized set of pages), selecting the type of customer (home user; small/medium/large business; local/state/federal government; education; healthcare), and being redirected to the appropriate section of products for that customer.  Then the customer must select the type of product they're interested in, and then choosing a specific product model.  Another possibility is using the site's search functionality to find product(s) of interest.

The dynamic pages in this phase must:

The name and purpose of each dynamic page in this phase is below.

Phase 2: Customization

The second phase involves configuring a product, which involves user interaction via form submissions.  First, the internal components are selected by the user (i.e. processor speed, amount of memory, hard drive size); the next page consists of further customizations, i.e. choosing the type of warranty, service, and support they want included, and finally any optional accessories (cables, printers, software, etc.).

The dynamic pages in this phase must:

This phase contains one page, but it is requested multiple times, as it returns different customizations depending upon which stage the user is in (1, 2, or 3).

Phase 3: Purchase

The third and final phase is the actual "E-commerce" phase.  Once the user has configured the product to his/her liking, the user adds the product to the shopping cart.  The cart page allows users to change item quantities, remove items, or save the cart for future retrieval.  When the user clicks "Checkout" from the cart, the session must become secure (this is accomplished via a redirect into HTTPS).  There are multiple pages in this SSL stage: login/registration, entering billing and shipping information, payment details, verifying and submitting the order, and the confirmation page.

The dynamic pages in this phase must handle:

The name and function of each page is described below:

Markov chain

SPECweb2005 is based upon a page-based model; that is, it issues a request to a dynamic page and requests all the images that would normally exist within the page as HTML image tags.  A Markov chain in the harness allows simulation of the relative page request frequencies as seen from the server side.  This is represented in the prime client's SPECweb_Ecommerce.config (see the STATE_n lines).  Below is a diagram of the likelihood of transitioning from one state into another:

Static file set

The static portion of the E-commerce file set is generated by Wafgen.  Each workload has a fixed file set and a file set that scales with the number of simultaneous user sessions requested.

Fixed file set

The fixed file set consists of two types of files: images that an HTML page would reference via <IMG> tags in the HTML (and that a browser would request upon receipt of a page), and "padding".  Padding consists of random text that is inserted at the bottom of a dynamic page to bring the file size up to what was observed with real-world E-commerce Web pages (which have, among other things, JavaScript code and numerous layout tags).  The page image sizes of the fixed file set were determined by analyzing and averaging file sizes observed; the sizes range from very small (less than 100 bytes) to ~22 KB; the former are usually "spacer" images used throughout the site for aligning tables, while the latter tend to be photo-quality JPEG images.  The page images used in the E-commerce workload, along with their size and percentage of being cached by the browser (i.e. receiving an HTTP 304 Not Modified response from the SUT) are listed in the table below.

File Name

Size (bytes)

304 Request %

homepage1 43 30%
homepage2 48 30%
homepage3 54 30%
homepage4 78 30%
homepage5 82 20%
homepage6 147 30%
homepage7 166 30%
homepage8 167 30%
homepage9 173 30%
homepage10 187 30%
homepage11 592 30%
homepage12 616 30%
homepage13 738 30%
homepage14 1,022 30%
homepage15 1,186 30%
homepage16 1,259 30%
homepage17 1,360 30%
homepage18 1,550 30%
homepage19 1,593 30%
homepage20 1,761 30%
homepage21 1,809 30%
homepage22 18,346 20%
homepage23 19,207 20%
homepage24 21,838 20%
homepage25 22,612 20%
browse1 443 30%
browse2 59 30%
browse3 62 30%
browse4 63 30%
browse5 71 30%
browse6 132 30%
browse7 175 30%
browse8 224 30%
browse9 243 30%
browse10 523 30%
browse11 917 30%
browse12 1,167 30%
browse13 1,347 30%
browse14 1,402 30%
browse15 1,480 30%
browse16 1,504 30%
browse17 1,621 30%
browse18 1,738 30%
browse19 2,208 20%
browse20 11,918 20%
browse21 11,972 20%
browse22 14,408 20%
browse23 14,525 20%
browse24 14,748 20%
browse25 15,750 20%
browse_productline1 49 30%
browse_productline2 72 30%
browse_productline3 185 30%
browse_productline4 1,423 30%
browse_productline5 2,176 30%
browse_productline6 3,140 10%
browse_productline7 6,828 10%
browse_productline8 8,210 10%
browse_productline9 9,461 10%
browse_productline10 10,633 10%
browse_productline11 10,774 10%
browse_productline12 11,044 10%
productdetail1 43 30%
productdetail2 58 30%
productdetail3 71 30%
productdetail4 49 30%
productdetail5 121 30%
productdetail6 132 30%
productdetail7 187 30%
productdetail8 187 30%
productdetail9 2,154 10%
productdetail10 3,521 10%
customize1 43 30%
customize2 43 30%
customize3 49 30%
customize4 95 30%
customize5 114 30%
customize6 370 30%
customize7 1,373 10%
customize8 1,936 10%
customize9 1,994 10%
cart1 43 0%
cart2 43 0%
cart3 43 0%
cart4 44 0%
cart5 44 0%
cart6 48 0%
cart7 50 0%
cart8 57 0%
cart9 61 0%
cart10 65 0%
cart11 82 0%
cart12 83 0%
cart13 91 0%
cart14 97 0%
cart15 98 0%
cart16 130 0%
cart17 136 0%
cart18 223 0%
cart19 243 0%
cart20 251 0%
cart21 274 0%
cart22 278 0%
cart23 280 0%
cart24 283 0%
cart25 319 0%
cart26 329 0%
cart27 362 0%
cart28 363 0%
cart29 385 0%
cart30 523 0%
cart31 621 0%
cart32 1,848 0%
cart33 1,980 0%
cart34 7,894 0%
cart35 8,240 0%
cart36 8,255 0%
cart37 8,484 0%
cart38 8,886 0%
cart39 8,914 0%
cart40 43 0%
checkout1 43 0%
checkout2 43 0%
checkout3 43 0%
checkout4 44 0%
checkout5 48 0%
checkout6 59 0%
checkout7 59 0%
checkout8 59 0%
checkout9 60 0%
checkout10 65 0%
checkout11 82 0%
checkout12 83 0%
checkout13 97 0%
checkout14 102 0%
checkout15 136 0%
checkout16 217 0%
checkout17 356 0%
checkout18 385 0%
checkout19 523 0%
checkout20 1,648 0%
checkout21 1,913 0%
shipping1 61 0%
shipping2 378 0%
shipping3 515 0%
shipping4 518 0%
shipping5 922 0%
billing1 50 0%
billing2 159 0%
billing3 519 0%
billing4 897 0%
billing5 1,069 0%
search1 43 30%
search2 67 30%
search3 185 30%
search4 205 30%
search5 370 30%
search6 523 30%
search7 1,731 30%
confirm1 43 0%
confirm2 44 0%
confirm3 80 0%
confirm4 92 0%
confirm5 97 0%
confirm6 98 0%
confirm7 130 0%
confirm8 135 0%
confirm9 135 0%
confirm10 135 0%
confirm11 223 0%
confirm12 586 0%
confirm13 1,120 0%

Scaling file set

Product images are the component of the E-commerce file set that scales as the number of requested simultaneous sessions increases.  Each directory represents a product line an E-commerce store would carry.  The number of directories is determined using the following formula:

directory count = 5 * SIMULTANEOUS_SESSIONS

During a benchmark run, a Zipf distribution is used to access each directory.  A Zipf distribution is a distribution where the probability of selecting the nth item is proportional to 1/n. Zipf distributions are empirically associated with situations where there are many equal-cost alternatives.

Each directory consists of 10 "products", and each product has three images associated with it, which represent different views of a product a customer is interested in.  Within a directory, one of the 10 products is chosen using a random distribution.  Then, three image requests are made for that product, one from each class.  The classes are shown in the table below:

Workload Class

File sizes

Stepping increment

Class 0

3521 - 17795 bytes

1586 bytes

Class 1

6710 - 39020 bytes

3590 bytes

Class 2

5327 - 40526 bytes

3911 bytes

The sizes, frequencies, and directory scaling factor were determined from aggregating server-side Web server logs and observing client-side Web browser caches.


Copyright © 2005-2006 Standard Performance Evaluation Corporation.  All rights reserved.