[AISWorld] FW: Benchmark sources from Capers Jones

Tue Aug 9 16:42:04 EDT 2011

Capers Jones is compiling a catalog of software benchmark data sources as a public service for the software community.   If you publish or know of benchmarks that you would like included, please send them to Mr. Jones.  Details in his note below ...

Hope all is good with you!

Best wishes,
Leon

"Anything less than a comprehensive economic plan that addresses not only the budget deficit but the underlying industrial, education, energy, tax and regulatory policies that have contributed to this crisis will leave us chasing our tails." - Michael Lewitt, 1-August-2011.

----------------------------------------------------------------------------------------------------------------------------------------------
Leon A. Kappelman, Ph.D.
  Professor of Information Systems
  Director Emeritus, Information Systems Research Center
  Fellow, Texas Center for Digital Knowledge
    College of Business, University of North Texas
    Voice: 940-565-4698   Fax: 940-565-4935  Email: kapp at unt.edu<mailto:kapp at unt.edu>
    Website: http://courses.unt.edu/kappelman/
Founding Chair, Society for Information Management's Enterprise Architecture Working Group
----------------------------------------------------------------------------------------------------------------------------------------------
Please consider the environment before printing this e-mail

From: CJonesIII at cs.com [mailto:CJonesIII at cs.com]
Subject: Benchmark sources from Capers Jones

Leon,

I'm planning to publish a catalog of benchmark sources.  If you or any colleague publish benchmarks and want to be included, please write up what is available using a format similar to the ones already here.

This work is just starting out.

Thanks,
Capers Jones

SOURCES OF SOFTWARE BENCHMARKS

Version 5.0                 August 6, 2011

Capers Jones & Associates LLC.

INTRODUCTION

Number of benchmark sources currently:  8

Quantitative software benchmark data is valuable for measuring process improvement programs, for calibrating software estimating tools, and for improving software quality levels.

There are a number of organizations that gather and report quantitative benchmark information.  However these organizations are independent and in fact some of them are competitors.

This catalog of software benchmark data sources is produced as a public service for the software community.

There are many different kinds of benchmarks including productivity and quality levels for specific projects; portfolio benchmarks for large numbers of projects, operational benchmarks for data center performance; security benchmarks, compensation and staffing benchmarks for human resource purposes; and software customer satisfaction benchmarks.

This catalog is intended to grow over time and to include all major sources of software benchmark information.

The information in this catalog is provided by the benchmark groups themselves and includes topics that the benchmark groups wish to be made available.

In this version the benchmark groups are listed in alphabetical order.  In later versions the groups will be listed alphabetically for reach type of benchmark.

TABLE OF CONTENTS

  1.  Capers Jones & Associates LLC                                                            3

  1.  Galorath Incorporated                                                                            8

  1.  International Software Benchmark Standards Group (ISBSG)     9

  1.  Price Systems LLC                                                                               12

  1.  Process Fusion                                                                         15

  1.  Quantimetrics                                                                                       16

  1.  RCBS, Inc.                                                                                          18

  1.  Software Benchmarking Organization                                        21

Appendix A:  Survey of Software Benchmark Usage and Interest                      22

Appendix B:  A New Form of Software Benchmark                                          23

Capers Jones & Associates LLC
Web site URL: Under construction
Email: Capers.Jones3 at gmail.com<mailto:Capers.Jones3 at gmail.com>

Sources of data:         Primarily on-site interviews of software projects.  Much of the data
                                    is collected under non-disclosure agreements.  Some self-reported
                                    data is included from Capers Jones studies while working at IBM
                                    and ITT corporations.  Additional self-reported data from clients
                                    taught by Capers Jones and permitted to use assessment and
                                    benchmark questionnaires.

Data metrics:             Productivity data is expressed in terms of function point
                                    metrics as defined by the International Function Point
                                    User's Group (IFPUG).  Quality data is expressed in
                                    terms of defects per function point.

                                    Also collected is data on defect potentials, defect removal                                                         efficiency, delivered defects, and customer defect reports at 90
                                    day and 12 month intervals.

                                    Long-range data over a period of years is collected from a small                                                group of clients to study total cost of ownership (TCO) and
                                    cost of quality (COQ).  Internal data from IBM also used for
                                    long-range studies due to author's 12 year period at IBM..

                                    At the request of specific clients some data is converted
                                    into COSMIC function points, use-case points, story points,
                                    or other metrics.

Data usage:                Data is used to create software estimating tools and predictive
                                    models for risk analysis.  Data is also published in a number of
                                    books including The Economics of Software Quality, Software
Engineering Best Practices, Applied Software Measurement,
                                    Estimating Software Costs  and 12 others.  Data has also
                                    been published in about 200 journal articles and monographs.

                                    Data is provided to specific clients of assessment, baseline, and
                                    benchmark studies.  These studies compare clients against similar
                                    companies in the same industry.

                                    Data from Capers Jones is frequently cited in software litigation
                                    for breach of contract lawsuits or for suits alleging poor quality.
                                    Some data is also used in tax litigation dealing with the value of
                                    software assets.

Data availability:       Data is provided to clients of assessment and benchmark studies.
                                    General data is published in books and journal articles.
                                    Samples of data and some reports are available upon request.

Kinds of data: Software productivity levels and software quality levels
for projects ranging from 10 to 200,000 function points.
Data is primarily for individual software projects, but some
portfolio data is also collected.  Data also supports activity-based
costing down to the level of 40 activities for development
and 25 activities for maintenance.  Agile data is collected
for individual sprints.  Unlike most Agile data collections
function points are used for both productivity and quality.

Some data comes from commissioned studies such as an
Air Force contract to evaluate the effectiveness of the CMMI
and from an AT&T study to identify occupations employed
within large software labs and development groups.

Volume of data:          About 13,500 projects from 1978 through today.
                                    New data is added monthly.  Old data is retained,
                                    which allows long-range studies at 5 and 10-year
                                    intervals.  New data is received at between 5 and
                                    10 projects per month from client interviews.

Industry data: Data from systems and embedded software, military
                                    software, commercial software, IT projects, civilian
                                    government projects, and outsourced projects.

                                    Industries include banking, insurance, manufacturing,
                                    telecommunications, medical equipment, aerospace,
                                    defense, and government at both state and national levels.

                                    Data is collected primarily from large organizations with
                                    more than 500 software personnel.  Little data from small
                                    companies due to the fact that data collection is on-site and
                                    fee based.

                                    Little or no data from the computer game industry or
                                    the entertainment industry.  Little data from open-source
                                    organizations.

Methodology data:     Data is collected for a variety of methodologies including
                                    Agile, waterfall, Rational Unified Process (RUP), Team
                                    Software Process, (TSP), Extreme Programming (XP),
                                    and hybrid methods that combine features of several methods.

                                    Some data is collected on the impact of Six Sigma, Quality
                                    Function Deployment (QFD), formal inspections, Joint
                                    Application Design (JAD), static analysis, and 40 kinds of
                                    testing.

                                    Data is also collected for the five levels of the Capability
                                    Maturity Model Integrated (CMMI(tm)) of the Software
                                    Engineering Institute.

Language data:           As is usual with large collections of data a variety of
                                    programming languages are included.  The number of
                                    languages per application ranges from 1 to 15, with an
                                    Average of about 2.5.  Most common combinations
                                    Include  COBOL and SQL and Java and HTML.
                                    Specific languages include Ada. Algol, APL,ASP Net,  BLISS,
                                    C, C++, C#, CHILL. CORAL, Jovial, PL/I and many
                                    Derivatives, Objective-C. Jovial, and Visual Basic..
                                    More than 150 languages out of a world total of 2,500
                                    are included.

Country data: About 80% of the data is from the U.S.  Substantial data
                                    From Japan, United Kingdom, Germany, France, Norway,
                                    Denmark, Belgium, and other major European countries.
                                    Some data from Australia, South Korea, Thailand, Spain, and
                                    Malaysia.

                                    Little or no data from Russia, South America, Central America,
China, India, South East Asia, or the Middle East.

Unique data:               Due to special studies Capers Jones data includes information
                                    on more than 90 software occupation groups and more than 100
                                    kinds of documents produced for large software projects.  Also,
                                    the data supports activity-based cost studies down to the levels
                                    of 40 development activities and 25 maintenance tasks.  Also
                                    included are data on the defect removal efficiency levels of
                                    65 kinds of inspection. static analysis, and test stages.

                                    Some of the test data on unit testing and desk checking came
                                    from volunteers who agreed to record information that is
                                    normally invisible and unreported.  When working as a
                                    programmer Capers Jones was such a volunteer.

                                    From longitudinal studies during development and after release
                                    the Jones data also shows the rate at which software requirements
                                    grow and change during development and after release.  Monthly
                                    change rates exceed 1% per calendar month     during
                                    development and more than 8% per year after release.

                                    From working as an expert witness in 15 lawsuits, some special
                                    data is available on litigation costs for plaintiffs and defendants.

                                    From on-site data collection and carrying out interviews
                                    With project teams and then comparing the results to
                                    Corporate resource tracking systems, it has been noted
                                    that "leakage" or missing data is endemic and approximates
                                    50% of actual software effort.  Unpaid overtime and the
                                    work of managers and part-time specialists are most common.

                                    Quality data also leaks and omits more than 70% of internal
                                    defects.   Most common omissions are those of desk checking,
                                    unit testing, static analysis, and all defect removal activities
                                    prior to release.

                                    Leakage from both productivity and quality data bases inside
                                    Corporations makes it difficult to calibrate estimating tools and
                                    also causes alarm to higher executives when the gaps are revealed.
                                    The best solution for leakage is activity-based cost collection.

Future data:                There are several critical areas which lack good sources of
                                    quantitative data.  These include studies of data quality,
                                    studies of intangible value, and studies of multi-national
                                    projects with geographically distributed development
                                    locations.

Summary:                   Capers Jones has been collecting software data since working
                                    for IBM in 1978.  In 1984 he founded Software Productivity
                                    Research and continued to collect data via SPR until 2000.
                                    Capers Jones & Associates LLC was formed in 2001.

                                    He owns several proprietary data collection questionnaires
                                    that include both qualitative assessment information and
                                    quantitative data on productivity and quality.  The majority
                                    of data comes from on-site interviews with software project
                                    teams but self-reported data is also included, especially from
                                    clients who have been trained and authorized to use the
                                    Jones questionnaires.

                                    More recently remote data collection has been carried
                                    out via Skype and telephone conference calls using                                                                    shorter forms of the data collection questionnaires.

                                    Some self-reported or client-reported benchmark data
                                    is included from companies taught by Capers Jones and
                                    from consortium members.

                                    Some self-reported data is also included from internal
                                    studies carried out while at IBM and ITT, and also
                                    from clients such as AT&T, Siemens, NASA, the Navy, and
                                    the like.

Galorath Incorporated
Web site URL: www.galorath.com
Email: info at galorath.com<mailto:info at galorath.com>

Sources of data:         Repositories obtained through product customers, industry sources,  and public domain and consulting.  Galorath also maintains a partnership with ISBSG and offers the ISBSG data in SEER Historical Database (SEER-HD) format to subscribers.

Data metrics:             Productivity expressed in function points and source lines of code.  Most data has language, platform and application descriptors.  Some of the data also contains, dates, defects delivered, time phased staffing, documentation pages, detailed SEER parameter settings.

Data usage:                Data is used to calibrate SEER knowledge bases, determine language factors, generate trend lines, benchmark organizations, and validate model relationships.

Data availability:       Data is delivered in the form of knowledge bases, language factors, trend lines and estimating relationships to SEER customers. Where permitted data is provided in source form so SEER users can see their estimates compared with individual data points.  SEER customers who submit data to Galorath receive their data in the SEER Historical Database format that can be used directly by the application.

Kinds of data:            Records of completed projects at the program and project level.  Size ranges from extremely small (less than one month duration) to very large (hundreds of staff, many years' duration.)  Additionally metadata is available from some sources, showing items such as productivity ranges, size ranges, etc. where management is not willing to release raw data.

Volume of data:         Over 20,000 records with nearly 13,000 total records containing at least size, in addition to other metrics of which approximately 8500 records contain completed effort and a lesser number, project duration.  Volume increases regularly.

Industry data:            Data from IT/business systems, industrial systems and embedded software, military, and commercial enterprises.

Methodology data:    Project processes in collected data range from traditional waterfall through incremental and agile development.

Country data:            Military and aerospace data is US and European.  Commercial data is worldwide.

Summary:                  Galorath maintains an active data collection program and has done so for the last 20 years.  It makes data available to users of SEER for Software in several forms such as its ProjectMiner data mining tool and Metrics and Benchmarking data visualization tool.

ISBSG Limited
Web site URL: www.isbsg.org
Email: peter.r.hill at isbsg.org<mailto:Capers.Jones3 at gmail.com>

Sources of data:         Voluntary submission of software projects data.  The data is collected under an agreement of anonymity. Data is submitted either directly or by consultants with the permission of their customers.

Data metrics:             Productivity data is expressed in terms of function point metrics as defined by the International Function Point User's Group (IFPUG); NESMA; COSMIC; and FiSMA.  Quality data is expressed in terms of defects per function point or functional unit.

LOC sizes are stored if provided but is not validated and not used for any analysis.

Also collected is data on the report of defect occurrence during the project and within the first 30 days of operation.

Data has come from more 25 countries and 20 major organization types

Data usage:                Data is used to create some basic ISBSG software estimating tools; used by commercial estimation tool vendors; and for benchmarking services. The data is also analysed and published in a number of analysis reports and books including Practical Software Project Estimation.

Data is provided to academics for use in approved research work.

Data availability:       The ISBSG data is 'open' and can be licensed by anyone in its 'raw' form at a moderate cost.
General data and analysis results are published in books and journal articles. Samples of data and some reports are available upon request.

Kinds of data: Data on both Development and Enhancement projects.
Projects ranging from 10 to 10,000 function points in size, but with an average size around 300FP.
Data is primarily for individual software projects, but some
portfolio data is also collected.

Volume of data:         About 6,000 projects from 1989 through today. New data is added monthly.  Old data is retained, allowing long-range studies in four-year groups..

Industry data:            Data from MIS commercial software; real-time systems; IT projects; government projects; and outsourced projects; and Off-shore projects.

Industries include banking; insurance; legal; manufacturing;
Telecommunications; Accounting; Sales; Transport; Government; and public sector.

Data is submitted from a wide range of organizations of differing sizes. There is no data from the military.

Methodology data:    Data is collected for a variety of methodologies including Agile, Waterfall, Joint Application Design (JAD), Rational Unified Process (RUP), Team Software Process, (TSP), and hybrid methods that combine features of several methods.

Some data is collected on the impact of being compliant with the  CMM and CMMI five levels and relevant ISO standards.

Country data:             About 30% of the data is from the U.S. 16% from Japan; 16% Australia; 10% Finland; 8% Netherlands; 6% India; 5% Canada; 2% China. Also data from 12 other countries.

Summary:                         The ISBSG's formation in 1997 was built upon several years of previous cooperation by a group of national software metrics associations with the common aim of developing and promoting the use of IT industry history data to improve software processes and products, for the benefit of both businesses and governments worldwide.

Current members of the ISBSG represent IT and Metrics Associations based in the following countries: Australia, China, Finland, Germany, India, Italy, Japan, Korea, Netherlands, Spain, Switzerland, UK, USA.

Two independent repositories of IT industry data:
1.                  Software Development and Enhancement - over 5,700 projects
2.                  Software Maintenance and Support - ~500 applications

PRICE Systems, LLC
www.pricesystems.com
Email: arlene.minkiewicz at pricesystems.com

Sources of data:
Data comes from many sources.  Proprietary data is collected from clients through consulting engagements, research engagements and training.  Publically available data is purchased from external sources. Public data is also mined from sources published technical reports and articles as well as reports in the media.

Data metrics:
Data collected include software size (in Source Lines of Code (SLOC), Function Points or Use Cases), project effort, project cost, various technical parameters about the nature of the product being developed, the development team, and the developing organization.

Schedule duration for projects is collected as well.

Data usage:
Data is used to for creation, verification and validation of estimating relationships for cost, effort and schedule in the TruePlanning for Software product.  Data is used to update knowledge bases which provide user guidance through various means within the product.

Data collected is also used to support research efforts, internally and externally funded, focused on specific areas of software development, technology,  or process.

Data availability:
Raw data is generally unavailable due to its proprietary nature, summary data is available upon request

Kinds of data:

Software project data for projects (or project components) ranging from 1000 - 3,000,000 SLOC and projects (or project components) ranging from 25 - 1,000,000 Function Points.  Data is primarily for individual software projects - some data is provided down to the component level, others are higher level project data.

Volume of data:
Data collected from more than 3500 projects from 1977 through today.            New data is added monthly.  Old data is retained.

Industry data:
Data from systems and embedded software, military software, real time software, commercial software, IT projects, MIS systems,  civilian government projects, and outsourced projects.

Industries include banking, insurance, manufacturing, accounting, transportation, telecommunications,  aerospace, defense, and government at both state and national levels.

Data is collected primarily from large organizations or agencies.

Methodology data:
Data is collected for a variety of methodologies and development process models.

Language data:           Data collected covers a wide range of programming languages.

Country data:
About 85% of the data comes from sources in the US.  Additional data is provided from mostly European sources.

Unique data:
In addition to data that we have collected from our clients and other sources, PRICE also has been collecting data from our own software development efforts over the last 30+ years we have been developing software applications.  Having been an agile shop for much of the last decade, this data includes many of the common productivity metrics associated with agile development

Summary:
PRICE Systems has been collecting software data since the mid 70's during the research and development of the PRICE Software model, which has evolved into the TruePlanning for Software model.  Data collection continues to support regular upgrades to the software cost, schedule and effort estimating ability in this model.

PRICE Systems is a global leader of cost management solution in Aerospace, Defense, Space and Information Systems, serving over 250 customers worldwide.

Process-Fusion.net
URL: www.Process-fusion.net<http://www.Process-fusion.net>
Email: ggack at process-fusion.net

Sources of data:         Approximately 75% of this data set comes from formal inspections training event workshops in which participants inspect actual work products prepared prior to the workshop. Approximately 25% of the data are from post-training inspections of work products.

Data metrics:             This data set includes approximately 1,600 individual inspections conducted by team of 3 or 4 in most cases. All of these inspections are consistent with IEEE Std. 1028-2008. Data elements collected for each inspection include work product type (requirements, design, code), size in lines, orientation hours, preparation hours, team meeting hours, number of major and minor defects identified, and post-correction review hours.

Data usage:                Summaries of these data may be used as a basis for estimating the number of defects likely to be found in the several work product types identified, the number of hours per defect (or defects per hour) likely to be required when using formal inspections consistent with the process defined by IEEE Std. 1028-2008 (a.k.a. "Fagan-style Inspections"). Standard deviation and distribution data are also available.

Data availability:       Summary data are available from the author upon request.

Volume of data:          Approximately 1,600 inspections.

Industry data: Data from systems and embedded software, military
software, commercial software, IT projects  and outsourced projects.

                                    Data is collected primarily from large organizations with
                                    more than 500 software personnel.

Language data:           A variety of programming languages are included.

Country data: About 80% of the data is from the U.S.

QuantiMetrics
Web site URL: www.quantimetrics.net<http://www.quantimetrics.net>
Email: bram at quantimetrics.net<mailto:bram at quantimetrics.net>

Sources of data:           Organisations (public and private) who subscribe to the software benchmarking services of QuantiMetrics.  Data submitted is subject to rigorous consistency, validity and reasonableness checks; measurement practices of subscribers are pre-audited to avoid potential data problems.  In many instances QuantiMetrics carry out the size measurements using certified staff; where size measures are provided by subscribers, these are audited by QuantiMetrics.

Data metrics:               Project data: size measures; resource and time use by development phase, resource use by activity type; costs; quality measures by quality step; methods, languages/tools and platforms used; project team staffing and experience; qualitative factors deemed to have affected performance.  Measures reflecting plans at significant mile-stones, are also collected.

Applications data: size measures; annual resource used by activity type; volumes of changes, change and support requests; costs; application support team experience and ratings on maintainability and other factors; languages/tools and platforms used; application age; application instances/versions; user population; support window

Data usage:                  Applications project and support performance benchmarking and estimating; at least a third of instances this data has been used in the context of outsourcing contracts

Data availability:         Data is deemed to be proprietary - participants are provided with like-for-like benchmarks for all measures they provide for their projects and applications.  Anonymity is preserved at all times.

Kinds of data:              Project and Application support for business applications

Volume of data:           Data accumulated goes back to mid 1980s; early data largely based on line-of-code measures of size, while use of IFPUG functional size measurement dominant from 1990s and onward.   The number of projects accumulated in total is about 10,000, yet benchmarks are only based on the most recent 3,000 or so projects, of which a large proportion are post Y2K.  Applications measured number about 1,000.

Industry data:              Industry participants are largely from the private sector, with financial institutions the most dominant, then telecoms businesses although the applicability of the benchmarks is appropriate for all business type projects.

Methodology data:       Project data includes information on approaches and methodology (proprietary or generic type).  In the case of Agile-based methods ratings of the degree of adoption of practices also obtained.

Language data:            No restriction is placed on the development languages (tools) used; where statistically reliable data is available benchmarks are language specific, otherwise they are based on type of language/tool.  Early submissions were dominated by third generation languages, followed by a period when a variety 4th generation tools was the norm, while now object oriented languages and "visual" tools are more the norm.  Projects also include those that configure and roll out business applications software.

Country data:               Largest and dominant contributions are from UK, Germany, South Africa and India

Unique data:                 QuantiMetrics use statistical techniques to model and normalize performance, particularly for the effects of: size and time in the case of projects; size, age and user population for applications support data.

Future data:                  Expect to see more use of COSMIC as a size measure

Summary:                     QuantiMetrics have worked to ensure the database contains good-quality data and that the most statistically reliable like-for-like benchmarks are provided.  Reports and presentations are aimed to identify systemic cause-effect relationships and, in particular, to suggest best practices and opportunities for improvement.
RBCS, Inc.
Web site URL: www.rbcs-us.com
Email: info at rbcs-us.com<mailto:info at rbcs-us.com>

Sources of data:         On-site interviews of software testing and quality assurance teams and groups within larger organizations that develop software and/or systems for internal use or for sale to customers, or which provide testing and quality assurance services. Discussions with clients in public and private training sessions.  The data is collected under non-disclosure agreements, and thus only anonymized and collective data is available for comparative purposes.

Data metrics:             The metrics are some two dozen testing and quality related process, product, and project metrics. Examples include: defect detection effectiveness, cost per defect found, percentage coverage of risks and/or requirements, percentage of invalid defect reports, percentage of re-opened defect reports, consistency of test case documentation, accuracy of test estimates, percentage overhead of test environment setup/maintenance, and various others.

Data usage:                Data is used to assess the capability and process improvement potential of organizations involved in creation of software or systems for their own use or for use by clients/customers, and of organizations engaged in providing testing services.  Data and metrics are mentioned in some of Rex Black's book, such as Managing the Testing Process, Advanced Software Testing: Volume 2, and The Expert Test Manager (scheduled for publication in 2012). Data is also used in many of Rex Black's presentations and articles.

Data is provided to specific clients of assessment, baseline, and benchmark studies.  These studies compare clients against similar companies.

Data is discussed in various RBCS training courses, especially test management courses.

Data is also discussed in various RBCS free webinars on topics related to test management.

Data availability:       Data is provided to clients of assessment and benchmark studies. General data is published in books and journal articles.

Kinds of data:             Software and system quality and testing capability measurements focused on project, process, and product. People-focused metrics are discussed in consultancy but not provided for general use, due to the potential for problems with such metrics.

Data comes from work with clients on assessments of their projects, processes, and products, as well as work with training clients.

Volume of data:          RBCS has over 200 clients and has been in the software testing and quality assurance training, consulting, and outsource business since 1994.

Industry data:             Data from systems and software projects spanning almost all applications of software and hardware/software systems, with no particular emphasis in terms of industry. Both private and government clients are included.

Data is collected from organizations ranging in size from small companies with less than a dozen employees to some of the largest organizations in the world (e.g., Sony, CA, etc.)

Due to the fact that assessments are paid consulting services, open source data is not included.

Methodology data:     Data is collected for a variety of methodologies including Agile, waterfall, Rational Unified Process (RUP), Scrum, Extreme Programming (XP), and hybrid methods that combine features of several methods.

Data covers all types of testing, including functional testing, performance testing, usability testing, internationalization testing, manual testing, automated testing, regression testing, and so forth.

Language data:           Many programming languages are included, but some of the older languages such as COBOL and FORTRAN are under-represented.

Country data:             RBCS clients are distributed widely around the world, and include many multinationals.  Emerging software testing countries are under-represented in some cases, but RBCS is heavily involved in software testing and quality training and consulting with emerging economies such as India, Bangladesh, Kenya, Nigeria, South Africa, Malaysia, and China.

Unique data:               RBCS data is unique in the range of test-related metrics and data available to clients.  While other consultancies provide proprietary test assessment and/or "maturity" services, many of these are almost entirely questionnaire-based and/or interview based, and thus suffer from the Rashomon Effect.  RBCS assessments, being evidence-focused and data-focused, provide deeper insights to clients.  In addition, RBCS assessments are based on an open-source framework published in Rex Black's book Critical Testing Processes.

Future data:                RBCS intends to continue to provide quality and testing assessment services to clients around the world, and is working to expand the depth of the data through more assessments.  RBCS is also considering developing an online assessment process using its proven e-learning platform.

Summary:                   RBCS is unique in test consultancies in being focused on an evidence-based and metrics-based approach to assessment, which provides clients with reliability benchmarks and assessment results.  RBCS has been in the quality and testing consulting business since 1994, and founder Rex Black's background in software engineering extends back to 1983.

Software Benchmarking Organization
Web site URL: www.sw-benchmarking.org
Email: hans.sassenburg at sw-benchmarking.org<mailto:hans.sassenburg at sw-benchmarking.org>

Sources of data:         On-site benchmark studies and assessments of software projects.  Data is always collected under non-disclosure agreements.

Data metrics:             Data is collected for the following metrics:
-         Schedule (calendar months)
-         Effort (person months)
-         Productivity (function points per hour using IFPUG definition)
-         Cost of Quality (effort distribution over 4 areas)
-         Deferral rate (ratio of deferred baselined features)
-         Feature size (function points)
-         Technical size (KLOC)
-         Re-use level (ratio)
-         Complexity (using McCabe definition)
-         Test coverage (unit, integration, system testing)
-         Defect density (defects per function point at release time)
-         Defect removal efficiency (ratio of defects removed before releasing)

Data usage:                Data is used:
-         to benchmark "engineering capability" against industry averages and best-in-class figures in same industry, using data from SBO assessments as well as published data from other sources (like Capers Jones)
-         to assess the feasibility of initial and running projects

Data availability:       Data is provided to clients of studies and assessments

Kinds of data: Data is primarily for individual software projects

Volume of data:          About 150 projects from 2001 through today.
                                    New data is added frequently.

Industry data: Data from systems and embedded software, military
                                    software, commercial software, IT projects, civilian
                                    government projects, and outsourced projects.

                                    Industries include banking, insurance, manufacturing,
                                    telecommunications, medical equipment, aerospace,
                                    defense, and government at both state and national levels.

                                    Data is collected primarily from projects with 10 - 75 software engineers

Methodology data:     Data is collected for a variety of methodologies including
                                    Agile, waterfall, Rational Unified Process (RUP), and hybrid methods that combine features of several methods.

                                    Data is also collected for the five levels of the Capability
                                    Maturity Model Integrated (CMMI(tm)) of the Software
                                    Engineering Institute.

Language data:           As is usual with large collections of data a variety of
                                    programming languages are included.  The number of
                                    languages per application ranges from 1 to 5, with an
                                    average of about 2.  Most common combinations
                                    include  Assembler, C, C++, C# and Java.

Country data:             Most of the data is from Western Europe (including The Netherlands, Germany, France, Switzerland, United Kingdom) with a strong focus on the embedded software industry.

                                    Limited data from other countries/regions.

Unique data:               The collected data is used to compute 16 KPIs, arranged in four different categories. This enables an organization to identify root causes of underperformance and estimate the effect of corrective measures.

Future data:                Currently, limited data is available regarding complexity and test coverage. This will receive more attention.

Summary:                   SBO has been collecting software data since working 2001 using benchmarking and assessment studies. SBO uses a network of accredited partners.

                                    In addition, SBO delivers metrics workshops as well as supporting technology for metrics collection and analysis.

APPENDIX A:  SURVEY OF BENCHMARK USAGE AND INTERESTS

SOFTWARE BENCHMARK USAGE SURVEY

Version 2.0     3/7/2010

No

Used

Would

Interest

Today

Use if

Available

"0"

"1"

"2"

1. Portfolio benchmarks

2. Industry benchmarks (banks, insurance, defense, etc.)

3. International benchmarks (US, UK, Japan, China, etc.)

4. Application class bench marks (embedded, systems, IT, etc.)

5. Application size benchmarks (1, 10, 100, 1000, function points etc.)

6. Requirements creep benchmarks (monthly rates of change)

7. Data center and operations benchmarks (availability, MTTF, etc.)

8. Data quality benchmarks

9. Data base volume benchmarks

10. Staffing and specialization benchmarks

11. Staff turnover and attrition benchmarks

12. Staff compensation benchmarks

13. Organization structure benchmarks (matrix, small team, Agile, etc.)

14. Development productivity benchmarks

15. Software quality benchmarks

16. Software security benchmarks (cost of prevention, recovery, etc.)

17. Maintenance and support benchmarks

18. Legacy renovation benchmarks

19. Total cost of ownership (TCO) benchmarks

20. Cost of quality (COQ) benchmarks

21. Customer satisfaction benchmarks

22. Methodology benchmarks (Agile, RUP, TSP, etc.)

23. Tool usage benchmarks (Project management, static analysis, etc.)

24. Reusability benchmarks (volumes of various reusable deliverables)

25. Software usage benchmarks (by occupation, by function)

26. Outsource results benchmarks (domestic)

27. Outsource results benchmarks (international)

28. Schedule slip benchmarks

29. Cost overrun benchmarks

30. Project failure benchmarks (from litigation records)

31. Litigation cost benchmarks - breach of contract

32. Litigation cost benchmarks - taxable  value of software

33. Litigation cost benchmarks - non competition violations

34. Litigation cost benchmarks - damages due to poor quality

35. Litigation cost benchmarks - intellectual property

APPENDIX B:  A NEW FORM OF SOFTWARE BENCHMARK

Introduction

Normally software assessments and benchmarks are provided to specific companies and compare selected applications against similar applications from other companies in the same industry.  This is useful information, but it does not provide any context or any information about the industry itself.

What would be both useful and salable would be a new kind of benchmark that would consolidate information about specific industries, the major companies within the industries, the software used by those companies, and also productivity and quality ranges derived from assessment and benchmark studies.

Using "banking" as an industry example here are some 55 kinds of assessment and benchmark information that would be provided:

Table 1:  Fifty Five Data Topics for Industry-Specific Software Benchmarks

  1.  Number of large companies (source = Hoover's Guides to business)
  2.  Number of medium companies (source = Hoover's Guides to business)
  3.  Number of small companies (source = Hoover's Guides to business)
  4.  Regulatory agencies that control business sectors
  5.  Standards and safety regulations for the industry
  6.  Supply chains and related industries
  7.  Current government investigations involving the industry
  8.  Mergers and acquisitions within the industry
  9.  Start-up companies within the industry
  10. Business failures, government takeovers, or bankruptcies within the industry
  11. Recent patents and intellectual property filed by industry members
  12. Ranges of industry profitability and economic health
  13. Current litigation involving the industry
  14. Domestic competitive situation of the major players within the industry
  15. Global competitive situation of the major players within the industry
  16. Professional associations that serve the industry
  17. Major kinds of hardware platforms used within the industry
  18. Major kinds of data utilized within the industry
  19. Major kinds of software applications utilized by the industry
  20. Major kinds of ERP applications utilized by the industry
  21. Major COTS vendors that provide packages to the industry
  22. Major open-source applications utilized by the industry
  23. Major outsource vendors that service the industry
  24. Major sources of reusable components serving the industry
  25. Ranges of portfolio and application sizes within the industry
  26. Ranges of data base and data warehouse sizes within the industry
  27. Numbers of software users within the industry
  28. Number of customer organizations served by software
  29. Number of actual clients or users served by software
  30. Numbers of software developers within the industry
  31. Numbers of maintenance personnel within the industry
  32. Numbers of technical specialists within the industry (quality, testing, etc.)
  33. Rates of change for software personnel (expanding, shrinking, stable)
  34. Software security issues within the industry
  35. Statistics on data theft, denial of service, and other security breaches
  36. Security policies, standards, and best practices for the industry
  37. Software development productivity benchmarks with function points
  38. Software maintenance/enhancement  productivity benchmarks (function points)
  39. Software total cost of ownership (TCO) benchmarks with function points
  40. Software quality benchmarks with function points
  41. Software cancelled project benchmarks by size in function points and type
  42. Software costs and schedule overruns within the industry
  43. Legacy application replacement strategy within the industry
  44. Distribution of CMMI levels within the industry
  45. Distribution of TickIt scores or maintainability scores within the industry
  46. Major development methods used within the industry
  47. Major maintenance methods used within the industry
  48. Typical tool suites used within the industry
  49. Best practices utilized by the industry
  50. Average practices utilized by the industry
  51. Worst practices utilized by the industry
  52. Major quality control methods used within the industry
  53. Future industry technology trends (cloud computing, SOA, etc.)
  54. Future industry software trends
  55. Major sources of industry data (web sites; periodicals, etc.)

For a bank to assemble all of this information by itself it would be necessary to gather data from about a dozen industry and government sources plus probably commissioning benchmarks on a sample of 10 to more than 25 applications.  Competitive information from other banks would not be accessible.  Essentially this kind of information would not be gathered by individual banks because of a lack of organizational focus, plus the rather high costs involved.

Probable Clients for Software Benchmarks by Industry

As of 2011 the software benchmark business sector is divided into two subsets.  One form of benchmark uses fairly simple questionnaires with the data being self-reported by clients.

Because self-reported benchmarks have no fees for providing information and consolidated benchmark reports are available for low cost, this form of benchmarking is widespread and popular.  The International Software Benchmark Standards Group (ISBSG) is the major provider of self-reported benchmarks.  The ISBSG clients consist mainly of project managers and some CIO's.

The second form of software benchmark uses more complicated questionnaires and also includes on-site data collection in order to gather and validate quantitative and qualitative information from samples of 10 to more than 25 projects at the same time.

These on-site benchmarks usually include non-disclosure agreements for data collection and distribution so the data is delivered to specific companies.

Because the costs for collecting and analyzing the data ranges from $25,000 to more than $50,000 these benchmarks require approval and funding from the level of a CIO or a CTO.  The reports back to clients are of course used by first-line and project managers, but the funding is usually provided by a higher level of management.

Software assessments are also on-site consulting studies.  For assessments using the model of the Software Engineering Institute (SEI) certified assessors are used.  For other forms of assessment such as the author's, trained consultants are used.  Here too the costs are fairly expensive and in the $25,000 to $75,000 range.

The data that would be assembled for the new kind of benchmarks discussed in this report would include a combination of self-reported data, on-site data collection, and subscriptions to a number of industry information providers such as Hoover Business Guides, Gartner Group, the Department of Commerce, plus access to other government statistics as well.  At a nominal charge of $10,000 for such a benchmark report, funding approval would probably be at the CIO and CTO level.

While the costs of these benchmarks are less than the cost of today's on-site benchmarks and assessments for individual clients, these new kinds of benchmarks could be offered to dozens or even hundreds of clients so they would generate much greater revenues and profits than conventional single-client benchmarks.

In order to be useful, the benchmark reports would consolidate data from at least 10 companies and 100 projects within an industry, and then use extrapolation to cover other companies within the same industry.  Of course larger samples would be desirable.  Remote data might ge gathered from 100 banks or more, while on-site data might be gathered from 20 banks or more.

The on-site data collection would probably be derived from conventional fee-based studies that provide information to specific clients.  However once the data is sanitized and aggregated, it would be marketed to multiple clients.

Because of the richness of the data provided, these new benchmarks would attract a much wider and more diverse set of clients than normal self-reported or on-site software benchmarks.  Obviously the information would be of use to CIO's and CTO's, but because of the in-depth industry coverage the information would also be of use to CEO's and to client executives as well.

For example these new benchmarks would be of use to VP's of marketing, sales, manufacturing, human resources, and research and development.  In addition the information would no doubt be acquired by major consulting companies, by law firms that specialize in software litigation, by outsource vendors, and by other kinds of information providers such as journals and web sites.

In addition the new form of benchmark would also be useful to many related companies that provide services or products to banking clients:  outsource vendors, software vendors, consulting companies, equipment manufacturers, personnel companies, major law firms, and government regulators.  In fact sales of the new kind of benchmark to these ancillary companies would probably exceed sales to the banking community itself.  For each benchmark study acquired by a bank, probably at least three studies would be acquired by banking service and product providers.  This is a new and previously untapped market for benchmark studies.

The basic idea of the new form of benchmark is to elevate the value of benchmark information from data that is "useful but not essential" to the level of "we must have this information to stay competitive."    A second goal is to elevate the target audience of the benchmark information from project managers, CIO's, and CTO's up the level of CEO's and senior operating unit executives.

Once this new form of benchmark is launched, it will probably lead to a significant increase in other forms of benchmarking.

It is obvious that the initial launch within an industry such as banking needs to attract a fairly significant number of clients.  Therefore the new form of benchmark should start with and industry where such information is already perceived as valuable; i.e. banks, defense, insurance, health care, medical equipment, and the like.

Once launched in the United States, these new benchmarks would certainly lead to an increase in overseas benchmarks using the same formats and data collection methods.  However to facilitate overseas benchmark data collection, local subcontractors would probably be a desirable method of proceeding.

In addition, some overseas data might be gathered via on-line methods such as Skype, remote surveys, and perhaps Wiki sites.  In fact a virtual benchmark environment using the same technology as Second Life is technically possible.  In such an environment avatars of consultants and clients might have conversations and discuss data gathering methods without actual travel.

It is apparent that once the expanded benchmarks start being created, continuous collection of data and continuous updates will become part of the benchmark and assessment process.

Expanding the Sources of Benchmark Data

Currently the actual personnel who provide data for both assessments and benchmarks are primarily software engineers and technical workers, project managers, some higher-level managers, and occasionally executives at the level of CIO or CTO.

Once the value of the expanded benchmarks becomes apparent, it can be anticipated that additional information might be collected from a much wider variety of stakeholders, executives, software personnel, and actual users of software:

Table 2: Executive Sources of Benchmark Information

  1.  Corporate CEO's
  2.  Boards of directors or advisory boards
  3.  Government executives (state and Federal CIO's, agency chiefs, etc.)
  4.  Operating unit VP's (manufacturing, finance, etc.)
  5.  Agile embedded stakeholders
  6.  CIO's for information systems
  7.  CTO's for systems  software
  8.  CTO's for embedded applications and hybrid systems
  9.  Customer executives who approve and acquire software
  10. Outsource executives
  11. Corporate attorneys
  12. User association executives

In addition, technical information might be acquired from a larger spectrum of software technical personnel than is normal for today's assessments and benchmarks:

Table 3: Technical Sources of Benchmark Information

  1.  Architects
  2.  Business analysts
  3.  Data base analysts
  4.  Software quality assurance (SQA)
  5.  Project Office personnel
  6.  Test personnel
  7.  Scrum Masters
  8.  Integration and configuration control personnel
  9.  Scrum masters
  10. Embedded stakeholders
  11. Software users and clients
  12. Customer support personnel

A combination of remote interviews using simple questionnaire, more detailed questionnaires for on-site use, and conference calls would be used to gather the expanded forms of information from an expanded set of software stakeholders and software production personnel.

The basic idea is to consolidate information on what stakeholders, top executives, development personnel, maintenance personnel, quality assurance personnel, and actual clients or users think about software applications and the methods used to create them.

The expansion of information sources will obviously cause some additional effort in gathering information.  But after a few months of trials and tuning, hopefully the additional effort will not cause more than about a 25% increase in total data collection effort.
International Software Benchmarks by Industry

As of 2011 there are benchmark studies and assessments performed overseas, and there are a few reports that compare performance by both country and industry.  Of the ones that attempt to do so, sample sizes are small and the results are somewhat marginal in terms of economic breadth and reliability.

There is a very strong demand for reliable international benchmarks that would show the comparative performance of the countries where software production is a major economic topic.  The probable revenues for international benchmarks would be proportional to the size of the software industries within the countries.  Due to the logistical issues of carrying out on-site international benchmarks, the global market for such benchmarks would probably be three to five years behind the U.S. market.

That being said, the revenues from international software benchmarks by industry would probably be sufficient to fund expansion throughout the world.  The more countries that provide data, the more valuable the overall collection of data will become.

Following are a few hypothetical examples of potential annual benchmark revenues by about 2020, assuming U.S. benchmarks as described in this report begin in 2012:

Table 4:  Probable Software Benchmark Revenues by Country Circa 2020

China                           $45,000,000
India                             $45,000,000
Japan                           $35,000,000
Russia                          $30,000,000
Brazil                            $25,000,000
U.K.                            $20,000,000
Germany                      $20,000,000
France                          $20,000,000
Italy                              $15,000,000
Ukraine                        $10,000,000
Spain                            $10,000,000
Scandinavia                  $10,000,000
Australia                       $10,000,000
Mexico             $10,000,000
South Korea                 $10,000,000
Canada                                      $7,000,000
Taiwan                           $7,000,000
Israel                              $5,000,000
Netherlands                    $3,000,000
Belgium                          $3,000,000

TOTAL                      $340,000,000

Of course all of these countries could not be studied at the same time, but eventually the value of expanded global and industry software benchmarks has the potential to create a significant body of knowledge about software in every industrialized country and every major industry.

Over and above the countries shown in table 2 benchmarks might also be provided for many other countries in Asia, Central and South America, Africa, and the Pacific regions such as Singapore and Malaysia.

Effort Needed to Create Benchmarks by Industry

Assembling the initial information needed to produce this example of an industry benchmark for banking would probably require about 90 days of full-time effort by at least one capable researcher.  Two would be better in order to provide backup in case of illness.

The initial data collection would also require fee-based subscriptions to a variety of data sources such as Hoover, Gartner Group, the ISBSG, and other industrial information sources.  However some of these sources of data provide short-term trials, so the initial expense would be fairly low for commercial data sources.

The probable initial cost of such an industry benchmark in a single industry such as banking would probably be in the range of $150,000.  This includes data collection from clients, commercial information providers, analysis of the data, and production of an initial report.

Later expenses would include development of a web site, marketing materials, and other collateral materials that are not absolutely needed for the initial benchmark report.

Because the data in such a benchmark is dynamic and changes rapidly, continuous updates would be needed to keep the information current.  Probably 36 days per year would be needed to refresh the information (i.e. 3 days per month per industry).  Monthly or quarterly updates would be provided to clients.

The value of this kind of benchmark compendium would be high enough so that the benchmark report might be marketed at perhaps $10,000 for the initial report and annual subscription fees for updates of perhaps $2,500.  Extracts and subsets from the report could be marketed individually for costs in the range of $500.  These would appeal to smaller companies within the industry.

Information would also be made available via the web, with some samples of the data being provided for free, but extended data being fee-based.  The fees could either be on a per-use basis or an annual subscription basis.

Another possible deliverable would be a daily web site entitled "Software Daily News" that resembles the very useful and popular current website entitled "Science Daily News."  The science web site covers a wide range of scientific disciplines ranging from archeology through astrophysics and includes both news summaries and full-length articles.

For a major industry such as banking, a benchmark report of this nature might attract about 250 domestic banks and 75 overseas banks.   It would also go to government agencies and major software vendors.

Each industry benchmark report would probably generate about $2,500,000 in the U.S. and about $750,000 abroad, or $2,750,000.  Recurring revenues would amount to perhaps $400,000 per year per industry per year.  If there were 10 industries supported the revenues would ascend to more than $27,500,000 for initial subscriptions and more than $4,000,000 per year in recurring revenues.

Eventually on a global level these new benchmarks might have more than 5,000 major corporate clients, more than 1,000 government clients,  and a large but unpredictable set of ancillary clients such as law firms, other consulting groups, universities, and the like.

Obviously such benchmarks would be most useful for industries that have a large amount of effort in the area of software and data processing.  These industries include but are not limited to:

Table 5:  Industry Candidates for Software Benchmarks

  1.  Aerospace
  2.  Agriculture
  3.  Airlines
  4.  Automotive
  5.  Banking
  6.  Chemicals
  7.  Computers and peripheral equipment
  8.  Cruise lines
  9.  Defense
  10. Education - university
  11. Education - primary and secondary
  12. Entertainment
  13. Energy and oil
  14. Governments - state
  15. Governments - Federal
  16. Government - municipal
  17. Health care
  18. Hotels
  19. Insurance
  20. Manufacturing
  21. Open-source
  22. Process control
  23. Pharmaceuticals
  24. Public Utilities
  25. Publishing
  26. Retail trade
  27. Software
  28. Telecommunications
  29. Transportation
  30. Wholesale trade

A full suite of such benchmarks for major industries would probably generate in the range of $30,000,000 to $50,000,000 per year from U.S. clients. A team of perhaps 10 researchers and 5 logistical support personnel, plus licenses and subscriptions for external sources of data.

If the company producing the benchmark reports also collected benchmark and assessment data itself, probably another 10 consultants would be needed.  The data collection would probably generate about $5,000,000 to $10,000,000 per year.

Data collection could also be subcontracted to existing benchmark groups such as ISBSG, Software Productivity Research (SPR), the Davids Consulting Group, and the like.  Marketing and sales personnel plus a small executive contingent would be needed.  The total size would probably be close to 35 total personnel.   However subcontracts for collecting benchmark data might be issued to more than 25 companies in more than 25 countries.

It is conceivable that within 10 years of the initial launch, the new form of benchmark might involve more than 300 subcontract personnel in more than 25 countries.

The annual cost for operating such the core group would probably be in the range of $4,000,000 per year.  However except for the $150,000 investment for the initial report, the organization should be self-sustaining and profitable because no heavy capital investments are needed.

The organization might generate revenues for benchmark subscriptions from eventually more than 500 domestic companies and perhaps 300 overseas companies and perhaps 100 government agencies.  Assuming three to five subscriptions per company revenues from benchmark subscriptions might be in the range of $10,000,000 to $15,000,000 per year, plus the initial cost for each subscription.

The consulting work of collecting data on a fee basis would probably bring in revenues of perhaps $8,000,000 per year.  Total revenues from all sources might total $30,000,000 to $50,000,000 per year.

By contrast as of 2011 software benchmarks are a niche industry with perhaps 20 U.S. companies collecting data, with combined annual revenues that probably are below $20,000,000.  This is because benchmarks are perceived by clients as useful but not essential.

The business idea behind the new form of benchmark is to elevate the importance from being perceived as useful to being perceived as essential.  To achieve this goal, more and better information needs to be provided to clients than is currently made available.

As useful as today's benchmarks are for productivity and quality studies, the lack of context information limits the utility of benchmarks and restricts the potential audience.

Additional Tools and Services in Addition to Benchmarks and Assessments

In addition to marketing benchmarks and performing assessment and benchmark consulting studies, the organization would also be positioned to market and perhaps develop several kinds of tools and ancillary products and services:

  *   Benchmark tools for data collection that clients might use to facilitate their own data collection and analysis.

  *   Predictive tools for estimating software schedules, efforts, costs, quality, reliability, maintenance, and enhancements.

  *   Special studies that evaluate important topics.  For example large-scale studies that compared a number of development methods such as waterfall, iterative, object-oriented, Agile, RUP, TSP, XP, and others would be both useful to the industry and highly salable.

  *   Training in skills which the benchmarks demonstrate need improvement within an industry such as quality control, test case design, change control, legacy migration, and a number of others.

  *   Portfolio analyses are seldom performed because of the difficulty of sizing and analyzing as many as 3,000 applications that might total more than 7,500,000 function points.  Further, between 25% and more than 50% of typical portfolios are in the form of COTS packages or open-source software which cannot be sized using standard methods because the vendors do not provide the needed inputs.  It is possible to size a portfolio using one or more of the new high-speed function point sizing methods.  Portfolio studies would no doubt be covered by non-disclosure agreements and hence marketed only to specific companies.

  *   Litigation support depends upon accurate data to determine industry averages for quality, productivity, and other benchmarks.  The kinds of data discussed herein would probably be widely used as background information in breach of contract litigation between clients and outsource vendors.  It would also be used in litigation against software companies for poor quality or damages.

  *   Outsource contracts frequently include clauses dealing with quality, schedule, productivity, reliability, change control, and the like.  Without accurate benchmark data some contracts contain clauses that are probably impossible to achieve.  Accurate benchmarks would be very useful for developing outsource contracts that are mutually agreeable.

  *   Mergers and acquisitions occur almost daily in many industries..  The data in this new form of benchmark would be of significant interest to business brokers and companies considering either acquisition or being acquired.  It would also be of interest to companies that are merely seeking partnerships, distributors, subcontractors, or potential clients.

  *   Venture-backed start-up businesses have been in decline due to the recession, but are starting to ramp up again.  The data contained in this new form of benchmark report should be of some interest to both entrepreneurs and venture capitalists considering starting new businesses.

Tool and special study revenues would be over and above the revenues already discussed.  They are currently unpredictable because the suite of tools is not fully defined.  However annual revenues in excess of $10,000,000 per tool would not be uncommon.  Special studies could easily top $ 35,000,000 per year.

Benchmark Company Exit Strategies

Normally companies have four possible end games:  1) they go public; 2) they are acquired by larger companies; 3) they go out of business 4) they continue indefinitely with only marginal growth..

Options 1 and 2 are the most likely end games for such a benchmark organization as the one described here.  While such benchmarks might be created by a non-profit group or perhaps by a university, it is more likely that a for-profit organization would be the best choice.

A for-profit company is most likely because if the idea of the new form of benchmark expands and becomes successful, the benchmark production group would be an attractive acquisition candidate for large data providers such as Gartner Group, Accenture, Google, or similar large corporations where information is a valuable commodity..

Software Benchmarks Circa 2011

As of early 2011 none of the normal benchmark sources or consulting companies provide this kind of information so there is no real competition.  Gartner Group and the International Data Group (IDG) provide subsets of the kinds of information discussed, but not actual benchmarks.

  *   The International Software Benchmark Standards Group (ISBSG) provides useful benchmarks, but does not cover embedded software, systems software, or have any data on large applications > 10,000 function points in size.

  *   The Software Engineering Institute (SEI) provides assessments, but is sparse with benchmark data and provides little or no information about companies, industries, data, and other key topics.

  *   The Information Technology Metrics and Productivity Institute (ITMPI) provides many useful reports and webinars on specific topics, but does not provide assessment and benchmark information or any context information about various industry segments.  Some of the CAI tools might well be useful for collecting benchmark data.

  *   The government-sponsored Data Analysis Center for Software (DACS) provides useful information in a government and defense context, but no benchmarks or assessments.

  *   The Standish Group publishes interesting statistics on software failures, but does not provide conventional benchmarks and assessments.

  *   A number of privately held consulting companies such as the Davids' Consulting Group, Quantitative Software Management (QSM), Software Productivity Research (SPR), and the Software Improvement Group (SIG) in Amsterdam and several others provide benchmarks and assessments for individual clients.  These groups occasionally publish studies using data from multiple clients, but sample sizes are fairly small.

  *   Universities tend to provide small-scale studies on specific topics but are not funded or equipped to produce large-scale industry-wide studies.

  *   The software metrics associations such as IFPUG and COSMIC provide the current rules for counting functional metrics, but seldom produce benchmarks and they don't do assessments at all.

The bottom line is that the new kind of benchmark discussed in this report has little competition circa 2011.  Gartner Group is the best positioned to compete, but to date has not attempted specific software benchmarks or assessments.  Gartner Group aims at CEO's and top executives, but does not get down to the level of assessments and benchmarks.

It is apparent that the overall benchmark that contains all forms of data shown in this report might be subset into special purpose reports that might be marketed separately or offered via subscription to specialized communities.  Examples of such special reports might include, but are not limited to:

Table 6: Special Benchmark Reports

  1.  Software Quality Assurance (SQA)
  2.  Software Estimation Tool Analysis
  3.  Software Development
  4.  Software Maintenance and Enhancement
  5.  Software Security and Safety
  6.  Software Failure Rates by Industry
  7.  Software Project Management
  8.  Software Project Offices
  9.  Data Quality by Industry
  10. Data Base Development
  11. Web Site Development
  12. Software Process Improvement
  13. Software Education and Training
  14. Analysis of Cloud Computing
  15. Best Practice Analysis Based on Empirical Results

These separate subsets would clearly generate additional revenues over and above those discussed for the entire assessment and benchmark report.  However it is premature to attempt to quantify the numbers of subscribers and revenues for these subsets of information.

Summary and Conclusions about Software Benchmarks

The software industry circa 2011 is characterized by many project failures, by frequent cost and schedule overruns, by poor quality, and by lack of reliable information as to what the phrase "best practices" really means in terms of results.

The software industry is also subject to frequent fads and fallacies as new development methods surface, are touted as panaceas, and then gradually demonstrate only marginal improvements if any over alternate development methods.  Poor measurement practices and inadequate benchmarks are what makes these fads and fallacies endemic problems for the software community.

Benchmarks are very useful methods for minimizing these common problems, but less than 30% of large U.S. software companies have either commissioned benchmarks or use benchmark data.  For small and mid-sized U.S. companies, less than 10% utilize benchmark data.  In fact many corporations not only fail to use benchmarks and assessments, they have never even heard of the SEI, ISBSG, ITMPI, and the other major sources of assessment and benchmark information.

The new form of benchmark discussed in this paper is aimed at expanding the information contained in software benchmarks from basic productivity and quality levels up to a complete condensation of critical industry topics where software is part of the equation.   The additional kinds of data and information will hopefully elevate benchmarks and assessments from useful but optional studies to mandatory business practices that are demanded by a majority of CEO's and top operational executives.

Once such information is published for a specific industry such as banking, it is expected that demands from other industries will drive the business growth for similar benchmarks in other industries.

The goal for the new form of benchmark is to reach close to 100% of major corporations, more than 50% of medium corporations, and perhaps 25% of smaller corporations.  It is difficult to predict government penetration at all levels, but no doubt all 50 states would want to subscribe to this new form of benchmark if state government data were included.  A number of Federal agencies would also want to have access to the kinds of data provided by the new benchmarks.

It is hoped that the kinds of information included in these new benchmarks will not only lead to a profitable business, but will assist the software industry in overcoming its traditional problems of cost and schedule overruns combined with poor quality.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.aisnet.org/pipermail/aisworld_lists.aisnet.org/attachments/20110809/1e0dd098/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SOURCES OF SOFTWARE BENCHMARKS.doc
Type: application/octet-stream
Size: 190976 bytes
Desc: SOURCES OF SOFTWARE BENCHMARKS.doc
URL: <http://lists.aisnet.org/pipermail/aisworld_lists.aisnet.org/attachments/20110809/1e0dd098/attachment.obj>