4. Why do organizations conduct assessment?
Organizations use assessment tools and procedures to help them
perform the following human resource functions:
- Selection. Organizations want to be able to
identify and hire the best people for the job and the organization
in a fair and efficient manner. A properly developed assessment
tool may provide a way to select successful sales people,
concerned customer service representatives, and effective workers
in many other occupations.
- Placement. Organizations also want to be able
to assign people to the appropriate job level. For example, an
organization may have several managerial positions, each having a
different level of responsibility. Assessment may provide
information that helps organizations achieve the best fit between
employees and jobs.
- Training and development. Tests are used to
find out whether employees have mastered training materials. They
can help identify those applicants and employees who might benefit
from either remedial or advanced training. Information gained from
testing can be used to design or modify training programs. Test
results also help individuals identify areas in which
self-development activities would be useful.
- Promotion. Organizations may use tests to
identify employees who possess managerial potential or higher
level capabilities, so that these employees can be promoted to
assume greater duties and responsibilities.
- Career exploration and guidance. Tests are
sometimes used to help people make educational and vocational
choices. Tests may provide information that helps individuals
choose occupations in which they are likely to be successful and
- Program evaluation. Tests may provide
information that the organization can use to determine whether
employees are benefiting from training and development programs.
7. Limitations of personnel tests and proceduresfallibility
of test scores
Professionally developed tests and procedures that are used as
part of a planned assessment program may help you select and hire
more qualified and productive employees. However, it is essential to
understand that all assessment tools are subject to errors, both in
measuring a characteristic, such as verbal ability, and in
predicting performance criteria, such as success on the job. This is
true for all tests and procedures, regardless of how objective or
standardized they might be.
- Do not expect any test or procedure to
measure a personal trait or ability with perfect accuracy for
every single person.
- Do not expect any test or procedure to be
completely accurate in predicting performance.
There will be cases where a test score or procedure will predict
someone to be a good worker, who, in fact, is not. There will also
be cases where an individual receiving a low score will be rejected,
who, in fact, would actually be capable and a good worker. Such
errors in the assessment context are called selection errors.
Selection errors cannot be completely avoided in any assessment
Why do organizations conduct testing despite these
errors? The answer is that appropriate use of
professionally developed assessment tools on average enables
organizations to make more effective employment-related decisions
than use of simple observations or random decision making.
1. Title VII of the Civil Rights Act (CRA) of 1964 (as amended in
1972); Tower Amendment to Title VII Title VII is landmark
legislation that prohibits unfair discrimination in all terms and
conditions of employment based on race, color, religion, sex, or
national origin. Other subsequent legislation, for example, ADEA and
ADA, has added age and disability, respectively, to this list. Women
and men, people age 40 and older, people with disabilities, and
people belonging to a racial, religious, or ethnic group are
protected under Title VII and other employment laws. Individuals in
these categories are referred to as members of a protected group.
The employment practices covered by this law include the following:
- performance appraisal
- disciplinary action
- job classification
- union or other membership
- fringe benefits.
Employers having 15 or more employees, employment agencies, and
labor unions are subject to this law.
The Tower Amendment to this act stipulates that
professionally developed workplace tests can be used to make
employment decisions. However, only instruments that do not
discriminate against any protected group can be used. Use only tests
developed by experts who have demonstrated qualifications in this
- Uniform Guidelines on Employee Selection Procedures1978;
adverse or disparate impact, approaches to determine existence of
adverse impact, four-fifths rule, job-relatedness, business
necessity, biased assessment procedures
In 1978, the EEOC and three other federal agenciesthe Civil
Service Commission (predecessor of the Office of Personnel
Management) and the Labor and Justice Departmentsjointly
issued the Uniform Guidelines on Employee Selection Procedures...
The Guidelines cover all employers employing 15 or more employees,
labor organizations, and employment agencies. They also cover
contractors and subcontractors to the federal government and
organizations receiving federal assistance. They apply to all tests,
inventories and procedures used to make employment decisions.
Employment decisions include hiring, promotion, referral,
disciplinary action, termination, licensing, and certification.
Training may be included as an employment decision if it leads to
any of the actions listed above. The Guidelines have significant
implications for personnel assessment.
One of the basic principles of the Uniform Guidelines is
that it is unlawful to use a test or selection procedure that
creates adverse impact, unless justified. Adverse impact
occurs when there is a substantially different rate of selection in
hiring, promotion, or other employment decisions that work to the
disadvantage of members of a race, sex, or ethnic group.
Different approaches exist that can be used to determine whether
adverse impact has occurred. Statistical Techniques may provide
information regarding whether or not the use of a test results in
adverse impact. Adverse impact is normally indicated when the
selection rate for one group is less than 80% (4/5) that of another.
This measure is commonly referred to as the four-fifths or 80% rule.
However, variations in sample size may affect the interpretation of
the calculation. For example, the 80% rule may not be accurate in
detecting substantially different rates of selection in very large
or small samples. When determining whether there is adverse impact
in very large or small samples, more sensitive tests of statistical
significance should be employed.
When there is no charge of adverse impact, the Guidelines
do not require that you show the job-relatedness of your assessment
procedures. However, you are strongly encouraged to use
only job-related assessment tools.
One of the basic principles of the Uniform Guidelines is that it
is unlawful to use a test or selection procedure that creates
adverse impact, unless justified. Adverse impact occurs when there
is a substantially different rate of selection in hiring, promotion,
or other employment decisions that work to the disadvantage of
members of a race, sex, or ethnic group...
When there is no charge of adverse impact, the Guidelines do not
require that you show the job-relatedness of your assessment
procedures. However, you are strongly encouraged to use only
job-related assessment tools.
3. Interpretation of reliability information from test
manuals and reviews
Test manuals and independent review of tests provide information
on test reliability. The following discussion will help you
interpret the reliability information about any test.
reliability of a test is indicated by the reliability coefficient.
It is denoted by the letter r, and is expressed as a
number ranging between 0 and 1.00, with r = 0 indicating no
reliability, and r = 1.00 indicating perfect reliability. Do not
expect to find a test with perfect reliability. Generally, you
will see the reliability of a test as a decimal, for example, r =
.80 or r = .93. The larger the reliability coefficient, the more
repeatable or reliable the test scores. Table 1 serves as a
general guideline for interpreting test reliability. However, do
not select or reject a test solely based on the size of its
reliability coefficient. To evaluate a tests reliability,
you should consider the type of test, the type of reliability
estimate reported, and the context in which the test will be used.
||Table 1. General Guidelines for
Interpreting Reliability Coefficients
0.80 - 0.89
0.70 - 0.79
may have limited applicability
5. Personality inventories
In addition to abilities, knowledge, and skills, job success also
depends on an individuals personal characteristics.
Personality inventories designed for use in employment contexts are
used to evaluate such characteristics as motivation,
conscientiousness, self-confidence, or how well an employee might
get along with fellow workers. Research has shown that, in
certain situations, use of personality tests with other assessment
instruments can yield helpful predictions.
7. Ensuring both efficiency and diversity
To help ensure both efficiency and diversity in your workforce,
apply the whole-person approach to assessment. Use a variety of
assessment tools to obtain a comprehensive picture of the skills and
capabilities of applicants and employees. This approach to
assessment will help you make sure you dont miss out on some
very qualified individuals who could enhance your organizations
9. How to interpret validity information from test manuals
and independent reviews
To determine if a particular test is valid for your intended use,
consult the test manual and available independent reviews. (Chapter
5 offers sources for test reviews.) The information below can help
you interpret the validity evidence reported in these publications.
- In evaluating validity information, it is important to
determine whether the test can be used in the specific way you
intended, and whether your target group is similar to the test
Test manuals and reviews should describe
- Available validation evidence supporting use of the test
for specific purposes. The manual should include a thorough
description of the procedures used in the validation studies
and the results of those studies.
- The possible valid uses of the test. The purposes for which
the test can legitimately be used should be described, as well
as the performance criteria that can validly be predicted.
- The sample group(s) on which the test was developed. For
example, was the test developed on a sample of high school
graduates, managers, or clerical workers? What was the racial,
ethnic, age, and gender mix of the sample?
- The group(s) for which the test may be used.
- The criterion-related validity of a test is measured by the
validity coefficient. It is reported as a number between 0 and
1.00 that indicates the magnitude of the relationship, r,
between the test and a measure of job performance (criterion). The
larger the validity coefficient, the more confidence you can have
in predictions made from the test scores. However, a single test
can never fully predict job performance because success on the job
depends on so many varied factors. Therefore, validity
coefficients, unlike reliability coefficients, rarely exceed r =
general rule, the higher the validity coefficient the more
beneficial it is to use the test. Validity coefficients of r=.21
to r=.35 are typical for a single test. Validities for selection
systems that use multiple tests will probably be higher because
you are using different tools to measure/predict different aspects
of performance, where a single test is more likely to measure or
predict fewer aspects of total performance. Table 3 serves as a
general guideline for interpreting test validity for a single
test. Evaluating test validity is a sophisticated task, and you
might require the services of a testing expert. In addition to the
magnitude of the validity coefficient, you should also consider at
a minimum the following factors:
||Table 3. General Guidelines for
Interpreting Validity Coefficients Validity coefficient value
0.21 - 0.35
0.11 - 0.20
likely to be useful
depends on circumstances
unlikely to be useful