Psychometric properties of a test: Reliability, validity and norms

Introduction

Assessments, such as cognitive and aptitude, are used by organizations and institutes worldwide to evaluate prospective employees and students. These tests claim to provide insights into candidates’ behavior, reasoning, and other skills, and the data collected is used to formulate different decisions. But how do organizations know these tests measure what they claim to measure?

This is where the psychometric properties of a test come in. Psychometric properties identify and define critical aspects of an assessment, such as its suitability or reliability for use in a specific circumstance. For instance, if a test is presented as an appropriate tool for measuring a particular skill set or trait, its psychometric properties will provide test creators and users with sufficient evidence of whether the instrument is what it claims. A good psychometric test must have three fundamental properties: reliability, validity, and norming. When hiring or developing employees, choosing the correct set of assessments is pivotal.

Why are psychometric properties important?

As discussed above, different psychometric properties provide distinct insights into a test’s meaningfulness, appropriateness, and usefulness. Some psychometric characteristics speak about the quality of the whole test, while others give weight to its constituent parts and sections. When considered in totality, the psychometric properties of an assessment could reveal whether the test assesses a single construct or multiple constructs.

Psychometric properties are most often expressed quantitatively. Numerical quantities such as a coefficient or an index represent the property. The awareness of the different psychometric properties of a test ensures that the information gained using it will provide a firm foundation for making the right decisions.

What are the psychometric properties of a standardized test?

A standardized test is administered and scored in a consistent or standard manner. They are designed to stabilize the questions, conditions for administering, scoring procedures, and interpretations. Standardized testing can consist of true-false, multiple-choice, authentic assessments or essay questions. It’s possible to shape any form of assessment into standardized tests.

Here are the three psychometric characteristics that must be considered when creating or standardizing tests:

Infographic

What is reliability in psychometrics?

Psychometric reliability is the extent to which test scores are accurate. A reliable test score is precise and consistent during all the instances of tests taken. An assessment is considered reliable only if it produces similar results under variable conditions across multiple testing instances, numerous test editions, or multiple raters grading the participant’s responses. Reliability is an essential component of a perfect assessment test.

Over the years, scholars and researchers have uncovered multiple ways to check for psychometric reliability. Some include testing the same participants at different points or presenting the participants with varying versions of the same test to evaluate their consistency levels. An assessment must demonstrate excellent reliability to qualify for validity.

What are the four types of reliability?

The four types of psychometric reliability are:

1. Parallel forms reliability: The two tests use the same content but use separate procedures or equipment, yielding the same result for each test-taker.

2. Internal consistency reliability: Items within the test are examined to see if they measure what the assessment evaluates. Internal reliability between test items is referred to as internal consistency.

3. Inter-rater reliability: Inter-scorer consistency is high when two raters score the psychometric test similarly.

4. Test-retest reliability: This is when the same test is conducted over time, and the test-taker displays consistency in scores over multiple administrations of the same test.

What is validity?

Validity is the degree to which the test measures what it claims to measure. As per the definition put forward by The Standards for Educational and Psychological Testing (2014), validity is the ‘degree to which evidence and theory support the interpretations of test scores for proposed uses of tests.’

Even though an assessment might be reliable, it may fail to provide the correct measure of the test-takers’ traits if it is not valid. Since the assessor will make decisions about the test takers based on the assessment, the validity inferred from it is crucial. Four types of validity can be measured, and all four should be considered to ensure a test is valid.

The four types of validity are:

Content validity: Does the content appropriately characterize all aspects of the construct?
Construct validity: How well does the test measure a particular construct that it is designed to measure?
Face validity: Does the test appear to measure what it intends to measure, even on the surface?
Criterion validity: Do the test results correspond to a benchmark test?

What are norms?

Norms refer to a sample of test-takers who represent the intended population for the assessment.

Norming helps the test designer understand the group they are assessing and identify what is considered normal within the target group. For example, a test that is designed to evaluate the coding skills of an experienced programmer in Java and will be used to hire coders with five years of experience will have a norming group comprising Java programmers with five years of experience.

Comprehensive assessment solutions from Mercer | Mettl

Mercer | Mettl, a global leader in online assessments, takes an exhaustive approach to assessment curation. Data is collected from a sample of more than two thousand respondents (representative sample with different ages, genders, job levels, and education) for norming. At the same time, the validity (convergent) is between 0.4-0.67, and the reliability coefficient is between 0.63-0.73.

Furthermore, Mercer | Mettl assessments adhere to the principles for validating and using personnel selection procedures set by the Society for Industrial and Organizational Psychology (SIOP) and uniform guidelines on employee selection procedures by The Equal Employment Opportunity Commission (EEOC). The tests also meet the Association of Test Publishers (ATP) and the American Psychological Association (APA) guidelines.

Here is how a Mercer | Mettl assessment curation process operates:

Competency framework creation:

Organizations share their requirements, and a dedicated team of in-house subject matter experts works intricately to curate the list of competencies and sub-competencies that need to be measured.

Assessment tool mapping:

Once the skill or competency framework is sealed, these competencies are mapped to one or more relevant tools for the assessment. These tests range from psychometric tests to aptitude tests for specific industries or skills. The assessments can be in different formats, such as multiple-choice questions (MCQs), situational judgment tests (SJTs), and simulators.

Question creation and curation:

Once the relevant tools for assessment are mapped, the next and most critical step is creating the questions for the test. These questions can be sourced from Mercer | Mettl’s existing question bank of over one million questions covering 3000+ skills.

Assessment configuration:

The assessment platform is fully customizable. It allows organizations to select the difficulty level of the questions, set up the order of questions, allot specific time to each section and much more. They can also send invites to the candidates in bulk or during a selected time slot.

Customized reports:

The reports generated are specific to each candidate and offer directional feedback on strengths and areas of improvement. The reports can be shared in formats like PDF or HTML. All the reports are customizable.

Conclusion

When an organization uses an assessment to assess candidates, it must have confidence that the test measures what it is supposed to and is reliable over time. Psychometric properties of an assessment help organizations understand this.

FAQs

What are the psychometric properties of a test?

What are the three psychometric properties of a good test?

Originally published April 12 2018, Updated March 5 2024

Written by

Vaishali Parnami

Vaishali has been working as a content creator at Mercer | Mettl since 2022. Her deep understanding and hands-on experience in curating content for education and B2B companies help her find innovative solutions for key business content requirements. She uses her expertise, creative writing style, and industry knowledge to improve brand communications.

About This Topic

Psychometric Test/Assessment

Psychometric tests measure an individual’s personality traits and behavioral tendencies to predict job performance. Psychometric assessments gauge cultural fitment, trainability, motivations, preferences, dark characteristics, etc., to hire and develop the right people.

Online Assessment Tools

Hiring

Learning and Development

ONLINE EXAMINATION

CLIENTS

Trusted by More Than 6000 Clients Worldwide

Top 14 objectives of recruitment and selection

AI & Future of work

Examination and Proctoring

Learning and Development

Recruitment

Online Assessment Tools

Hiring

Learning and Development

ONLINE EXAMINATION

CLIENTS

Trusted by More Than 6000 Clients Worldwide

Get awesome marketing content related to Hiring & L&D in your inbox each week

Stay up-to-date with the latest marketing, sales, and service tips and news

Thank you for subscribing! Let's take the HR world by storm now!

Suvarna Kartha

Recruiter, SAP

GET A FREE DEMO

Just drop in your details here and we'll get back to you!

Psychometric properties of a test: Reliability, validity and norms

Table of Contents

Table of Contents

About This Topic

Introduction

Why are psychometric properties important?

What are the psychometric properties of a standardized test?

What is reliability in psychometrics?

What are the four types of reliability?

What is validity?

The four types of validity are:

What are norms?

Comprehensive assessment solutions from Mercer | Mettl

Competency framework creation:

Assessment tool mapping:

Question creation and curation:

Assessment configuration:

Customized reports:

Conclusion

FAQs

What are the psychometric properties of a test?

What are the three psychometric properties of a good test?

Written by

About This Topic

Related Topics:

Related Products

Psychometric Assessments For Hiring And L&D

Mercer | Mettl Aptitude Assessments For Hiring

Personality Assessments For Hiring And L&D

Related posts

The advantages and challenges in the workplace for a type B personality

Understanding the impact of psychometrics on return on investment (ROI)

Pre-employment psychometric testing: how to eliminate bias?

Would you like to comment?

Get awesome marketing content related to Hiring & L&D in your inbox each week

Stay up-to-date with the latest marketing, sales, and service tips and news

Thank you for subscribing! Let's take the HR world by storm now!