Get awesome marketing content related to Hiring & L&D in your inbox each week

Stay up-to-date with the latest marketing, sales, and service tips and news
SpeechX: An AI-powered English Proficiency Test Software Tool

Talent Assessment | 6 Min Read

SpeechX: An AI-powered English Proficiency Test Software Tool


A globally interconnected marketplace has lent English considerable clout in providing a level-playing field for the exchange of information and ideas. English-speaking skills have assumed the foremost importance for successfully undertaking any consumer-centric role in the service sector, cutting across industries, from hospitality, retail, travel, and tourism to insurance, sales, and banking.

It plays a critical role in the right communication of a product’s nuances or service up for sale. Therefore, it is of notable importance among recruiters scouting for the right talent to further their organization’s commercial objectives. The argument especially holds for BPO players that must hire the right set of candidates with the necessary English-speaking skills by conducting an English proficiency test, to maintain a favorable consumer interface. They are indeed on the lookout for people to become the first line of contact for their existing and probable consumers. The scope of an error is virtually non-existent. Perhaps why the test of spoken English skills is one of the foremost requirements for hiring in consumer-facing roles, especially BPOs. And every probable is mandatorily subjected to a language assessment test for BPOs. 

Hiring is a complicated and multi-layered process and demands to summon the very best of human and technological skills. From managing the logistical aspect to ensuring the sanctity of the exercise itself, hiring is complicated and involves several stages.


The established guidelines for hiring in the BPO industry

In the established process, BPO companies mandate a
language assessment test to evaluate the English-speaking skills and undertake an English language test of the candidates by hiring Voice and Accent (VNA) trainers/assessors. They are professionals trained in a specific English framework called the ‘Common European Framework of Reference for Languages’ (CEFR) (For the uninitiated, CEFR is a set of guidelines used to determine the achievements of learners of foreign languages throughout Europe and now, increasingly, the world over.) While these trainers/assessors may be trained on other frameworks, they are primarily trained on the CEFR framework, a common standard for English testing system guidelines or the English language testing system.


How VNA Trainers Conduct English Proficiency Assessments

VNA trainers, also known as the voice and accent assessors, evaluate prospective hires’ English-speaking ability by giving them a language proficiency test. They evaluate their pronunciation, fluency, accent, intonation and grammar. A VNA trainer/assessor scrutinizes for errors in the abovementioned criteria through the English language proficiency test to evaluate their trainability to overcome the discrepancies, if any.

VNA trainers evaluate and focus on discrepancies that have been divided into trainable and non-trainable errors by conducting the English ability test. For instance, it would be counted as a non-trainable error if a candidate says ‘Soe’ instead of ‘Shoe.’ Similarly, a candidate saying ‘pleajure’ instead of ‘pleasure’ would constitute a non-trainable error. These fatal/non-trainable errors may also be grammatical. 

Conversely, trainable/non-fatal errors are the ones that can be worked upon and addressed in a relatively lesser amount of time. For instance, a definite ‘T’ or ‘D’ sound while speaking are examples of non-fatal and trainable errors. This process enables VNAs to ascertain the English-speaking and comprehension skills to determine their suitability for employment.


Challenges In The Existing Process of The English Proficiency Test by VNA Assessors

However, this process of giving an English competency test is not rigid and has its nuances. Over the years, each company/organization has built and developed its own set of nuances and best practices to make hiring a seamless and hassle-free exercise. They subject candidates to multiple rounds of screening using VNA trainers/assessors who have a defined set of processes to make hiring decisions.

  • Scaling for mass-hiring is resource-intensive: The process of employing VNA trainers/assessors by BPOs works exceptionally well for hiring and, therefore, has rightly been the established norm for a long time. Its efficacy in hiring at a smaller-scale is undisputed, but BPOs often struggle when planning to hire at a larger-scale. Large-scale hiring by using VNA trainers/assessors poses a unique set of challenges. Scaling the process is both resource-intensive and time-consuming. It is highly likely to make the entire recruitment process sluggish, consequently impacting business plans and the company’s balance sheet. 
  •  Human-led methods are prone to bias: Not to speak blithely of their efforts, but a human-led process by VNA trainers is certainly not free from bias as every VNA trainer has his/her own inherent understanding, likes and dislikes. Given the human-led intervention, the visible lack of consistency across multiple trainers is also a challenge in hiring at scale. As mentioned, the BPO industry is always faced with high attrition rates and is perennially in the hiring mode. Such a labor-intensive process is prone to errors. It may dilute the outcome, inadvertently lowering the quality of the hire.

Given these constraints in scaling the VNA-led process quickly and in a financially viable manner, BPOs usually rely on automated English assessment tools to aid the evaluation process for hiring the right set of candidates.


Is there an answer to the challenge of conducting mass assessment for consumer-facing roles in BPOs? 

BPOs routinely face the challenges mentioned above. However, there is a tool available on the marketplace to address these pain-points by auto-evaluating the candidates’ English-speaking skills and simulating the VNA experience, offering a
Language proficiency test online. An Artificial Intelligence (AI)-based automatic English evaluation tool can bypass the problem faced by BPOs in undertaking mass hiring by employing VNA trainers/assessors. Such a tool uses an English pronunciation test software to address the challenges detailed above.


Introducing SpeechX: An AI-powered English Proficiency Test Software Tool

Mercer | Mettl’s innovative tool, ‘SpeechX,’ addresses these existing BPO industry challenges. Powered by reliable Artificial Intelligence Speech Technology, this assessment tool is fully machine-administered and auto-graded to test a non-native speaker’s ability to speak and understand English. This English fluency test software is a scalable means of assessing prospective hires’ capability with a high level of accuracy by simulating a VNA trainer. It is also a beneficial and ready-to-use assessment solution for corporate houses to hire for critical client-facing roles and sales profiles. 

SpeechX is a video-proctored assessment that analyzes a candidate’s proficiency across two key dimensions – the ability to listen and to articulate clearly. It reviews linguistics to identify correct and incorrect information in the candidate’s speech and detects errors in reading sentences and extempore speech. It also undertakes para-linguistic voice analytics to measure the quality and clarity of a candidate’s statements.



Why does SpeechX simulate a VNA trainer? 

As a VNA trainer determines a candidate’s employability based on his/her guidelines designed on the CEFR framework, SpeechX, too, simulates a VNA trainer by giving an
English language test online. SpeechX provides a rating to the candidate on the lines of a VNA trainer to establish the level of the candidate’s English proficiency and employability. It checks for critical parameters of pronunciation, grammar, fluency, and listening skills, factoring in the nuances examined by VNA trainers, thereby simulating the entire process. 


SpeechX and CEFR

It was put together at the beginning of the 1990s by the Council of Europe to foster continent-wide collaboration among language teachers. The CEFR framework is one of the many frameworks to assess a non-native English speaker’s English-speaking proficiency. It is interesting to note that as the framework was designed for all European languages, it can also be applied to test a person’s French or Spanish-speaking skills!

The framework has three core dimensions of language activities- domains in which they occur and competencies that are drawn when engaged. It principally divides learners into three segments, which is then further divided into six levels of language proficiency. The three segments include basic users, independent users, and proficient users. A basic user is also categorized into two levels, i.e., A1 (beginner) and A2 (elementary). An independent user is classified into B1 (intermediate) and B2 (upper-intermediate). Similarly, a proficient user is bracketed into C1 (advanced) and C2 (proficiency). A CEFR test determines the level of competency and provides a score to the candidate, which is utilized to decide on his/her employability.

SpeechX has incorporated the elements of CEFR and condensed them into four components of listening, comprehension, grammar, and fluency.


To ascertain trainable and non-trainable errors.


To understand whether a candidate can speak fluent English while conversing.


To verify a candidate’s level of understanding of grammar and to detect trainable and non-trainable errors.

Listening Comprehension:

A candidate is subjected to listening comprehension and assessed on his/her ability to listen and comprehend.

High accuracy with Carnegie Speech 

SpeechX uses Proprietary Speech Analytics and Carnegie Speech’s patented Speech Recognition Engine and Pinpointing Technology. It is combined with proprietary voice analytics. With over thirty years of experience in undertaking assessments, the platform’s robustness needs no further validation. It can listen to a non-native speaker of English and determine even the individual sound or phoneme level to identify errors. As a result, Carnegie Speech’s engine is often referred to as the most accurate system on the market by universities in the USA, including Stanford, Yale, and Northwestern, to name a few. By processing hundreds of millions of speech assessments every year from countries worldwide, Carnegie Speech has gained unparalleled expertise in Speech Technology.


Incorporates best practices of the BPO industry

SpeechX has been designed after assimilating insights from various SMEs, VNA trainers, and BPO industry experts to combine the best practices of the BPO industry. On the lines of a VNA trainer who assesses trainable and non-trainable errors, SpeechX flags whether the gaps shown in the evaluation of a candidate are trainable. 

While evaluating pronunciation, it checks for more than eight critical non-trainable mistakes. SpeechX reviews fluency by measuring it across over ten dimensions, such as the speaking rate, prosody, intonation and pausing, etc. SpeechX evaluates a candidate’s understanding of grammar rules. Further, it assesses a candidate’s listening ability, looking for fact and inference-based knowledge.


Challenges With The Existing Tools In The Market

1. Impersonation and Cheating:

Firstly, impersonation and fraud are intrinsic to such large-scale recruitments. Despite the best intentions of VNA assessors, it is a fact that such means are employed. The use of IVR by current tools available on the market is susceptible to cheating as they cannot be monitored.

2. Accuracy:

There have been several reports of glaring discrepancies between the VNA’s and the existing tool’s results. This dichotomy raises a question on the validity of the test and calls for a re-evaluation.

3. The Lack of Ease of Use:

The IVR-based setup is challenging to administer, and the user experience often suffers too. It also requires putting in place considerable logistics, which can be time-consuming and resource-intensive for organizations.

How SpeechX, the most innovative English Speaking Test Software tool, solves these pain-points


1. Impersonation and Cheating:

SpeechX uses AI-based video monitoring to deter candidates from using unfair means, thereby ensuring the sanctity of the English proficiency exam’s exercise. It auto-generates cheating flags by using AI-proctoring technology.

2. Accuracy:

SpeechX is powered by Carnegie Speech’s world-class speech evaluation and recognition technology. This patented and reliable technology ensures a high degree of accuracy in the English test results.

3. The Lack of Ease-of-Use:

It is a computer-based process and does not face challenges enumerated in the IVR process. It conducts a tool-based communication test, which offers a smoother and superlative user and giver experience overall.


Key Elements/USPs of SpeechX

1. A Detailed Report:

A candidate’s performance is summarized in a comprehensive, actionable and objective report, immediately available for action. The report can be accessed in real-time and compared across multiple applicants and across business and educational enterprises. It includes a CEFR and SpeechX score.

2. Accessibility:

It is accessible on the cloud. Therefore, the English proficiency test online results can be accessed at a moment’s notice, without the hassle of maintaining logistics for the same. It can be used on computers and smartphones. Thus, it provides a high degree of mobility to the candidates.


Mercer | Mettl’s Communication-based Assessments

Thus far, we have detailed how voice and accent trainers conduct tests to assess candidates’ English-speaking proficiency and the methods employed for such evaluations. We also outlined the advantages of using AI-based tools. Now, let us briefly understand some of the most taken to communication-based assessments for ascertaining employability, depending on organizations’ needs. Mercer | Mettl offers some scientifically-validated and detailed assessments. They are: 

  • Corporate communication skills assessment: Corporate communication skills assessment offers subjective and objective questions to quantifiably evaluate candidate fitment for corporate roles involving direct consumer interactions. Such tests are domain-specific and offer deep insights into candidates’ abilities to listen intently, speaking fluently and coherently, using writing simulator, listening simulator and reading comprehensions. We encourage you to check it out. 
  • Customer care representative: This assessment is particularly designed to help companies evaluate candidates’ technical and vocational skills. Employers seek candidates with a heady mix of technical and vocational skills to drive client-related conversations, solve queries, retain clientele and forge new relationships. This test gauges candidates’ attributes, customer service orientation, work management and cognitive abilities to help make informed decisions. Such an assessment is a must for those vying for the most suited candidates for their job roles.
  • SpeechX: SpeechX is a call center assessment that objectively and scientifically evaluates candidates on four broad parameters: fluency, listening comprehension, pronunciation and grammar. It uses the CEFR framework employed by VNA assessors. You can use this assessment to conduct large-scale hiring at ease and in a very short timespan. Check out this assessment to hire the best candidates and watch your business grow. 
  • English grammar, spelling and punctuation test: Such a test is critical not only for hiring but equally for learning and development. Therefore, it offers a detailed understanding of candidates’ abilities to perform their roles without language-related impediments. They can also be used to understand current proficiency levels and train and develop employees effectively. Do have a look.



Changing times calls for newer methods of talent assessment. The BPO industry is one of the few sectors amid a constant influx of the workforce. It is challenged with a considerably high attrition rate, as much as ten percent higher than the industry average of 35 percent. It requires a cost-effective, smart and time-saving English testing system to improve efficiency and lower input costs. With an AI-backed solution, companies can be assured of hiring the right talent at scale, without exerting themselves financially or otherwise.



What is a tool-based communication test?

What is the best English proficiency test?

What is the importance of English proficiency?

What is the meaning of English proficiency?

How do I test my English proficiency?

How can I check my English fluency?

How do I prepare for a language assessment test?

Originally published March 12 2021, Updated November 22 2023

Written by

Shashank has been working in the publishing and online industry for eight-plus years now. He has donned many hats and has reported on diverse industry verticals, including aviation, tourism, hospitality, etc. He is currently the senior editor at Mercer | Mettl.

About This Topic

An English proficiency test software is used to evaluate the English-speaking and comprehension skills of an individual. Customer-facing industries like hospitality, retail, BPOs, etc., often used some variant of an English proficiency test software to assess the pronunciation, fluency, accent, intonation and grammar of potential employees.

Related Products

AI-Powered English Proficiency Assessment Tool

Intelligent Technology That Evaluates Candidates On Nuances Of Speech

Know More

Related posts

Would you like to comment?


Please write a comment before submitting


Thanks for submitting the comment. We’ll post the comment once its verified.

Get awesome marketing content related to Hiring & L&D in your inbox each week

Stay up-to-date with the latest marketing, sales, and service tips and news