The Science of Hiring

Woman writing on whiteboard.

Hiring managers are afraid to make a bad hire. It’s a justified fear—a bad hire costs the company more than just money. So, hiring managers developed their own intricate hiring processes to ensure bad hires don’t get offers. And these processes work, right?

Bad news—the research is clear; unless these hiring processes are based on proven selection methods, complex and arbitrary processes do not reliably predict the job performance of a potential hire.

Good news—you can feel more confident about your ability to make a good hire if you use a streamlined process based on selection methods that are proven to predict an individual’s on-the-job performance.

This article takes the research available and applies it specifically to hiring technical talent. Our focus is on best practices to implement and too-frequent poor practices to discontinue.

Selection methods are the methods we use to evaluate candidates.

Validity is a measure of how accurately the selection method predicted future job performance. A high validity corresponds to a high accuracy. Validities are often expressed as a number between 0 and 1 (e.g., 0.61) but can also be understood as percentages (e.g., 61%).

While a data-backed hiring process might seem harsh and impersonal, it has other benefits besides candidate vetting. Using high-validity selection methods decreases risky hires, eliminates hiring bias, and increases diversity in the workplace.

We talk about our recommended hiring process in another article; in this article, we’ll explain what the research says, which selection methods have high validity, and considerations you should make for the candidate experience.

What the Research Says

The primary source for this article is a large body of research dedicated to effective hiring methods. The research spans a hundred years, tens of thousands of employees, and reliably predicts future job performance no matter the field or industry.

A 2016 meta-studyi compiled the available research and yielded the following high-level observations:

  • Certain selection methods have a higher validity than others
  • Certain selection methods, when combined, have a higher validity than others
  • Combining selection methods with little-to-no correlation produces the highest validity

An additional study in 2020 that examined the validity of brainteaser interview questions was used as a secondary source.

What selection methods have been studied?

Although many selection methods have been studied, these are the selection methods we’ll focus on, since they are most often used to hire technical talent:

  • GMA Tests: general mental ability (GMA) tests are used to assess general intelligence or problem-solving ability of the candidate.
  • Integrity Tests: used to identify counterproductive or risky behaviors in candidates like theft, drug use, or fighting while on the job; can also measure conscientiousness, agreeableness, and emotional stability.
  • Work Sample Tests: when you have the candidate perform the tasks they would do on the job; they are standardized tests that closely mirror the actual work and environment that the job would normally be subject to.
  • Job Knowledge Tests: when you ask the candidate questions about how to perform specific aspects of the job; their answers reveal the depth of their job knowledge.
  • Unstructured Interviews: vary from candidate to candidate because there is no set list of questions to ask candidates applying for a certain role.
  • Structured Interviews: utilize an established list of questions with predetermined good and bad answers; interviewers don’t deviate from the list when interviewing candidates to reduce the influence of bias when evaluating candidates.
  • Brainteaser Interviews: use unexpected or random questions in an attempt to test the candidate’s ability to problem solve and think quickly on their feet. The 2020 studyii identified three types of brainteaser questions:
  • Justification – numerous plausible answers but require justification to answer fully
  • Definitive-answer – has a specific and objectively correct answer
  • Oddball – random question with no right or wrong answer
  • Years of Previous Job Experience: evaluates the candidate’s previous work experience to determine how the candidate will perform in a job.
  • Years of Education: uses the candidate’s level of education as a basis for determining how the candidate will perform in a job.
  • Interests: use the candidate’s extracurricular interests or hobbies to determine how the candidate will perform in a job. This is most effective when the interests considered are vocation related.

Predictive validity of selection methods

Table 1 – Predictive Validity of Selection Methods
Selection Methods Predictive Validity
GMA Tests 0.65
Employment Interviews (Structured) 0.58
Employment Interviews (Unstructured) 0.58
Job Knowledge Tests 0.48
Integrity Tests 0.46
Brainteasers (Justification) 0.42
Work Sample Tests 0.33
Interests 0.31
Brainteasers (Oddball) 0.30
Brainteasers (Definitive-answer) 0.26
Years of Job Experience 0.16
Years of Education 0.10
The data reflected in this table represents the validity of the selection methods
when used independently of one another. This data comprises two studies.
See footnotes for more information.

Table 2 – Predictive Validity of Selection Methods when used with a GMA Test
Selection Methods Predictive Validity
Integrity Tests 0.78
Employment Interviews (Structured) 0.76
Employment Interviews (Unstructured) 0.73
Interests 0.71
Years of Job Experience 0.68
Job Knowledge Tests 0.65
Work Sample Tests 0.65
Years of Education 0.65
Brainteasers (Oddball) 0.47
Brainteasers (Justification) 0.46
Brainteasers (Definitive-answer) 0.39
The data reflected in this table represents the validity of each selection
method when used alongside a GMA test. This data comprises two studies.
See footnotes for more information.

GMA tests

GMA tests have the single highest predictive validity at 0.65. If you do nothing else in your hiring process, administer a GMA test—it gives you the best chance at predicting an applicant’s future job performance. The other selection methods should supplement your use of a GMA test.

An important aspect of these tests is that, in addition to measuring general intelligence and problem-solving ability, they are also a predictor of learning ability (Schmidt and Hunter, 1998). Learning ability is a large part of future job performance and internal promotion. Your team will benefit from having employees who are smart, can problem solve, and will continue to learn on the job as they are assigned new projects and challenges.

Integrity tests

Integrity tests are somewhat middle of the pack in terms of stand-alone validity (0.46). However, integrity tests have low correlation with cognitive ability tests, so when used in combination with a GMA test, the validity increases to 0.78. Using these two selection methods in tandem drastically improves your ability to predict the future job performance of candidates.

Integrity tests, aside from highlighting risky behaviors, also give insight into personality traits. High integrity often correlates with high conscientiousness and high agreeableness. Conscientiousness in hiring is generally defined as wanting to do your work well and thoroughly, and agreeableness in hiring is defined as getting along well with others. It’s obvious why you’d want to interview candidates that score well on an integrity test because it’s an indicator of the developer’s personality traits and a decent predictor of their job performance.

Unstructured vs structured interviews

When used on their own, unstructured and structured interviews have the same predictive validity (0.58). However, when used alongside a GMA test, structured interviews outperform unstructured interviews. This is because structured interviews don’t correlate as closely with GMA tests; structured interviews and GMA tests measure different things, and so the two used together are a stronger predictor of job performance.

The debate about the benefits of unstructured and structured interviews is ongoing, but what the data supports is using employment interviews to evaluate candidates. Interviews as a selection method, when used with GMA tests, have the next two highest validities. Facet favors structured interviews because it allows us to fairly evaluate candidates and efficiently train and prepare our interviewers.


Earlier meta-studies reported low validity for interests as a predictor for job performance. However, the 2016 meta-study reanalyzed data and found that vocation-related interests do predict job performance with some level of validity (0.31). While significantly lower than other selection methods when used alone, evaluating interests alongside a GMA test provides the fourth highest validity (0.71).

Remember that for this to be accurate, the interests that you as a hiring manager need to consider are vocation-related—in this case, that might look like developers having personal coding projects or contributing to open-source projects.

Take this with a grain of salt—Stack Overflow reported that 72.87% of professional developers code as a hobby. If you are extrapolating that simply because they code outside of work, the candidate is a good developer, your bar isn’t high enough. That number of developers who code as a hobby potentially speaks more to passion for coding rather than indicating technical expertise.

Work sample tests vs. Brainteasers

On its own, a work sample test has a predictive validity of 0.33, and when used with a GMA, that number rises to 0.65. This data may have given rise to the frequency of using coding tests like Leetcode, HackerRank, or Byteboard to evaluate developers. As a hiring manager, you might think you are doing a work sample test by utilizing these coding tests, but you aren’t. You aren’t modeling the environment or even the types of questions that will be required by the job, and so these coder tests are no better than brainteasers.

Famously, Google used brainteaser questions to evaluate developers during interviews. But in 2013, their chief of human resources announced they were discontinuing the practice, saying they were a waste of time. Brainteasers have not been extensively tested, but a 2020 study researched the predictive validity and suggested that brainteasers correlate with cognitive ability rather than job performance.

This is reflected in the data in Table 2; the predictive validity for any type of brainteaser when used with a GMA test do not raise to a statistically significant amount because they are closely correlated. Besides that, their predictive validity is quite low compared to other selection methods. Your time and finances as a hiring manager would be better spent investing in a tailored work sample test rather than utilizing coder tests that function as brainteasers. If you do this, you’ll have significantly better hires.

Three people sitting at a desk in front of computers

Job knowledge tests

Job knowledge tests are also middle of the pack when used on their own (0.48), but they rank highly when used with a GMA test (0.65). Job knowledge tests might be a good alternative to work sample tests (and certainly to brainteaser coding tests) because job knowledge tests are easy to administer in-person and don’t require the candidate to take a coding test during their personal time.

You can expect the validity of your hiring process to increase if you administer a job knowledge test during a structured or unstructured interview alongside a GMA test.

Job experience and years of education

Once you have reached a certain threshold, your performance is more impacted by your ability to learn on the job than by what you’ve done in the past. Ability to learn is best tested by a GMA, not by your previous job experience (0.16 validity) or how many years you’ve gone to school (0.10 validity).

When used with a GMA test, these two selection methods have a drastically improved predictive validity (see Table 2). This is because the GMA test is an accurate predictor of job performance, not because previous experience or education are good predictors. Using these selection methods on their own to evaluate if the candidate will be a good hire is only barely better than randomly selecting a candidate’s name out of a hat. And if you are using a GMA test as a baseline, you are better off using one of the higher-validity methods to better predict job performance.

Combining Selection Methods

You’ll notice that, when combined with a GMA test, all the selection methods experienced an increase in validity. Previously low-validity selection methods, like brainteasers or job experience, become somewhat effective at predicting job performance.

That increase shouldn’t be used to justify using low-validity selection methods. The predictive validity that is statistically significant is a GMA plus an integrity test (0.78) and a GMA plus a structured interview (0.76). And, as noted in the 2016 meta study, “A further advantage of these two combinations is that they can be used for both entry level hiring and selection of experienced job applicants.”

These two combinations have the highest predictive validity because the selection methods have very low correlation, meaning they don’t test similar things. Either of these combinations will allow you to test for multiple characteristics in candidates with a high probability of predicting future job performance.

As you plan out your hiring process, combine selection methods that don’t correlate and are efficient to administer. Your hiring process will maximize the probability that any given candidate will be a good hire.


Interested in the hiring process that Facet recommends, or want to build your own scientific hiring process? Read the second article in this two-part series: How to Design a Scientific Hiring Process.

[1] The figures in this paper reflect the numbers from the 2016 study. As of November 2022, the 2016 study is still a working paper. It was prepared as an update to Schmidt and Hunter (1998). In some cases, the updated findings indicate that some selection methods were substantially underestimated for the 1998 publication. The more accurate updated validities are used here to promote a more effective hiring process.

[2] The effectiveness of brainteaser interviews is anecdotally reported, but a limited number of studies have been conducted, so there is little data to support the claim. However, a study published in December 2020 as part of a graduate thesis found that, “Results indicate that brainteaser interview questions, particularly “oddball” questions, are predictive of these applicant characteristics, but primarily through their covariation with other, typical selection assessments. . . . Further results showed that applicants have strong negative reactions to being asked brainteaser questions in an interview.” Although the cognitive assessment validity in this study is a bit higher (0.70) than the GMA validity reported by Schmidt and Hunter (1998), to maintain clarity of where the information came from, the brainteaser validities have not been adjusted to reflect that ratio.

Hire developers you'll love

Facet offers a comprehensive developer hiring solution - technical recruiting, staff augmentation, freelance, vetted offshore, and managed development services.