Notes on hiring and the stable marriage problem

26 Apr, 2024

Joe Mosby's blog

The stable marriage problem is the problem of finding a match between two equally sized sets of elements given an ordering of preferences for each element. The problem is defined as follows:

Given n men and n women, where each person has ranked all members of the opposite sex in order of preference, marry the men and women together such that there are no two people of opposite sex who would both rather have each other than their current partners. When there are no such pairs of people, the set of marriages is deemed stable.

This definition of the problem is highly specific: there is only one match, and there have to be equivalent numbers of both sets. In the 1990s, the National Resident Matching Program, which matches medical residents to hospitals with residency programs, commissioned a more general solution to the problem that could take into account hospitals taking on more than one resident, and residents having more than one preference. This yielded a tweak to the problem definition:

There is no applicant A and program P such that both of the following are true: 1) A is unmatched or would prefer to go to P over the program to which A matched, and 2) P has a free slot or would prefer A over one of the other applicants matched to the program.

When I start looking at these problem statements, a similar problem starts to stick out to me: hiring. We could define the hiring problem as:

Given n applicants and m roles at k companies, where each applicant has ranked all companies and jobs at those companies in order of preference, and each company has ranked all applicants for those jobs in order of preference, match applicants, roles, and companies such that there are no applicant/role/company set where no applicant would prefer a different company or job, and no company would rather have that same applicant rather than their current one.

The original stable marriage problem was solved by the Gale-Shapley algorithm, which sought to apply the idea to college admissions. Below are my notes on this original approach.

Problem Definition

A college is considering a set of n applicants of which is can admit a quota of q. We can't just admit the top q applicants, because it's not guaranteed that all q will accept. In order to fill the quota q, the college must offer >q admissions letters, which also produces a challenge of over-filling.

The following factors are also unknown:

if an applicant has applied elsewhere
how the applicant ranks the colleges to which they have applied
which other colleges will admit the applicant

Due to all this uncertainty, colleges can only seek an optimization: that they will admit close to their quota q, and that they can be close to the reasonable optimum quality.

The authors seek to avoid this problem with an algorithmic solution that removes the uncertainties and assigns to each college precisely its quota. (the implication here is that this would be a centralized system)

Assignment Criteria

We need to distribute a set of n applicants across m colleges in a way that hits all the q_i quotas of each college. Each applicant ranks colleges in order of preference, including those the applicant would never accept. Each college similarly ranks applicants in order of their preference, including those it would never accept even if it meant not filling quota.

Our stability problem then looks like:

An assignment of applicants to colleges will be considered unstable if there are two applicants a and b who are assigned to colleges A and B although b prefers A to B and A prefers b to a.

This is deemed "unstable," because in this particular scenario, an applicant could simply transfer to the next college rather than stick around at the current one.

Assuming that we can find a stable arrangement, how should we choose between the various possible stable arrangements if multiples exist? We need to make another declaration:

A stable arrangement is considered optimal if every applicant is at least as well off under it as under any other stable assignment.

Continuing on this thread: this points us to a decision that the stable assignment is indeed unique, because any change would necessarily be suboptimal or instable.

Stable Marriage

	A	B	C
a	(1,3)	(2,2)	(3,1)
b	(3,1)	(1,3)	(2,2)
c	(2,2)	(3,1)	(1,3)

Consider the ranking matrix of 3 men (a, b, c) and 3 women (A, B, C) where the first number in the pair gives the ranking of women by the men, and the second number in the pair gives the ranking of men by the women. Thus a prefers A first, B second, and C third, and A prefers b first, c second, and a third. Pairing these up, there are six possible scenarios, three of which are stable.

Scenario 1: each man gets his first choice (a marries A, b marries B, c marries C). Each woman gets her last choice, but the pairings are stable.
Scenario 2: each woman gets her first choice (a marries C, b marries A, c marries B).
Scenario 3: everyone gets their second choice. (a marries B, b marries C, c marries A).

Any other pairings are unstable.

In order for this to be a viable problem, we need a working theorem: there always exists a stable set of marriages. We need to prove this.

Let each man propose to his preferred future wife. Each woman with more than one proposal rejects all but her favorite, but does not accept the final in case someone better comes along.

Now the rejected men propose to their second choice. Continue on. Eventually, every woman will have a proposal. As soon as this last woman gets her proposal, our process is over, and each woman is required to accept the man still remaining on a string.

The authors assert that this set of marriages is stable. Suppose j and k are not married to each other but j prefers to k to his current choice m. Then j must have proposed to k at some stage and was rejected in favor of someone k preferred. Thus k prefers her own husband to j, and there is no instability.

Extrapolating to the College Admissions Problem

Students apply to the college of their first choice. A college with a quota of q places on its waiting list the q who rank highest and reject the rest. Rejected applicants then apply to their second choice, colleges select the top q from the new applicants and the existing waiting list, and reject the rest. Process repeats until every applicant is either on a waiting list or has been rejected by every college. At this point, each college admits everyone on its waiting list and the stable assignment has been achieved.

We now need to show that this procedure is optimal, which leads us to a new theorem: every applicant is at least as well off under the assignment given by the deferred acceptance procedure as they would be under any other stable assignment.

Call a college "possible" for a particular applicant if there is a stable assignment that sends them there. Now assume that up to a given point in the procedure no applicant has been turned away by a possible college. At this point suppose that college A having received applications from a full quota of better applicants, rejects applicant a. We know that each student b preferred college A to any others that they applied to.

Now consider a hypothetical assignment that sends a to A* and everyone else to other possible colleges. At least one of the bs will go to a less desirable place than a. This arrangement is unstable, since that b and A could upset the arrangement by rejecting their current choices.

The conclusion is that the procedure only rejects applicants that don't work in a stable assignment - therefore what results is optimal.

Closing Remarks

The authors note that this is a mathematical exercise that's divorced from reality, which is why I think it's so interesting to consider in a hiring context. In order for this matching process to work optimally, you have to change process dramatically, and the algorithm still contains polynomial complexity after that. That says nothing about factoring in different dimensions (for example, an applicant might prefer a certain company, but not a certain job at a certain company).