When making a decision such as whether to hire, insure or lend to someone, is more data better? Actually, when it comes to fairness, the opposite is often true.
Consider a recent Harvard Business Review experiment, which involved sending 316 fake applications to the largest U.S. law firms. All the applicants were among the top 1 percent of students at their schools, but other information — such as their names, college clubs and hobbies — provided hints about their gender and social class.
The result: Upper-class males were four times as likely to get a callback as other candidates, including upper-class women. This suggests that even among equally qualified candidates, the added information gave potential employers something not to like, such as a lower-class background, a bad “cultural fit” or the possibility that a woman might decide to have children and leave the firm.
Proponents of big data tend to believe that such problems can be addressed by handing the decision-making over to an impartial computer. The idea is that with enough information, perhaps ranging from Facebook likes to ZIP codes, an algorithm should be able to choose the objectively best candidates.
Yet algorithms can be as flawed as the humans they replace — and the more data they use, the more opportunities arise for those flaws to emerge. Most essentially assign points to a candidate depending on the presence of certain attributes that are correlated with success, with no consideration for the nature and nuances of those correlations.
One issue is that the algorithms tend to use linear models, so they assume that more is always better, and way more is way better. This can be fine when dealing with attributes such as education or experience. Something like Facebook activity, by contrast, could have a golden mean — a reasonable amount might suggest engagement in a community, while an abundance could indicate addiction.
More important, such algorithms will tend to discriminate against attributes that, though beyond people’s control, have historically been correlated with a lack of success. A marker of poverty or race, for example, can translate into a demerit, even if the person is eminently qualified — thus reinforcing the historical pattern that the algorithm finds in the data.
In short, handing decisions over to machine learning algorithms trained on historical data isn’t likely to improve on prejudiced humans. And more complex models, such as neural networks, require a ton more data — which is why they tend to be reserved for things like self-driving cars.
The Harvard Business Review study concludes that using less data would be better: Remove the information on gender and clubs altogether, and focus on performance in law school. It’s a lesson that should be valuable for big-data modelers, too.