FaiRIR: Mitigating Exposure Bias from Related Item Recommendations in Two-Sided Platforms

🔬 Research Summary by Abhisek Dash, a PhD student (TCS Research Fellow) at the Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur.

[Original paper by Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi ]

Overview: Related Item Recommendation (RIR) algorithms weave through the fabric of online platforms having far-reaching effects on different stakeholders. While customers rely on them for exploring products/services that appeal to them, producers depend on them for their livelihood. Such far-reaching effects warrant these systems to be fair to these different stakeholders. Though fairness in personalized recommendations has been well discussed in the community, fairness in RIRs has been overlooked. To this end, in the current work, we propose a novel suite of algorithms that aims at ensuring fairness in related item recommendations while preserving the underlying relatedness of the algorithms.

Introduction

Related Item Recommendations often recommend items in the context of an item viewed or consumed by the user. Some popular examples of related item recommendations are (a) customers who viewed this item also viewed — recommendations on Amazon or, more like this — recommendations on Netflix, etc. With many businesses and individuals depending on online platforms to earn their livelihood, fairness of these related item recommendations has become a matter of focus in recent years. Even on the legislation front, a recent Indian regulation (and those in the EU and USA) mandates e-commerce sites to treat their sellers fairly.

As RIR algorithms recommend new items that are “related” or “similar” to the current item, there may arise situations where an item gets much more (or less) exposure than what it deserves. For example, a poor-quality item may be recommended as related to a popular good-quality item. Hence, the poor-quality item may get much more exposure than it deserves. We term this discrepancy between the observed item exposure (as induced by RIRs) and the desired item exposure (e.g., based on merit or any other criteria) as exposure bias.

In its generic form, the related item recommendation pipeline has the following key steps (a) item representation learning: which learns the representation of items so that similar items are closer together, (b) item similarity computation: which provides a quantification of the similarity between two given items, and (c) related item selection: select a set of k related items to be recommended. FaiRIR essentially introduces fairness at these three stages by reconciling underlying relatedness and additional desired exposure considerations. Since the notion of desired exposure is highly contextual, it cannot be riveted to a single operational definition. Therefore, we operationalize this notion of ‘desired’ exposure in multiple ways allowing for multiple contextual assumptions.

Key Insights

In this section, we summarise some of the most interesting observations made during this work. We utilize two standard-related item recommendation approaches: rating-SVD and item2vec, to demonstrate our observations and the utility of our proposals. rating-SVD applies Singular Value Decomposition (SVD) to the user-item rating matrix and uses cosine similarity for similarity evaluation to generate the related item recommendations. item2vec (a replication of word2vec) substitutes items for words and tries to find co-occurrence patterns in user consumption to generate recommendations. In other words, while the former works on the underlying relatedness based on people’s preference toward different items, the latter works on the underlying relatedness based on consumption in close temporal proximity.

Traditional RIR algorithms induce exposure bias: The existing RIR algorithms perform very well in finding related items. However, the exposure that different items get is found to be heavily skewed. To the extent that the top 25% of the items with the most exposure in one of the datasets account for 75% of the entire exposure. Thus, a small fraction of the item set gets a very high fraction of the entire exposure; put differently, there is very low item-space coverage.

Are the top items deserving of exposure?: Based on this skewed exposure distribution, a plausible question can be raised: are the 25% of items of very high quality? If so, one may argue that such items, having better quality than others, deserve more exposure than others, i.e., the gap in quality can explain this skew in the exposure distribution. However, considering the average user rating of items as a proxy for items, we found that the quality of items does not necessarily explain the gap in exposure that different items get. Only 6% – 10% of the total items in different datasets have comparable quality and exposure, with a stark disparity between quality and exposure for most items.

FaiRIR proposal : To counter the exposure bias, we propose FaiRIR, a novel suite of three algorithms applied at different stages in the RIR pipeline that can minimize exposure bias while preserving the underlying relatedness of the recommended items to the best possible extent. The algorithms try to introduce fairness at the a) representation learning phase (FaiRIR(rl)): where in the representation space-related items with similar desired exposure were brought together, b) similarity evaluation phase (FaiRIR(sim)): where the similarity evaluation considers not only relatedness similarity but also desirability based similarity and finally in the neighbor selection phase (FaiRIR(nbr)): reconcile between relatedness and desirability during the neighbor selection phase.

Their effectiveness in mitigating exposure bias: We demonstrate the utility of the proposed approaches on two benchmark datasets: MovieLens and Amazon product review datasets. Our extensive analyses show that the three fair interventions of RIR algorithms successfully mitigate/reduce exposure bias induced by the vanilla version of the standard algorithms. While the performance of FaiRIR(rl) and FaiRIR(nbr) are significantly better than the vanilla algorithms, the improvement is less stable for FaiRIR(sim). An organic follow-up question in this line is, does this reduction of exposure bias come at the cost of significant loss in relatedness or user satisfaction? To this end, we evaluate (a) the preservation of relatedness of the recommendation through genre/category overlap and (b) the utility of the recommendation through a user survey performed on AMT. Across all evaluations, we observe that our proposed approaches successfully mitigate exposure bias without sacrificing much on the relatedness of recommendations.

Between the lines

Traditionally, Information Retrieval (IR) research has been keyed to relevance. However, recently several studies are showing the inadvertent consequence of too much attention to relevance and user satisfaction. Especially in the post-pandemic world, where people’s dependence on platforms has deepened, IR research must include the well-being of different stakeholders. Although some effects have been taken in that line in recent works, the role of platforms has been overlooked. For example, while exposure bias can emerge due to implicit reasons, as shown in the context of the current work, other works have shown how it may emerge due to the special relationship of platforms with different entities on its platforms. For example, the emergence of private label / in-house products/contents, etc., provide enough monetary incentive for platforms to prefer their own in-house contents. The methodologies proposed in the current work are initial steps to remove any such implicit and/or explicit exposure bias.

FaiRIR: Mitigating Exposure Bias from Related Item Recommendations in Two-Sided Platforms

Introduction

Key Insights

Between the lines

From AI Winter to AI Hype: The Story of AI in Montreal

Algorithmic Auditing and Social Justice: Lessons from the History of Audit Studies

DICES Dataset: Diversity in Conversational AI Evaluation for Safety

The Social Metaverse: Battle for Privacy

Research summary: Legal Risks of Adversarial Machine Learning Research

Assessing the Fairness of AI Systems: AI Practitioners' Processes, Challenges, and Needs for Support

DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems

Ethics as a service: a pragmatic operationalisation of AI Ethics

Research summary: Social Biases in NLP Models as Barriers for Persons with Disabilities

Research summary: Bring the People Back In: Contesting Benchmark Machine Learning

Categories

Signature Content

Learn More

The AI Ethics Brief (bi-weekly newsletter)

About Us

Archive

Introduction

Key Insights

Between the lines

From AI Winter to AI Hype: The Story of AI in Montreal

Algorithmic Auditing and Social Justice: Lessons from the History of Audit Studies

DICES Dataset: Diversity in Conversational AI Evaluation for Safety

The Social Metaverse: Battle for Privacy

Research summary: Legal Risks of Adversarial Machine Learning Research

Assessing the Fairness of AI Systems: AI Practitioners' Processes, Challenges, and Needs for Support

DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems

Ethics as a service: a pragmatic operationalisation of AI Ethics

Research summary: Social Biases in NLP Models as Barriers for Persons with Disabilities

Research summary: Bring the People Back In: Contesting Benchmark Machine Learning

Footer

Categories

Signature Content

Learn More

The AI Ethics Brief (bi-weekly newsletter)

About Us

Archive