Research summary: Learning to Diversify from Human Judgments - Research Directions and Open Challenges

Mini summary (scroll down for full summary):

Current algorithmic techniques frame the notion of diversity in the sense of using the presence of sensitive attributes in the result set as a measurement for whether there is sufficient representation. Yet, such an approach often ends up stripping these sensitive attributes, often gender and race from their deep social, culture and context specific meanings and bucket them into discrete categories that are rigid, uni-dimensional, and determined algorithmically in the process of clustering.

The paper (by Denton et al.) presents a research direction using the concept of determinantal point process (DPP) as a mechanism for capturing diversity in a more subjective and individualized manner by taking in the feelings of the individuals on whether they think they are well represented in the result set or not. It tends to cluster together the things that the individual feels represents them well and further away from others that don’t in an embedding space. Relying on individual’s perceptions to tailor these representations moves applications a step forward in a direction where representation is adequately captured. The authors do identify challenges associated with this approach namely the reliable sourcing of this information in a large-scale manner, especially as it relates to the limitations of how crowdsourcing platforms are structured today but still gives the research community some food for thought in how to capture diversity better.

Full summary:

Ranking and retrieval systems for presenting content to consumers are geared towards enhancing user satisfaction, as defined by the platform companies which usually entails some form of profit-maximization motive, but they end up reflecting and reinforcing societal biases, disproportionately harming the already marginalized.

In fairness techniques applied today, the outcomes are focused on the distributions in the result set and the categorical structures and the process of associating values with the categories is usually de-centered. Instead, the authors advocate for a framework that does away with rigid, discrete, and ascribed categories and looks at subjective ones derived from a large pool of diverse individuals. Focusing on visual media, this work aims to bust open the problem of underrepresentation of various groups in this set that can render harm on to the groups by deepening social inequities and oppressive world views. Given that a lot of the content that people interact with online is governed by automated algorithmic systems, they end up influencing significantly the cultural identities of people.

While there are some efforts to apply the notion of diversity to ranking and retrieval systems, they usually look at it from an algorithmic perspective and strip it of the deep cultural and contextual social meanings, instead choosing to reference arbitrary heterogeneity. Demographic parity and equalized odds are some examples of this approach that apply the notion of social choice to score the diversity of data. Yet, increasing the diversity, say along gender lines, falls into the challenge of getting the question of representation right, especially trying to reduce gender and race into discrete categories that are one-dimensional, third-party and algorithmically ascribed.

The authors instead propose sourcing this information from the individuals themselves such that they have the flexibility to determine if they feel sufficiently represented in the result set. This is contrasted with the degree of sensitive attributes that are present in the result sets which is what prior approaches have focused on. From an algorithmic perspective, the authors advocate for the use of a technique called determinantal point process (DPP) that assigns a higher probability score to sets that have higher spreads based on a predefined distance metric.

How DPP works is that for items that the individual feels represents them well, the algorithm clusters those points closer together, for points that they feel don’t represent them well, it moves those away from the ones that represent them well in the embedding space. Optimizing for the triplet loss helps to achieve the goals of doing this separation.

But, the proposed framework still leaves open the question of sourcing in a reliable manner these ratings from the individuals about what represents and doesn’t represent them well and then encoding them in a manner that is amenable to being learned by an algorithmic system.

While large-scale crowdsourcing platforms which are the norm in seeking such ratings in the machine learning world, given that their current structuring precludes raters’ identities and perceptions from consideration, this framing becomes particularly challenging in terms of being able to specify the rater pool. Nonetheless, the presented framework provides an interesting research direction such that we can obtain more representation and inclusion in the algorithmic systems that we build.

Original piece by Denton et al.: https://drive.google.com/file/d/1lPynepBWoldRH6TS_a2UgOLu3y_QBDWs/view