Training Algorithms To Make Fair Decisions Using Private Data

by Julia Cohen

Published on February 3rd, 2023Last updated on February 7th, 2023

“Artificial intelligence systems make decisions based on the data they observe,” said Shen Yan, a recently graduated Ph.D. student at USC Viterbi’s Information Sciences Institute (ISI) and co-author of FairFed: Enabling Group Fairness in Federated Learning, which will be presented at the 37th AAAI Conference on Artificial Intelligence, held in Washington, D.C. on Feb. 7-14, 2023.  

Decisions based on biased data can be biased; this is true whether it’s a human making those decisions or an artificial intelligence system. The good news is that debiasing the information source can help mitigate the bias of machine learning (ML) algorithms. The bad news? That source data is not always available. Like in federated learning, an ML technique that is used to train algorithms across multiple decentralized datasets without actually exchanging local data samples. 

Federated Learning: Learning Without Seeing the Data

Because it does not require direct access to data, federated learning maintains privacy, making it a great solution for sensitive data (think: records at financial institutions or hospitals). 

Yan said, “Federated learning has been viewed as a promising solution for collaboratively training machine learning models among multiple parties while maintaining their local data privacy. However, federated learning also poses new challenges in mitigating the potential bias against certain populations (e.g., demographic groups), as this typically requires centralized access to the sensitive information (e.g., race, gender) of each datapoint.” 

With the rise of both federated learning and the use of ML to give guidance on critical decisions that involve private data, like parole, job offers and medical treatments, it has become more important than ever to ensure fairness. “Machine learning can affect a person’s life or future,” said Yahya Ezzeldin, a post-doc at USC Viterbi’s Ming Hsieh Department of Electrical and Computer Engineering and co-author of the paper. 

Motivated by the importance and challenges of group fairness in federated learning, Yan, Ezzeldin, and their co-authors developed FairFed, a novel algorithm to enhance group fairness in federated learning. 

Sounds Great! How Does FairFed Do It?

Using the example of banks using machine learning to evaluate customers for loans:  

First, each individual entity (or in this example, bank) performs local debiasing on its own dataset. That is, they debias their algorithm using local population data. Those individual entities calculate a local fairness metric, a measure of how fair their algorithm is with regards to their local population. 

Then, to amplify this local debiasing performance, the entities (individual banks) evaluate the fairness of the global model on their local datasets and collaborate with the server to adjust its model aggregation weights. 

The aggregation weights are a function of the mismatch between the global fairness metric (on the full dataset, which includes the data from all the banks in the global model) and the local fairness metric at each client, favoring clients whose local measures match the global fairness measure.  

Ezzeldin explained, “if a bank’s fairness metric is close to the global view, then that bank has a higher weight when we’re taking an average, because the debiasing that they did is actually in line with the global view.” 

Did It Work?

The research team is satisfied with the results. “FairFed provides an efficient and effective approach to improve federated learning systems,” said Yan. 

Real life decision-making scenarios involve “heterogeneous data” – data with a high variability of types and formats. Because of this, the team evaluated FairFed using heterogeneous data. 

They found that FairFed outperformed the state-of-the-art fair federated learning frameworks under high data heterogeneity, making sure that the outcomes have fairer performance for different demographic groups. 

Yan said, “In our research, we found a way to debias the federated learning system during the aggregation stage. It ensures that the system won’t access an individual’s data as well as make fair decisions.” 

The research and findings will be presented at the upcoming 37th AAAI Conference on Artificial Intelligence. Run by the largest professional organization in the field, the AAAI conference aims to promote research in artificial intelligence and scientific exchange among AI researchers, practitioners, scientists, and engineers in affiliated disciplines. This year, the conference had an acceptance rate of 19.6%. 

Published on February 7th, 2023

Last updated on February 13th, 2023

Want to write about this story?