Keywords:-
Article Content:-
Abstract
Generative artificial intelligence (AI), especially large language models (LLMs), is increasingly deployed in domains such as recruitment, content creation, and education. While these systems accelerate productivity, they also risk reproducing and amplifying societal biases (Ahuchogu et al., 2025). This project addresses the urgent challenge of identifying, quantifying, and mitigating gender bias in text-generative AI outputs, with a focus on job narratives. Building on my independent study of 11,000+ AI-generated job narratives, which we generated using Gemini AI, we introduce a bias quantification framework using mean bias, mean absolute bias, sentiment skew (via TextBlob), and distributional measures (via Kullback–Leibler divergence and related distances). Preliminary results show measurable gendered patterns across generated narratives, validating the hypothesis of proposed gender bias in LLM.
The proposed work extends this foundation in three directions: expanding bias quantification using probabilistic distribution distances (Devisetti, 2024)(Chung et al., 1989), evaluating prompt-construction bias and multi-model comparisons across GPT-3, GPT-4, Gemini, and open-source LLMs (Blodgett et al., 2020), and integrating interpretable embedding methods (e.g., SPINE)(Subramanian et al., 2017) for transparency in downstream debiasing.
The expected contribution is both theoretical and practical: a robust bias quantification pipeline grounded in probability theory, and actionable strategies to mitigate bias in LLM-generated recruitment texts(Ferrara, 2024). Beyond recruitment, the proposed methodology aims to serve as a standard for bias evaluation in generative AI applications more broadly.
A key part of this research is the creation of large datasets containing job narratives. These datasets not only help analyze bias in AI-generated content but also support other Natural Language Processing (NLP) tasks.
References:-
References
Magnus Chukwuebuka Ahuchogu, Gabriella Folashade Akenn Musa, Eric Howard, and Kashmira Mathur. 2025. Ai and bias in recruitment: Ensuring fairness in algorithmic hiring. Journal of Informatics Education and Research, 5(3). Published July 18, 2025.
Abeba Birhane, Vinay Uday Prabhu, and Emmanuel Kahembwe. 2021. Multimodal datasets: misogyny, pornogra- phy, and malignant stereotypes. Preprint, arXiv:2110.01963.
Su Lin Blodgett, Solon Barocas, Hal Daumé III, and Hanna Wallach. 2020. Language (technology) is power: A critical survey of “bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5454–5476, Online. Association for Computational Linguistics.
Conrad Borchers, Dalia Gala, Benjamin Gilburt, Eduard Oravkin, Wilfried Bounsi, Yuki M Asano, and Hannah Kirk. 2022. Looking for a handsome carpenter! debiasing GPT-3 job advertisements. In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 212–224, Seattle, Washington. Association for Computational Linguistics.
J.K Chung, P.L Kannappan, C.T Ng, and P.K Sahoo. 1989. Measures of distance between probability distributions. Journal of Mathematical Analysis and Applications, 138(1):280–292.
Google DeepMind. 2024. Gemini 1.5 flash: Lightweight, efficient, and scalable large language model. Available at:
https://deepmind.google/technologies/gemini.
S. A. Devisetti. 2024. Bias in text generative open ai. Indian Journal of Artificial Intelligence and Neural Networking, 4(2):8–10.
Malika Dikshit, Houda Bouamor, and Nizar Habash. 2024. Investigating gender bias in STEM job advertisements. In Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 179–189, Bangkok, Thailand. Association for Computational Linguistics.
Emilio Ferrara. 2024. Fairness and bias in artificial intelligence: A brief survey of sources, impacts, and mitigation strategies. Sci, 6(1).
Solomon Kullback and R. A. Leibler. 1951. On information and sufficiency. Annals of Mathematical Statistics, 22:79–86.
Sihang Li, Kuangzheng Li, and Haibing Lu. 2023. National origin discrimination in deep-learning-powered automated resume screening. Preprint, arXiv:2307.08624.
Dena F. Mujtaba and Nihar R. Mahapatra. 2024. Fairness in ai-driven recruitment: Challenges, metrics, methods, and future directions. Preprint, arXiv:2405.19699.
Gideon Popoola, Khadijat-Kuburat Abdullah, Gerard Shu Fuhnwi, and Janet Agbaje. 2024. Sentiment analysis of financial news data using tf-idf and machine learning algorithms. pages 1–6.
Ravender Singh Rana. 2023. Job dataset.
Anant Subramanian, Danish Pruthi, Harsh Jhamtani, Taylor Berg-Kirkpatrick, and Eduard Hovy. 2017. Spine: Sparse interpretable neural embeddings. Preprint, arXiv:1711.08792.
Titus von der Malsburg, Till Poppels, and Roger P. Levy. 2020. Implicit gender bias in linguistic descriptions for expected events: The cases of the 2016 united states and 2017 united kingdom elections. Psychological Science, 31(2):115–128. PMID: 31913768.
Kyra Wilson and Aylin Caliskan. 2024. Gender, race, and intersectional bias in resume screening via language model retrieval. Preprint, arXiv:2407.20371.
Mi Zhou, Vibhanshu Abhishek, Timothy Derdenger, Jaymo Kim, and Kannan Srinivasan. 2024. Bias in generative ai. Preprint, arXiv:2403.02726.