The "AI for Equality" Datathon and Webinar Series
In October 2024, HerWILL hosted the "AI for Equality" Datathon and Webinar Series, an initiative designed to empower women and marginalized communities in Artificial Intelligence (AI) and Data Science.
With over 900 registrants from five countries—Syria, Egypt, Bangladesh, the United States, Pakistan, and Lebanon—the event aimed to promote equity in technology by addressing gender bias in AI systems. Participants engaged in a week-long competition following a preparatory webinar series led by experts in the field.
Example of event advertising seen across social media
The "AI for Equality" Datathon and Webinar Series
In October 2024, I was invited to organize and moderate the "AI for Equality" Datathon and Webinar Series, an initiative designed to empower women and marginalized communities in Artificial Intelligence (AI) and Data Science. With over 900 registrants from 20 countries and 171 universities, the event aimed to address systemic gender bias in technology while equipping participants with critical skills in data science, coding, and machine learning. This is a growing global event, now in its fourth year, and has consistently expanded its reach and impact, becoming a cornerstone initiative for HerWILL’s mission to foster inclusivity in technology.
My responsibilities included working closely with the HerWILL team to design the event structure, coordinate the technical aspects, and ensure the sessions aligned with the mission of promoting inclusivity. I moderated all four webinars, managing discussions and facilitating interactive Q&A sessions to enhance participant engagement. In preparation for each session, I met with speakers to refine their content, tailoring it to the diverse audience's skill levels and ensuring it was accessible and actionable. For the Datathon, I collaborated with HerWILL and Harvard researchers to shape the dataset and challenge scope, prepared guidance documents and supported the evaluation process for participant submissions.
In collaboration with AI4OPT - AI Institute for Advances in Optimization and Harvard researchers, HerWILL designed this year’s Datathon with an emphasis on real-world challenges and cutting-edge technology. The initiative reflected a unique partnership between academic institutions, nonprofits, and industry leaders to ensure participants gained both technical expertise and actionable insights for their careers. This year marked a significant milestone with the creation of a custom dataset of over 100,000 lines of code, developed in-house by Harvard researchers, making this Datathon a standout event in the global data science community.
What is a Datathon?
A Datathon is a structured event where individuals or teams work together to solve real-world problems using data. Unlike hackathons, which address a broad range of technical challenges, Datathons are specifically focused on the analysis, management, and processing of data to extract insights and develop solutions. These solutions often take the form of predictive models, data visualizations, or applications designed to address a defined challenge.
Participants in a Datathon are tasked with working through large, often complex datasets to identify patterns, trends, or anomalies. The process involves data cleaning, integration, and analysis, requiring a blend of skills in programming, mathematics, statistics, data visualization, and communication. Success in a Datathon isn’t just about technical skills—it’s about being able to tell a story with the data, showcasing why the solution matters and how it addresses the problem.
The datasets used in Datathons vary widely in scope and origin, often sourced from corporate databases, government platforms, non-profits, or academic institutions. For example, challenges can range from improving fraud detection systems, optimizing pricing models, or enhancing recommendation algorithms to solving social equity issues. By offering a practical, hands-on experience, Datathons allow participants to apply their knowledge to real-world problems, collaborate with others, and showcase their abilities to peers, mentors, and industry leaders.
The "AI for Equality" Datathon
The HerWILL "AI for Equality" Datathon, held from October 21 to 27, 2024, was a week-long event that challenged participants to use their skills in AI and data science to tackle the pressing issue of gender bias in software.
This even encouraged participants to push the boundaries of Machine Learning (ML) and Deep Learning (DL) to address gender bias in text. Participants were challenged to develop models capable of accurately and efficiently identifying gender bias directed toward males, females, and non-binary individuals. Beyond just technical problem-solving, the Datathon aimed to spark meaningful conversations within the Social Sciences, Gender Studies, and Data Science communities.
Evaluation Criteria
To ensure a comprehensive evaluation, submissions were assessed by a panel of esteemed judges from Harvard University, UC Berkeley, Purdue University, BRAC University, and Georgia Tech. The judges considered the following factors:
Accuracy: How well the models detected and classified gender biases in the dataset.
Innovation: The creativity and originality of the solutions proposed.
Feasibility: The practicality and potential for real-world application of the mitigation strategies.
Presentation: The clarity and effectiveness with which participants communicated their approach, findings, and solutions.
Submissions were evaluated using a custom metric, F1NOP, that balanced model accuracy (via F1 score) with efficiency (via parameter count). This approach incentivized participants to create models that were not only effective but also computationally efficient.
The formula for F1NOP:
F1NOP = (0.4 × F1 Score) + (0.6 × 1/log(max(1, NOP)) + ε)
F1 Score: Assessed how well the model classified gender bias.
NOP (Number of Parameters): Encouraged participants to optimize their models for inference efficiency.
Presentation and Documentation: Teams could earn bonus points for submitting training loss plots, model comparisons, and well-documented code.
Participants followed strict guidelines for calculating NOP using the thop Python package, ensuring consistency across submissions. A sample script provided in the competition overview helped standardize the methodology.
Judging
Example of what a webinar can look like from a participant's perspective. As a moderator, I begin and end each session while managing discussions and facilitating Q&A sessions.
The Webinar Series
The webinar series leading up to the Datathon was designed to prepare participants with the foundational skills necessary to successfully tackle the challenge. The sessions were tailored to teach critical skills in data science, coding, and AI that would not only benefit those participating in the Datathon but also offer practical knowledge applicable to academic and professional settings. Whether attendees were beginners or experienced professionals, the goal was to provide accessible, high-quality education that could be applied beyond the event itself.
Each class focused on an important aspect of data science and problem-solving, equipping participants with tools and techniques they would need to analyze the dataset, develop solutions, and present their findings. Taught by academic leaders and experts, the series ensured that every attendee, regardless of skill level, left with valuable insights. This inclusive approach ensured that even those not planning to compete in the Datathon would walk away with new skills to enhance their work or studies.
Webinar Schedule and Speakers
Python Essentials Unleashed
Date: October 11, 2024
Speaker:
Dr. Rabab Haider: Assistant Professor in Civil and Environmental Engineering at the University of Michigan. Dr. Haider introduced Python programming to over 320 participants, ensuring accessibility for all skill levels.
AI Basics and Advanced Text Processing
Date: October 14, 2024
Speakers:
Dr. Reza Zandehshahvar: Research Engineer II at Georgia Tech’s AI4OPT and H. Milton Stewart School of Industrial and Systems Engineering.
Thomas Bruys: Graduate Research Assistant in Electrical and Computer Engineering at Georgia Tech. This workshop explored machine learning models and text processing techniques, providing practical insights into the application of AI for text-based challenges.
Understanding Large Language Models (LLMs)
Date: October 18, 2024
Speaker:
Dr. Jin Xu is a Research Faculty member of Artificial Intelligence at Georgia Tech. Dr. Xu explained the mechanics and applications of LLMs, focusing on their transformative potential across industries like healthcare and education.
Datathon Countdown: Strategic Planning
Date: October 19, 2024
Speaker:
Noor Mairukh Khan Arnob Project Manager at HerWILL. Arnob provided participants with strategies for effective teamwork, time management, and presentation, equipping them for success in the Datathon.
The Challenge
Participants were tasked with using Machine Learning and Deep Learning techniques to classify passages based on their gender bias. Specifically, the competition required teams to:
Analyze the Dataset
Teams worked with a custom dataset, designed in collaboration with Harvard researchers, that included real-world programming scenarios and textual data. This dataset showcased examples of implicit and explicit biases in language, providing participants with an opportunity to explore how gender bias manifests in various contexts.Develop Robust Models
The primary objective was to create models that could classify gender bias in text with high accuracy while remaining efficient. Teams were asked to predict the gender group (males, females, or non-binary individuals) that a passage was biased against. The competition also emphasized model efficiency by requiring participants to report the Number of Parameters (NOP) their models used during inference.Propose Mitigation Strategies
In addition to detection, teams were encouraged to think about how these insights could be used to reduce biases in textual data. The competition aimed to foster creative, actionable solutions that could lead to fairer and more inclusive technologies.
Impact and Key Takeaways
The HerWILL "AI for Equality" Datathon was a powerful platform designed to provide women and individuals from marginalized communities with opportunities to engage in cutting-edge technology and tackle systemic issues in AI. With participants from 20 countries and 171 universities, the event offered a rare chance for individuals—particularly women from regions with limited access to STEM opportunities—to showcase their talents and work on solutions that address real-world challenges.
This event was especially significant for participants from underserved communities, where opportunities for women in technology are scarce. By focusing on gender bias in AI, the Datathon gave women the platform to demonstrate their skills in a space where their voices often go unheard. The success of Jawa Habib, a Syrian woman who won on the merit of her work, was a testament to the importance of creating equitable opportunities. Her achievement, unanimously recognized by an esteemed panel of judges, showed the power of talent and determination when given the chance to shine.
The Datathon also pushed participants to advance Machine Learning and Deep Learning practices by developing models that balance accuracy and efficiency. By addressing gender bias in text, the competition sparked conversations about inclusivity in technology and how AI can be used as a tool to foster equity.
Beyond technical skills, this event created a space for collaboration and global dialogue. Participants came together from diverse backgrounds to learn, innovate, and connect with mentors and peers, gaining not only recognition but also confidence in their abilities. For many, it was a rare and transformative experience that inspired them to pursue careers and leadership roles in STEM.
The HerWILL "AI for Equality" Datathon was a step toward making technology more inclusive by prioritizing marginalized voices and tackling systemic challenges. It showed that when opportunities are made accessible, innovation thrives, and meaningful change becomes possible.
These are our esteemed panel of Judges…
Dr. Kevin Dalmeijer
Senior Research Associate
H. Milton Stewart School of Industrial and Systems Engineering, Georgia Tech.
Dr. Reem F. Khir
Assistant Professor
Edwardson School of Industrial Engineering, Purdue University
Dr. Farig Sadeque Yusuf Sadeque
Associate Professor
Computer Science and Engineering, BRAC University.
Dr. Paul Grigas
Associate Professor
Department of Industrial Engineering & Operations Research, UC Berkeley.
Dr. Intekhab Hossain
Senior Data Scientist
Analysis Group.
Conclusion
Through my work on the HerWILL "AI for Equality" Datathon, I had the privilege of contributing to an event that empowered women and individuals from marginalized communities to step into the spotlight and tackle one of the most pressing issues in technology today—bias in AI. From designing the challenge to collaborating with researchers and guiding participants, I helped create a space where diverse voices could be heard, skills could be showcased, and meaningful solutions could emerge. This Datathon wasn’t just about advancing technical expertise; it was about breaking barriers, sparking critical conversations, and demonstrating the power of inclusive opportunities. I’m proud to have played a role in an initiative that inspired so many to see their potential and pursue a future where technology works for everyone.