Insights on Workplace Sexual Harassment in Latin America
Summary
Utilized ELSA's dataset to analyze workplace sexual harassment trends using NLP and hypothesis testing, uncovering high tolerance levels and low reporting instances. Explored training's impact on tolerance and reporting behavior, and identified significant implications on employee well-being and productivity. Provided strategic recommendations for improvements.
Code: Github
In collaboration with ELSA
Background
I recently got this opportunity to worked on a super impactful project related to using data analytics to address workplace harassment issues, in my people analytics class, and I wanted to share that.
A bit of backstory first - We got connected to a Peruvian startup called Espacios Laborales Sin Acoso (ELSA), through our professor. ELSA partners with organizations across Latin America to help tackle problems like sexual harassment affecting employees. They have one of the largest datasets that documents information about workplace sexual harassment in Latin America across in 26 organizations across Peru, Bolivia, Columbia and Chile.
As a part of their data collection ELSA conducts employee survey for the organizations they work with which asks crucial question around their experiences with acts of sexual harassment, trainings, their relationship with their company and much more. To get me started with this analysis I defined metrics to calculate the following using the survey results provided by ELSA:
-
Tolerance score: indicates an individual's tolerance towards acts of sexual harassment. The higher the tolerance score, the more tolerant the individual is, meaning they are less likely to recognize harmful behaviors as problematic.
-
Reporting score: indicates if a certain incident was reported or considered to be reported by an individual who has been a victim to or witness of workplace sexual harassment. It is on a scale of 0-5, where 0 represents an inconsequential action in response to the act and 5 represents the most advisable action.
I was able to dive into an anonymized dataset and let me tell you, the initial trends I spotted raised some major red flags:

1. Very few actions taken - We're talking almost 90% of harassment cases where victims or witnesses did pretty much nothing in response. Only around 2% officially reported it. I've attached a chart showing the full disturbing landscape.
·
2. Way too much tolerance from management - You'd hope leadership would take this seriously, but median "tolerance scores" (measuring how lax they were about harassment) were over 25%! And it spiked as high as 57% at some companies.

So obviously a lot of room for improvement! Based on the areas needing attention, I focused my deeper analysis on:
-
Figuring out why so many cases go unreported. That'll help shape recommendations to get those reporting numbers up. No one should feel barriers to speaking out.
-
Evaluating training effectiveness - do mandated programs make any impact? I wanted to see if better or more frequent approaches could shift attitudes.
-
Business impact - Harassment obviously takes a huge toll on victims, but how does it affect productivity and performance overall? Quantifying that could really motivate management and leadership to implement stronger policies. No one wants to hurt their own bottom line!
I was super excited to dig into the analysis and brainstorm ways to move the needle through data-driven recommendations.
Analysis & Insights
Analysis 1: Understanding reporting barriers
The survey asked victims of workplace sexual harassment who chose not to report the incident why they did so. This was the only text question on the survey. I personally find free texts to be a gold mine when it comes to understanding the feelings around a topic, so I had to analyze it. I used NLP to do a word count analysis, which resulted in the word cloud here. Words like “afraid”, “retaliation”, “troublemaker” etc. came up.

To give more context to it I also performed topic modelling using LDA to discover any abstract topics or themes present in the responses. The results were:
Topic #0: 0.226*"harassment" + 0.115*"sure" + 0.115*"time" + 0.115*"case" + 0.115*"sexual" + 0.115*"workplace" + 0.033*"troublemaker" + 0.033*"work" + 0.033*"seen" + 0.025*"want"
Topic #1: 0.171*"cause" + 0.170*"trouble" + 0.170*"person" + 0.158*"didnt" + 0.158*"want" + 0.006*"useless" + 0.006*"seen" + 0.006*"think" + 0.006*"work" + 0.006*"troublemaker"
Topic #2: 0.089*"want" + 0.089*"didnt" + 0.083*"job" + 0.083*"type" + 0.083*"afraid" + 0.083*"losing" + 0.083*"retaliation" + 0.079*"troublemaker" + 0.079*"work" + 0.079*"seen"
Topic #3: 0.155*"shame" + 0.107*"afraid" + 0.058*"seemed" + 0.058*"malicious" + 0.058*"nothing" + 0.058*"retaliation" + 0.058*"type" + 0.058*"losing" + 0.058*"job" + 0.058*"partner"
Topic #4: 0.185*"think" + 0.185*"useless" + 0.185*"denounce" + 0.076*"malicious" + 0.076*"seemed" + 0.076*"nothing" + 0.022*"complaint" + 0.022*"know" + 0.022*"whether" + 0.022*"cases"
Each topic is represented as a list of words along with their weights that indicate their importance within that topic. Here's how we can interpret the output:
For each topic:
-
Topic #0: This topic seemed to be about "workplace sexual harassment." Words like "harassment," "sexual," "workplace," and "case" are most dominant, indicating that this topic revolves around issues related to harassment at work
-
Topic #1: This topic seemed related to "consequences or feelings." Words like "cause," "trouble," and "person" suggest reasons or potential issues associated with not reporting. There might be concerns about trouble or consequences associated with reporting.
-
Topic #2: Here, the words revolve around "fear of retaliation." Phrases like "afraid," "retaliation," "losing job," and "troublemaker" indicate a fear of repercussions for reporting incidents.
-
Topic #3: This topic seems to involve "feelings and negative consequences." Words like "shame," "afraid," "losing job," and "retaliation" suggest emotional distress and fear of repercussions for reporting.
-
Topic #4: This topic seems to be about "doubts and uncertainties." Phrases like "think," "useless," "denounce," and "complaint" indicate doubts about the effectiveness or utility of reporting the incident.
Insights: This shed light on the multi-faceted obstacles deterring victims from reporting workplace harassment and abuse. Four predominant themes emerged from employee descriptions of their challenges in coming forward:
-
Fear of Reprisals - Apprehensions over being labeled a "troublemaker," losing one's job or facing other retaliation dominate. Victims rightfully worry about the professional and personal consequences of accusing their harassers.
-
Social Stigma - Embarrassment, shame, and wanting to avoid "causing trouble" suggest cultural taboos around openly discussing harassment. Toxic notions still reinforce a code of silence rather than support.
-
Process Doubts - Sentiments of reporting being "useless" and "uncertainty" over whether cases are adequately investigated signal systemic mistrust. Employees lack confidence in current protocols delivering justice or stopping repeat offenses.
These interwoven themes reflect a no-win situation where victims shoulder disbelief, blame and career threats for speaking up about the harassment done against them.
Considering that most of the organizations have formal reporting channels and procedures in place and also had mandated sexual harassment trainings, I wanted to understand how the trainings are failing.
Analysis 2: Impact of Trainings
Since mandatory trainings are a direct way to reach all employees, next I wanted to analyze the impact that the trainings have on the employees. I conducted 5 hypothesis tests to find relations. The first set of tests used either correlation or t-tests.
As you can see the first three tests resulted in a p-value very close to 0, indicating that trainings have a significant impact on imparting knowledge about reporting channels, investigation procedures and policies. However, the hypothesis regarding trainings and reporting behavior yielded a non-significant result with a p-value of 0.89, suggesting that there’s no statistically significant difference in reporting scores between individuals who have received training and those who haven’t. This reveals that the current trainings may not be motivating individuals to report cases of sexual harassment.

For the hypothesis: People who have received more trainings in the past year have lower tolerance to sexual harassment, I conducted an ANOVA test since it had three groups. The results suggested at least on groups tolerance is significantly different. To verify I conducted a Tukey's Honestly Significant Difference (HSD) test, which gave the following result:

group 0: Employees who received no training
group 1: Employees who received trainings once
group 2: Employees who received trainings more than once
This means that individuals who have never received any training have significantly lower mean tolerance levels compared to those who have received training at least once. However, there's no significant difference in mean tolerance levels between those who received training once and those who received training more than once.
Insights: This analysis concluded that trainings play a significant role in educating the workforce about the policies and procedures in place to deal with sexual harassment. It also has a significant impact on the tolerance level when compared to that of individuals who have undergone no training. But there are two big gaps jumping out where these trainings miss the mark:
-
It does not seem to lead to higher reporting rates despite imparting knowledge about reporting channels and procedures, indicating a gap in translating knowledge gained from training into actual reporting behavior.
-
Taking the training more than once does not significantly impact tolerance levels. This conveys a diminished impact of repeated training on reducing tolerance.
Analysis 3: Impact on performance and productivity
A concerning finding from the data provided by ELSA showed that sexual harassment takes a significant toll on employees' wellbeing and performance. Approximately 72% oof employees who have experienced sexual harassment at work have reported experiencing heightened stress, anxiety, lack of concentration, absenteeism or reduced productivity.
While the personal impacts are most troubling, the ripple effects on the broader workplace are also impactful. Research conducted by the National Library of Medicine concluded that higher stress scores were associated with lower productivity scores. They conducted a t-test to come to this statistically-significant conclusion, which resulted in a p-value of < 0.001 (Bui et al., 2021). Also a survey conducted by the Society of Human Resource Management concluded that aside from low productivity among absent employees, co-workers are perceived to be 29.5% less productive while covering for the absent employees. Supervisors spend an extra four hours per week to plan around absenteeism and have a reduced productivity of 15.7% (Society of Human Resource Management, 2014). This indicates that an entire team or division can be impacted by victims’ experiences, and logically, the more pervasive the issue, the more widespread the impacts to the day-to-day work – and bottom line – of the organization.
Recommendations
Based on the above analysis and findings I came up with the following strategic recommendations to improve the way ELSA works with the organizations to reduce cases of workplace sexual harassment:
-
Enhance Training for Impactful Behavioral Change: Simply imparting knowledge is inadequate - trainings should focus on driving impactful behavioral changes through simulations, role-plays and realistic scenarios.
-
Behavioral Modules for All Employees: Incorporate modules that promote proactive reporting behaviors by addressing real reasons behind non-reporting. Also introduce bystander training to instill broader responsibility towards preventing harassment.
-
Specialized Training for Managers: Equip managers with sensitivity and tools to foster a harassment-free work culture. Their broader influence calls for training focused on supporting employees in difficult situations.
-
Refresher Courses for Returning Employees: Develop refresher modules for returning trainees focused on improving tolerance and reinforcing reporting significance through complex case studies.
-
-
Communicate Business Impact to Secure Buy-In: Complement the moral imperative by highlighting productivity, performance and costs impacts of unaddressed harassment.
-
Leadership Sensitization: Apprise leadership of harassment's tangible business consequences like reduced productivity, absenteeism and employee turnover. The goal is securing their buy-in.
-
Establish Zero Tolerance Ethos: Institutionalize zero tolerance not just as a moral mandate but also as vital for optimal workplace culture, team collaboration and performance.
-
-
Provide Employee Support Infra and Services: Complement enhanced training with accessible support infra to encourage reporting and help victims recover.
-
External Counseling Assistance: Implement robust Employee Assistance Programs that provide counseling and coping support to harassed employees.
-
Respond to Reporting Apprehensions: Provide safe anonymous channels, clear anti-retaliation policies and general employee support to counter reporting fears.
-
Lastly, a heartfelt thank you to ELSA for their groundbreaking work in tackling workplace harassment in Latin America. Thank you for providing this invaluable data. Your work and commitment is truly inspiring!
And the huge thanks to professor Heather Whiteman for introducing me to this exciting field of People analytics and for providing the opportunity to work with ELSA.