欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

员工离职原因分析_分析员工离职调查

程序员文章站 2022-04-10 21:01:51
...

员工离职原因分析

When analyzing employee sentiment data, which in our case is an employee exit survey, we have to look at four topics.

在分析员工情感数据(在我们的案例中是员工离职调查)时,我们必须关注四个主题。

  1. Statistical rigor of the survey

    调查的统计严谨性
  2. Demographical composition of survey respondents

    受访者的人口构成
  3. Overall sentiment for defined latent constructs

    定义的潜在构造的总体情绪
  4. Sentiment scores by respondents’ characteristics (ie. gender, location, department, etc.)

    根据受访者的特征(即性别,位置,部门等)进行情感评分

First, keeping to this methodology will enable us to determine how well our survey is measuring what it is meant to measure. Secondly, by understanding who answered the survey from a respondent characteristics perspective (ie. gender, departments, etc) we can provide context to our analysis and results. Thirdly, this methodology will help us determine the general sentiment of the responders. Last but not least, it will help us determine not only what organization initiatives might be useful to increase sentiment but also where these initiatives should be implemented.

首先,坚持这种方法将使我们能够确定调查在衡量其测量意图方面的程度。 其次,通过从受访者特征角度(即性别,部门等)了解谁回答了调查,我们可以为我们的分析和结果提供背景。 第三,这种方法将帮助我们确定响应者的总体情绪。 最后但并非最不重要的一点是,它不仅可以帮助我们确定哪些组织举措可能有助于增加人们的情绪,还可以确定应该在哪里实施这些举措。

数据集 (Dataset)

The dataset we’ll be using is a fictional employee exit survey which asks the employee a series of questions regarding their organizational demographics (ie. department) and 5-point Likert (ie. Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree) sentiment questions (ie. the organization offered plenty of promotional opportunities). No open-ended questions were utilized.

我们将使用的数据集是一个虚构的员工离职调查,该调查询问员工有关他们的组织人口统计特征(即部门)和五点李克特的一系列问题(即,强烈不同意,不同意,中立,同意,强烈同意)情绪问题(即组织提供了大量促销机会)。 没有使用开放式问题。

数据处理 (Data Processing)

import pandas as pd
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
import seaborn as snsimport warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns', None)
%matplotlib inlinewith open('exit_data_final.csv') as f:
df = pd.read_csv(f)
f.close()df.info()
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

We have 33 items or questions which were asked of the employees. Before we can begin our analysis we have a bit of data cleaning to perform.

我们有33个询问员工的项目或问题。 在开始分析之前,我们需要执行一些数据清理。

df.drop('Unnamed: 0', axis=1, inplace=True)

Let’s drop this odd “Unnamed” column as it services no purpose.

让我们删除这个奇怪的“未命名”列,因为它毫无用处。

for var in df.columns:
print(var, df[var].unique())
员工离职原因分析_分析员工离职调查

By examining the unique values for each item we can see a few issues.

通过检查每个项目的唯一值,我们可以看到一些问题。

  1. Some items have missing values labeled correctly as np.nan but others are simply null.

    有些项目缺少正确标记为np.nan的值,而另一些则只是空值。
  2. Based on df.info() we need to transform the item types for our Likert items as they are currently formatted as ‘objects’.

    基于df.info(),我们需要转换Likert项目的项目类型,因为它们当前被格式化为“对象”。
  3. Finally, we need to transform some of the values in order to improve the readability of our visualizations.

    最后,我们需要转换一些值,以提高可视化效果的可读性。
# Replacing nulls with np.nan
for var in df.columns:
df[var].replace(to_replace=' ', value=np.nan, inplace=True)# Converting feature types
likert_items = df[['promotional_opportunities', 'performance_recognized',
'feedback_offered', 'coaching_offered', 'mgmt_clear_mission',
'mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm',
'direct_mgmt_satisfaction', 'job_stimulating', 'initiative_encouraged',
'skill_variety', 'knowledge_variety', 'task_variety', 'fair_salary',
'teamwork', 'team_support', 'team_comm', 'team_culture',
'job_train_satisfaction', 'personal_train_satisfaction', 'org_culture',
'grievances_resolution', 'co-worker_interaction',
'workplace_conditions', 'job_stress', 'work/life_balance']]for col in likert_items:
df[col] = pd.to_numeric(df[col], errors='coerce').astype('float64')# Discretization of tenure
bins = [0,4,9,14,19,24]
labels = ['0-4yrs', '5-9yrs', '10-14yrs', '15-19yrs', '20+yrs']
df['tenure'] = pd.cut(df['tenure'], bins = bins, labels=labels)

潜在构造的发展 (Development of Latent Constructs)

In my previous article, we reviewed the process of analyzing the statistical rigor (ie. validity, reliability, factor analysis) of our survey. Feel free to review the but let’s quickly review what latent survey constructs are and how they are derived.

在我的上一篇文章中,我们回顾了对调查的统计严谨性(即有效性,可靠性,因子分析)进行分析的过程。 随时查看,但是让我们快速回顾一下潜在的调查构造及其衍生方式。

In order to develop survey items or questions which maintain good statistical rigor, we have to begin with scholarly literature. We want to find a theoretical model that describes the phenomena we wish to measure. For example, personality surveys very often will use the Big-5 model (ie. openness, conscientiousness, extraversion, agreeableness, and neuroticism) to develop the survey items. The survey developer will carefully craft 2–10 (depending on the length of the survey) items for each component of the model. The items which are meant to assess the same component are said to be measuring a “latent construct”. In order words, we are not measuring “extraversion” explicitly as that would be an “observed construct” but indirectly through the individual survey items. The survey is pilot tested with multiple samples of respondents until a certain level of rigor is attained. Once again, if you’re interested in the statistical analyses used to determine rigor take a look at my previous article.

为了开发能够保持良好统计严格性的调查项目或问题,我们必须从学术文献开始。 我们想要找到一个描述我们要测量的现象的理论模型。 例如,性格调查经常会使用Big-5模型(即开放性,尽责性,性格外向,好感和神经质)来开发调查项目。 调查开发人员将为模型的每个组成部分精心制作2-10个项目(取决于调查的时间长短)。 旨在评估同一组成部分的项目据说正在衡量“潜在构成”。 换句话说,我们并不是在明确衡量“外向性”,因为这将是“观察到的结构”,而是通过各个调查项目间接衡量的。 该调查是通过对多个受访者样本进行的先导测试,直到达到一定程度的严格性为止。 再次,如果您对用于确定严谨性的统计分析感兴趣,请参阅我的上一篇文章

# Calculating latent variables
df['employee_valued'] = np.nanmean(df[['promotional_opportunities',
'performance_recognized',
'feedback_offered',
'coaching_offered']], axis=1)df['mgmt_sati'] = np.nanmean(df[['mgmt_clear_mission',
'mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm', 'direct_mgmt_satisfaction']], axis=1)df['job_satisfaction'] = np.nanmean(df[['job_stimulating',
'initiative_encouraged','skill_variety','knowledge_variety',
'task_variety']], axis=1)df['team_satisfaction'] = np.nanmean(df[['teamwork','team_support',
'team_comm','team_culture']], axis=1)df['training_satisfaction'] = np.nanmean(df[['job_train_satisfaction',
'personal_train_satisfaction']], axis=1)df['org_environment'] = np.nanmean(df[['org_culture','grievances_resolution',
'co-worker_interaction','workplace_conditions']], axis=1)df['work_life_balance'] = np.nanmean(df[['job_stress','work/life_balance']], axis=1)df['overall_sati'] = np.nanmean(df[['promotional_opportunities', 'performance_recognized','feedback_offered', 'coaching_offered', 'mgmt_clear_mission','mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm','direct_mgmt_satisfaction', 'job_stimulating', 'initiative_encouraged','skill_variety', 'knowledge_variety', 'task_variety', 'fair_salary','teamwork', 'team_support', 'team_comm', 'team_culture', 'job_train_satisfaction', 'personal_train_satisfaction', 'org_culture', 'grievances_resolution', 'co-worker_interaction', 'workplace_conditions', 'job_stress', 'work/life_balance']], axis=1)

Our exit survey has also been developed to assess certain latent constructs. Each survey item is averaged in accordance with the latent factor is it meant to measure. Finally, we have calculated an “overall_sati” feature which calculates the grand average across all items/latent factors for each respondent.

我们的出口调查也已经开发出来,以评估某些潜在的构造。 每个调查项目均根据要测量的潜在因子进行平均。 最后,我们计算了一个“ overall_sati”功能,该功能可以计算每个受访者所有项目/潜在因素的总计平均值。

Below is a list of the survey items and the latent construct they are meant to measure. Keep in mind each label for each item has been shortened significantly in order to help facilitate visualizations. You can imagine the items asking questions such as “On a scale of 1–5, I find my job stimulating”.

以下是调查项目和它们将要测量的潜在结构的列表。 请记住,每个项目的每个标签都已大大缩短,以帮助促进可视化。 您可以想象这些项目会问一些问题,例如“在1到5分之间,我发现我的工作很刺激”。

员工离职原因分析_分析员工离职调查
mappings = {1:'1) Dissatisfied', 2:'1) Dissatisfied', 3:'2) Neutral', 4:'3) Satisfied', 5:'3) Satisfied'}
likert = ['promotional_opportunities', 'performance_recognized',
'feedback_offered', 'coaching_offered', 'mgmt_clear_mission',
'mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm',
'direct_mgmt_satisfaction', 'job_stimulating', 'initiative_encouraged',
'skill_variety', 'knowledge_variety', 'task_variety', 'fair_salary',
'teamwork', 'team_support', 'team_comm', 'team_culture',
'job_train_satisfaction', 'personal_train_satisfaction', 'org_culture',
'grievances_resolution', 'co-worker_interaction',
'workplace_conditions', 'job_stress', 'work/life_balance']for col in likert:
df[col+'_short'] = df[col].map(mappings)

df.head()

In order to aid with visualizations, we are going to create new features which will aggregate ratings of 1 and 2 into Dissatisfied, 3 into neutral, and 4 and 5 into Satisfied. This will enable us to create stacked bar plots with 3 unique portions.

为了帮助可视化,我们将创建新功能,将1和2的评级汇总为“不满意”,将3评级汇总为“中立”,将4和5汇总为“满意”。 这将使我们能够创建具有3个唯一部分的堆叠条形图。

员工离职原因分析_分析员工离职调查

受访者的特征 (Respondents’ Characteristics)

Understanding the demographics of survey respondents’ helps to provide contextual information for our analysis. We also have to keep in mind that most employee exit surveys are completed on a volunteer basis. Due to this confounding variable, we need to take any insights gleaned from the data as “evidence” of organizational affairs instead of definitive “proof”. Employees might have been extremely happy or angry with the organization and their attitude will surely be represented in their answers. Finally, the survey consists of roughly 600 respondents and we need to be careful not to consider these as the total number of all termination which occurred in the last 4 years. 600 terminations might only be a small percentage of all terminations which have occurred.

了解被调查者的人口统计信息有助于为我们的分析提供背景信息。 我们还必须记住,大多数员工离职调查都是在自愿的基础上完成的。 由于存在这个令人困惑的变量,我们需要将从数据中收集到的所有见解作为组织事务的“证据”,而不是确定的“证明”。 员工可能对组织非常满意或生气,他们的态度肯定会在他们的回答中得到体现。 最后,该调查由大约600名受访者组成,我们需要注意不要将其视为过去4年发生的所有终止劳动的总数。 600个终端可能仅占已发生终端总数的一小部分。

In other words, our analysis hopes to determine the level satisfaction on several organizational factors for those employees who RESPONDED to the survey. We should not generalize our results to the broader organization or all terminated employees.

换句话说,我们的分析希望确定响应调查的那些员工在几个组织因素上的水平满意度。 我们不应将结果推广到更广泛的组织或所有已终止的员工。

def uni_plots(feature, text):
tmp_count = df[feature].dropna().value_counts().values
tmp_percent = ((df[feature].dropna().value_counts()/len(df))*100).values
df1 = pd.DataFrame({feature: df[feature].value_counts().index,
'Number of Employees': tmp_count,
'Percent of Employees': tmp_percent})

f, ax = plt.subplots(figsize=(20,10))
plt.title(text, fontsize=25, pad=30)
plt.tick_params(axis='both', labelsize=15, pad=10)
plt.xlabel(feature, fontsize=20)
plt.xticks(size=18)
plt.yticks(size=18)

sns.set_color_codes('pastel')
count = sns.barplot(x=feature, y='Number of Employees', color='b', data=df1, label='Number of Employees')
for p in count.patches:
count.annotate(format(p.get_height(), '.1f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9),
textcoords = 'offset points', size = 20)

sns.set_color_codes('muted')
percent = sns.barplot(x=feature, y='Percent of Employees', color='b', data=df1, label='Percent of Employees')
for i in percent.patches:
percent.annotate(format(i.get_height(), '.1f'),
(i.get_x() + i.get_width() / 2., i.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9), size = 20,
textcoords = 'offset points')

ax.set_ylabel('')
ax.legend(ncol=2, loc="upper right", fontsize=15, frameon=True)
sns.despine(left=False, bottom=False)
ax.set_xticklabels(ax.get_xticklabels(), rotation=45)
plt.show()
def bi_cat_plot(feature1, feature2):
ax = pd.crosstab(df[feature1], df[feature2], normalize='index')*100
ax1 = ax.plot(kind='barh', stacked=True, figsize=(25,15), fontsize=25)
for i in ax1.patches:
width, height = i.get_width(), i.get_height()
x, y = i.get_xy()
ax1.text(x+width/2,
y+height/2,
'{:.0f} %'.format(width),
horizontalalignment='center',
verticalalignment='center',
size=25)

plt.title('Percentage of Termination Reasons by {}'.format(feature1), fontsize=30, pad=25)
plt.ylabel(' ')
plt.legend(prop={'size':20})
员工离职原因分析_分析员工离职调查

The production, customer service, and sales departments make up over 65% of the survey respondents across 4 years (2017–2020) of data collection. We have a large discrepancy in survey respondents across all four years. 2018 saw 342 terminations whereas 2020 has only seen 37.

在过去四年(2017-2020年)的数据收集中,生产,客户服务和销售部门占调查受访者的65%以上。 在过去的四年中,我们的调查受访者之间存在很大差异。 2018年有342个终止,而2020年只有37个。

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

A major insight stems from the fact that almost 50% of the respondents were voluntary terminations. Once again it is difficult to generalize our results to the broader organization without conclusive HRIS data but this size of discrepancy in termination reasons points us to the possibility that the company may have a problem with voluntary terminations. We don’t have any employee performance data to determine constructive or regrettable voluntary turnover as this would allow us to specifically focus our analysis on regrettable voluntary turnover.

一个主要的见解源于以下事实:几乎50%的受访者是自愿离职。 再次,如果没有确切的HRIS数据,很难将我们的结果推广到更广泛的组织,但是由于终止原因而造成的差异如此之大,这表明我们公司自愿终止可能存在问题。 我们没有任何员工绩效数据来确定建设性的或令人遗憾的自愿离职,因为这将使我们能够将分析重点专门放在令人遗憾的自愿离职上。

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

The largest age group of respondents is 56 and older which makes up almost 20% of the respondents. Slicing age by reason for termination we can see that this age group makes up 87% of retirements which would make sense. If we look at voluntary turnover and age, we can see an even distribution across all age groups.

年龄最大的受访者年龄在56岁及以上,占受访者的近20%。 按终止原因划分年龄,我们可以看到这个年龄组占退休人数的87%,这是有道理的。 如果我们观察自愿离职和年龄,我们可以看到所有年龄段的人均分布。

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

Variables with a low number of categories such as gender typically don’t provide us with many insights unless we see major skewness to the data. It seems females responded to the survey at almost 2:1 ratio compared to males. Looking at gender by reason for termination we see percentages that mirror the ratio we see for gender overall.

除非我们看到数据存在较大的偏斜,否则诸如性别之类的类别数量很少的变量通常不会为我们提供很多见识。 与男性相比,女性对调查的回应率几乎为2:1。 通过解雇原因来观察性别,我们看到的百分比反映了我们看到的总体性别比率。

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

We have a fairly even distribution of respondents based on job type. We do see managerial and executive job types to make up a significantly smaller portion but these positions make up smaller portions of the organization in general. What is more interesting that we see a spike in involuntary terminations (38%) for executives. We typically see lower sentiment for involuntary terminations compared to other reasons of termination, therefore, we can expect executives to score fairly low on overall sentiment.

根据工作类型,我们的受访者分布相当平均。 我们确实看到管理和行政职位类型只占很小的一部分,但这些职位通常占组织的较小部分。 更有趣的是,我们发现高管的非自愿解雇激增(38%)。 与其他解雇原因相比,我们通常会看到非自愿解雇的情绪较低,因此,我们可以预期高管在整体情绪上的得分相当低。

员工离职原因分析_分析员工离职调查

Overall, we see voluntary terminations make up the majority of respondents who mainly come from the production, customer service and sales departments and fall towards the latter years of their careers (41 and older).

总体而言,我们看到自愿解雇构成了大多数受访者,这些受访者主要来自生产,客户服务和销售部门,并且在职业生涯的后期(41岁及以上)。

总体受访者情绪 (Overall Respondent Sentiment)

The overall respondent sentiment is calculated by first taking the average of all the individual sentiment items (ie. Likert items) for each respondent. Finally, a grand average is calculated depending on how the data is sliced.

总体受访者情感是通过首先获取每个受访者的所有单个情感项目(即李克特项目)的平均值来计算的。 最终,将根据数据的切片方式来计算总体平均值。

def overall_plot(feature):
ax = round(df.groupby(feature)['overall_sati'].mean(),2).sort_values().plot(kind='barh', stacked=True,
figsize=(25,15), fontsize=25)
for i in ax.patches:
width, height = i.get_width(), i.get_height()
x, y = i.get_xy()
ax.text(x+width/2,
y+height/2,
'{:.2f}'.format(width),
horizontalalignment='center',
verticalalignment='center',
size=25)
plt.title('Overall Employee Sentiment by {}'.format(feature), fontsize=30, pad=25)
plt.ylabel(' ')
员工离职原因分析_分析员工离职调查

Overall, HR has the lowest overall sentiment but it also makes up the smallest percentage of respondents (5.8%). This is problematic because the average can be quickly swayed by just a few respondents who rated the Likert items very low. That said, we cannot ignore this low overall average sentiment. On the other hand, production and sales make up the two largest portion of the respondents and they score 2nd and 3rd lowest. We will have to take a look into these three departments to determine which factors score particularly low.

总体而言,人力资源部门的总体情绪最低,但在受访者中所占的比例最小(5.8%)。 这是有问题的,因为只有几个对李克特项目的评价很低的受访者会很快影响平均值。 就是说,我们不能忽视这种较低的总体平均情绪。 另一方面,生产和销售在受访者中占最大的比例,分别位列第二和第三。 我们将不得不研究这三个部门,以确定哪些因素得分特别低。

员工离职原因分析_分析员工离职调查

The fact involuntary terminations scored the lowest on overall sentiment is not surprising. Being terminated (fired/laid off) by the organization typically produces adverse emotions which can affect how you respond to the survey. What is more interesting is that voluntary terminations, which is the number one reason in response rate, scored second-lowest on overall sentiment. Understanding the potential causes of voluntary turnover can pay enormous dividends in keeping your high performing workforce.

非自愿解雇在整体情绪上得分最低的事实不足为奇。 被组织终止(解雇/解雇)通常会产生不良情绪,这可能会影响您对调查的回应方式。 更有意思的是,自愿终止是响应率最高的原因,在整体情绪上得分第二低。 了解自愿离职的潜在原因可以为保持高绩效的员工队伍带来巨大的好处。

员工离职原因分析_分析员工离职调查

We can see that 46–50 age group scored the lowest.

我们可以看到46-50岁年龄组的得分最低。

员工离职原因分析_分析员工离职调查

As expected, executives scores particularly low on overall sentiment as 38% of them were terminated involuntarily but we also have to remember that 33% were terminated voluntarily. We definitely have to look into which specific factors executives score particularly low on. Finally, machine_ops scored equally low (3.39) on overall sentiment, also this population has a 52% voluntary termination response rate.

不出所料,高管人员在整体情绪上得分特别低,因为其中38%的人被自愿辞职,但我们还必须记住33%的人被自愿辞职。 我们绝对必须研究高管得分特别低的具体因素。 最终,machine_ops在总体情绪上得分同样低(3.39),该人群的自愿解雇响应率也达到52%。

员工离职原因分析_分析员工离职调查

Finally, we don’t see any particular differences in general sentiment between tenure groups.

最后,在终身任职制之间,我们在总体情绪上没有发现任何特别的差异。

In summary, HR, production, and sales had the lowest overall sentiment with HR scoring (3.39). Since production and sales maintain the highest number of survey respondents we need to examine which specific factors these departments score the lowest on. Involuntary terminations had the lowest sentiment (3.42) which is of no surprise but voluntary turnover had the second-lowest average sentiment while being having the highest number of survey respondents. Finally, we saw executives and machine ops job types to have the lowest sentiment (3.39). Yet again additional analyses are required to ascertain the nature of this relationship.

总而言之,人力资源,生产和销售的整体情绪最低,只有人力资源得分(3.39)。 由于生产和销售保持着最多的调查受访者,我们需要检查这些部门得分最低的具体因素。 非自愿解雇的情绪最低(3.42),这不足为奇,但自愿离职的平均情绪第二低,而受访者人数最多。 最后,我们看到高管和机器操作人员的工作情绪最低(3.39)。 同样,还需要进行其他分析才能确定这种关系的性质。

平均总体潜在因素情绪 (Average Overall Latent Factors Sentiment)

emp_value_avg = round(np.mean(df['employee_valued']),2)
mgmt_sati_avg = round(np.mean(df['mgmt_sati']),2)
job_sati_avg = round(np.mean(df['job_satisfaction']),2)
team_sati_avg = round(np.mean(df['team_satisfaction']),2)
training_sati_avg = round(np.mean(df['training_satisfaction']),2)
org_env_avg = round(np.mean(df['org_environment']),2)
work_life_avg = round(np.mean(df['work_life_balance']),2)
overall_sati = round(np.mean([emp_value_avg, mgmt_sati_avg, job_sati_avg, team_sati_avg,
training_sati_avg, org_env_avg, work_life_avg]), 2)
temp_dict = {'emp_value_avg': emp_value_avg, 'mgmt_sati_avg': mgmt_sati_avg,
'job_sati_avg': job_sati_avg, 'team_sati_avg': team_sati_avg,
'training_sati_avg': training_sati_avg, 'org_env_avg': org_env_avg,
'work_life_avg': work_life_avg, 'overall_sati': overall_sati}
tmp_df = pd.DataFrame.from_dict(temp_dict, orient='index', columns=['average']).sort_values(by='average')plt.figure(figsize=(25,15))
plt.title('Overall Latent Factor Averages', fontsize=28)
plt.ylabel('Average Employee Rating', fontsize=25)
ax = tmp_df['average'].plot(kind='barh', fontsize=25)
for i in ax.patches:
width, height = i.get_width(), i.get_height()
x, y = i.get_xy()
ax.text(x+width/2,
y+height/2,
'{:.2f}'.format(width),
horizontalalignment='center',
verticalalignment='center',
size=25)plt.grid(False)
plt.show()
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

If we disregard any respondent demographics and look at our latent sentiment factors we see that fair_salary, emp_value, and org_env have scored the lowest. It is important to focus our analysis on these factors in order to understand why these factors are low but also where in the organization they are the lowest (ie. department, job type, etc.). Our results are confirmed for voluntary termination as well.

如果我们不考虑任何受访者的人口统计资料,并查看我们的潜在情感因素,我们会发现fair_salary,emp_value和org_env得分最低。 重要的是,将我们的分析重点放在这些因素上,以了解为什么这些因素较低,而且在组织中它们最低的位置(即部门,职位类型等)。 我们的结果也被确认为自愿终止。

likert = ['promotional_opportunities', 'performance_recognized',
'feedback_offered', 'coaching_offered', 'mgmt_clear_mission',
'mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm',
'direct_mgmt_satisfaction', 'job_stimulating', 'initiative_encouraged',
'skill_variety', 'knowledge_variety', 'task_variety', 'fair_salary',
'teamwork', 'team_support', 'team_comm', 'team_culture',
'job_train_satisfaction', 'personal_train_satisfaction', 'org_culture',
'grievances_resolution', 'co-worker_interaction',
'workplace_conditions', 'job_stress', 'work/life_balance']
likert_avgs = []
for col in likert:
likert_avgs.append(round(np.nanmean(df[col]),2))
likert_avgs_df = pd.DataFrame(list(zip(likert, likert_avgs)), columns=['likert_item', 'avg_sentiment'])
likert_avgs_df.set_index('likert_item', inplace=True)
plt.figure()
ax = likert_avgs_df.plot(kind='bar', figsize=(25,15), fontsize=20)
for i in ax.patches:
width, height = i.get_width(), i.get_height()
x, y = i.get_xy()
ax.text(x+width-0.25,
y+height+.1,
'{:.2f}'.format(height),
horizontalalignment='center',
verticalalignment='center',
size=20)
plt.title('Average Overall Likert Item Sentiment', fontsize=30, pad=25)
plt.legend().remove()
plt.xlabel(' ')
员工离职原因分析_分析员工离职调查

By examining the individual Likert items which make up each latent factor we can see that promotional opportunities and performance_recognized contribute the most to the low sentiment of emp_value. Although we would have liked to see more than one Likert item to assess salary satisfaction we can examine fair_salary in greater detail to determine where this sentiment is particularly low. Finally, it seems org_culture and grievances_resolution contribute the most to the low sentiment of org_environment.

通过检查构成每个潜在因素的各个李克特项目,我们可以看到促销机会和Performance_recognized对emp_value的低情绪贡献最大。 尽管我们希望看到一个以上的李克特项目来评估薪水满意度,但我们可以更详细地研究公务员薪水,以确定这种情绪在哪里特别低。 最后,似乎org_culture和grievances_resolution是造成org_environment情绪低落的最大原因。

受访者特征的潜在建构情感 (Latent Construct Sentiment by Respondent Characteristics)

When analyzing survey data it is quite easy to end-up down a proverbial rabbit-hole of charts and plots only to lose sight of your goal. In other words, we need to narrow our focus. The ultimate goal of analyzing sentiment surveys is to identify areas of weakness where organizational initiatives can be implemented to improve those identified areas. The area we will mainly focus our efforts on are voluntary terminations. First, they make up almost 50% of the respondents. Secondly, this is the employee population where we can make the most significant impact on using organizational initiatives. Finally, we want to limit the amount of voluntary turnover in order to limit the knowledge drain from the organization and minimize the recruiting and training cost associated with hiring replacement employees.

在分析调查数据时,很容易得出一个众所周知的图表和绘图兔子洞,而只是看不到您的目标。 换句话说,我们需要缩小重点。 分析情绪调查的最终目标是确定薄弱环节,可以在这些薄弱环节上实施组织计划以改善这些领域。 我们将主要关注的领域是自愿终止。 首先,他们几乎占受访者的50%。 其次,这是员工人数,我们可以在其中使用组织计划产生最大的影响。 最后,我们希望限制自愿离职的数量,以限制组织的知识流失,并最大程度地减少与雇用替代员工相关的招聘和培训成本。

# plotting average likert sentiment by respondent characteristics for voluntary terminations
def bi_volterm_plot(feature1, feature2):
tmp_df = df.loc[(df['reason_of_term']=='vol_term')]
ax = round(tmp_df.groupby(feature1)[feature2].mean(),2).sort_values().plot(kind='barh', stacked=True,
figsize=(25,15), fontsize=25)
for i in ax.patches:
width, height = i.get_width(), i.get_height()
x, y = i.get_xy()
ax.text(x+width/2,
y+height/2,
'{:.2f}'.format(width),
horizontalalignment='center',
verticalalignment='center',
size=25)
plt.title('Average {} Sentiment of Voluntary Terminations by {}'.format(feature2, feature1),fontsize=30, pad=25)
plt.ylabel(' ')
plt.legend(prop={'size':20})

员工价值因素 (Employee Valued Factor)

员工离职原因分析_分析员工离职调查

Not surprisingly voluntary termination from the HR department had the lowest score on emp_value as HR has the lowest overall sentiment. Additionally, purchasing, customer service, and production also scored relatively low. This is important as production and customer service have the largest number of survey respondents.

毫不奇怪,由于HR的整体情绪最低,因此从HR部门自愿终止​​的得分最低。 此外,采购,客户服务和生产也得分较低。 这一点很重要,因为生产和客户服务的受访者人数最多。

Since we know promotional opportunities and performance recognized are the main drivers behind low emp_value sentiment we are plotting these Likert items against department, age, and job type.

由于我们知道提升机会和绩效是导致emp_value情绪低落的主要因素,因此我们根据部门,年龄和职位类型来绘制这些Likert项目。

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

Filtering the data for only voluntary terminations, we can see that most departments score low on promotional opportunities but customer_service, HR, and purchasing score particularly low. Age scores especially low as voluntarily terminated employees between the ages of 21–50 score very low on promotional opportunities sentiment. Age group 46–50 yrs old scores very low (2.62) on this Likert item. Voluntarily terminated executives, managerial, machine ops, and service workers job types scored low on promotional opportunities sentiment with executives scoring very low (2.17). Finally, all tenure groups besides 1–14 yrs score low on promotional opportunities but 5–9 yrs scored especially low (2.80).

过滤仅自愿终止的数据,我们可以看到大多数部门在促销机会上得分较低,但是客户服务,人力资源和购买得分特别低。 年龄得分特别低,因为21-50岁之间自愿终止的员工在晋升机会上的得分非常低。 这个Likert项目的年龄在46-50岁的年龄段得分非常低(2.62)。 自愿离职的高管,管理人员,机器操作人员和服务人员的工作类型在晋升机会感方面得分较低,而高管得分很低(2.17)。 最后,除了1-14岁的所有保有权群体在晋升机会上得分都很低,而5-9岁的所有保有权群体得分特别低(2.80)。

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

Once again we see purchasing and HR score the lowest on performance recognized but IT and production have fairly low sentiment as well. From an employee age perspective, 26–50 have scored the lowest, plus these age groups also scored very low on promotional opportunities. Again we see executive, managerial, and machine ops job types score the lowest. Finally, much like we saw with promotional opportunities, tenure group 5–9 yrs has scored the lowest on performance recognized.

我们再一次看到采购和人力资源得分在公认的绩效中最低,但是IT和生产也同样低迷。 从员工年龄的角度来看,26-50分的得分最低,此外,这些年龄段的晋升机会得分也很低。 同样,我们看到执行,管理和机器操作的工作类型得分最低。 最后,就像我们看到的有晋升机会一样,保有权组5–9年的成绩是公认的最低。

合理的工资因素 (Fair Salary Factor)

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

Fair_salary scored second-lowest on overall respondent sentiment. Now that we have filtered our data for voluntary terminations let’s take a look where salary sentiment is the lowest. Not surprisingly, HR and purchasing departments scored the lowest. However, production which has a tendency to score low as well on many Likert items actually scored relatively well. That said, we do see R&D and IT scoring low on this Likert item. 46–50 yr olds again score low as they record an average sentiment of (2.76). Executives, managerial, and administrative job types score the lowest on fair_salary. Finally, tenure group 5–9 yrs scored the lowest.

Fair_salary在整体受访者情绪中得分第二低。 既然我们已经过滤掉了自愿离职的数据,让我们看一下薪金情绪最低的地方。 毫不奇怪,人力资源和采购部门得分最低。 但是,在许多李克特项目上得分往往也较低的生产实际上得分相对较高。 话虽如此,我们确实认为该Likert项目的研发和IT得分较低。 46至50岁的年轻人的平均情绪(2.76)再次低下。 高管,管理和行政工作类型的薪水最低。 最后,终身制5-9岁组得分最低。

组织环境因素 (Organizational Environment Factor)

员工离职原因分析_分析员工离职调查

We know that the organizational environment factor scored third lowest in terms of overall sentiment. As we filter for voluntary terminations we see that HR and purchasing scored the lowest on this factor. Let’s break this factor into its individual Likert items.

我们知道,就整体情绪而言,组织环境因素得分最低。 当我们过滤自愿终止合同时,我们发现人力资源和采购在该因素上得分最低。 让我们将此因素分解为各个李克特项目。

We saw from our analysis above that org_culture and grievances_resolution were the two Likert items that seemed to contribute most to organizational environment’s low overall sentiment.

从上面的分析中我们可以看出,组织文化和抱怨解决是李克特的两个项目,似乎对组织环境的整体情绪低落贡献最大。

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

It would seem org-culture has scored low and at times very low across all departments. From an age perspective, org_culture is the lowest for 46–50 yr olds but it’s generally low for individuals between 26–50 yrs old. Executives have scored extremely low on this Likert item (1.80) followed by machine ops and managerial. That said, all job types besides sales have scored below an average score of 3.00. Finally from a tenure perspective, all tenure groups scored below 3.00 with 5–9yrs group scoring particularly low.

似乎所有部门的组织文化得分都很低,有时甚至很低。 从年龄的角度来看,组织文化在46至50岁之间是最低的,但在26至50岁之间通常较低。 高管对该李克特项目(1.80)的得分非常低,其次是机器操作和管理。 就是说,除了销售,所有工作类型的得分都低于平均得分3.00。 最后,从权属角度来看,所有权属组的得分都低于3.00,其中5-9岁组的得分特别低。

员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查
员工离职原因分析_分析员工离职调查

Once again purchasing, HR, customer service and production scored the lowest on grievances_resolution. Age groups, once more we see 46–50-year-old respondents make up the lowest sentiment. Executives and machine ops had the lowest sentiment for grievances resolution. Finally, again 5–9yrs tenure group scored the lowest on this Likert item much like org_culture.

采购,人力资源,客户服务和生产再次在grievances_resolution上得分最低。 年龄组,我们再一次看到46至50岁的受访者构成了最低的情绪。 高管和机器操作人员对申诉的解决率最低。 最终,5–9岁的任期组在这个Likert项目上得分最低,这与org_culture一样。

综上所述 (In Summary)

受访者的特征(Respondents’ Characteristics)

Voluntary terminations made up the largest group of survey respondents at almost 50%. This is wonderful news as it allows us enough data to make insights into the organizational sentiment of a, particularly important population. Understanding the potential reason why voluntary terminations are occurring can help the company reduce this turnover thereby minimizing knowledge drain and hiring cost.

自愿终止劳动的人数最多,接近50%。 这是一个好消息,因为它使我们有足够的数据来洞察特别是重要人群的组织情绪。 了解导致自愿离职的潜在原因可以帮助公司减少营业额,从而最大程度地减少知识流失和招聘成本。

Production (32%), customer service (19%), and sales (16%) made up the majority of survey responses from a departmental perspective. Despite the remaining departments having a significantly lower response rate their number of responses is not low enough which would warrant their insights meaningless.

从部门的角度来看,生产(32%),客户服务(19%)和销售(16%)构成了大多数调查答复。 尽管其余部门的回应率明显较低,但他们的回应数量仍然不足以保证他们的见解没有意义。

Respondents 56 yrs of age and older made up the largest group at (23%) and the remaining groups were relatively even besides 20 yrs and younger which only made up (2.3%). By slicing this data by reason of termination we can see that 54% of 56 yrs and older responders were retiring.

56岁及以上的受访者构成了最大的群体(23%),其余的群体则相对平均,除了20岁以下的年轻人(2.3%)。 通过终止数据对数据进行切片,我们可以看到56岁及以上年龄段的受访者中有54%即将退休。

Females answered the survey at almost 2:1 rate compared to males.

与男性相比,女性以几乎2:1的比例回答了调查。

Job types were fairly evenly represented in the survey with Service workers (19%), professional (17%), and sales (14%) taking the top three. Executives answered the survey at the lowest rates (3%) or 24 responses. 24 is a relatively low number of responses which can produce skewed results. Furthermore, the executive job type had the largest percentage (38%) of involuntary terminations and we know that involuntary terminations have a tendency to produce more negative overall sentiment. Any insights regarding the executive population will have to be scrutinized and validated with much larger sample size.

在调查中,工作类型的分布相当平均,其中服务人员(19%),专业人员(17%)和销售人员(14%)排名前三。 高管回答率最低(3%)或24回答。 24是可能产生偏斜结果的相对较少的响应。 此外,行政工作类型的非自愿离职比例最大(38%),我们知道非自愿离职会产生总体负面情绪。 关于高管人群的任何见解都必须使用更大的样本量进行审查和验证。

Finally, organization tenure yield an even representation of responses at average 20% per tenure group.

最后,组织使用权在每个使用权组中平均产生20%的响应。

整体员工情绪 (Overall Employee Sentiment)

As expected involuntary terminations had the lowest average sentiment but closely followed by voluntary terminations. It would seem HR, production, and sales departments had the overall lowest sentiment. Furthermore, responders between the ages of 46–50 along with executive, machine ops, and managerial job types had the lowest overall sentiment.

不出所料,非自愿解雇的平均情绪最低,但紧随其后的是自愿解雇。 人力资源,生产和销售部门的总体情绪似乎最低。 此外,年龄在46至50岁之间的受访者以及高管,机器操作和管理工作类型的整体情绪最低。

The three latent factors with the lowest overall sentiment were salary satisfaction (3.14), employee valued (3.31), and organizational environment (3.46). If we examine the individual Likert items/questions for these three latent factors we can see that promotional opportunities and performance recognition were had the lowest sentiment for employee valued. Fair salary had the lowest sentiment for salary satisfaction. Finally, organizational culture and grievances resolution had the lowest overall sentiment for organizational environment.

总体满意度最低的三个潜在因素是薪水满意度(3.14),员工价值(3.31)和组织环境(3.46)。 如果我们检查这三个潜在因素的个别李克特项目/问题,我们可以看到促销机会和绩效认可是员工重视的最低情绪。 公平薪水对薪水满意度的满意度最低。 最后,组织文化和申诉解决方案对组织环境的总体感受最低。

自愿终止情绪 (Voluntary Terminations Sentiment)

When specifically looking at the voluntarily terminated population we see similar results as above. Salary satisfaction, employee value, and organizational environment had the lowest overall sentiment. Again promotional opportunities and performance recognition were the driving factors for low employee value sentiment. Fair salary was the main driver of low salary satisfaction. Finally, organizational culture and grievances resolution had the lowest scores on the organizational environment latent factor.

当专门研究自愿终止的人口时,我们会看到与上述类似的结果。 工资满意度,员工价值和组织环境的总体情绪最低。 同样,晋升机会和绩效认可是降低员工价值观念的驱动因素。 合理的工资是低工资满意度的主要驱动力。 最后,组织文化和抱怨解决方案在组织环境潜在因素方面得分最低。

Promotional Opportunities (employee value)

晋升机会(员工价值)

Most departments need to improve their access to promotional opportunities but purchasing, HR, customer service, and production departments are in greatest need to stem voluntary turnover. Furthermore, it seems promotional opportunities sentiment is low across most age groups especially 46–50 years olds. Executives, managerial, machine ops and service worker job types suffered most from lack of promotional opportunities. Finally, tenured employees between 5–9 yrs had the lowest sentiment towards promotional opportunities.

大多数部门需要改善获得促销机会的机会,但是采购,人力资源,客户服务和生产部门最需要阻止自愿离职。 此外,在大多数年龄段,尤其是46至50岁的人群中,晋升机会的情绪似乎较低。 高管,管理人员,机器操作人员和服务人员的工作类型因缺乏晋升机会而受害最大。 最后,5-9岁之间的终身雇员对晋升机会的信心最低。

Performance Recognition (employee value)

绩效表彰(员工价值)

Purchasing and HR had suffered from low-performance recognition. Low sentiment for performance recognition was also shared by 46–50 yrs olds, executives, managerial, and machine ops job types. Finally, we again see 5–9 yrs of tenure seems to have low sentiment on this item.

采购和人力资源部门的绩效低下。 46至50岁的老年人,高管,管理人员和机器操作人员的工作类型也认同绩效认可的低迷情绪。 最后,我们再次看到5-9年的任期似乎对这个项目不满意。

Fair Salary (salary satisfaction)

公平工资(工资满意度)

HR, purchasing, R&D and IT had the lowest sentiment towards their salary. Older age groups, particularly 46–50 yrs old, had low sentiment. Executives, managerial and administrative respondents scored the lowest on salary satisfaction.

人力资源,采购,研发和IT领​​域对薪资的看法最低。 年龄较大的人群,尤其是46-50岁的人群,情绪低落。 高管,管理和行政受访者在薪水满意度上得分最低。

Organizational Culture (org environment)

组织文化(组织环境)

It seems organizational culture has the lowest overall sentiment across the entire survey. All departments, especially purchasing, customer service, and production seem to suffer from low sentiment. The same is true for age and job type where 46–50 yr olds and executives, machine ops, and managerial seem to score the lowest.

在整个调查中,似乎组织文化的总体情绪最低。 所有部门,尤其是采购,客户服务和生产部门,似乎都情绪低落。 对于年龄和工作类型也是如此,其中46至50岁的老年人和高管,机器操作人员和管理人员得分最低。

Grievance Resolution (org environment)

申诉解决方案(组织环境)

The sentiment for this topic is generally higher than organization culture but it is still a major contributing factor to low organizational environment sentiment. Again the same departments, purchasing, HR, customer service, and production score the lowest. The trend continues as 46–50-year-olds score the lowest again. Executives, machine ops, and service workers score the lowest as well.

通常,此主题的情绪高于组织文化,但它仍然是导致组织环境情绪低落的主要因素。 同样,采购,人力资源,客户服务和生产部门均得分最低。 趋势继续发展,46至50岁的孩子再次得分最低。 高管,机器操作员和服务人员的得分也最低。

执行摘要 (Executive Summary)

We focused our attention on analyzing an employee exit survey from roughly 600 employees which was collected over a 4 year prior. The vast majority of survey responders (50%) were voluntary terminations. Departments of production, customer service, and sales made up the majority of respondents. The overall respondents’ age was slightly skewed as the largest group was 56 yrs of age and older. Finally, job types were proportionally represented.

我们集中精力分析了过去4年中收集的大约600名员工的离职调查。 绝大多数的调查答复者(50%)是自愿终止的。 生产,客户服务和销售部门占大多数。 总体受访者的年龄略有偏差,因为最大的一组年龄在56岁以上。 最后,工作类型按比例表示。

Involuntary terminations had the lowest overall sentiment closely followed by voluntary terminations. Overall, salary satisfaction, employee value and organizational environment had the lowest sentiment. These results were confirmed for the voluntary terminated sample as well.

非自愿解雇的总体情绪最低,紧随其后的是自愿解雇。 总体而言,薪水满意度,员工价值和组织环境的情绪最低。 这些结果也被自愿终止的样本所证实。

The analysis was then specifically focused on the voluntarily terminated sample. It seems in order to increase overall employee sentiment and potentially reduce voluntary turnover the organization needs to focus its initiatives on improving promotional opportunities, performance recognition, salary, organizational culture, and grievances resolution.

然后将分析专门针对自愿终止的样品。 为了提高整体员工情绪并可能减少自愿离职,该组织需要将其举措集中在改善晋升机会,绩效认可,薪水,组织文化和抱怨解决上。

Any initiatives developed towards improving the aforementioned areas in order to potentially stem voluntary turnover would be best targeted towards the purchasing, HR, customer service, and production departments (in order to need). Furthermore, employees between the ages of 46–50, 26–30, and 36–40 would also benefit from any and all initiatives. From a job type perspective, executives, machine ops, managerial, and service workers would also benefit from these initiatives (in order of need). It is important to note the executive job type had a small sample size (24) and only a third of that were voluntary terminations. The survey, in general, had a relatively small sample size (600), therefore, it is recommended these results be validated by a larger survey sample. That said, these results do provide a small glimpse into the organizational issues which might reside inside.

为了改善上述领域而采取的任何举措,以潜在地阻止自愿离职,最好将其针对采购,人力资源,客户服务和生产部门(以有需要)。 此外,年龄在46-50、26-30和36-40之间的员工也将从任何计划中受益。 从工作类型的角度来看,高管,机器操作人员,管理人员和服务人员也将从这些计划中受益(按需求顺序)。 重要的是要注意高管职位类型的样本量较小(24),其中只有三分之一是自愿离职。 通常,该调查的样本量相对较小(600),因此,建议通过较大的调查样本来验证这些结果。 就是说,这些结果确实使您对内部可能存在的组织问题有所了解。

翻译自: https://towardsdatascience.com/analyzing-employee-exit-surveys-fbbe35ae151d

员工离职原因分析