Skip to content

Add a Challenging SQL & Pandas Problem: "Employee Performance Analysis" #95

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
iamAntimPal opened this issue Mar 31, 2025 · 0 comments

Comments

@iamAntimPal
Copy link
Owner

Add a Challenging SQL & Pandas Problem: "Employee Performance Analysis"

Description

I propose that we add a new challenging question to our repository that requires solving using both SQL and Pandas. The question involves real-world data analysis and combines multiple concepts. Below is the proposed problem statement.


Problem: Employee Performance Analysis

Context:
A company maintains a database of its employees along with their monthly performance metrics. The data is stored in two tables: Employees and Performance.

Table: Employees

Column Name Type Description
employee_id INT Unique identifier for each employee.
name VARCHAR Employee name.
department VARCHAR Department of the employee.

Table: Performance

Column Name Type Description
employee_id INT Foreign key referencing Employees.employee_id.
month DATE The month of the performance record (assume it is the first day of the month).
sales INT Number of sales made by the employee in that month.
targets_met INT Number of targets met by the employee in that month.

Task:
Write a solution that reports, for each department:

  • Employee Count: Total number of employees.
  • Average Monthly Sales: Average monthly sales per employee (across all available months).
  • Target Achievement Rate: The overall rate of target achievement calculated as:
    [
    \text{Target Achievement Rate} = \frac{\text{Total Targets Met}}{\text{Total Targets Assigned}}
    ]
    (Assume each employee has a fixed target of 10 per month.)

Return the result table with the following columns:
department, employee_count, avg_monthly_sales, target_achievement_rate.
Round avg_monthly_sales and target_achievement_rate to two decimal places.


Requirements:

  1. SQL Solution:

    • Write a query that computes the required results using appropriate JOINs, GROUP BY, and aggregation functions.
    • Ensure the query handles cases where performance records might be missing for some employees.
  2. Pandas (Python) Solution:

    • Write a function that accepts two DataFrames (one for Employees and one for Performance) and returns a DataFrame with the required results.
    • Use Pandas operations such as merge, groupby, and aggregation functions to compute the values.

Example (Simplified):

Given sample data, your output might look like:

department employee_count avg_monthly_sales target_achievement_rate
IT 5 250.75 0.92
Finance 3 180.50 0.88
HR 2 200.00 0.95

Additional Notes:

  • The SQL solution should utilize functions like ROUND() to format the results.
  • The Pandas solution should be encapsulated in a function that takes the two DataFrames as input and returns the resulting DataFrame.
  • This question combines multiple SQL concepts (JOINs, GROUP BY, aggregations, rounding) along with equivalent Pandas operations, making it a great practice problem for both SQL and Python skills.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant