31 – Real-World Python Projects – AI Resume Screener

Focus Areas: NLP, Machine Learning, Automation
Difficulty: Intermediate–Advanced
Real-World Use: HR teams receive hundreds of resumes. An AI Resume Screener automatically parses resumes, extracts skills, ranks candidates, and generates a shortlist.


🌟 What You Will Build

A Python program that:

  1. Reads multiple resumes (PDF/DOCX/TXT)
  2. Extracts candidate information
  3. Extracts skills using NLP
  4. Matches resume skills with job description skills
  5. Calculates a match score (0–100%)
  6. Sorts and displays best candidates
  7. Exports results to CSV

🧠 Tech Stack

  • python-docx (for DOCX reading)
  • PyPDF2 (for PDF reading)
  • spaCy (NLP for skill extraction)
  • pandas
  • re (regex for cleaning)
  • os

📁 Folder Structure

AI_Resume_Screener/
│── resumes/
│     ├── resume1.pdf
│     ├── resume2.docx
│── job_description.txt
│── screener.py
│── skills_library.txt

📘 skills_library.txt (example)

These skills will be matched:

python
machine learning
excel
power bi
communication
sql
customer service
java
react
salesforce
data analysis

You can add more.


📄 job_description.txt (example)

We are hiring a Senior Customer Service Representative.
Strong communication, client handling, CRM knowledge, and problem-solving required.
Experience with Excel and email support preferred.

🧠 Core Python Program: screener.py

import os
import re
import PyPDF2
import docx
import spacy
import pandas as pd

# Load NLP model
nlp = spacy.load("en_core_web_sm")

# Load skills library
with open("skills_library.txt", "r") as f:
    SKILL_LIST = [skill.strip().lower() for skill in f.readlines()]

# Clean text
def clean_text(text):
    text = text.lower()
    text = re.sub(r'[^a-zA-Z0-9\s]', ' ', text)
    text = re.sub(r'\s+', ' ', text)
    return text

# Extract text from PDF
def read_pdf(path):
    text = ""
    with open(path, "rb") as pdf:
        reader = PyPDF2.PdfReader(pdf)
        for page in reader.pages:
            text += page.extract_text() + " "
    return text

# Extract text from DOCX
def read_docx(path):
    doc = docx.Document(path)
    return " ".join([para.text for para in doc.paragraphs])

# Read any resume
def extract_text(path):
    if path.endswith(".pdf"):
        return read_pdf(path)
    elif path.endswith(".docx"):
        return read_docx(path)
    elif path.endswith(".txt"):
        return open(path, "r").read()
    return ""

# Extract skills from resume
def extract_skills(text):
    words = set(clean_text(text).split())
    found = [skill for skill in SKILL_LIST if skill in words]
    return found

# Calculate match score
def calculate_match(resume_skills, jd_skills):
    score = (len(set(resume_skills) & set(jd_skills)) / len(jd_skills)) * 100
    return round(score, 2)

# Load job description skills using NLP
def get_jd_skills():
    with open("job_description.txt", "r") as f:
        jd_text = clean_text(f.read())
    jd_words = jd_text.split()
    return [skill for skill in SKILL_LIST if skill in jd_words]

# Main screening function
def screen_resumes():
    jd_skills = get_jd_skills()
    results = []

    for file in os.listdir("resumes"):
        path = os.path.join("resumes", file)
        text = extract_text(path)
        skills = extract_skills(text)
        score = calculate_match(skills, jd_skills)

        results.append({
            "Resume Name": file,
            "Skills Found": ", ".join(skills),
            "Match Score": score
        })

    df = pd.DataFrame(results)
    df = df.sort_values(by="Match Score", ascending=False)
    df.to_csv("screening_results.csv", index=False)

    print(df)
    print("\nResults saved to screening_results.csv")

screen_resumes()

🎯 Output Example

ResumeSkills FoundMatch Score
resume3.pdfcommunication, excel, crm85
resume1.docxcustomer service, excel70
resume2.pdfpython10

🏆 Real-World Improvements

Add these once basic version works:

✔ Resume Ranking Model

Use TF-IDF + cosine similarity.

✔ Named Entity Recognition

Extract:

  • Name
  • Email
  • Phone
  • Experience years
  • Education

✔ Streamlit Web App

Upload resume → Get score instantly.

✔ HR Dashboard

Graphs for:

  • skill distribution
  • top candidates
  • missing skill analysis

✔ Multi-Job Screening

Screen resumes for 50 different roles.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *