Master’s Capstone Project Website

ExposureShield: Free Email Exposure Scan

This capstone presents a prototype scanner that uses publicly available breach sources and safe, ethical design choices to detect possible exposure and report risk in simple language.

Start Free Scan Read Abstract References

Author: Eric Hiheglo • Program: Cybersecurity • Date:

Free Email Exposure Scan

Enter your email to check if it appears in known public breach databases. This scan is for awareness and education.

Technical note: This scan calls /api/free-scan (server-side). Your HIBP key stays on the server (Vercel environment variables).

Abstract

Individuals are frequently impacted by data breaches, credential leaks, and illegal marketplaces. While large organizations can purchase advanced monitoring tools, many people do not have access to simple and affordable exposure detection. This capstone introduces ExposureShield, a prototype scanner that detects, classifies, and reports personal exposure using publicly available breach sources. The system generates a risk level and a clear report that helps a user understand what happened and what actions to take.

The prototype integrates data acquisition, preprocessing and normalization, a classification and prioritization layer, and a reporting workflow. Evaluation focuses on standard metrics and usability outcomes, with an emphasis on privacy-first and ethical design.

1. Introduction

This section explains the problem, why it matters, and what the project delivers.

1.1 Background

Data breaches expose emails, passwords, and personal information. Attackers can resell stolen data and use it for account takeover, identity theft, phishing, and fraud.

1.2 Problem Statement

Most individuals do not have access to a reliable and ethical tool that detects and explains exposure using clear language, without requiring enterprise resources.

1.3 Purpose

The purpose is to design and evaluate a prototype that can detect, classify, and report individual exposure in a reliable and ethical manner.

Research Question & Objectives

Research Question

How can a prototype scanner, built on publicly available breach and dark-web sources, effectively detect, classify, and report individual exposure in a reliable and ethical manner?

Objectives

Collect and normalize exposure-related data from public breach sources and controlled datasets.
Classify exposure content and prioritize severity to reduce false alerts.
Generate a clear report with recommended actions for the user.
Evaluate performance using precision, recall, and F1-score, plus usability review.

2. Literature Review

This section summarizes the research foundation that informed the system design. It covers ethical data handling, text classification in cybersecurity, risk scoring, and limits in existing monitoring solutions.

Topic Area	What the research shows	How ExposureShield uses it
Ethical research	Ethics and privacy are critical when analyzing breach or illicit ecosystem data.	Minimize collection, avoid harmful data handling, and protect keys server-side.
ML text classification	Text models can detect threat-related patterns and reduce manual review.	Use classification concepts to support reporting and prioritization.
Risk scoring	Prioritization reduces alert fatigue and improves practical response.	Assign simple risk levels and provide clear actions.
Existing tools	Many tools detect exposure but do not explain results clearly for users.	Focus on simple reporting and usability.

Update note: Replace this summary with short, specific statements from your final peer-reviewed sources and cite them in your capstone paper.

3. Methodology

This section describes the research design, data sources, processing steps, and evaluation approach.

3.1 Research Design

Applied research using a build-and-evaluate approach
Prototype development with controlled testing
Quantitative metrics + usability review

3.2 Data and Processing

Public breach data sources (permitted use)
Normalization and de-duplication
Text preprocessing for classification and reporting

Evaluation Plan

3.3 Metrics

Precision: reduce false positives and incorrect alerts
Recall: detect more real exposures
F1-score: balanced measure of performance

3.4 Usability

Clarity of results and risk explanation
Actionability of the recommendations
Time to understand the outcome

4. Implementation

This section explains how the prototype is built and how data moves through the system.

4.1 High-Level Architecture

Frontend: user enters email and views results
Backend API: server endpoint calls breach source API securely
Risk layer: simple risk scoring and reporting

4.2 Data Flow

User submits an email.
Frontend calls /api/free-scan.
Server queries the breach source API and returns results.
Frontend shows risk level, breaches, and recommended actions.

5. Evaluation & Results

Replace the placeholders below with your final results from testing.

Metric	Value	Meaning
Precision	—	Lower false positives
Recall	—	Fewer missed exposures
F1-score	—	Overall balance

Discussion

What the prototype detects well
Common false positives and why they happen
What improved clarity for users
Limitations and future improvements

6. Ethics, Privacy, and Compliance

ExposureShield is designed to reduce harm and protect users. The scan endpoint keeps API keys server-side and focuses on awareness and protective guidance.

Minimal data retention
No illegal content collection
Server-side API key protection
Clear user guidance and safe reporting

References (APA 7th)

Put your final references here (exactly as in your capstone paper).

Author, A. A. (Year). Title of article. Journal Name, volume(issue), pages.
Author, B. B. (Year). Title of study. Conference/Journal, pages.

Appendix

Appendix A: System Architecture Diagram
Appendix B: Model Configuration Parameters
Appendix C: Sample Report Output