Skip to content

Catch-You/CatchYou-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Criminal Composite Sketch Generation Framework Using Generative Models

This project introduces a framework for creating composite sketches by leveraging generative AI models to effectively reflect both concrete and abstract witness descriptions.

Introduction

Despite the widespread use of CCTV systems, composite sketches remain crucial evidence in criminal investigations, especially for cases where CCTV footage is unavailable or unclear. Traditional montage creation methods in Korea heavily rely on the "recall" principle, where witnesses describe specific facial features that are then combined by sketch artists.

Our project addresses the limitations of traditional methods by developing a framework that:

  • Automatically generates sketches from textual descriptions
  • Captures both concrete facial features and abstract impressions
  • Significantly reduces creation time (under 30 seconds)
  • Allows for partial text modifications to refine images

Check out our paper/report

Features

  • Text-to-Image Generation: Convert witness descriptions directly into composite sketches
  • Comprehensive Description Processing: Handle both specific features ("round face", "thick eyebrows") and abstract impressions ("looks trustworthy", "appears meticulous")
  • Rapid Generation: Create sketches in under 30 seconds
  • Interactive Refinement: Modify specific parts of the description to update corresponding features in the image

Generative Models

We developed and compared two generative models:

  1. DALL-E with VQ-GAN:

    • Uses klue/roberta-large as text encoder for Korean language support
    • Employs VQ-GAN as image decoder for improved detail rendering
  2. Stable Diffusion with LoRA:

    • Fine-tuned the pre-trained Korean stable diffusion model (my-korean-stable-diffusion-v1-5)
    • Implemented Low-Rank Adaptation (LoRA) to optimize for limited computing resources

Model Comparison

Model TIFA Score User Rating
Stable Diffusion 0.50/1 3.17/5
DALL-E 0.53/1 2.96/5

We selected Stable Diffusion as our primary model based on superior user evaluation scores, particularly in handling the nuances of montage creation.

⚠️ Limitations

Current limitations include:

  • Occasional image distortions characteristic of generative models
  • Incomplete representation of very lengthy descriptions
  • Imperfect image updates in response to text modifications

Team

  • Chaeyeon Yang (Front-end)
  • Janghyeon Roh (AI engineering)
  • Mir An (AI engineering)
  • Seungyeon An (Back-end)
  • Sujin Hwang (AI engineering)

Department of Data Technology, Convergence Software School, Myongji University

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •