real-time system to provide live instructor feedback in AR, when spikes in negative student emotions are detected. implemented in 3 weeks (really just 1) for VR special topics course

speculative project viable of course only if headsets became super lightweight - it could also just function on glasses that just display text

inspired by my research experience on AR/VR applications for augmenting problems in everyday life

system overview

  • calibrate baseline average emotion levels of the class at the start of lecture
    • YOLO-face, finetuned on a face dataset, detects faces
    • faces are iteratively passed to the emotion classification convolutional neural networks (CNN) - only negatively classified emotions are tracked
    • detect spikes in negative emotion via standard deviation over a rolling window of emotion classifications
  • concurrently, whisper is doing live speech-to-text transcription and storing chunks of transcript at timestamps via Pandas dataframes
    • when the emotion spike is detected, we retrieve the past five minutes of lecture content, and query an OpenAI model for possible student questions

open issues

  • the number of students changing would affect calibration levels
  • correcting past erroneous transcriptions to affect accumulating bad context
  • todo