Medical Dialogue Summarization Using Linear Support Vector Classification Technique

Published in CLEF 2023: Conference and Labs of the Evaluation Forum, 2023

This paper explores medical dialogue summarization using a linear support vector classification technique. The study focuses on accurately identifying topics within doctor-patient conversations to streamline clinical documentation. Using a machine learning pipeline, snippets of medical dialogue were classified into predefined section headers such as “Assessment,” “Diagnosis,” and “Past Medical History.”

Key features of the approach include:

  • Data Preprocessing: Removing digits, punctuation, and stopwords for clean text input.
  • Modeling: Training a Linear Support Vector Classifier (SVC) with random oversampling to address class imbalances.
  • Validation: Demonstrating effectiveness on unseen test data, showcasing practical applications in organizing and summarizing clinical notes.

The research was presented at CLEF 2023 in Thessaloniki, Greece.

Recommended citation: Dhanya Krishnan, Divya Srinivasan, and Kavitha Srinivasan. (2023). "Medical Dialogue Summarization Using Linear Support Vector Classification Technique." CLEF 2023: Conference and Labs of the Evaluation Forum.
Download Paper | Download Slides