Smart Video Capsule Endoscopy: Raw Image-Based Localization for Enhanced GI Tract Investigation

Smart Video Capsule Endoscopy: Raw Image-Based Localization for Enhanced GI Tract Investigation

Abstract

This work introduces an ultra-low power AI solution for Smart Video Capsule Endoscopy, addressing severe battery life limitations in medical edge devices. The innovation involves performing accurate organ classification (93.06% accuracy) directly on raw Bayer images using a compact Convolutional Neural Network (63,000 parameters) and Viterbi decoding, thus skipping energy-intensive RGB conversion. Implemented on a customized PULPissimo RISC-V System-on-Chip with a hardware accelerator, the approach achieves remarkable efficiency, requiring only 5.31 $\mu$J per image and delivering 89.9% energy savings during the initial GI tract transit.

Report

Key Highlights

  • Application Focus: Enhanced Smart Video Capsule Endoscopy (VCE) for investigating the small intestine.
  • Energy Efficiency: Achieves ultra-low power consumption of just 5.31 $\mu$J per image classification.
  • Significant Savings: Demonstrates an average energy saving of 89.9% before the capsule enters the small intestine, dramatically extending battery life.
  • AI Method: Classification is performed directly on raw Bayer images, eliminating the energy cost of converting to RGB format.
  • Performance: Achieves a final organ classification accuracy of 93.06%.

Technical Details

  • Hardware Platform: A customized PULPissimo System-on-Chip (SoC).
  • Processing Core: Utilizes a standard RISC-V core.
  • Acceleration: Features an ultra-low power dedicated hardware accelerator specifically designed for AI/image classification tasks.
  • Model Size: Employs a highly compact Convolutional Neural Network (CNN) consisting of only 63,000 parameters.
  • Enhancement Technique: Uses Viterbi decoding for time-series analysis to leverage temporal context and improve classification/localization accuracy.
  • Input Data: The neural network is trained and deployed to process raw Bayer color filter array data directly, avoiding the need for demosaicing.

Implications

  • RISC-V Validation in Medical Edge: This research provides strong validation for RISC-V architectures in extremely power-constrained, mission-critical medical applications, positioning it as a leading choice for implantable and ingestible devices.
  • PULP Ecosystem Success: The successful implementation on a customized PULPissimo SoC underscores the versatility and efficiency of the open-source PULP platform, accelerating specialized embedded development within the RISC-V ecosystem.
  • Architectural Efficiency Paradigm: The work champions a shift in edge computing architecture where raw sensor data (Bayer format) is processed directly by optimized low-parameter CNNs, proving that bypassing traditional processing pipelines (like RGB conversion) is crucial for achieving microjoule-level efficiency.
  • Hardware/Software Co-design: It highlights the necessity of tightly integrating customized ultra-low power hardware accelerators alongside the RISC-V core to meet the severe energy constraints required for long-duration medical monitoring.
lock-1

Technical Deep Dive Available

This public summary covers the essentials. The Full Report contains exclusive architectural diagrams, performance audits, and deep-dive technical analysis reserved for our members.

Read Full Report →