Anotace:
Surgical telementoring is an advanced tele-medicine concept where the expert surgeon guides the onsite novice present at the remote location. The efficient telementoring system requires the wireless transmission of high-quality surgical video with less bitrate in less time. The bit rate of the surgical video can be decreased by segmenting the surgical incision region and removing the background region. The High Efficiency Video Coding (HEVC) standard has provided promising results for surgical telementoring applications. But the Rate-Distortion Optimization (RDO) search process in HEVC increases the complexity that in turn increases the encoding time. We propose the method which involves the segmentation of the surgical incision region using the Kernelized Correlation Filter (KCF) object tracking technique. The segmented region is encoded by the complexity-efficient Scalable HEVC (SHVC) to meet the resolution of an end-user device. The complexity of SHVC is decreased by using the Convolutional Neural Network (CNN) and Long- and Short- Term Memory (LSTM) to predict the Coding Tree Unit (CTU) structure. The results show that the proposed method decreases the bitrate significantly for segmented surgical video sequences without degradation in Peak Signal-to-Noise Ratio (PSNR). These results are obtained for the surgical video sequences with slow-moving objects. Furthermore, the CNN+LSTM approach reduces the encoding time of standard SHVC by 51% with negligible Rate-Distortion (RD) performance loss.