Related papers: Medical Image Deidentification, Cleaning and Compression Using Pylogik

Medical Image Deidentification, Cleaning and Compression Using Pylogik

URL: http://arxiv.org/abs/2304.12322v5
Date: Wed, 10 May 2023 13:55:49 GMT
Title: Medical Image Deidentification, Cleaning and Compression Using Pylogik
Authors: Adrienne Kline, Vinesh Appadurai, Yuan Luo, Sanjiv Shah
Abstract summary: PyLogik is a library for cleaning, de-identification, determining ROI, and file compression of medical image meta-data. Results show that PyLogik is a viable methodology for data cleaning and de-identification, determining ROI, and file compression. Variants of the pipeline have also been created for use with other medical imaging data types.
Score: 4.515386176265859
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Leveraging medical record information in the era of big data and machine learning comes with the caveat that data must be cleaned and de-identified. Facilitating data sharing and harmonization for multi-center collaborations are particularly difficult when protected health information (PHI) is contained or embedded in image meta-data. We propose a novel library in the Python framework, called PyLogik, to help alleviate this issue for ultrasound images, which are particularly challenging because of the frequent inclusion of PHI directly on the images. PyLogik processes the image volumes through a series of text detection/extraction, filtering, thresholding, morphological and contour comparisons. This methodology de-identifies the images, reduces file sizes, and prepares image volumes for applications in deep learning and data sharing. To evaluate its effectiveness in processing ultrasound data, a random sample of 50 cardiac ultrasounds (echocardiograms) were processed through PyLogik, and the outputs were compared with the manual segmentations by an expert user. The Dice coefficient of the two approaches achieved an average value of 0.976. Next, an investigation was conducted to ascertain the degree of information compression achieved using the algorithm. Resultant data was found to be on average ~72% smaller after processing by PyLogik. Our results suggest that PyLogik is a viable methodology for data cleaning and de-identification, determining ROI, and file compression which will facilitate efficient storage, use, and dissemination of ultrasound data. Variants of the pipeline have also been created for use with other medical imaging data types.

Related papers

Efficient Curation of Invertebrate Image Datasets Using Feature Embeddings and Automatic Size Comparison [5.480305055542485]
We present a method for curating large-scale image datasets of invertebrates. Our approach is based on extracting feature embeddings with pretrained deep neural networks. Also, we show that a simple area-based size comparison approach is able to find a lot of common erroneous images.
arXiv Detail & Related papers (2024-12-20T12:35:41Z)
Efficient Medical Image Retrieval Using DenseNet and FAISS for BIRADS Classification [0.0]
We propose an approach to medical image retrieval using DenseNet and FAISS. DenseNet is well-suited for feature extraction in complex medical images. FAISS enables efficient handling of high-dimensional data in large-scale datasets.
arXiv Detail & Related papers (2024-11-03T08:14:31Z)
Speckle Noise Reduction in Ultrasound Images using Denoising Auto-encoder with Skip Connection [0.19116784879310028]
Ultrasound images often contain speckle noise which can lower their resolution and contrast-to-noise ratio. This can make it more difficult to extract, recognize, and analyze features in the images. Researchers have proposed several speckle reduction methods, but no single method takes all relevant factors into account.
arXiv Detail & Related papers (2024-03-05T08:08:59Z)
Robust Medical Image Classification from Noisy Labeled Data with Global and Local Representation Guided Co-training [73.60883490436956]
We propose a novel collaborative training paradigm with global and local representation learning for robust medical image classification. We employ the self-ensemble model with a noisy label filter to efficiently select the clean and noisy samples. We also design a novel global and local representation learning scheme to implicitly regularize the networks to utilize noisy samples.
arXiv Detail & Related papers (2022-05-10T07:50:08Z)
Preservation of High Frequency Content for Deep Learning-Based Medical Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists. We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z)
Voice-assisted Image Labelling for Endoscopic Ultrasound Classification using Neural Networks [48.732863591145964]
We propose a multi-modal convolutional neural network architecture that labels endoscopic ultrasound (EUS) images from raw verbal comments provided by a clinician during the procedure. Our results show a prediction accuracy of 76% at image level on a dataset with 5 different labels.
arXiv Detail & Related papers (2021-10-12T21:22:24Z)
Deep data compression for approximate ultrasonic image formation [1.0266286487433585]
In ultrasonic imaging systems, data acquisition and image formation are performed on separate computing devices. Deep neural networks are optimized to preserve the image quality of a particular image formation method.
arXiv Detail & Related papers (2020-09-04T16:43:12Z)
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology Images [50.222197963803644]
Niffler is an integrated framework that enables the execution of machine learning pipelines at research clusters. Niffler uses the Digital Imaging and Communications in Medicine (DICOM) protocol to fetch and store imaging data. We present its architecture and three of its use cases: an inferior vena cava filter detection from the images in real-time, identification of scanner utilization, and scanner clock calibration.
arXiv Detail & Related papers (2020-04-16T21:06:49Z)
Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging [7.370841471918351]
We leverage DICOM metadata from ultrasound images to help learn representations of the ultrasound image. We demonstrate that the proposed method outperforms the non-metadata based approaches across different downstream tasks.
arXiv Detail & Related papers (2020-03-20T02:17:03Z)
TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning [68.8204255655161]
We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning. TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks. It includes a command-line interface which allows users to apply transforms to image files without using Python.
arXiv Detail & Related papers (2020-03-09T13:36:16Z)
VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images [121.31355003451152]
Large Scale Vertebrae Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020. We present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view.
arXiv Detail & Related papers (2020-01-24T21:09:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.