Medical Image Deidentification, Cleaning and Compression Using Pylogik
- URL: http://arxiv.org/abs/2304.12322v5
- Date: Wed, 10 May 2023 13:55:49 GMT
- Title: Medical Image Deidentification, Cleaning and Compression Using Pylogik
- Authors: Adrienne Kline, Vinesh Appadurai, Yuan Luo, Sanjiv Shah
- Abstract summary: PyLogik is a library for cleaning, de-identification, determining ROI, and file compression of medical image meta-data.
Results show that PyLogik is a viable methodology for data cleaning and de-identification, determining ROI, and file compression.
Variants of the pipeline have also been created for use with other medical imaging data types.
- Score: 4.515386176265859
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Leveraging medical record information in the era of big data and machine
learning comes with the caveat that data must be cleaned and de-identified.
Facilitating data sharing and harmonization for multi-center collaborations are
particularly difficult when protected health information (PHI) is contained or
embedded in image meta-data. We propose a novel library in the Python
framework, called PyLogik, to help alleviate this issue for ultrasound images,
which are particularly challenging because of the frequent inclusion of PHI
directly on the images. PyLogik processes the image volumes through a series of
text detection/extraction, filtering, thresholding, morphological and contour
comparisons. This methodology de-identifies the images, reduces file sizes, and
prepares image volumes for applications in deep learning and data sharing. To
evaluate its effectiveness in processing ultrasound data, a random sample of 50
cardiac ultrasounds (echocardiograms) were processed through PyLogik, and the
outputs were compared with the manual segmentations by an expert user. The Dice
coefficient of the two approaches achieved an average value of 0.976. Next, an
investigation was conducted to ascertain the degree of information compression
achieved using the algorithm. Resultant data was found to be on average ~72%
smaller after processing by PyLogik. Our results suggest that PyLogik is a
viable methodology for data cleaning and de-identification, determining ROI,
and file compression which will facilitate efficient storage, use, and
dissemination of ultrasound data. Variants of the pipeline have also been
created for use with other medical imaging data types.
Related papers
- Efficient Medical Image Retrieval Using DenseNet and FAISS for BIRADS Classification [0.0]
We propose an approach to medical image retrieval using DenseNet and FAISS.
DenseNet is well-suited for feature extraction in complex medical images.
FAISS enables efficient handling of high-dimensional data in large-scale datasets.
arXiv Detail & Related papers (2024-11-03T08:14:31Z) - Speckle Noise Reduction in Ultrasound Images using Denoising
Auto-encoder with Skip Connection [0.19116784879310028]
Ultrasound images often contain speckle noise which can lower their resolution and contrast-to-noise ratio.
This can make it more difficult to extract, recognize, and analyze features in the images.
Researchers have proposed several speckle reduction methods, but no single method takes all relevant factors into account.
arXiv Detail & Related papers (2024-03-05T08:08:59Z) - Robust Medical Image Classification from Noisy Labeled Data with Global
and Local Representation Guided Co-training [73.60883490436956]
We propose a novel collaborative training paradigm with global and local representation learning for robust medical image classification.
We employ the self-ensemble model with a noisy label filter to efficiently select the clean and noisy samples.
We also design a novel global and local representation learning scheme to implicitly regularize the networks to utilize noisy samples.
arXiv Detail & Related papers (2022-05-10T07:50:08Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Voice-assisted Image Labelling for Endoscopic Ultrasound Classification
using Neural Networks [48.732863591145964]
We propose a multi-modal convolutional neural network architecture that labels endoscopic ultrasound (EUS) images from raw verbal comments provided by a clinician during the procedure.
Our results show a prediction accuracy of 76% at image level on a dataset with 5 different labels.
arXiv Detail & Related papers (2021-10-12T21:22:24Z) - Deep data compression for approximate ultrasonic image formation [1.0266286487433585]
In ultrasonic imaging systems, data acquisition and image formation are performed on separate computing devices.
Deep neural networks are optimized to preserve the image quality of a particular image formation method.
arXiv Detail & Related papers (2020-09-04T16:43:12Z) - A DICOM Framework for Machine Learning Pipelines against Real-Time
Radiology Images [50.222197963803644]
Niffler is an integrated framework that enables the execution of machine learning pipelines at research clusters.
Niffler uses the Digital Imaging and Communications in Medicine (DICOM) protocol to fetch and store imaging data.
We present its architecture and three of its use cases: an inferior vena cava filter detection from the images in real-time, identification of scanner utilization, and scanner clock calibration.
arXiv Detail & Related papers (2020-04-16T21:06:49Z) - Weakly Supervised Context Encoder using DICOM metadata in Ultrasound
Imaging [7.370841471918351]
We leverage DICOM metadata from ultrasound images to help learn representations of the ultrasound image.
We demonstrate that the proposed method outperforms the non-metadata based approaches across different downstream tasks.
arXiv Detail & Related papers (2020-03-20T02:17:03Z) - TorchIO: A Python library for efficient loading, preprocessing,
augmentation and patch-based sampling of medical images in deep learning [68.8204255655161]
We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning.
TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks.
It includes a command-line interface which allows users to apply transforms to image files without using Python.
arXiv Detail & Related papers (2020-03-09T13:36:16Z) - VerSe: A Vertebrae Labelling and Segmentation Benchmark for
Multi-detector CT Images [121.31355003451152]
Large Scale Vertebrae Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020.
We present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view.
arXiv Detail & Related papers (2020-01-24T21:09:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.