Fugu-MT 論文翻訳(概要): Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

論文の概要: Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

arxiv url: http://arxiv.org/abs/2401.07957v1
Date: Mon, 15 Jan 2024 20:47:24 GMT
ステータス: 翻訳完了
システム内更新日: 2024-01-17 15:46:41.635928
Title: Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models
Title（参考訳）: 機械知覚品質:重度の損失圧縮が音響・画像モデルに与える影響の評価
Authors: Dan Jacobellis, Daniel Cummings, Neeraja J. Yadwadkar
Abstract要約: 損失圧縮に対する異なるアプローチが機械知覚タスクにどのように影響するかを評価する。重く損失のある圧縮を発生させながら、圧縮された知覚的圧縮を利用することが可能である。事前トレーニングのためのロッシー圧縮は、マシン直観的なシナリオを劣化させる可能性がある。
参考スコア（独自算出の注目度）: 1.2584276673531931
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the field of neural data compression, the prevailing focus has been on optimizing algorithms for either classical distortion metrics, such as PSNR or SSIM, or human perceptual quality. With increasing amounts of data consumed by machines rather than humans, a new paradigm of machine-oriented compression$\unicode{x2013}$which prioritizes the retention of features salient for machine perception over traditional human-centric criteria$\unicode{x2013}$has emerged, creating several new challenges to the development, evaluation, and deployment of systems utilizing lossy compression. In particular, it is unclear how different approaches to lossy compression will affect the performance of downstream machine perception tasks. To address this under-explored area, we evaluate various perception models$\unicode{x2013}$including image classification, image segmentation, speech recognition, and music source separation$\unicode{x2013}$under severe lossy compression. We utilize several popular codecs spanning conventional, neural, and generative compression architectures. Our results indicate three key findings: (1) using generative compression, it is feasible to leverage highly compressed data while incurring a negligible impact on machine perceptual quality; (2) machine perceptual quality correlates strongly with deep similarity metrics, indicating a crucial role of these metrics in the development of machine-oriented codecs; and (3) using lossy compressed datasets, (e.g. ImageNet) for pre-training can lead to counter-intuitive scenarios where lossy compression increases machine perceptual quality rather than degrading it. To encourage engagement on this growing area of research, our code and experiments are available at: https://github.com/danjacobellis/MPQ.
Abstract（参考訳）: ニューラルネットワーク圧縮の分野では、PSNRやSSIMといった古典的歪みメトリクスのアルゴリズム最適化や、人間の知覚的品質に重点が置かれている。人間ではなく機械が消費するデータ量が増えるにつれて、従来の人間中心の基準である$\unicode{x2013}$hasよりも機械知覚に適する特徴の保持を優先する機械指向圧縮の新たなパラダイムが出現し、損失のある圧縮を利用したシステムの開発、評価、配置にいくつかの新たな課題が生じた。特に、損失圧縮に対する異なるアプローチが下流の機械知覚タスクのパフォーマンスにどのように影響するかは明らかではない。この未探索領域に対処するために、画像分類、画像分割、音声認識、音源分離を含む様々な知覚モデル$\unicode{x2013}$under severe lossy compressionを評価した。従来の,ニューラルネットワーク,生成圧縮アーキテクチャにまたがるいくつかの一般的なコーデックを利用する。 Our results indicate three key findings: (1) using generative compression, it is feasible to leverage highly compressed data while incurring a negligible impact on machine perceptual quality; (2) machine perceptual quality correlates strongly with deep similarity metrics, indicating a crucial role of these metrics in the development of machine-oriented codecs; and (3) using lossy compressed datasets, (e.g. ImageNet) for pre-training can lead to counter-intuitive scenarios where lossy compression increases machine perceptual quality rather than degrading it. この成長する研究領域への関与を促進するため、コードと実験はhttps://github.com/danjacobellis/MPQ.comで公開されています。

関連論文リスト

GANCompress: GAN-Enhanced Neural Image Compression with Binary Spherical Quantization [0.0]
GANCompressは、二元球量子化(BSQ)とGAN(Generative Adversarial Networks)を組み合わせた新しいニューラル圧縮フレームワークである。 GANCompressは圧縮効率を大幅に向上し、ファイルサイズを最大100倍まで削減し、視覚的歪みを最小限に抑える。
論文参考訳（メタデータ） (2025-05-19T00:18:27Z)
Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
現在、ビデオ伝送は人間の視覚システム(HVS)だけでなく、分析のための機械認識にも役立っている。本稿では,機械知覚関連歪み表現を抽出し,下流モデルに埋め込む圧縮歪埋め込み(CDRE)フレームワークを提案する。我々のフレームワークは,実行時間,パラメータ数といったオーバーヘッドを最小限に抑えて,既存のコーデックのレートタスク性能を効果的に向上させることができる。
論文参考訳（メタデータ） (2025-03-27T13:01:53Z)
Unlocking the Potential of Digital Pathology: Novel Baselines for Compression [31.13721473800084]
病的全スライド画像(WSI)における色とテクスチャの相違ディープラーニングモデルは、WSIのさらなる圧縮のためにJPEG-XLやWebPのような従来の圧縮方式よりも優れた知覚品質のために微調整されている。本研究は、WSIにおける損失圧縮スキームの評価のための新しい知見を提供し、損失圧縮スキームの統一的な評価を奨励し、デジタル病理の臨床的取り込みを加速させる。
論文参考訳（メタデータ） (2024-12-17T18:04:33Z)
AlphaZip: Neural Network-Enhanced Lossless Text Compression [0.0]
本稿では,Large Language Model (LLM) を用いたロスレステキスト圧縮手法を提案する。第一に、トランスフォーマーブロックのような高密度ニューラルネットワークアーキテクチャを使用した予測、第二に、予測ランクをAdaptive Huffman、LZ77、Gzipといった標準的な圧縮アルゴリズムで圧縮する。
論文参考訳（メタデータ） (2024-09-23T14:21:06Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
本研究は、最適な視覚的忠実度のために設計された強化されたニューラル圧縮手法を提案する。我々は,洗練されたセマンティック・アンサンブル・ロス,シャルボニエ・ロス,知覚的損失,スタイル・ロス,非バイナリ・ディバイザ・ロスを組み込んだモデルを構築した。実験により,本手法は神経画像圧縮の統計的忠実度を著しく向上させることが示された。
論文参考訳（メタデータ） (2024-01-25T08:11:27Z)
Are Visual Recognition Models Robust to Image Compression? [23.280147529096908]
画像圧縮が視覚認知タスクに与える影響を解析する。我々は、0.1ビットから2ビット/ピクセル(bpp)までの幅広い圧縮レベルについて検討する。これら3つのタスクすべてにおいて,強い圧縮を使用する場合,認識能力に大きな影響があることが判明した。
論文参考訳（メタデータ） (2023-04-10T11:30:11Z)
Cross Modal Compression: Towards Human-comprehensible Semantic Compression [73.89616626853913]
クロスモーダル圧縮は、視覚データのためのセマンティック圧縮フレームワークである。提案したCMCは,超高圧縮比で再現性の向上が期待できることを示す。
論文参考訳（メタデータ） (2022-09-06T15:31:11Z)
Estimating the Resize Parameter in End-to-end Learned Image Compression [50.20567320015102]
本稿では,最近の画像圧縮モデルの速度歪みトレードオフをさらに改善する検索自由化フレームワークについて述べる。提案手法により,Bjontegaard-Deltaレート(BD-rate)を最大10%向上させることができる。
論文参考訳（メタデータ） (2022-04-26T01:35:02Z)
Implicit Neural Representations for Image Compression [103.78615661013623]
Inlicit Neural Representations (INRs) は、様々なデータ型の新規かつ効果的な表現として注目されている。量子化、量子化を考慮した再学習、エントロピー符号化を含むINRに基づく最初の包括的圧縮パイプラインを提案する。我々は、INRによるソース圧縮に対する我々のアプローチが、同様の以前の作業よりも大幅に優れていることに気付きました。
論文参考訳（メタデータ） (2021-12-08T13:02:53Z)
Towards Compact CNNs via Collaborative Compression [166.86915086497433]
チャネルプルーニングとテンソル分解を結合してCNNモデルを圧縮する協調圧縮方式を提案する。 52.9%のFLOPを削減し、ResNet-50で48.4%のパラメータを削除しました。
論文参考訳（メタデータ） (2021-05-24T12:07:38Z)
Analyzing and Mitigating JPEG Compression Defects in Deep Learning [69.04777875711646]
本稿では,JPEG圧縮が共通タスクやデータセットに与える影響を統一的に検討する。高圧縮の一般的なパフォーマンス指標には大きなペナルティがあることが示される。
論文参考訳（メタデータ） (2020-11-17T20:32:57Z)
Improving Inference for Neural Image Compression [31.999462074510305]
State-of-the-art method build on Hierarchical variational autoencoders to predict a compressible latent representation of each data point。従来の手法では性能を制限した3つの近似ギャップを同定する。本稿では,これら3つの制約のそれぞれについて,反復的推論に関する考え方に基づく対策を提案する。
論文参考訳（メタデータ） (2020-06-07T19:26:37Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。