Fugu-MT 論文翻訳(概要): PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset

論文の概要: PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset

arxiv url: http://arxiv.org/abs/2112.07146v1
Date: Tue, 14 Dec 2021 03:58:00 GMT
ステータス: 翻訳完了
システム内更新日: 2021-12-15 15:24:03.832063
Title: PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset
Title（参考訳）: pp-humanseg:大規模遠隔会議ビデオデータセットを用いたコネクティビティ対応ポートレートセグメンテーション
Authors: Lutao Chu, Yi Liu, Zewu Wu, Shiyu Tang, Guowei Chen, Yuying Hao, Juncai Peng, Zhiliang Yu, Zeyu Chen, Baohua Lai, Haoyi Xiong
Abstract要約: この研究は、23の会議シーンから291のビデオを含む大規模なビデオポートレートデータセットを初めて構築した。セマンティック・セグメンテーションのためのセマンティック・コネクティビティ・アウェア・ラーニング(SCL)を提案し,セマンティック・コネクティビティ・アウェア・ロスを導入した。また,本論文では,IoUと推論速度の最良のトレードオフを実現するために,SCLを用いた超軽量モデルを提案する。
参考スコア（独自算出の注目度）: 9.484150543390955
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As the COVID-19 pandemic rampages across the world, the demands of video conferencing surge. To this end, real-time portrait segmentation becomes a popular feature to replace backgrounds of conferencing participants. While feature-rich datasets, models and algorithms have been offered for segmentation that extract body postures from life scenes, portrait segmentation has yet not been well covered in a video conferencing context. To facilitate the progress in this field, we introduce an open-source solution named PP-HumanSeg. This work is the first to construct a large-scale video portrait dataset that contains 291 videos from 23 conference scenes with 14K fine-labeled frames and extensions to multi-camera teleconferencing. Furthermore, we propose a novel Semantic Connectivity-aware Learning (SCL) for semantic segmentation, which introduces a semantic connectivity-aware loss to improve the quality of segmentation results from the perspective of connectivity. And we propose an ultra-lightweight model with SCL for practical portrait segmentation, which achieves the best trade-off between IoU and the speed of inference. Extensive evaluations on our dataset demonstrate the superiority of SCL and our model. The source code is available at https://github.com/PaddlePaddle/PaddleSeg.
Abstract（参考訳）: 新型コロナウイルス(COVID-19)のパンデミックが世界中に広がり、ビデオ会議の需要が急増している。この目的のために、リアルタイムのポートレートセグメンテーションは、会議参加者のバックグラウンドを置き換えるために人気のある機能になる。特徴豊富なデータセット、モデル、アルゴリズムは生活シーンから身体の姿勢を抽出するセグメンテーションのために提供されてきたが、ポートレートセグメンテーションはビデオ会議コンテキストではあまりカバーされていない。この分野の進展を促進するために,PP-HumanSegというオープンソースのソリューションを導入する。この研究は、23の会議シーンから291のビデオと14Kのファインラベルフレームとマルチカメラテレカンファレンスの拡張を含む、大規模なビデオポートレートデータセットを初めて構築した。さらに,セマンティクスセグメンテーションのための新しいセマンティクス接続認識学習(scl)を提案し,セマンティクス接続認識損失を導入し,接続性の観点からセグメンテーション結果の品質を向上させる。また,本論文では,IoUと推論速度の最良のトレードオフを実現するために,SCLを用いた超軽量モデルを提案する。データセットの大規模な評価は、SCLとモデルが優れていることを示す。ソースコードはhttps://github.com/paddlepaddle/paddlesegで入手できる。

論文の概要: PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset

関連論文リスト