# 格子上の母子共分散の逆

Inverses of Matern Covariances on Grids ( http://arxiv.org/abs/1912.11914v3 )

ライセンス: Link先を確認
Joseph Guinness(参考訳) We conduct a study of the aliased spectral densities of Mat\'ern covariance functions on a regular grid of points, providing clarity on the properties of a popular approximation based on stochastic partial differential equations; while others have shown that it can approximate the covariance function well, we find that it assigns too much power at high frequencies and does not provide increasingly accurate approximations to the inverse as the grid spacing goes to zero, except in the one-dimensional exponential covariance case. 本理論を裏付ける数値的な結果を示し,シミュレーション研究によりパラメータ推定の意義について検討し,spde近似が空間範囲パラメータを過大評価する傾向があることを発見した。

We conduct a study of the aliased spectral densities of Mat\'ern covariance functions on a regular grid of points, providing clarity on the properties of a popular approximation based on stochastic partial differential equations; while others have shown that it can approximate the covariance function well, we find that it assigns too much power at high frequencies and does not provide increasingly accurate approximations to the inverse as the grid spacing goes to zero, except in the one-dimensional exponential covariance case. We provide numerical results to support our theory, and in a simulation study, we investigate the implications for parameter estimation, finding that the SPDE approximation tends to overestimate spatial range parameters.
翻訳日:2023-06-10 07:58:41 公開日:2021-03-01
# Dzyaloshinsky-MoriyaとKaplan-Shekhtman-Entin-Wohlman-Aharony相互作用による2量子XYZスピン鎖の量子相関について

On the quantum correlations in two-qubit XYZ spin chains with Dzyaloshinsky-Moriya and Kaplan-Shekhtman-Entin-Wohlman-Aharony interactions ( http://arxiv.org/abs/2003.04542v2 )

ライセンス: Link先を確認
M. A. Yurischev(参考訳) 反対称dzyaloshinsky-moriyaと対称kaplan-shekhtman-entin-wohlman-aharony相互作用を持つ不均質磁場における異方性ハイゼンベルク2スピン-1/2模型は熱平衡において考慮される。 群論的アプローチを用いて、15のスピンハミルトニアンと対応するギブス密度行列(量子状態)を見つけ、固有値は正方根基によってのみ表される。 また、この15状態のコレクションのうち9つを連結する局所ユニタリ変換を発見し、そのうちの1つはx量子状態である。 量子エンタングルメント、量子ディスコード、一方向量子ワーク不足などの量子相関はx状態として知られているので、これは9つの状態族から任意のメンバーの量子相関を得ることができる。 さらに、残りの6つの量子状態は分離可能であり、局所的なユニタリ変換でも接続可能であることを示すが、現在は絡み合い以外の既知の相関関係は一般には利用できない。

The anisotropic Heisenberg two-spin-1/2 model in an inhomogeneous magnetic field with both antisymmetric Dzyaloshinsky-Moriya and symmetric Kaplan-Shekhtman-Entin-Wohlman-Aharony cross interactions is considered at thermal equilibrium. Using a group-theoretical approach, we find fifteen spin Hamiltonians and as many corresponding Gibbs density matrices (quantum states) whose eigenvalues are expressed only through square radicals. We also found local unitary transformations that connect nine of this fifteen state collection, and one of them is the X quantum state. Since such quantum correlations as quantum entanglement, quantum discord, one-way quantum work deficit, and others are known for the X state, this allows to get the quantum correlations for any member from the nine state family. Further, we show that the remaining six quantum states are separable, that they are also connected by local unitary transformations, but, however, now the case with known correlations beyond entanglement is generally not available.
翻訳日:2023-05-30 01:15:37 公開日:2021-03-01
# 幾何識別性尺度 量子チャネル推定と識別の限界

Geometric distinguishability measures limit quantum channel estimation and discrimination ( http://arxiv.org/abs/2004.10708v2 )

ライセンス: Link先を確認
Vishal Katariya and Mark M. Wilde(参考訳) 量子チャネル推定と識別は、量子情報科学における基本的な情報処理タスクである。 本稿では,適切な対数微分フィッシャー情報と幾何学的R'enyi相対エントロピーを用いてこれらのタスクを解析し,これらの識別可能性尺度間の関係を同定する。 我々の論文の重要な結果は、連鎖ルール特性が正しい対数微分フィッシャー情報と、R'enyiパラメータ $\alpha$ の区間 $\alpha\in(0,1) の幾何学的 R'enyi 相対エントロピーを保持することである。 チャネル推定において、これらの結果はハイゼンベルクスケーリングの到達可能性の条件を示し、チャネル識別においては、チャーンオフとホッフィングエラー指数設定における誤差率の境界が改善される。 より一般的には、量子チャネルに符号化されたパラメータを推定する一般的なシーケンシャルプロトコルを解析するための概念的枠組みとして、償却量子フィッシャー情報を導入し、上記のアプリケーションを超えて、古典量子チャネルに符号化されたパラメータをハイゼンベルクスケーリングすることは不可能であることを示す。 次に、推定と識別のタスクと、それぞれの分析にかかわる識別可能性対策の間に、他の概念的および技術的関係を多数特定する。 この研究の一環として、量子状態とチャネルの幾何学的r\'enyi相対エントロピーの詳細な概要と、その性質について述べる。

Quantum channel estimation and discrimination are fundamentally related information processing tasks of interest in quantum information science. In this paper, we analyze these tasks by employing the right logarithmic derivative Fisher information and the geometric R\'enyi relative entropy, respectively, and we also identify connections between these distinguishability measures. A key result of our paper is that a chain-rule property holds for the right logarithmic derivative Fisher information and the geometric R\'enyi relative entropy for the interval $\alpha\in(0,1) $ of the R\'enyi parameter $\alpha$. In channel estimation, these results imply a condition for the unattainability of Heisenberg scaling, while in channel discrimination, they lead to improved bounds on error rates in the Chernoff and Hoeffding error exponent settings. More generally, we introduce the amortized quantum Fisher information as a conceptual framework for analyzing general sequential protocols that estimate a parameter encoded in a quantum channel, and we use this framework, beyond the aforementioned application, to show that Heisenberg scaling is not possible when a parameter is encoded in a classical-quantum channel. We then identify a number of other conceptual and technical connections between the tasks of estimation and discrimination and the distinguishability measures involved in analyzing each. As part of this work, we present a detailed overview of the geometric R\'enyi relative entropy of quantum states and channels, as well as its properties, which may be of independent interest.
翻訳日:2023-05-22 10:58:29 公開日:2021-03-01
# グリーン関数と基底状態のための量子コンピュータ上のLaczos再帰

Lanczos recursion on a quantum computer for the Green's function and ground state ( http://arxiv.org/abs/2008.05593v4 )

ライセンス: Link先を確認
Thomas E. Baker(参考訳) 量子コンピュータ上の単一基底状態波動関数からランチョス再帰の係数を得るために、状態保存量子計数アルゴリズムを用いる。 これは相互作用するグリーン関数の継続分数表現を計算し、凝縮物、素粒子物理学、その他の領域で用いられる。 波動関数は各イテレーションで再準備される必要はない。 量子アルゴリズムは、既知の古典的手法よりも指数関数的なメモリ減少を表す。 基底状態を決定する方法の拡張についても述べる。

A state-preserving quantum counting algorithm is used to obtain coefficients of a Lanczos recursion from a single ground state wavefunction on the quantum computer. This is used to compute the continued fraction representation of an interacting Green's function for use in condensed matter, particle physics, and other areas. The wavefunction does not need to be re-prepared at each iteration. The quantum algorithm represents an exponential reduction in memory over known classical methods. An extension of the method to determining the ground state is also discussed.
翻訳日:2023-05-06 11:16:37 公開日:2021-03-01
# 量子非ガウス光子偶然

Quantum non-Gaussian Photon Coincidences ( http://arxiv.org/abs/2009.04184v2 )

ライセンス: Link先を確認
Luk\'a\v{s} Lachman and Radim Filip(参考訳) 光子偶然は量子技術にとって重要な資源である。 これらは物質に非線形量子過程を露出させ、絡み合いの源に必須である。 我々は、新しい光子源の質を証明する量子非ガウス型2光子偶然の幅広い適用基準を導出する。 この基準は、しばしば量子技術への応用を制限するガウス的パラメトリック過程から生じる状態を拒絶する。 また、量子非ガウシアン一致のロバスト性を分析し、それに基づく単光子のヘラルド量子非ガウシアン性と比較する。

Photon coincidences represent an important resource for quantum technologies. They expose nonlinear quantum processes in matter and are essential for sources of entanglement. We derive broadly applicable criteria for quantum non-Gaussian two-photon coincidences that certify a new quality of photon sources. The criteria reject states emerging from Gaussian parametric processes, which often limit applications in quantum technologies. We also analyse the robustness of the quantum non-Gaussian coincidences and compare with the heralded quantum non-Gaussianity of single-photons based on them.
翻訳日:2023-05-03 03:02:23 公開日:2021-03-01
# 急速重イオンの軌道に沿った窒素ドープダイヤモンド中の窒素空孔中心の直接形成

Direct formation of nitrogen-vacancy centers in nitrogen doped diamond along the trajectories of swift heavy ions ( http://arxiv.org/abs/2011.03656v2 )

ライセンス: Link先を確認
Russell E. Lake, Arun Persaud, Casey Christian, Edward S. Barnard, Emory M. Chan, Andrew A. Bettiol, Marilena Tomut, Christina Trautmann, Thomas Schenkel(参考訳) 結晶成長中に100ppmの窒素をドープしたib型合成単結晶ダイヤモンドにおいて,スウィフト重イオン(shis)の軌道に沿って形成された窒素空孔(nv$^-$)中心の深さ分解発光測定を行った。 スペクトルの解析により、nv$^-$中心は、電子的停止過程が支配的かつ、弾性衝突が空隙や欠陥の形成につながるイオン範囲の終端ではなく、優先的に形成されることが示された。 熱焼鈍はさらに高い空隙密度の領域でより優先的にshis照射後のnv収率を増加させる。 単一急速重イオンの軌道に沿って形成されたNV中心は、数ナノメートルの平均量子ビット間隔と、長さ10から30マイクロメートルのパーコレーションチェーンに沿った100のカラー中心を持つ準-1Dレジスタにおける色中心量子ビットの探索技術によって分離することができる。

We report depth-resolved photoluminescence measurements of nitrogen-vacancy (NV$^-$) centers formed along the tracks of swift heavy ions (SHIs) in type Ib synthetic single crystal diamonds that had been doped with 100 ppm nitrogen during crystal growth. Analysis of the spectra shows that NV$^-$ centers are formed preferentially within regions where electronic stopping processes dominate and not at the end of the ion range where elastic collisions lead to formation of vacancies and defects. Thermal annealing further increases NV yields after irradiation with SHIs preferentially in regions with high vacancy densities. NV centers formed along the tracks of single swift heavy ions can be isolated with lift-out techniques for explorations of color center qubits in quasi-1D registers with an average qubit spacing of a few nanometers and of order 100 color centers per micrometer along 10 to 30 micrometer long percolation chains.
翻訳日:2023-04-25 01:21:18 公開日:2021-03-01
# 組込みクーパーペアトランジスタから導出される非線形電荷・磁束可変キャビティ

A Nonlinear Charge- and Flux-Tunable Cavity Derived from an Embedded Cooper Pair Transistor ( http://arxiv.org/abs/2011.06298v3 )

ライセンス: Link先を確認
B. L. Brock, Juliang Li, S. Kanhirathingal, B. Thyagarajan, William F. Braasch Jr., M. P. Blencowe, A. J. Rimberg(参考訳) 本稿では、ゲートコンデンサの充電とSQUIDループのスレッディングにより共振周波数を調整できる高非線形マイクロ波キャビティとして機能する共振器組込みクーパーペアトランジスタ(cCPT)を紹介する。 この装置を特徴付け、理論と実験の間に優れた一致を見出す。 このキャラクタリゼーションの重要な難点は、最近の理論的な予測(brock et al., phys. rev. applied 14, 054026 (2020))に従って測定された共鳴円を変形するキャビティ線幅に匹敵する周波数変動の存在である。 パラメータ空間におけるこれらの周波数変動のパワースペクトル密度を測定することで、これらは主に固体デバイスに共通する1/f$の電荷とフラックスノイズの結果であることがわかった。 特に、Kerr非線形性を介して空洞内の量子ゆらぎによって引き起こされる周波数変動のキーシグネチャも観察する。

We introduce the cavity-embedded Cooper pair transistor (cCPT), a device which behaves as a highly nonlinear microwave cavity whose resonant frequency can be tuned both by charging a gate capacitor and by threading flux through a SQUID loop. We characterize this device and find excellent agreement between theory and experiment. A key difficulty in this characterization is the presence of frequency fluctuations comparable in scale to the cavity linewidth, which deform our measured resonance circles in accordance with recent theoretical predictions [Brock et al., Phys. Rev. Applied 14, 054026 (2020)]. By measuring the power spectral density of these frequency fluctuations at carefully chosen points in parameter space, we find that they are primarily a result of the $1/f$ charge and flux noise common in solid state devices. Notably, we also observe key signatures of frequency fluctuations induced by quantum fluctuations in the cavity field via the Kerr nonlinearity.
翻訳日:2023-04-24 07:48:36 公開日:2021-03-01
# Lechner-Hauke-Zollerスキームにおける量子アニールスケジュールの変分最適化

Variational optimization of the quantum annealing schedule for the Lechner-Hauke-Zoller scheme ( http://arxiv.org/abs/2012.01694v2 )

ライセンス: Link先を確認
Yuki Susa, Hidetoshi Nishimori(参考訳) このアニーリングスケジュールは、汎用的な組合せ最適化問題を表すIsingモデルのために設計された量子アニーリングのためのLHZスキームのパラメータに最適化される。 我々は,松浦らによって提案された変分法(arXiv:2003.09913)を,LHZスキームに固有の変数の制約を表す項のアニーリングスケジュールに適応し,他の項のアニーリングスケジュールをそのまま維持する。 簡易強磁性モデルとスピングラス問題の数値計算結果から,非単調アニーリングスケジュールは残留エネルギーと最終基底状態の忠実度によって測定された性能を最適化することが示された。 この改善は瞬時エネルギーギャップの増加を伴わないため、量子アニーリングにおける実際に関連する断熱過程の研究において静的解析に加えて、動的視点の重要性が示唆される。

The annealing schedule is optimized for a parameter in the Lechner-Hauke-Zoller (LHZ) scheme for quantum annealing designed for the all-to-all-interacting Ising model representing generic combinatorial optimization problems. We adapt the variational approach proposed by Matsuura et al. (arXiv:2003.09913) to the annealing schedule of a term representing a constraint for variables intrinsic to the LHZ scheme with the annealing schedule of other terms kept intact. Numerical results for a simple ferromagnetic model and the spin-glass problem show that nonmonotonic annealing schedules optimize the performance measured by the residual energy and the final ground-state fidelity. This improvement does not accompany a notable increase in the instantaneous energy gap, which suggests the importance of a dynamical viewpoint in addition to static analyses in the study of practically relevant diabatic processes in quantum annealing.
翻訳日:2023-04-22 05:44:26 公開日:2021-03-01
# 連立平衡状態に存在するシステム環境相関を考慮したマスター方程式

A master equation incorporating the system-environment correlations present in the joint equilibrium state ( http://arxiv.org/abs/2012.14853v2 )

ライセンス: Link先を確認
Ali Raza Mirza, Muhammad Zia, Adam Zaman Chaudhry(参考訳) 本稿では,最初のシステム環境相関を考慮に入れた,システム環境結合強度の2次に直交する一般マスター方程式を提案する。 システムとその環境は連立熱平衡状態にあると仮定し、その後、所望の初期システム状態を作成するために一元演算を行い、その後、システムハミルトニアンも変化する可能性がある。 量子系における緩和と脱コヒーレンスを記述した通常の二階項と同様、初期相関の効果が第二階マスター方程式に付加項として現れることを示す。 このマスター方程式を、パラダイム的スピン-ボソンモデルの一般化、すなわち、調和振動子の共通環境と相互作用する2レベル系の集合、および共通のスピン環境と相互作用する2レベル系の集合に適用する。 システムダイナミクスを正確に取得するためには,まず,システム環境相関を考慮に入れる必要があることを実証する。

We present a general master equation, correct to second order in the system-environment coupling strength, that takes into account the initial system-environment correlations. We assume that the system and its environment are in a joint thermal equilibrium state, and thereafter a unitary operation is performed to prepare the desired initial system state, with the system Hamiltonian possibly changing thereafter as well. We show that the effect of the initial correlations shows up in the second-order master equation as an additional term, similar in form to the usual second-order term describing relaxation and decoherence in quantum systems. We apply this master equation to a generalization of the paradigmatic spin-boson model, namely a collection of two-level systems interacting with a common environment of harmonic oscillators, as well as a collection of two-level systems interacting with a common spin environment. We demonstrate that, in general, the initial system-environment correlations need to be accounted for in order to accurately obtain the system dynamics.
翻訳日:2023-04-18 11:58:56 公開日:2021-03-01
# asbso:フレキシブルな検索長とメモリベースの選択によるブレインストーム最適化

ASBSO: An Improved Brain Storm Optimization With Flexible Search Length and Memory-Based Selection ( http://arxiv.org/abs/2101.11275v2 )

ライセンス: Link先を確認
Yang Yu, Shangce Gao, Yirui Wang, Jiujun Cheng and Yuki Todo(参考訳) brain storm optimization (bso) は、対数型sgmoid transfer関数を用いて収束過程における探索範囲を調整する、新しく提案された人口ベースの最適化アルゴリズムである。 しかし、この調整は、現在のイテレーション番号と柔軟性と多様性の欠如によってのみ変化し、bsoの検索効率と堅牢性に乏しい。 この問題を軽減するため、BSOに組み込むために、適応的なステップ長構造と成功メモリ選択戦略を提案する。 提案手法は,メモリ選択BSOに基づく適応的なステップ長,すなわちASBSOを用いて,新たな解の生成過程を変更することで,対応する問題や収束周期に応じて柔軟な探索を行う。 解の改良度を評価し記憶することができる新規なメモリ機構を用いて、ステップ長の選択可能性を決定する。 ASBSOの検索能力をテストするために57のベンチマーク関数が使用され、実際の4つの問題が応用価値を示すために採用されている。 これらのテスト結果は、ASBSOのソリューション品質、スケーラビリティ、堅牢性を著しく改善したことを示している。

Brain storm optimization (BSO) is a newly proposed population-based optimization algorithm, which uses a logarithmic sigmoid transfer function to adjust its search range during the convergent process. However, this adjustment only varies with the current iteration number and lacks of flexibility and variety which makes a poor search effciency and robustness of BSO. To alleviate this problem, an adaptive step length structure together with a success memory selection strategy is proposed to be incorporated into BSO. This proposed method, adaptive step length based on memory selection BSO, namely ASBSO, applies multiple step lengths to modify the generation process of new solutions, thus supplying a flexible search according to corresponding problems and convergent periods. The novel memory mechanism, which is capable of evaluating and storing the degree of improvements of solutions, is used to determine the selection possibility of step lengths. A set of 57 benchmark functions are used to test ASBSO's search ability, and four real-world problems are adopted to show its application value. All these test results indicate the remarkable improvement in solution quality, scalability, and robustness of ASBSO.
翻訳日:2023-04-13 20:20:48 公開日:2021-03-01
# エネルギーコヒーレンスを利用した自律熱機械の最適化

Optimizing autonomous thermal machines powered by energetic coherence ( http://arxiv.org/abs/2101.11572v2 )

ライセンス: Link先を確認
Kenza Hammam, Yassine Hassouni, Rosario Fazio, Gonzalo Manzano(参考訳) 熱力学タスクの性能における量子効果のキャラクタリゼーションと制御は、ナノスケールで働く小さな熱機械の新しい道を開くかもしれない。 熱機関としても冷蔵庫としても機能する小型サーマルマシンの運転におけるエネルギーベースにおけるコヒーレンスの影響について検討した。 入力コヒーレンスによって機械の性能が向上し、そうでなければ禁止された体制で動作できることが示される。 さらに,コヒーレンスが有害である場合や,特定のモデルのレンダリング最適化がコヒーレンスによる拡張の恩恵を受ける上で重要なタスクである場合もあります。

The characterization and control of quantum effects in the performance of thermodynamic tasks may open new avenues for small thermal machines working in the nanoscale. We study the impact of coherence in the energy basis in the operation of a small thermal machine which can act either as a heat engine or as a refrigerator. We show that input coherence may enhance the machine performance and allow it to operate in otherwise forbidden regimes. Moreover, our results also indicate that, in some cases, coherence may also be detrimental, rendering optimization of particular models a crucial task for benefiting from coherence-induced enhancements.
翻訳日:2023-04-13 19:59:58 公開日:2021-03-01
# アインシュタイン・ポドルスキー・ローゼン・ステアリング : 1つの絡み合ったペアによる2面式シーケンシャル測定

Einstein-Podolsky-Rosen Steering in Two-sided Sequential Measurements with One Entangled Pair ( http://arxiv.org/abs/2102.02550v2 )

ライセンス: Link先を確認
Jie Zhu, Meng-Jun Hu, Guang-Can Guo, Chuan-Feng Li, and Yong-Sheng Zhang(参考訳) 非局所性と量子測定は量子論の2つの基本的なトピックであり、ベルの定理の発見以来、それらの相互作用は集中的な焦点を引き付けている。 複数観測者間の非局所性共有を予測し,実験的に観測した。 しかし、1つのアリスと複数のボブについて広く議論され、2つのボブについてはほとんど知られていない一方のシーケンシャルケースのみを理論的に実験的に検討し、1つの絡み合ったペアが複数のアライスとボブに分配される2つのシーケンシャル測定ケースにおける非局所性共有について検討する。 フォトニックシステムにおける4人の観測者による二重eprステアリングを初めて観測した。 すべての観測者が同じ測定強度を採用する場合、ダブルEPRステアリングを同時に行うことができ、ダブルベル-CHSH不等式違反は不可能であることが示されている。 任意の多数のアライスとボブに対するベル量と逐次弱測定に関する正確な公式も導出され、偏りのない入力条件下ではダブルベル-chshの不等式違反はもはや不可能であることを示している。 その結果、シーケンシャル測定と非局所性の関係の理解を深めるだけでなく、量子情報処理における重要な応用を見出すことができる。

Non-locality and quantum measurement are two fundamental topics in quantum theory and theirinterplay attracts intensive focus since the discovery of Bell theorem. Non-locality sharing amongmultiple observers is predicted and experimentally observed. However, only one-sided sequentialcase, i.e., one Alice and multiple Bobs is widely discussed and little is known about two-sided case.Here, we theoretically and experimentally explore the non-locality sharing in two-sided sequentialmeasurements case in which one entangled pair is distributed to multiple Alices and Bobs. Weexperimentally observed double EPR steering among four observers in the photonic system for thefirst time. In the case that all observers adopt the same measurement strength, it is observedthat double EPR steering can be demonstrated simultaneously while double Bell-CHSH inequalityviolations are shown to be impossible. The exact formula relating Bell quantity and sequential weakmeasurements for arbitrary many Alices and Bobs is also derived, showing that no more doubleBell-CHSH inequality violations is possible under unbiased input condition. The results not onlydeepen our understanding of relation between sequential measurements and non-locality but alsomay find important applications in quantum information tasks.
翻訳日:2023-04-12 20:01:27 公開日:2021-03-01
# 有限時間2スピン量子オットーエンジン:断熱対可逆性に対するショートカット

Finite-time two-spin quantum Otto engines: shortcuts to adiabaticity vs. irreversibility ( http://arxiv.org/abs/2102.11657v2 )

ライセンス: Link先を確認
Bar{\i}\c{s} \c{C}akmak(参考訳) 逆磁場における2スピン1/2$の異方性XYモデルにおいて量子オットーサイクルを提案する。 まず,作業媒体が断熱型システムにおいてエンジンとして動作するパラメータ機構を特徴付ける。 そこで本研究では,STA(Shortcut to adiabaticity)技術を用いることなく,エンジンの有限時間挙動を考察する。 STAスキームは、外部制御の導入を犠牲にして、システムのダイナミクスが断熱経路に従うことを保証している。 我々は,非断熱式エンジンとSTAエンジンの性能を,固定断熱効率では比較するが,動作媒体のパラメータは異なる。 パラメータの条件によっては、有限時間駆動による効率ラグによる可逆性は非常に低く、非断熱エンジンは非断熱エンジンに非常に近い性能を示し、STAエンジンは非断熱エンジンよりもわずかに優れている。 これは、働く媒質ハミルトニアンを設計することにより、外部制御プロトコルを扱うことの難しさを回避できることを示唆している。

We propose a quantum Otto cycle in a two spin-$1/2$ anisotropic XY model in a transverse external magnetic field. We first characterize the parameter regime that the working medium operates as an engine in the adiabatic regime. Then, we consider finite-time behavior of the engine with and without utilizing a shortcut to adiabaticity (STA) technique. STA schemes guarantee that the dynamics of a system follows the adiabatic path, at the expense of introducing an external control. We compare the performance of the non-adiabatic and STA engines for a fixed adiabatic efficiency but different parameters of the working medium. We observe that, for certain parameter regimes, the irreversibility, as measured by the efficiency lags, due to finite-time driving is so low that non-adiabatic engine performs quite close to the adiabatic engine, leaving the STA engine only marginally better than the non-adiabatic one. This suggests that by designing the working medium Hamiltonian one may spare the difficulty of dealing with an external control protocol.
翻訳日:2023-04-10 03:23:32 公開日:2021-03-01
# 繊維化ダイヤモンド系ベクトル磁気センサ

Fiberized diamond-based vector magnetometers ( http://arxiv.org/abs/2102.11902v2 )

ライセンス: Link先を確認
Georgios Chatzidrosos, Joseph Shaji Rebeirro, Huijie Zheng, Muhib Omar, Andreas Brenneis, Felix M. St\"urner, Tino Fuchs, Thomas Buck, Robert R\"olver, Tim Schneemann, Peter Bl\"umler, Dmitry Budker, Arne Wickenbrock(参考訳) ダイヤモンド中の窒素空隙(nv)中心に基づく2つの磁化ベクトル磁場センサを提案する。 センサーはサブ・nT/$\sqrt{\textrm{Hz}}$磁気感度を備える。 市販の部品を用いて、センササイズが小さく、光子コレクションが高く、センサとサンプル距離が最小限であるセンサーを構築する。 両方のセンサーは光ファイバーの端に位置し、センサーヘッドは自由にアクセス可能で、動作中の堅牢である。 これらの特徴は、高感度かつ空間分解能(\leq$\,mm)で磁場をマッピングするのに理想的である。 デモでは、センサの1つを使って、$\geq$100\,mT Halbach配列のボア内のベクトル磁場をマッピングします。 ベクトル場センシングプロトコルは、すべてのダイヤモンド軸に対応するマイクロ波分光データを変換し、3次元磁場ベクトルへの二重量子遷移を含む。

We present two fiberized vector magnetic-field sensors, based on nitrogen-vacancy (NV) centers in diamond. The sensors feature sub-nT/$\sqrt{\textrm{Hz}}$ magnetic sensitivity. We use commercially available components to construct sensors with a small sensor size, high photon collection, and minimal sensor-sample distance. Both sensors are located at the end of optical fibres with the sensor-head freely accessible and robust under movement. These features make them ideal for mapping magnetic fields with high sensitivity and spatial resolution ($\leq$\,mm). As a demonstration we use one of the sensors to map the vector magnetic field inside the bore of a $\geq$ 100\,mT Halbach array. The vector field sensing protocol translates microwave spectroscopy data addressing all diamonds axes and including double quantum transitions to a 3D magnetic field vector.
翻訳日:2023-04-10 03:05:22 公開日:2021-03-01
# 可変双極子相互作用による磁気トルクの増強

Magnetic-Torque Enhanced by Tunable Dipolar interactions ( http://arxiv.org/abs/2103.00836v1 )

ライセンス: Link先を確認
C. Pellet-Mary, P. Huillery, M. Perdriat, G. H\'etet(参考訳) ダイヤモンド中の窒素空孔(NV)中心のスピン間の調整可能な双極子相互作用を用いてダイヤモンド結晶を回転させる。 具体的には、閉じ込められたダイヤモンド内のnv中心の電子スピン間の交差関係を利用して異方性nv準磁性を増大させ、それに伴うスピントルクを増加させる。 本研究は、光学遷移を欠いた常磁性欠陥の検出に機械振動子を用いる方法、スピン緩和過程における角運動量保存の検討、機械振動子の運動を冷却する新しい方法への道を開くものである。

We use tunable dipolar-interactions between the spins of nitrogen-vacancy (NV) centers in diamond to rotate a diamond crystal. Specifically, we employ cross-relaxation between the electronic spin of pairs of NV centers in a trapped diamond to enhance the anisotropic NV paramagnetism and thus to increase the associated spin torque. Our observations open a path towards the use of mechanical oscillators to detect paramagnetic defects that lack optical transitions, to investigation of angular momentum conservation in spin relaxation processes and to novel means of cooling the motion of mechanical oscillators.
翻訳日:2023-04-09 14:50:47 公開日:2021-03-01
# 浅い量子回路をシミュレートする高性能並列古典スキーム

High-performance parallel classical scheme for simulating shallow quantum circuits ( http://arxiv.org/abs/2103.00693v1 )

ライセンス: Link先を確認
Shihao Zhang, Jiacheng Bao, Yifan Sun, Lvzhou Li, Houjun Sun, and Xiangdong Zhang(参考訳) 最近では、対称二元行列に関する2次元(2D)隠れ線形関数(HLF)問題など、ある問題の解法において、定数深さ量子回路は従来の回路よりも強力であることが証明されている。 そこで本研究では,従来の古典並列アルゴリズムとゲート型古典回路モデルを組み合わせた2次元HLF問題の完全サンプリング変種を,ターゲットの浅量子回路を正確にシミュレートするための高性能な2段階古典的スキームを提案する。 妥当なパラメータ仮定の下では、理論解析により、我々の古典的シミュレータは、ほとんどの問題インスタンスにおいて、短期量子プロセッサよりもランタイムを消費しないことが明らかになった。 さらに,本設計した並列スキームは,現在の量子ハードウェアによって実行されるグラフ状態回路をシミュレーションし検証するための,スケーラブルで高効率かつ操作上便利なツールであることを示す。

Recently, constant-depth quantum circuits are proved more powerful than their classical counterparts at solving certain problems, e.g., the two-dimensional (2D) hidden linear function (HLF) problem regarding a symmetric binary matrix. To further investigate the boundary between classical and quantum computing models, in this work we propose a high-performance two-stage classical scheme to solve a full-sampling variant of the 2D HLF problem, which combines traditional classical parallel algorithms and a gate-based classical circuit model together for exactly simulating the target shallow quantum circuits. Under reasonable parameter assumptions, a theoretical analysis reveals our classical simulator consumes less runtime than that of near-term quantum processors for most problem instances. Furthermore, we demonstrate the typical all-connected 2D grid instances by moderate FPGA circuits, and show our designed parallel scheme is a practically scalable, high-efficient and operationally convenient tool for simulating and verifying graph-state circuits performed by current quantum hardware.
翻訳日:2023-04-09 14:49:49 公開日:2021-03-01
# ランダムプロジェクションによる高次元高コスト最適化のための階層的代理支援進化アルゴリズムの強化

Enhancing hierarchical surrogate-assisted evolutionary algorithm for high-dimensional expensive optimization via random projection ( http://arxiv.org/abs/2103.00682v1 )

ライセンス: Link先を確認
Xiaodong Ren, Daofu Guo, Zhigang Ren, Yongsheng Liang, An Chen(参考訳) 実際の適合性評価を著しく低減することにより、サロゲート支援進化アルゴリズム(SAEA)、特に階層型SAEAは計算コストの高い最適化問題を解くのに有効であることが示されている。 階層的SAEAの成功は、主に「不確実性の恵み」として知られるグローバルサロゲートモデルの潜在的な利益と、局所モデルの高い精度から利益を得る。 しかし、その性能は高次元問題を改善する余地を残しており、現在なお、巨大なソリューション空間のために十分な正確な局所モデルを構築することは困難である。 そこで本研究では,ランダムプロジェクション手法の助けを借りて局所代理モデルを訓練し,新しい階層型SAEAを提案する。 元の高次元の解空間でトレーニングを実行する代わりに、新しいアルゴリズムはまずサンプルをランダムに低次元の部分空間に射影し、次に各部分空間で代理モデルを訓練し、結果のモデルを平均化して候補解の評価を行う。 100次元と200次元の6つのベンチマーク関数の実験結果から、ランダムプロジェクションは局所代理モデルの精度を大幅に向上し、新しい階層SAEAは最先端SAEAよりも明らかなエッジを持つことが示された。

By remarkably reducing real fitness evaluations, surrogate-assisted evolutionary algorithms (SAEAs), especially hierarchical SAEAs, have been shown to be effective in solving computationally expensive optimization problems. The success of hierarchical SAEAs mainly profits from the potential benefit of their global surrogate models known as "blessing of uncertainty" and the high accuracy of local models. However, their performance leaves room for improvement on highdimensional problems since now it is still challenging to build accurate enough local models due to the huge solution space. Directing against this issue, this study proposes a new hierarchical SAEA by training local surrogate models with the help of the random projection technique. Instead of executing training in the original high-dimensional solution space, the new algorithm first randomly projects training samples onto a set of low-dimensional subspaces, then trains a surrogate model in each subspace, and finally achieves evaluations of candidate solutions by averaging the resulting models. Experimental results on six benchmark functions of 100 and 200 dimensions demonstrate that random projection can significantly improve the accuracy of local surrogate models and the new proposed hierarchical SAEA possesses an obvious edge over state-of-the-art SAEAs
翻訳日:2023-04-09 14:49:30 公開日:2021-03-01
# 超高速時間波動関数の直接測定

Direct measurement of ultrafast temporal wavefunctions ( http://arxiv.org/abs/2103.01020v1 )

ライセンス: Link先を確認
Kazuhisa Ogawa, Takumi Okazaki, Hirokazu Kobayashi, Toshihiro Nakanishi, and Akihisa Tomita(参考訳) 光子の時間モードにおける情報符号化のキャパシティとロバスト性は、高使用性と時間分解能を有する時間量子状態を特徴付ける量子情報処理において重要である。 単一光子レベルの弱光に対する時間的複素波動関数のサブピコ秒時間分解能による直接測定法を提案する。 実験中の光と自己生成単色基準光との干渉の超高速メトロロジーによって直接測定が実現され、外部参照光や複雑な後処理アルゴリズムは不要である。 したがって、この方法は汎用性があり、潜在的に時間的状態のキャラクタリゼーションに適用できる。

The large capacity and robustness of information encoding in the temporal mode of photons is important in quantum information processing, in which characterizing temporal quantum states with high usability and time resolution is essential. We propose and demonstrate a direct measurement method of temporal complex wavefunctions for weak light at a single-photon level with subpicosecond time resolution. Our direct measurement is realized by ultrafast metrology of the interference between the light under test and self-generated monochromatic reference light; no external reference light or complicated post-processing algorithms are required. Hence, this method is versatile and potentially widely applicable for temporal state characterization.
翻訳日:2023-04-09 14:45:10 公開日:2021-03-01
# Di{\'o}si-Penroseモデルに対する相対論的補正

Relativistic corrections to the Di{\'o}si-Penrose model ( http://arxiv.org/abs/2103.00994v1 )

ライセンス: Link先を確認
Luis A. Poveda and Luis Grave de Peralta and Arquimedes Ruiz-Columbi\'e(参考訳) Di{\'o}si-Penroseモデルは相対論的文脈で探索される。 相対論的効果は,最近提案されたgrave de peraltaアプローチ(l。 Grave de Peralta, {\em results Phys. {\displaystyle {\bf18} (2020) 103318} は、系の平均運動エネルギーはその相対論的運動エネルギーと一致するように、シュルディンガーのようなハミルトニアンをパラメータ化する。 実験では, 良好な結果を得た箱内の粒子に本手法を適用した。 di{\o}si-penroseモデルでは、量子物質場の幅が自身の重力場 [l] に閉じ込められていることを観測した。 Di{\'o}si, {\displaystyle Phys} の略。 Lett}。 bf 105a} (1984) 199] は、プランク質量のオーダーの質量に対して鋭くゼロに低下し、プランクスケールでのモデルの崩壊を示している。

The Di{\'o}si-Penrose model is explored in a relativistic context. Relativistic effects were considered within a recently proposed Grave de Peralta approach [L. Grave de Peralta, {\em Results Phys.} {\bf 18} (2020) 103318], which parametrize the Schr{\"o}dinger-like hamiltonian so as to impose that the average kinetic energy of the system coincide with its relativistic kinetic energy. As a case of study, the method is applied to a particle in a box with good results. In the Di{\'o}si-Penrose model we observed that the width of a quantum matter field confined by its own gravitational field [L. Di{\'o}si, {\em Phys. Lett}. {\bf 105A} (1984) 199], sharply drop to zero for a mass of the order of the Planck mass, indicating a breakdown of the model at the Planck scale.
翻訳日:2023-04-09 14:44:47 公開日:2021-03-01
# ニルソンモデルにおけるスピンパラドックスの分解能

Resolution of the spin paradox in the Nilsson model ( http://arxiv.org/abs/2103.00979v1 )

ライセンス: Link先を確認
Hadi Sobhani, Hassan Hassanabadi, Dennis Bonatsos(参考訳) 核の単一粒子エネルギー準位の四重極変形への依存を描写したよく知られたニルソン図では、スピンパラドックスが、球形から筒状対称性を持つ変形形へと導かれる。 同じ軌道角運動量と同じ全角運動量を共有する球殻模型軌道に対応するレベルの束は、円柱座標における漸近量子数によってラベル付けされたニルソンエネルギー準位に対応するように見える。 さらに、いくつかの軌道では、球殻モデル量子数とニルソン漸近量子数との対応は陽子や中性子では同じではない。 2つの量子数の集合間の新しい対応則を導入し、スピンパラドックスが解決され、陽子と中性子ニルソン図形の完全な一致が確立されることを示す。 四極子変形の関数としてのニルソン図形の形式は変わらず、新しい図と伝統的な図の唯一の違いは、特定の1粒子エネルギー準位に対するニルソンラベルの相互交換である。

In the well-known Nilsson diagrams, depicting the dependence of the nuclear single-particle energy levels on quadrupole deformation, a spin paradox appears {as the deformation sets in, leading from spherical shapes to prolate deformed shapes with cylindrical symmetry}. Bunches of levels corresponding to a spherical shell model orbital, sharing the same orbital angular momentum and the same total angular momentum, appear to correspond to Nilsson energy levels, labeled by asymptotic quantum numbers in cylindrical coordinates, some of which have spin up, while some others have spin down. Furthermore, for some orbitals the correspondence between spherical shell model quantum numbers and Nilsson asymptotic quantum numbers is not the same for protons and for neutrons. Introducing a new rule of correspondence between the two sets of quantum numbers, we show that the spin paradox is resolved and full agreement between the proton and neutron Nilsson diagrams is established. The form of the Nilsson diagrams as a function of the quadrupole deformation remains unchanged, the only difference between the new diagrams and the traditional ones being the mutual exchange of the Nilsson labels for certain pairs of single-particle energy levels.
翻訳日:2023-04-09 14:44:30 公開日:2021-03-01
# フェルミ・ディラックとボース・アインシュタイン量子統計学の情報幾何

Information geometry for Fermi-Dirac and Bose-Einstein quantum statistics ( http://arxiv.org/abs/2103.00935v1 )

ライセンス: Link先を確認
Pedro Pessoa, Carlo Cafaro(参考訳) 情報幾何(英: information geometry)は、リーマン微分幾何学構造を確率分布の空間に割り当てることからなる確率論の創発的分岐である。 本稿では,フェルミ・ディラックとボース・アインシュタイン量子統計に基づく気体の幾何学的研究について述べる。 各量子気体について、大標準アンサンブルに付随する曲面統計多様体の情報幾何について検討する。 フィッシャー・ラオ情報計量とスカラー曲率は、非相互作用粒子のフェルミオンモデルとボソニックモデルの両方で計算される。 特に,情報幾何解析における理想的なボソニックガスの基底状態を考慮して,凝縮領域におけるスカラー曲率の特異な挙動が消失することを発見した。 これは、曲率は常に相転移によって発散するという長い予想の反例である。

Information geometry is an emergent branch of probability theory that consists of assigning a Riemannian differential geometry structure to the space of probability distributions. We present an information geometric investigation of gases following the Fermi-Dirac and the Bose-Einstein quantum statistics. For each quantum gas, we study the information geometry of the curved statistical manifolds associated with the grand canonical ensemble. The Fisher-Rao information metric and the scalar curvature are computed for both fermionic and bosonic models of non-interacting particles. In particular, by taking into account the ground state of the ideal bosonic gas in our information geometric analysis, we find that the singular behavior of the scalar curvature in the condensation region disappears. This is a counterexample to a long held conjecture that curvature always diverges in phase transitions.
翻訳日:2023-04-09 14:43:46 公開日:2021-03-01
# 量子応用のための非線形デバイスの改良

Improved non-linear devices for quantum applications ( http://arxiv.org/abs/2103.00873v1 )

ライセンス: Link先を確認
Jano Gil-Lopez, Matteo Santandrea, Benjamin Brecht, Gana\"el Roland, Raimund Ricken, Viktor Quiring and Christine Silberhorn(参考訳) 本稿では,量子光学技術に適したモード選択型集積和周波数生成装置の現状を概観する。 我々は,その性能を評価するためのベンチマークを調査し,これらのデバイスの現在の限界を議論し,それらを克服するための戦略を概説する。 最後に、新しい改良されたデバイスの作製とその特性について述べる。 本装置の製作品質を分析し,量子応用のための非線形デバイスの改良に向けた次のステップについて考察する。

In this paper, we review the state of the art of mode selective, integrated sum-frequency generation devices tailored for quantum optical technologies. We explore benchmarks to asses their performance and discuss the current limitations of these devices, outlining possible strategies to overcome them. Finally, we present the fabrication of a new, improved device and its characterization. We analyse the fabrication quality of this device and discuss the next steps towards improved non-linear devices for quantum applications.
翻訳日:2023-04-09 14:42:44 公開日:2021-03-01
# 電子励起のqmc計算のためのキプシ展開の調整--チオフェンのケーススタディ

Tailoring CIPSI expansions for QMC calculations of electronic excitations: the case study of thiophene ( http://arxiv.org/abs/2103.01158v1 )

ライセンス: Link先を確認
Monika Dash and Saverio Moroni and Claudia Filippi and Anthony Scemama(参考訳) 摂動選択された構成相互作用スキーム (CIPSI) は、Jastrow-Slater波動関数を用いた量子モンテカルロ (QMC) シミュレーションの行列展開を構築するのに特に有効である: 基底状態特性の高速かつ滑らかな収束、および異なる対称性の基底状態と励起状態のバランスの取れた記述が報告されている。 特に、正確な励起エネルギーは、各状態に対する2階摂動補正を伴うcipsi展開を用いるという重要な要件、すなわち、完全な構成相互作用の制限に関して、推定誤差が類似していることから得られる。 ここでは、基底状態と同じ対称性の励起状態に対するCIPSI選択基準について詳述し、共通の軌道集合から膨張を生成する。 これらの拡張をジャストロウ・スレーター波動関数の行列式成分としてqmcのこれらの拡張を用いて、チオフェンの最低で明るい励起状態を計算する。 結果として生じる垂直励起エネルギーは、数千個の行列式しか持たない最高の理論推定値の0.05〜eV以内である。 さらに、変分モンテカルロの対応する根に続く基底状態と励起状態の構造を緩和し、0.01~\AAよりも正確な結合長を得る。 したがって、このシステムのCIPSIレベルでの完全な処理は非常に要求されるが、QMCでは、比較的少ない、よく選択された行列式を持つ安価なCIPSI展開上に構築される高品質な励起エネルギーと励起状態構造パラメータを計算できる。

The perturbatively selected configuration interaction scheme (CIPSI) is particularly effective in constructing determinantal expansions for quantum Monte Carlo (QMC) simulations with Jastrow-Slater wave functions: fast and smooth convergence of ground-state properties, as well as balanced descriptions of ground- and excited-states of different symmetries have been reported. In particular, accurate excitation energies have been obtained by the pivotal requirement of using CIPSI expansions with similar second-order perturbation corrections for each state, that is, similar estimated errors with respect to the full configuration interaction limit. Here we elaborate on the CIPSI selection criterion for excited states of the same symmetry as the ground state, generating expansions from a common orbital set. Using these expansions in QMC as determinantal components of Jastrow-Slater wave functions, we compute the lowest, bright excited state of thiophene, which is challenging due to its significant multireference character. The resulting vertical excitation energies are within 0.05~eV of the best theoretical estimates, already with expansions of only a few thousand determinants. Furthermore, we relax the ground- and excited-state structures following the corresponding root in variational Monte Carlo and obtain bond lengths which are accurate to better than 0.01~\AA. Therefore, while the full treatment at the CIPSI level of this system would be quite demanding, in QMC we can compute high-quality excitation energies and excited-state structural parameters building on affordable CIPSI expansions with relatively few, well chosen determinants.
翻訳日:2023-04-09 14:35:09 公開日:2021-03-01
# 人工知能の臨床受容に関する考察

Reflections on the Clinical Acceptance of Artificial Intelligence ( http://arxiv.org/abs/2103.01149v1 )

ライセンス: Link先を確認
Jens Schneider, Marco Agus(参考訳) 本章では,人工知能(AI)の使用と臨床環境における受容について考察する。 我々は,AIと臨床実践を組み合わせたパイプラインモデルの形で臨床受け入れの障害を概観する。 次に、各課題をパイプラインの関連するステージにリンクし、各課題を克服するために必要な要件を議論します。 私たちはこの議論を、現在臨床ワークフローの周辺で見られるaiの機会の概要と共に補完します。

In this chapter, we reflect on the use of Artificial Intelligence (AI) and its acceptance in clinical environments. We develop a general view of hindrances for clinical acceptance in the form of a pipeline model combining AI and clinical practise. We then link each challenge to the relevant stage in the pipeline and discuss the necessary requirements in order to overcome each challenge. We complement this discussion with an overview of opportunities for AI, which we currently see at the periphery of clinical workflows.
翻訳日:2023-04-09 14:34:40 公開日:2021-03-01
# 異種外界におけるdzyaloshinsky-moriya相互作用とハイゼンベルク模型の熱的コヒーレンス

Thermal coherence of the Heisenberg model with Dzyaloshinsky-Moriya interactions in an inhomogenous external field ( http://arxiv.org/abs/2103.01130v1 )

ライセンス: Link先を確認
Manikandan Parthasarathy, Segar Jambulingam, Tim Byrnes and Chandrashekar Radhakrishnan(参考訳) Dzyaloshinsky-Moriya (DM) 相互作用を持つ2サイトXYZモデルの外部磁場における量子コヒーレンスについて検討した。 DM相互作用, 磁場, 測定ベースは, それぞれ異なる方向を向くことができ, 有限温度での量子コヒーレンスについて検討する。 スピンスピン相互作用パラメータに関して、測定基底の方向がスピンスピン相互作用の方向と同じである場合、量子コヒーレンスが減少することがわかった。 スピン格子相互作用が変化すると、コヒーレンスは常にその方向と測定基底の関係に関係なく増加する。 また、外部不均一磁場の変動に基づく量子コヒーレンスの類似解析を行い、測定基底の方向が外部磁場と同じである場合、コヒーレンスが減少することを発見した。

The quantum coherence of the two-site XYZ model with Dzyaloshinsky-Moriya (DM) interactions in an external inhomogenous magnetic field is studied. The DM interaction, the magnetic field and the measurement basis can be along different directions, and we examine the quantum coherence at finite temperature. With respect to the spin-spin interaction parameter, we find that the quantum coherence decreases when the direction of measurement basis is the same as that of the spin-spin interaction. When the spin-lattice interaction is varied, the coherence always increases irrespective of the relation between its direction and the measurement basis. Similar analysis of quantum coherence based on the variation of the external inhomogenous magnetic field is also carried out, where we find that the coherence decreases when the direction of the measurement basis is the same as that of the external field.
翻訳日:2023-04-09 14:34:16 公開日:2021-03-01
# 単一光子量子ハードウェア--量子優位性を持つスケーラブルフォトニック量子技術に向けて

Single-photon quantum hardware: towards scalable photonic quantum technology with a quantum advantage ( http://arxiv.org/abs/2103.01110v1 )

ライセンス: Link先を確認
Ravitej Uppu, Leonardo Midolo, Xiaoyan Zhou, Jacques Carolan, Peter Lodahl(参考訳) 量子ハードウェアのスケールアップは、情報科学における量子技術の破壊的可能性を実現するための根本的な課題である。 たくさんのハードウェアプラットフォームの中で、photonicsはモジュラーなアプローチで際立っており、その主な課題は、十分な高品質のビルディングブロックを構築し、それらを効率的にインターフェースする手法を開発することである。 重要なのは、その後のスケールアップにより、photonic foundryのインフラストラクチャが提供する成熟した統合フォトニック技術をフル活用して、膨大な複雑性を持つ小さなフットプリント量子プロセッサを作り出すことだ。 完全コヒーレントで決定論的な光子-エミッタ界面は、量子フォトニクスのキーイネーブルであり、今日では量子アドバンテージと呼ばれる定量ベンチマークに到達した仕様を持つ固体量子エミッタで実現可能である。 この光間相互作用プライマーは、オンデマンドの単光子および多光子絡み源や光子-光子非線形量子ゲートを含む、量子フォトニックリソースと機能の範囲を実現する。 我々は,単一光子量子ハードウェアの現在の最先端と,スケールアップに必要な主要なフォトニックビルディングブロックを提示する。 さらに、量子通信とフォトニック量子コンピューティングにおけるハードウェアビルディングブロックの、特定の有望な応用を指摘し、真の量子優位性を提供する量子フォトニクスアプリケーションへの道を切り開く。

The scaling up of quantum hardware is the fundamental challenge ahead in order to realize the disruptive potential of quantum technology in information science. Among the plethora of hardware platforms, photonics stands out by offering a modular approach, where the main challenge is to construct sufficiently high-quality building blocks and develop methods to efficiently interface them. Importantly, the subsequent scaling-up will make full use of the mature integrated photonic technology provided by photonic foundry infrastructure to produce small foot-print quantum processors of immense complexity. A fully coherent and deterministic photon-emitter interface is a key enabler of quantum photonics, and can today be realized with solid-state quantum emitters with specifications reaching the quantitative benchmark referred to as Quantum Advantage. This light-matter interaction primer realizes a range of quantum photonic resources and functionalities, including on-demand single-photon and multi-photon entanglement sources, and photon-photon nonlinear quantum gates. We will present the current state-of-the-art in single-photon quantum hardware and the main photonic building blocks required in order to scale up. Furthermore, we will point out specific promising applications of the hardware building blocks within quantum communication and photonic quantum computing, laying out the road ahead for quantum photonics applications that could offer a genuine quantum advantage.
翻訳日:2023-04-09 14:33:48 公開日:2021-03-01
# 高等教育における新たなデジタル第三宇宙プロフェッショナルの台頭 : 研究ソフトウェア工学の認識

The Rise of a New Digital Third Space Professional in Higher Education: Recognising Research Software Engineering ( http://arxiv.org/abs/2103.01041v1 )

ライセンス: Link先を確認
Shoaib Sufi(参考訳) 研究ソフトウェア工学(research software engineering)は、専門的なソフトウェアスキルを研究に応用する分野である。 これをやる人は、Research Software EngineersまたはRSEと呼ばれる。 RSEは独立した機能を提供するのではなく、研究者と緊密に連携して機能する(例えば、従来のIT労働者や図書館員は、研究、教育、管理機能など、大学コミュニティに一般的なサービスを提供しようとしている)。 RSEとRSEの経営陣を高等教育部門における新しいタイプの第三の宇宙専門家にする学術研究者と重複している。 我々は、高等教育におけるアイデンティティの再構築において、Whitchurch氏が構築したモデルに関連して、知識、関係、正当性、言語に関する側面を探求し、これらがRSEの役割とどのように関係しているかを探求し、RSEをより堅固な組織基盤に置くために解決を必要とするオープンな問題を強調する。

Research Software Engineering is the application of professional software skills to research problems. Those who do this are called Research Software Engineers or RSEs for short. RSEs work closely with researchers in a collaborative fashion rather than just offering a standalone function (c.f. the traditional IT workforce or Librarians working to provide a general set of services to the University community such as research, teaching or administrative functions). It is this overlap with academic researchers that make the RSE and the RSE management a new type of third space professional in the higher education sector. We explore aspects of knowledge, relationships, legitimacies and language in relation to the model constructed by Whitchurch in Reconstructing Identities in Higher Education to explore how these relate to the RSE role and go on to highlight open problems that need resolving to put RSEs on a firmer organisational footing.
翻訳日:2023-04-09 14:33:15 公開日:2021-03-01
# 大規模非平衡量子力学のためのスケーラブルハミルトン学習

Scalable Hamiltonian learning for large-scale out-of-equilibrium quantum dynamics ( http://arxiv.org/abs/2103.01240v1 )

ライセンス: Link先を確認
Agnes Valenti and Guliuxin Jin and Julian L\'eonard and Sebastian D. Huber and Eliska Greplova(参考訳) 大規模量子デバイスは古典的なシミュレーションの範囲を超えて洞察を提供する。 しかし、信頼性が高く検証可能な量子シミュレーションでは、量子デバイスの構築ブロックは精巧なベンチマークを必要とする。 この大規模動的量子システムのベンチマークは、シミュレーションに効率的なツールがないために大きな課題となっている。 ここでは、非平衡量子系におけるハミルトントモグラフィーのためのニューラルネットワークに基づくスケーラブルなアルゴリズムを提案する。 我々は,光格子中の超低温原子のフォアフロント量子シミュレーションプラットフォームのためのモデルを用いて,我々のアプローチを説明する。 具体的には, 任意の大きさの擬1次元ボソニック系のハミルトニアンを, 到達可能な実験値を用いて再構成できることを示す。 既知パラメータの精度を著しく向上させることができる。

Large-scale quantum devices provide insights beyond the reach of classical simulations. However, for a reliable and verifiable quantum simulation, the building blocks of the quantum device require exquisite benchmarking. This benchmarking of large scale dynamical quantum systems represents a major challenge due to lack of efficient tools for their simulation. Here, we present a scalable algorithm based on neural networks for Hamiltonian tomography in out-of-equilibrium quantum systems. We illustrate our approach using a model for a forefront quantum simulation platform: ultracold atoms in optical lattices. Specifically, we show that our algorithm is able to reconstruct the Hamiltonian of an arbitrary size quasi-1D bosonic system using an accessible amount of experimental measurements. We are able to significantly increase the previously known parameter precision.
翻訳日:2023-04-09 14:26:27 公開日:2021-03-01
# リンドブラッド方程式の構造保存数値スキーム

Structure-preserving numerical schemes for Lindblad equations ( http://arxiv.org/abs/2103.01194v1 )

ライセンス: Link先を確認
Yu Cao, Jianfeng Lu(参考訳) リンドブラッド方程式に対する構造保存決定論的数値スキームの族を研究し,詳細な誤差解析と絶対安定性解析を行った。 誤差解析と絶対安定性解析の両方を数値例で検証する。

We study a family of structure-preserving deterministic numerical schemes for Lindblad equations, and carry out detailed error analysis and absolute stability analysis. Both error and absolute stability analysis are validated by numerical examples.
翻訳日:2023-04-09 14:24:59 公開日:2021-03-01
# アフリカにおけるデータ共有の物語と対策

Narratives and Counternarratives on Data Sharing in Africa ( http://arxiv.org/abs/2103.01168v1 )

ライセンス: Link先を確認
Rediet Abebe, Kehinde Aruleba, Abeba Birhane, Sara Kingsley, George Obaido, Sekou L. Remy, Swathi Sadagopan(参考訳) 機械学習とデータサイエンスのアプリケーションがますます普及するにつれ、特にアフリカ大陸の文脈では、データ共有とオープンデータイニシアチブに焦点が当てられるようになっている。 多くの人は、データ共有はアフリカの貧困、不平等、デリバティブ効果を軽減するための研究と政策設計を支援することができると主張している。 問題のデータセットはアフリカのコミュニティから抽出されることが多いが、アフリカのデータにアクセスし共有することの難しさに関する議論は、非アフリカ人の利害関係者によって引き起こされることが多い。 これらの視点は、しばしば、データエコシステムにおける摩擦の主因として、大陸における教育、訓練、技術資源の欠如に重点を置いている。 これらの物語はアフリカのデータ共有環境の複雑さを邪魔し歪めていると我々は主張する。 特に、アフリカのデータ専門家との一連のインタビューから構築された架空の人物によるストーリーテリングを用いて、支配的な物語を複雑にし、反ナラティブを提供する。 これらのペルソナを大陸内のデータプラクティスの研究と組み合わせることで、データ共有のメリットの分散における不適切なだけでなく、データ共有に対する繰り返し発生する障壁を特定します。 特に、植民地主義、民族中心主義、奴隷制の正当性から生じる権力不均衡、信頼構築への不信感、歴史的・現代の抽出慣行の認識の欠如、アフリカの文脈に不適な西洋中心の政策について議論する。 これらの問題を概説した後、大陸で生成されたデータを共有する際に対処するための道筋について論じる。

As machine learning and data science applications grow ever more prevalent, there is an increased focus on data sharing and open data initiatives, particularly in the context of the African continent. Many argue that data sharing can support research and policy design to alleviate poverty, inequality, and derivative effects in Africa. Despite the fact that the datasets in question are often extracted from African communities, conversations around the challenges of accessing and sharing African data are too often driven by nonAfrican stakeholders. These perspectives frequently employ a deficit narratives, often focusing on lack of education, training, and technological resources in the continent as the leading causes of friction in the data ecosystem. We argue that these narratives obfuscate and distort the full complexity of the African data sharing landscape. In particular, we use storytelling via fictional personas built from a series of interviews with African data experts to complicate dominant narratives and to provide counternarratives. Coupling these personas with research on data practices within the continent, we identify recurring barriers to data sharing as well as inequities in the distribution of data sharing benefits. In particular, we discuss issues arising from power imbalances resulting from the legacies of colonialism, ethno-centrism, and slavery, disinvestment in building trust, lack of acknowledgement of historical and present-day extractive practices, and Western-centric policies that are ill-suited to the African context. After outlining these problems, we discuss avenues for addressing them when sharing data generated in the continent.
翻訳日:2023-04-09 14:24:49 公開日:2021-03-01
# 量子ネットワークのためのベンチマーク手順

A benchmarking procedure for quantum networks ( http://arxiv.org/abs/2103.01165v1 )

ライセンス: Link先を確認
Jonas Helsen and Stephanie Wehner(参考訳) 量子ネットワークにおける量子プロセッサを接続する量子ネットワークリンクの品質を効率よくベンチマークする手法として,ネットワークベンチマークを提案する。 この手順は標準のランダム化ベンチマークプロトコルに基づいており、量子ネットワークリンクの忠実度の推定を提供する。 本稿では,ノイズ量子ネットワークのための専用シミュレータnetsquidを用いて,nv中心系にインスパイアされたシミュレーション実装とプロトコルの統計解析を行う。

We propose network benchmarking: a procedure to efficiently benchmark the quality of a quantum network link connecting quantum processors in a quantum network. This procedure is based on the standard randomized benchmarking protocol and provides an estimate for the fidelity of a quantum network link. We provide statistical analysis of the protocol as well as a simulated implementation inspired by NV-center systems using Netsquid, a special purpose simulator for noisy quantum networks.
翻訳日:2023-04-09 14:24:25 公開日:2021-03-01
# ネジ転位時の中性非相対論的粒子の電気四極子モーメント

Electric quadrupole moment of a neutral non-relativistic particle in the presence of screw dislocation ( http://arxiv.org/abs/2103.01163v1 )

ライセンス: Link先を確認
H. Hassanabadi, S. Zare, J. Kr\'i\v{z}, B. C. L\"utf\"uo\u{g}lu(参考訳) 本研究では、電場とスピンレス粒子の電気四極子モーメントとの相互作用を、トポロジカルな欠陥(スクリュー転位)を有する弾性媒体内での相互作用について検討する。 この相互作用を考慮することで、シュリンガー方程式は解析法を用いて正確に解かれる。 したがって、2つの構成に対する固有関数とエネルギー固有値が見つかる。 一方、角運動量量子数の変化を観測することにより、媒体内のスクリュー転位により、系のエネルギー固有値と波動関数を変化させる。

In this contribution, we investigate the interaction between electric and magnetic fields with an electric quadrupole moment of a spinless particle moving in an elastic medium which has a topological defect (screw dislocation). By considering this interaction, the Schr\"odinger equation is exactly solved by using the analytical method. Thus, the eigenfunction and energy eigenvalues for two configurations are found. Meanwhile, by observing a shift in the angular momentum quantum number, the energy eigenvalues and the wave function of the system are modified, due to the screw dislocation in the medium.
翻訳日:2023-04-09 14:24:18 公開日:2021-03-01
# 大規模多言語検索エンジンのための不偏文エンコーダ

Unbiased Sentence Encoder For Large-Scale Multi-lingual Search Engines ( http://arxiv.org/abs/2106.07719v1 )

ライセンス: Link先を確認
Mahdi Hajiaghayi, Monir Hajiaghayi, Mark Bolin(参考訳) 本稿では,クエリおよび文書エンコーダとして検索エンジンで使用可能な多言語文エンコーダを提案する。 この埋め込みにより、クエリとドキュメント間のセマンティックな類似性スコアが可能になり、ドキュメントのランク付けと関連性において重要な機能となる。 このようなカスタマイズされた文エンコーダをトレーニングするには、ユーザがクエリドキュメントクリックしたペアの形式でデータを検索するメリットがありますが、偏りがあるため、検索クリックデータに依存しすぎないようにしなくてはなりません。 検索データは短いクエリに対して大きく歪められており、長いクエリは小さく、しばしばうるさい。 目標は、すべてのケースで動作し、短いクエリと長いクエリの両方をカバーする、普遍的な多言語エンコーダを設計することだ。 我々は、異なる言語と翻訳データにおける多くの公開NLIデータセットを選択し、ユーザ検索データとともに、マルチタスクアプローチを用いて言語モデルを訓練する。 課題は、これらのデータセットがコンテンツ、サイズ、バランス比の点で均質ではないことである。 公開NLIデータセットは通常、正と負のペアの同じ部分に基づいて2文であるのに対し、ユーザ検索データは多文文書と正のペアのみを含むことができる。 マルチタスクトレーニングによって、これらのデータセットをすべて活用し、これらのタスク間の知識共有を活用できることを示す。

In this paper, we present a multi-lingual sentence encoder that can be used in search engines as a query and document encoder. This embedding enables a semantic similarity score between queries and documents that can be an important feature in document ranking and relevancy. To train such a customized sentence encoder, it is beneficial to leverage users search data in the form of query-document clicked pairs however, we must avoid relying too much on search click data as it is biased and does not cover many unseen cases. The search data is heavily skewed towards short queries and for long queries is small and often noisy. The goal is to design a universal multi-lingual encoder that works for all cases and covers both short and long queries. We select a number of public NLI datasets in different languages and translation data and together with user search data we train a language model using a multi-task approach. A challenge is that these datasets are not homogeneous in terms of content, size and the balance ratio. While the public NLI datasets are usually two-sentence based with the same portion of positive and negative pairs, the user search data can contain multi-sentence documents and only positive pairs. We show how multi-task training enables us to leverage all these datasets and exploit knowledge sharing across these tasks.
翻訳日:2023-04-09 14:17:12 公開日:2021-03-01
# CLPVG:時系列の新しいネットワークモデルとしての円限定透過可視グラフ

CLPVG: Circular limited penetrable visibility graph as a new network model for time series ( http://arxiv.org/abs/2104.13772v1 )

ライセンス: Link先を確認
Qi Xuan, Jinchao Zhou, Kunfeng Qiu, Dongwei Xu, Shilian Zheng and Xiaoniu Yang(参考訳) 可視性グラフ(VG)は時系列をグラフに変換し、高度なグラフデータマイニングアルゴリズムによる信号処理を容易にする。 本稿では,従来のlpvg法を基礎として,円周制限透過可視グラフ(clpvg)と呼ばれる新しい非線形マッピング手法を提案する。 典型的な時系列のグラフ上での次数分布とクラスタリング係数の検証により、我々のCLPVGが時系列の重要な特徴を効果的に捉え、従来のLPVGよりも優れたアンチノイズ能力を有することを示す。 実世界の無線信号と脳波の時系列データセット(EEG)に関する実験では、CLPVGが提供する構造的特徴はLPVGよりも時系列分類に有用であることが示唆され、精度が向上した。 また、この分類性能はサブグラフネットワーク(sgn)を採用することで構造的特徴拡張によりさらに向上することができる。 これらの結果はCLPVGモデルの有効性を検証した。

Visibility Graph (VG) transforms time series into graphs, facilitating signal processing by advanced graph data mining algorithms. In this paper, based on the classic Limited Penetrable Visibility Graph (LPVG) method, we propose a novel nonlinear mapping method named Circular Limited Penetrable Visibility Graph (CLPVG). The testing on degree distribution and clustering coefficient on the generated graphs of typical time series validates that our CLPVG is able to effectively capture the important features of time series and has better anti-noise ability than traditional LPVG. The experiments on real-world time-series datasets of radio signal and electroencephalogram (EEG) also suggest that the structural features provided by CLPVG, rather than LPVG, are more useful for time-series classification, leading to higher accuracy. And this classification performance can be further enhanced through structural feature expansion by adopting Subgraph Networks (SGN). All of these results validate the effectiveness of our CLPVG model.
翻訳日:2023-04-09 14:16:43 公開日:2021-03-01
# 量子力学におけるユニタリ進化の場合のフロケの定理の簡単な証明

A simple proof of Floquet's theorem for the case of unitary evolution in quantum mechanics ( http://arxiv.org/abs/2104.07019v1 )

ライセンス: Link先を確認
J. D. D. Martin and A. N. Poertner(参考訳) 量子力学におけるユニタリ時間進化の特別な場合に対するフロケの定理の構成的証明を示す。 この証明は単純で、量子力学のコースでの研究に適している。

We present a constructive proof of Floquet's theorem for the special case of unitary time evolution in quantum mechanics. The proof is straightforward and suitable for study in courses on quantum mechanics.
翻訳日:2023-04-09 14:16:24 公開日:2021-03-01
# e-walletアプリケーションの使いやすさと使いやすさをテストするための技術受容モデル(tam)の実装と重要性能分析(ipa)

Implementation of Technology Acceptance Model (TAM) and Importance Performance Analysis (IPA) in Testing the Ease and Usability of E-wallet Applications ( http://arxiv.org/abs/2103.09049v1 )

ライセンス: Link先を確認
Dedi Saputra and Burcu G\"urb\"uz(参考訳) デジタル決済の革新は、現在コミュニティ、特に非現金決済の取引にますます必要とされている。 本研究の目的は、特にGoPayアプリケーションにおいて、e-walletデジタルウォレットサービスの容易性と有用性を知ることである。 本研究の人口はGO-JEKプラットフォーム上でのGo-Payサービスの利用者である。 本研究のサンプルは,既存の基準に基づくTAM (Modified Technology Acceptance Model) を用いて,西ジャワ州デポックで実施した124名のアンケートから得られた。 本研究におけるデータ処理は、Importance Performance Analysis (IPA)分析を用いる。 その結果,gap分析の結果,go-payユーザは現在のサービス品質に満足していないことがわかった。 IPA分析に基づいて、E-Wallet Go-Payの品質改善の優先度尺度を、ユーザの視点で最も優先度の高い尺度である、[1]、[4]、[5]、[6]をマップできる。 これら3つの項目は、ユーザの期待に応えるために、マネージャによって即座にアップグレードされなければなりません。 メンテナンスしなければならないGoPay E-Walletの成果やメリットとなる領域は、第2位、すなわち[2]と[3]です。 この説明から、一般的にE-Wallet GoPayサービスはサービスパフォーマンスを改善するために改善されなければならないと結論付けることができる。

Digital payment innovation is currently increasingly needed by the community, especially in making non-cash payment transactions. The purpose of this research is to know and measure the ease and usefulness of e-wallet digital wallet services, especially in the GoPay application. The population in this study are users of the Go-Pay service on the GO-JEK platform. The sample of this study consisted of 124 respondents from distributing questionnaires in Depok, West Java using a modified Technology Acceptance Model (TAM) based on existing references. The data processing in this research uses Importance Performance Analysis (IPA) analysis. The results show that based on the gap analysis, it is found that in general Go-Pay users are not satisfied with the current service quality. Based on the IPA analysis, the priority scale of E-Wallet Go-Pay quality improvement can be mapped, where quadrant I is the highest priority scale according to the user's perspective: [1], [4], [5], and [6]. These three items must be upgraded immediately by the manager to meet user expectations. Areas that become the achievements or advantages of the GoPay E-Wallet that must be maintained are in quadrant II, namely: [2] and [3]. From this explanation, it can be concluded that in general the E-Wallet GoPay Service must be improved to improve its service performance.
翻訳日:2023-04-09 14:16:20 公開日:2021-03-01
# アハルノフ・ボーム効果におけるゲージ決定と局所性の陰影

Gauge-underdetermination and shades of locality in the Aharonov-Bohm effect ( http://arxiv.org/abs/2103.02684v1 )

ライセンス: Link先を確認
Ruward A. Mulder(参考訳) 古典的電磁ポテンシャルは、アハロノフ・ボーム効果によって物理的に現実的であることが示される(私は「ポテンシャル論」と推測する)。 私はこの見解の歴史的・哲学的なプレゼンテーションを行い、これまでの文献よりも正確にその展望を評価します。 物理的実在としてポテンシャルを取ることは、プリマ・フェイシーを「ゲージ・アンダーグルーション」に導く: 異なるゲージの選択は、異なる物理的状態を表すものであり、それゆえ、異なる理論を表す。 次に、このテーマを、ポテンシャルの観点からのAB効果の基本的な洞察、すなわち、電場と磁場(私がワイド等価クラスと呼ぶ)に直接対応するゲージ同値類があまりに広すぎる、すなわち、ナロー等価クラスは追加の物理的自由度を符号化する:これらは多重連結空間においてのみ異なる役割を果たす。 説明力とゲージ対称性にはトレードオフがある。 この狭義の同値類は「局所的相互作用」の観点から説明を与えるが、信号局所性という意味では局所性は満たされない。 したがって、これらの狭い同値類の中でさえ区別されるようなデシデラタを求めることは知的に必須であり、例えば、そのような同値クラスのいくつかの要素を他のものよりも好む。 ベル局所性、局所相互作用ハミルトニアン、信号局所性など、様々な局所性の定式化を考える。 ベル局所性はゲージの自由を完全に修正した場合にのみ評価できることを示します。 しかし、信号の局所性の観点からはロレンツゲージで説明できる:ポテンシャルは有限速で波に伝播する。 したがって、ロレンツゲージポテンシャル理論 -- より狭いゲージ同値関係 -- を電気力学のオントロジーとして提案する。

I address the view that the classical electromagnetic potentials are shown by the Aharonov-Bohm effect to be physically real (which I dub: 'the potentials view'). I give a historico-philosophical presentation of this view and assess its prospects, more precisely than has so far been done in the literature. Taking the potential as physically real runs prima facie into 'gauge-underdetermination': different gauge choices represent different physical states of affairs and hence different theories. I then illustrate this theme by what I take to be the basic insight of the AB effect for the potentials view, namely that the gauge equivalence class that directly corresponds to the electric and magnetic fields (which I call the Wide Equivalence Class) is too wide, i.e., the Narrow Equivalence Class encodes additional physical degrees of freedom: these only play a distinct role in a multiply-connected space. There is a trade-off between explanatory power and gauge symmetries. Although this narrower equivalence class gives a explanation in terms of `local interactions', locality is not satisfied in the sense of signal locality. It is therefore intellectually mandatory to seek desiderata that will distinguish even within these narrower equivalence classes, i.e. will prefer some elements of such an equivalence class over others. I consider various formulations of locality, such as Bell locality, local interaction Hamiltonians, and signal locality. I show that Bell locality can only be evaluated if one fixes the gauge freedom completely. Yet, an explanation in terms of signal locality can be accommodated by the Lorenz gauge: the potentials propagate in waves at finite speed. I therefore suggest the Lorenz gauge potentials theory -- an even narrower gauge equivalence relation -- as the ontology of electrodynamics.
翻訳日:2023-04-09 14:15:56 公開日:2021-03-01
# 新型コロナウイルス対ソーシャルメディアアプリ:プライバシーは本当に重要か?

COVID-19 vs Social Media Apps: Does Privacy Really Matter? ( http://arxiv.org/abs/2103.01779v1 )

ライセンス: Link先を確認
Omar Haggag, Sherif Haggag, John Grundy, Mohamed Abdelrazek(参考訳) 世界中の多くの人々が、新型コロナウイルス(covid-19)の接触追跡モバイルアプリの使用やダウンロードを心配している。 主な懸念事項はプライバシーと倫理の問題である。 同時に、新型コロナウイルス(COVID-19)アプリと同じようなプライバシー上の懸念もなく、パンデミックの間、人々は自発的にソーシャルメディアアプリをかなり高い速度で使用しています。 こうした異常な振る舞いをよりよく理解するために、最もよく使われている新型コロナウイルス、ソーシャルメディア、生産性アプリに関するプライバシーポリシー、条件、データ利用契約を分析しました。 また、これらのアプリの200万近いユーザレビューを抽出し分析するツールも開発しました。 以上の結果から、ソーシャルメディアと生産性アプリは、新型コロナウイルス(COVID-19)のアプリの大半に比べて、プライバシーと倫理上の問題がかなり高いことが分かる。 驚いたことに、多くの人がユーザーのレビューで、プライバシーはソーシャルメディアアプリよりもCOVID-19アプリの方が扱いやすいと感じている。 一方、新型コロナウイルス(COVID-19)のアプリの多くは、ほとんどのソーシャルメディアアプリに比べてアクセスしやすく、安定していない。 このパンデミックを効果的に対処するためには、医療関係者や技術者が、新型コロナウイルスのアプリ行動や信頼性に関する人々の認識を高める必要がある。 これにより、covid-19アプリの理解を深め、これらのアプリをダウンロードして使用するよう促すことができます。 さらに、さまざまな社会や文化からの幅広いユーザーがこれらのアプリにアクセスできるように、COVID-19アプリにはアクセシビリティーの強化が必要とされる。

Many people around the world are worried about using or even downloading COVID-19 contact tracing mobile apps. The main reported concerns are centered around privacy and ethical issues. At the same time, people are voluntarily using Social Media apps at a significantly higher rate during the pandemic without similar privacy concerns compared with COVID-19 apps. To better understand these seemingly anomalous behaviours, we analysed the privacy policies, terms & conditions and data use agreements of the most commonly used COVID-19, Social Media & Productivity apps. We also developed a tool to extract and analyse nearly 2 million user reviews for these apps. Our results show that Social Media & Productivity apps actually have substantially higher privacy and ethical issues compared with the majority of COVID-19 apps. Surprisingly, lots of people indicated in their user reviews that they feel more secure as their privacy are better handled in COVID-19 apps than in Social Media apps. On the other hand, most of the COVID-19 apps are less accessible and stable compared to most Social Media apps, which negatively impacted their store ratings and led users to uninstall COVID-19 apps more frequently. Our findings suggest that in order to effectively fight this pandemic, health officials and technologists will need to better raise awareness among people about COVID-19 app behaviour and trustworthiness. This will allow people to better understand COVID-19 apps and encourage them to download and use these apps. Moreover, COVID-19 apps need many accessibility enhancements to allow a wider range of users from different societies and cultures to access to these apps.
翻訳日:2023-04-09 14:15:25 公開日:2021-03-01
# 朝か夜か? CS1学生の概日リズムの検討

Morning or Evening? An Examination of Circadian Rhythms of CS1 Students ( http://arxiv.org/abs/2103.01752v1 )

ライセンス: Link先を確認
Albina Zavgorodniaia, Raj Shrestha, Juho Leinonen, Arto Hellas and John Edwards(参考訳) 概日リズム(circadian rhythms)は、睡眠時と活動時の制御において重要な役割を果たす体内時計のサイクルである。 関連する概念はクロノタイプ(Chronotype)であり、これは特定の時間における活動に対する人の自然な傾向であり、個人が最も警戒的で生産的なときに通常支配する。 本研究では,導入型コンピュータプログラミング(cs1)コースの設定におけるクロノタイプについて検討する。 生徒から収集されたキーストロークデータを用いて教師なし学習によるクロノタイプの存在を調べる。 文献で報告された典型的な集団の年代型と一致し,その成果は特定の年代型と学術的成果との相関を支持する。 また、コンピュータプログラマが夜のフクロウとしてまだ人気があるステレオタイプのサポートがないこともわかりました。 分析は、2つの大学(米国とヨーロッパ)のデータに基づいて行われ、それぞれ異なる指導方法を使っている。 これら2つの文脈を比較して,プロクラシエーションと努力の観点から,学生間のより良いプログラミングプラクティスを促進するプログラム割当設計と管理について考察する。

Circadian rhythms are the cycles of our internal clock that play a key role in governing when we sleep and when we are active. A related concept is chronotype, which is a person's natural tendency toward activity at certain times of day and typically governs when the individual is most alert and productive. In this work we investigate chronotypes in the setting of an Introductory Computer Programming (CS1) course. Using keystroke data collected from students we investigate the existence of chronotypes through unsupervised learning. The chronotypes we find align with those of typical populations reported in the literature and our results support correlations of certain chronotypes to academic achievement. We also find a lack of support for the still-popular stereotype of a computer programmer as a night owl. The analyses are conducted on data from two universities, one in the US and one in Europe, that use different teaching methods. In comparison of the two contexts, we look into programming assignment design and administration that may promote better programming practices among students in terms of procrastination and effort.
翻訳日:2023-04-09 14:14:59 公開日:2021-03-01
# 地磁気における全光学アルカリ水蒸気磁気センサのヘッド誤差

Heading errors in all-optical alkali-vapor magnetometers in geomagnetic fields ( http://arxiv.org/abs/2103.01358v1 )

ライセンス: Link先を確認
W. Lee, V. G. Lucivero, M. V. Romalis, M. E. Limes, E. L. Foley, T. W. Kornack(参考訳) アルカリ金属原子磁気センサは、測定された磁場が磁場に対するセンサーの向きに依存するため、地磁気の方向誤差に悩まされる。 非線形ゼーマン分裂に加えて、2つの超微細基底状態におけるゼーマン共鳴の差は、初期スピン偏極に依存する方向誤差を生じさせる。 偏光された$^{87}\text{Rb}$原子の自由偏光を利用した全光学スカラー磁力計の方向誤差を、スピン偏光状態の異なる磁場の方向と大きさを変化させることで検討する。 高偏極限界において、下限の超微細基底状態$F = 1$がほとんど非人口化されている場合、方向誤差は解析式で補正でき、地球の磁場の2桁の誤差を減少させることができる。 また,計測されたゼーマン偏差周波数の磁場による線形性を検証する。 スピン偏極が小さくなると、2つの超微粒子状態に対するゼーマン共鳴の分裂は、測定された偏極周波数と磁場との非線型性に拍動を引き起こす。 我々は、2つの直交プローブビームがスピン沈降中の2つの超微細状態間の相対位相を測るユニークなプローブ幾何を用いて周波数シフトを補正する。

Alkali-metal atomic magnetometers suffer from heading errors in geomagnetic fields as the measured magnetic field depends on the orientation of the sensor with respect to the field. In addition to the nonlinear Zeeman splitting, the difference between Zeeman resonances in the two hyperfine ground states can also generate heading errors depending on initial spin polarization. We examine heading errors in an all-optical scalar magnetometer that uses free precession of polarized $^{87}\text{Rb}$ atoms by varying the direction and magnitude of the magnetic field at different spin polarization regimes. In the high polarization limit where the lower hyperfine ground state $F = 1$ is almost depopulated, we show that heading errors can be corrected with an analytical expression, reducing the errors by two orders of magnitude in Earth's field. We also verify the linearity of the measured Zeeman precession frequency with the magnetic field. With lower spin polarization, we find that the splitting of the Zeeman resonances for the two hyperfine states causes beating in the precession signals and nonlinearity of the measured precession frequency with the magnetic field. We correct for the frequency shifts by using the unique probe geometry where two orthogonal probe beams measure opposite relative phases between the two hyperfine states during the spin precession.
翻訳日:2023-04-09 14:14:42 公開日:2021-03-01
# 要件とソフトウェアエンジニアリングに対する価値の潜在的影響の調査

Investigating the potential impact of values on requirements and software engineering ( http://arxiv.org/abs/2103.01309v1 )

ライセンス: Link先を確認
Alistair Sutcliffe, Pete Sawyer, Wei Liu, Nelly Bencomo(参考訳) 本稿では,価値に基づくソフトウェア工学に関する調査を行い,設計的特徴を解釈した総合的価値分類法を提案する。 価値分類法は、Covid19症状トラッカーアプリケーションの設計を評価するために用いられる。

This paper describes an investigation into value-based software engineering and proposes a comprehensive value taxonomy with an interpretation of design feature implications. The value taxonomy is used to assess the design of Covid19 symptom tracker applications.
翻訳日:2023-04-09 14:14:19 公開日:2021-03-01
# スピンロック法による低磁場におけるJ結合分光

Homonuclear J-Coupling Spectroscopy at Low Magnetic Fields using Spin-Lock Induced Crossing ( http://arxiv.org/abs/2103.01289v1 )

ライセンス: Link先を確認
Stephen J. DeVience, Mason Greer, Soumyajit Mandal, Matthew S. Rosen(参考訳) 核磁気共鳴分光法(NMR)は通常、異なるプロトン種間でスペクトル分解を行うために高磁場を必要とする。 低地では、化学シフトの分散は種を分離するには不十分であり、スペクトルは単一の線しか示さない。 本研究では、スピンロック誘導交差(SLIC)と呼ばれる新しいパルスシーケンスを用いて、低磁場でスペクトルを取得できることを実証する。 これは弱いスピンロックパルスによって誘導されるエネルギーレベルの交差をプローブし、ほとんどの有機分子に対してユニークなJカップリングスペクトルを生成する。 他の低磁場Jカップリング分光法とは異なり、我々の技術はヘテロ核の存在を必要とせず、自然界のほとんどの化合物に使用できる。 我々は276kHzと20.8MHZの小さな分子上でSLIC分光を行い、SLICスペクトルは測定値とよく一致してシミュレート可能であることを示した。

Nuclear magnetic resonance (NMR) spectroscopy usually requires high magnetic fields to create spectral resolution among different proton species. At low fields, chemical shift dispersion is insufficient to separate the species, and the spectrum exhibits just a single line. In this work, we demonstrate that spectra can nevertheless be acquired at low field using a novel pulse sequence called spin-lock induced crossing (SLIC). This probes energy level crossings induced by a weak spin-locking pulse and produces a unique J-coupling spectrum for most organic molecules. Unlike other forms of low-field J-coupling spectroscopy, our technique does not require the presence of heteronuclei and can be used for most compounds in their native state. We performed SLIC spectroscopy on a number of small molecules at 276 kHz and 20.8 MHZ, and we show that SLIC spectra can be simulated in good agreement with measurements.
翻訳日:2023-04-09 14:14:15 公開日:2021-03-01
# 性能のスパイク:量子化活性化関数を用いたハイブリッドスポーキングニューラルネットワークのトレーニング

A Spike in Performance: Training Hybrid-Spiking Neural Networks with Quantized Activation Functions ( http://arxiv.org/abs/2002.03553v2 )

ライセンス: Link先を確認
Aaron R. Voelker and Daniel Rasmussen and Chris Eliasmith(参考訳) 機械学習コミュニティは、ニューラルネットワークのエネルギー効率に対する関心がますます高まっている。 スパイクニューラルネットワーク(snn)は、その活性化レベルが時間的にスパースな1ビットの値(すなわち「スパイク」イベント)に量子化され、重み付き製品に対する和を単純な重み付け(スパイクごとに1つの重み付け)に変換するので、エネルギー効率の高いコンピューティングへの有望なアプローチである。 しかし、非スパイキングネットワークをSNNに変換する際の最先端(SotA)の精度を維持するという目標は、主に1ビットの精度しか持たないスパイクのため、明らかに困難な課題のままである。 信号処理のツールを応用し、時間的に拡散した誤差を持つ量子化器としてニューラルアクティベーション機能を投入し、非スパイクとスパイクの仕組みをスムーズに補間しながらネットワークを訓練した。 本稿では,LSTM,GRU,NRUを含むSNNの繰り返しアーキテクチャを,平均3.74ビット,各重みを1.26ビットに減らしながら,SNNがStAの繰り返しアーキテクチャを高速化する最初の例を,レジェンダメモリユニット(LMU)に適用する。 ニューラルネットワークのエネルギー効率を大幅に向上させる方法について論じる。

The machine learning community has become increasingly interested in the energy efficiency of neural networks. The Spiking Neural Network (SNN) is a promising approach to energy-efficient computing, since its activation levels are quantized into temporally sparse, one-bit values (i.e., "spike" events), which additionally converts the sum over weight-activity products into a simple addition of weights (one weight for each spike). However, the goal of maintaining state-of-the-art (SotA) accuracy when converting a non-spiking network into an SNN has remained an elusive challenge, primarily due to spikes having only a single bit of precision. Adopting tools from signal processing, we cast neural activation functions as quantizers with temporally-diffused error, and then train networks while smoothly interpolating between the non-spiking and spiking regimes. We apply this technique to the Legendre Memory Unit (LMU) to obtain the first known example of a hybrid SNN outperforming SotA recurrent architectures -- including the LSTM, GRU, and NRU -- in accuracy, while reducing activities to at most 3.74 bits on average with 1.26 significant bits multiplying each weight. We discuss how these methods can significantly improve the energy efficiency of neural networks.
翻訳日:2023-01-02 08:18:48 公開日:2021-03-01
# カーネル固定点回避: ELU と GELU の無限ネットワークによる計算

Avoiding Kernel Fixed Points: Computing with ELU and GELU Infinite Networks ( http://arxiv.org/abs/2002.08517v3 )

ライセンス: Link先を確認
Russell Tsuchida, Tim Pearce, Chris van der Heide, Fred Roosta, Marcus Gallagher(参考訳) 無限に広いニューラルネットワークから生じるガウス過程の分析と計算は、最近人気が回復した。 それにもかかわらず、現代のネットワークで使用される活性化関数を持つネットワークの多くの明示的な共分散関数は未だ不明である。 さらに、ディープ・ネットワークのカーネルは反復的に計算できるが、ディープ・カーネルの理論的理解は特に固定点力学に関して欠如している。 まず,指数線形単位 (ELU) とガウス誤差線形単位 (GELU) との多層パーセプトロン (MLP) の共分散関数を導出し,いくつかのベンチマークでガウス過程の性能を評価する。 第二に、より一般的には、幅広い活性化関数に対応する繰り返しカーネルの固定点ダイナミクスを分析する。 これまでに研究されたニューラルネットワークカーネルとは異なり、これらの新しいカーネルは有限幅ニューラルネットワークにミラーされる非自明な不動点ダイナミクスを示す。 いくつかのネットワークに存在する不動点挙動は、過パラメータ深層モデルにおける暗黙の正則化のメカニズムを説明する。 本研究は, 静的Iidパラメータ共役カーネルと動的ニューラルタンジェントカーネル構築に関するものである。 github.com/RussellTsuchida/ELU_GELU_kernelsのソフトウェア。

Analysing and computing with Gaussian processes arising from infinitely wide neural networks has recently seen a resurgence in popularity. Despite this, many explicit covariance functions of networks with activation functions used in modern networks remain unknown. Furthermore, while the kernels of deep networks can be computed iteratively, theoretical understanding of deep kernels is lacking, particularly with respect to fixed-point dynamics. Firstly, we derive the covariance functions of multi-layer perceptrons (MLPs) with exponential linear units (ELU) and Gaussian error linear units (GELU) and evaluate the performance of the limiting Gaussian processes on some benchmarks. Secondly, and more generally, we analyse the fixed-point dynamics of iterated kernels corresponding to a broad range of activation functions. We find that unlike some previously studied neural network kernels, these new kernels exhibit non-trivial fixed-point dynamics which are mirrored in finite-width neural networks. The fixed point behaviour present in some networks explains a mechanism for implicit regularisation in overparameterised deep models. Our results relate to both the static iid parameter conjugate kernel and the dynamic neural tangent kernel constructions. Software at github.com/RussellTsuchida/ELU_GELU_kernels.
翻訳日:2022-12-30 06:43:07 公開日:2021-03-01
# スケールでの教育的質問マイニング:予測・分析・パーソナライゼーション

Educational Question Mining At Scale: Prediction, Analysis and Personalization ( http://arxiv.org/abs/2003.05980v2 )

ライセンス: Link先を確認
Zichao Wang, Sebastian Tschiatschek, Simon Woodhead, Jose Miguel Hernandez-Lobato, Simon Peyton Jones, Richard G. Baraniuk, Cheng Zhang(参考訳) オンライン教育プラットフォームにより、教師は質問などの多くの教育資源を共有でき、演習や学生のためのクイズを作成できる。 利用可能な質問が大量にある場合には,その属性を定量化し,生徒にインテリジェントに選び,効果的かつパーソナライズされた学習体験を実現するための自動化方法が重要である。 本研究では,大規模に教育的な質問から洞察を抽出する枠組みを提案する。 本研究では,最新のベイズディープラーニング手法,特に部分変分自動エンコーダ(p-VAE)を用いて,学生の回答を大量の質問に対して分析する。 p-vaeに基づいて,質問の質と難易度をそれぞれ定量化する2つの新しい指標と,学生の質問を適応的に選択するパーソナライズ戦略を提案する。 提案したフレームワークを,数万の質問と数千万の回答をオンライン教育プラットフォームから収集した実世界のデータセットに適用する。 我々のフレームワークは、統計メトリクスの観点から有望な結果を示すだけでなく、ドメインの専門家の評価と高度に一貫した結果を得る。

Online education platforms enable teachers to share a large number of educational resources such as questions to form exercises and quizzes for students. With large volumes of available questions, it is important to have an automated way to quantify their properties and intelligently select them for students, enabling effective and personalized learning experiences. In this work, we propose a framework for mining insights from educational questions at scale. We utilize the state-of-the-art Bayesian deep learning method, in particular partial variational auto-encoders (p-VAE), to analyze real students' answers to a large collection of questions. Based on p-VAE, we propose two novel metrics that quantify question quality and difficulty, respectively, and a personalized strategy to adaptively select questions for students. We apply our proposed framework to a real-world dataset with tens of thousands of questions and tens of millions of answers from an online education platform. Our framework not only demonstrates promising results in terms of statistical metrics but also obtains highly consistent results with domain experts' evaluation.
翻訳日:2022-12-24 15:59:40 公開日:2021-03-01
# 横断脳波分類のためのメタ更新を用いた超効率的な転送学習

Ultra Efficient Transfer Learning with Meta Update for Cross Subject EEG Classification ( http://arxiv.org/abs/2003.06113v3 )

ライセンス: Link先を確認
Tiehang Duan, Mihir Chauhan, Mohammad Abuzar Shaikh, Jun Chu, Sargur Srihari(参考訳) 脳波(EEG)信号のパターンは、被験者によって大きく異なり、脳波分類器の課題となる。 1) 学習した分類器を新しい主題に効果的に適応させる。 2) 適応後の既知の対象に関する知識の保持。 そこで本研究では,脳波の連続的分類のための,Meta UPdate Strategy (MUPS-EEG) と呼ばれる効率的な伝達学習手法を提案する。 モデルはメタ更新を用いて効果的な表現を学習し、新しい主題への適応を加速し、前の主題に対する知識の忘れを同時に軽減する。 提案するメカニズムはメタ学習から始まり,動作する。 1) 異なる対象に広く適合する特徴表現を見つけること。 2) 高速適応のための損失関数の感度を最大化する。 この方法は、ディープラーニング指向モデルすべてに適用できる。 2つの公開データセットに関する広範囲な実験により、提案されたモデルの有効性が示され、新しい主題への適応と学習対象の知識の保持という両面で、現在の芸術の水準を大きく上回っている。

The pattern of Electroencephalogram (EEG) signal differs significantly across different subjects, and poses challenge for EEG classifiers in terms of 1) effectively adapting a learned classifier onto a new subject, 2) retaining knowledge of known subjects after the adaptation. We propose an efficient transfer learning method, named Meta UPdate Strategy (MUPS-EEG), for continuous EEG classification across different subjects. The model learns effective representations with meta update which accelerates adaptation on new subject and mitigate forgetting of knowledge on previous subjects at the same time. The proposed mechanism originates from meta learning and works to 1) find feature representation that is broadly suitable for different subjects, 2) maximizes sensitivity of loss function for fast adaptation on new subject. The method can be applied to all deep learning oriented models. Extensive experiments on two public datasets demonstrate the effectiveness of the proposed model, outperforming current state of the arts by a large margin in terms of both adapting on new subject and retain knowledge of learned subjects.
翻訳日:2022-12-24 01:22:17 公開日:2021-03-01
# メタ擬似ラベル

Meta Pseudo Labels ( http://arxiv.org/abs/2003.10580v4 )

ライセンス: Link先を確認
Hieu Pham, Zihang Dai, Qizhe Xie, Minh-Thang Luong, Quoc V. Le(参考訳) 本稿では,imagenetにおける最新のtop-1精度である90.2%を達成するための半教師付き学習手法であるmeta pseudo labelsを提案する。 Pseudo Labelsのように、Meta Pseudo Labelsは教師ネットワークを持ち、未ラベルのデータに擬似ラベルを生成して学生ネットワークを教える。 しかし、教師が固定されたPseudo Labelsとは異なり、Meta Pseudo Labelsの教師は、ラベル付きデータセット上での生徒のパフォーマンスのフィードバックによって常に適応される。 その結果、教師は生徒に教えるためのより良い擬似ラベルを生成する。 私たちのコードはhttps://github.com/google-research/google-research/tree/master/meta_pseudo_labelsで利用可能です。

We present Meta Pseudo Labels, a semi-supervised learning method that achieves a new state-of-the-art top-1 accuracy of 90.2% on ImageNet, which is 1.6% better than the existing state-of-the-art. Like Pseudo Labels, Meta Pseudo Labels has a teacher network to generate pseudo labels on unlabeled data to teach a student network. However, unlike Pseudo Labels where the teacher is fixed, the teacher in Meta Pseudo Labels is constantly adapted by the feedback of the student's performance on the labeled dataset. As a result, the teacher generates better pseudo labels to teach the student. Our code will be available at https://github.com/google-research/google-research/tree/master/meta_pseudo_labels.
翻訳日:2022-12-20 23:40:49 公開日:2021-03-01
# inferential statisticians (inferential statisticians) のレアイベント予測モデリング入門 - ブレークスルー特許の予測への応用-

Introduction to Rare-Event Predictive Modeling for Inferential Statisticians -- A Hands-On Application in the Prediction of Breakthrough Patents ( http://arxiv.org/abs/2003.13441v2 )

ライセンス: Link先を確認
Daniel Hain, Roman Jurowetzki(参考訳) 近年は定量的手法が大幅に発展し、主にコンピュータサイエンスコミュニティが主導し、予測モデリングに重点を置く機械学習アプリケーションの開発を目標にしている。 しかし、これまでのところ、経済、経営、技術予測の研究は予測モデリング技術やワークフローの適用をためらっている。 本稿では,予測性能の最適化を目的とした定量的分析のための機械学習(ML)アプローチを提案する。 この2つのフィールド間の潜在的なシナジーについて、一見すると目標非互換性の背景から議論する。 本稿では,モデル検証,変数選択,モデル選択,一般化,ハイパーパラメータチューニング手順といった予測モデリングの基本概念について述べる。 我々は,コンピュータサイエンスの用語のデミスティフィケーションを目指して,定量的社会科学の聴衆に手持ちの予測モデルの導入を行っている。 特許品質推定の例 — サイエントメトリックスコミュニティではおなじみのトピックであるべき – を使って,さまざまなモデルクラスとデータ前処理,モデリング,検証手順を通じて読者をガイドしています。 まず、モデルクラス(ロジットと弾性ネット)を解釈し、あまり馴染みのない非パラメトリックなアプローチ(分類木、ランダムフォレスト、勾配ブーストツリー)で継続し、最後に、単純なフィードフォワード、そしてレアイベント予測のためのディープオートエンコーダという、ニューラルネットワークアーキテクチャを提示します。

Recent years have seen a substantial development of quantitative methods, mostly led by the computer science community with the goal of developing better machine learning applications, mainly focused on predictive modeling. However, economic, management, and technology forecasting research has so far been hesitant to apply predictive modeling techniques and workflows. In this paper, we introduce a machine learning (ML) approach to quantitative analysis geared towards optimizing the predictive performance, contrasting it with standard practices inferential statistics, which focus on producing good parameter estimates. We discuss the potential synergies between the two fields against the backdrop of this, at first glance, target-incompatibility. We discuss fundamental concepts in predictive modeling, such as out-of-sample model validation, variable and model selection, generalization, and hyperparameter tuning procedures. We are providing a hands-on predictive modeling introduction for a quantitative social science audience while aiming at demystifying computer science jargon. We use the illustrative example of patent quality estimation - which should be a familiar topic of interest in the Scientometrics community - guiding the reader through various model classes and procedures for data pre-processing, modeling, and validation. We start off with more familiar easy to interpret model classes (Logit and Elastic Nets), continues with less familiar non-parametric approaches (Classification Trees, Random Forest, Gradient Boosted Trees), and finally presents artificial neural network architectures, first a simple feed-forward and then a deep autoencoder geared towards rare-event prediction.
翻訳日:2022-12-18 06:51:39 公開日:2021-03-01
# 固定経路サービスにおける混合燃料公共交通のエネルギー利用の最小化

Minimizing Energy Use of Mixed-Fleet Public Transit for Fixed-Route Service ( http://arxiv.org/abs/2004.05146v4 )

ライセンス: Link先を確認
Amutheezan Sivagnanam, Afiya Ayman, Michael Wilbur, Philip Pugliese, Abhishek Dubey, Aron Laszka(参考訳) 住民が雇用、教育、その他のサービスにアクセスできるようにするため、公共交通サービスはコミュニティにとって不可欠である。 残念なことに、広範囲をカバーする交通サービスは比較的低い利用率に悩まされがちで、1マイルあたりの乗客あたりの燃料消費が増加し、運用コストと環境への影響が高まる。 電気自動車(ev)はエネルギーコストと環境への影響を低減できるが、ほとんどの公共交通機関はevの先行コストが高いため、従来の内燃機関車両と組み合わせる必要がある。 このような混合車両を最大限に利用するためには、交通機関はルート割り当てと充電スケジュールを最適化する必要がある。 本稿では,既存の固定経路の運行スケジュールを守りながら,車両を走行旅行に割り当て,充電を予定することで,燃料・電気使用を最小化するための新しい問題定式化を提案する。 本稿では,最適割当とスケジューリングのための整数プログラムを提案し,大規模ネットワークを対象とした多項式時間ヒューリスティックおよびメタヒューリスティックアルゴリズムを提案する。 交通機関から収集した運用データを用いて, チャタヌーガの公共交通機関におけるアルゴリズムの評価を行った。 その結果,提案手法はスケーラブルであり,エネルギー消費を低減し,環境影響や運用コストを低減できることがわかった。 チャタヌーガにとって、提案されているアルゴリズムは年間15,635ドルのエネルギーコストと576.7トンのco2排出量を節約できる。

Affordable public transit services are crucial for communities since they enable residents to access employment, education, and other services. Unfortunately, transit services that provide wide coverage tend to suffer from relatively low utilization, which results in high fuel usage per passenger per mile, leading to high operating costs and environmental impact. Electric vehicles (EVs) can reduce energy costs and environmental impact, but most public transit agencies have to employ them in combination with conventional, internal-combustion engine vehicles due to the high upfront costs of EVs. To make the best use of such a mixed fleet of vehicles, transit agencies need to optimize route assignments and charging schedules, which presents a challenging problem for large transit networks. We introduce a novel problem formulation to minimize fuel and electricity use by assigning vehicles to transit trips and scheduling them for charging, while serving an existing fixed-route transit schedule. We present an integer program for optimal assignment and scheduling, and we propose polynomial-time heuristic and meta-heuristic algorithms for larger networks. We evaluate our algorithms on the public transit service of Chattanooga, TN using operational data collected from transit vehicles. Our results show that the proposed algorithms are scalable and can reduce energy use and, hence, environmental impact and operational costs. For Chattanooga, the proposed algorithms can save $145,635 in energy costs and 576.7 metric tons of CO2 emission annually.
翻訳日:2022-12-14 21:20:00 公開日:2021-03-01
# 銃の音: 銃のオーディオサンプルのデジタル鑑定が人工知能と出会う

Sound of Guns: Digital Forensics of Gun Audio Samples meets Artificial Intelligence ( http://arxiv.org/abs/2004.07948v2 )

ライセンス: Link先を確認
Simone Raponi, Isra Ali, Gabriele Oligeri(参考訳) 銃口の爆発に基づく武器の分類は、様々な安全保障や軍事分野に重要な応用をもたらす難しい課題である。 既存の研究の多くは、同じ銃弾の複数のレプリカを捉えるために、空間的に多様なマイクロフォンセンサーのアドホックな展開に依存しており、音響源の正確な検出と同定を可能にしている。 しかし、犯罪現場鑑定などのシナリオでは、慎重に制御された設定は入手が困難であり、前述の手法は適用不可能で実用的ではない。 本稿では,マイクロホンとシューターの相対的な位置を全く意識せず,記録装置の知識をゼロにする新しい手法を提案する。 われわれのソリューションは、YouTubeビデオから抽出された3655サンプルからなるデータセットで90%以上の精度で、銃のカテゴリ、口径、モデルを特定することができる。 本研究は,畳み込みニューラルネットワーク(cnn)をショット分類に適用することにより,アドホック設定の必要性をなくし,分類性能を大幅に向上させる効果と効率を示す。

Classifying a weapon based on its muzzle blast is a challenging task that has significant applications in various security and military fields. Most of the existing works rely on ad-hoc deployment of spatially diverse microphone sensors to capture multiple replicas of the same gunshot, which enables accurate detection and identification of the acoustic source. However, carefully controlled setups are difficult to obtain in scenarios such as crime scene forensics, making the aforementioned techniques inapplicable and impractical. We introduce a novel technique that requires zero knowledge about the recording setup and is completely agnostic to the relative positions of both the microphone and shooter. Our solution can identify the category, caliber, and model of the gun, reaching over 90% accuracy on a dataset composed of 3655 samples that are extracted from YouTube videos. Our results demonstrate the effectiveness and efficiency of applying Convolutional Neural Network (CNN) in gunshot classification eliminating the need for an ad-hoc setup while significantly improving the classification performance.
翻訳日:2022-12-13 04:32:42 公開日:2021-03-01
# 語彙表現のための一意的特徴表現

A Unified Feature Representation for Lexical Connotations ( http://arxiv.org/abs/2006.00635v2 )

ライセンス: Link先を確認
Emily Allaway and Kathleen McKeown(参考訳) 思想的態度や姿勢は、しばしば言葉や句の微妙な意味を通して表現される。 これらの意味を理解することは、話者の文化的、感情的な視点を理解する上で重要である。 本稿では,名詞や形容詞の意味を表す新しい語彙資源を作成するために,遠隔ラベリングを用いる。 我々の分析によると、それは人間の判断とよく一致している。 さらに,埋め込み空間内の含意をキャプチャする語彙表現を作成する手法を提案し,データ制限時の姿勢検出のタスクにおいて統計的に有意な改善をもたらすことを示す。

Ideological attitudes and stance are often expressed through subtle meanings of words and phrases. Understanding these connotations is critical to recognizing the cultural and emotional perspectives of the speaker. In this paper, we use distant labeling to create a new lexical resource representing connotation aspects for nouns and adjectives. Our analysis shows that it aligns well with human judgments. Additionally, we present a method for creating lexical representations that captures connotations within the embedding space and show that using the embeddings provides a statistically significant improvement on the task of stance detection when data is limited.
翻訳日:2022-11-26 12:49:08 公開日:2021-03-01
# 最長経路長を用いた計画長境界計算

Computing Plan-Length Bounds Using Lengths of Longest Paths ( http://arxiv.org/abs/2006.01011v2 )

ライセンス: Link先を確認
Mohammad Abdulaziz and Dominik Berger(参考訳) 我々は、古典的な計画に現れる状態空間のように、分解された状態空間における最も長い単純な経路の長さを正確に計算する手法を考案する。 この問題の複雑さはNEXP-Hardであるが,本手法は計画長の上限値の計算に有効であることを示す。 計算された上界は, 従来の境界技術による境界よりも有意に(多くの場合, 桁数)良く, SATに基づく計画の改善に有効であることを示す。

We devise a method to exactly compute the length of the longest simple path in factored state spaces, like state spaces encountered in classical planning. Although the complexity of this problem is NEXP-Hard, we show that our method can be used to compute practically useful upper-bounds on lengths of plans. We show that the computed upper-bounds are significantly (in many cases, orders of magnitude) better than bounds produced by previous bounding techniques and that they can be used to improve the SAT-based planning.
翻訳日:2022-11-26 07:16:57 公開日:2021-03-01
# 畳み込み変分オートエンコーダを用いた完全教師なしダイバーシティDenoising

Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders ( http://arxiv.org/abs/2006.06072v2 )

ライセンス: Link先を確認
Mangal Prakash, Alexander Krull, Florian Jug(参考訳) ディープラーニングベースの手法は、事実上すべての画像復元タスクにおいて、不可解なリーダとして現れています。 特に顕微鏡画像の領域では、取得したデータの解釈性を改善するために様々なコンテンツ認識画像復元(CARE)アプローチが用いられている。 当然、破損した画像で復元できるものには制限があり、すべての逆問題と同様に、多くの潜在的な解決策が存在し、そのうちの1つを選択する必要がある。 本稿では,完全畳み込み型変分オートエンコーダ(vaes)に基づくデノジング手法であるdivnoisingを提案する。 まず, 撮像ノイズモデルをデコーダに明示的に組み込むことで, 教師なしの雑音発生問題をVAEフレームワーク内に定式化する手法を提案する。 提案手法は完全に教師なしであり,ノイズ画像と画像雑音分布の適切な記述のみを要求できる。 このようなノイズモデルを計測したり、ノイズデータからブートストラップしたり、トレーニング中に共同学習したりすることが可能であることを示す。 もし望めば、コンセンサス予測は一連の分割予測から推測でき、他の教師なしの方法と競合し、時には教師なしの状態でも結果が得られる。 後方からサンプルを分離することで、多くの有用な応用が可能になる。 私たちは (i)13データセットの復調結果を示す。 (II)光学式文字認識(OCR)の応用が様々な予測の恩恵を享受し得るか、また、 (iii)多様な分割予測を用いた場合のインスタンスセルのセグメンテーションの改善効果を示す。

Deep Learning based methods have emerged as the indisputable leaders for virtually all image restoration tasks. Especially in the domain of microscopy images, various content-aware image restoration (CARE) approaches are now used to improve the interpretability of acquired data. Naturally, there are limitations to what can be restored in corrupted images, and like for all inverse problems, many potential solutions exist, and one of them must be chosen. Here, we propose DivNoising, a denoising approach based on fully convolutional variational autoencoders (VAEs), overcoming the problem of having to choose a single solution by predicting a whole distribution of denoised images. First we introduce a principled way of formulating the unsupervised denoising problem within the VAE framework by explicitly incorporating imaging noise models into the decoder. Our approach is fully unsupervised, only requiring noisy images and a suitable description of the imaging noise distribution. We show that such a noise model can either be measured, bootstrapped from noisy data, or co-learned during training. If desired, consensus predictions can be inferred from a set of DivNoising predictions, leading to competitive results with other unsupervised methods and, on occasion, even with the supervised state-of-the-art. DivNoising samples from the posterior enable a plethora of useful applications. We are (i) showing denoising results for 13 datasets, (ii) discussing how optical character recognition (OCR) applications can benefit from diverse predictions, and are (iii) demonstrating how instance cell segmentation improves when using diverse DivNoising predictions.
翻訳日:2022-11-23 05:24:07 公開日:2021-03-01
# 暗黙のカーネル注意

Implicit Kernel Attention ( http://arxiv.org/abs/2006.06147v3 )

ライセンス: Link先を確認
Kyungwoo Song, Yohan Jung, Dongjun Kim, Il-Chul Moon(参考訳) \textit{Attention} は表現間の依存関係を計算し、重要な選択機能にフォーカスするようモデルに促す。 トランスフォーマやグラフアテンションネットワーク(gat)などのアテンションベースモデルがシーケンシャルデータやグラフ構造化データに広く利用されている。 本稿では,Transformer と GAT における注目の新たな解釈と一般化構造を提案する。 Transformer と GAT の注目点については、注意が2つの部分の積であることから導かれる。 1)RBFカーネルは2つのインスタンスの類似性を測定する。 2) 個々のインスタンスの重要性を計算するための$L^{2}$ normの指数関数。 この分解から注意を3つの方法で一般化する。 まず,手動のカーネル選択ではなく,暗黙のカーネル機能による暗黙のカーネル注意を提案する。 次に、$L^{2}$ normを$L^{p}$ normとして一般化する。 第3に,マルチヘッドの構造化に注意を向ける。 一般的な注意は,分類,翻訳,回帰タスクにおいて優れた性能を示す。

\textit{Attention} computes the dependency between representations, and it encourages the model to focus on the important selective features. Attention-based models, such as Transformer and graph attention network (GAT), are widely utilized for sequential data and graph-structured data. This paper suggests a new interpretation and generalized structure of the attention in Transformer and GAT. For the attention in Transformer and GAT, we derive that the attention is a product of two parts: 1) the RBF kernel to measure the similarity of two instances and 2) the exponential of $L^{2}$ norm to compute the importance of individual instances. From this decomposition, we generalize the attention in three ways. First, we propose implicit kernel attention with an implicit kernel function instead of manual kernel selection. Second, we generalize $L^{2}$ norm as the $L^{p}$ norm. Third, we extend our attention to structured multi-head attention. Our generalized attention shows better performance on classification, translation, and regression tasks.
翻訳日:2022-11-22 09:44:25 公開日:2021-03-01
# 人間-MARLチームにおける人間とマルチエージェントのコラボレーション

Human and Multi-Agent collaboration in a human-MARL teaming framework ( http://arxiv.org/abs/2006.07301v2 )

ライセンス: Link先を確認
Neda Navidi, Francoi Chabo, Saga Kurandwa, Iv Lutigma, Vincent Robt, Gregry Szrftgr, Andea Schuh(参考訳) 強化学習は、観察、報酬、エージェント間の内的相互作用から学習するエージェントに効果的な結果を与える。 本研究では,学習の源泉として人間とエージェントの相互作用を効率的に活用するオープンソースMARLフレームワークであるCOGMENTを提案する。 我々は、RLエージェントによって駆動される無人航空機による設計されたリアルタイム環境を用いて、人間と協調してこれらのイノベーションを実証する。 本研究の結果から,提案する協調パラダイムとオープンソースフレームワークは,人的努力と探査費用の両面で大幅な削減につながることが明らかとなった。

Reinforcement learning provides effective results with agents learning from their observations, received rewards, and internal interactions between agents. This study proposes a new open-source MARL framework, called COGMENT, to efficiently leverage human and agent interactions as a source of learning. We demonstrate these innovations by using a designed real-time environment with unmanned aerial vehicles driven by RL agents, collaborating with a human. The results of this study show that the proposed collaborative paradigm and the open-source framework leads to significant reductions in both human effort and exploration costs.
翻訳日:2022-11-22 03:26:36 公開日:2021-03-01
# マルチタスク変動情報ボトルネック

Multi-Task Variational Information Bottleneck ( http://arxiv.org/abs/2007.00339v4 )

ライセンス: Link先を確認
Weizhu Qian, Bowei Chen, Yichao Zhang, Guanghui Wen and Franck Gechter(参考訳) マルチタスク学習(MTL)は、機械学習と人工知能において重要な課題である。 コンピュータビジョン、信号処理、音声認識への応用は至るところで行われている。 この課題は最近かなりの注目を集めているが、既存のモデルの異なるタスクに対するパフォーマンスと堅牢性はバランスが取れていない。 本稿では、変動情報ボトルネック(VIB)のアーキテクチャに基づくMTLモデルを提案する。 敵攻撃下での3つの公開データセットの広範囲な観測により、提案モデルは予測精度に関する最先端のアルゴリズムと競合することが示された。 実験結果から,VIBとタスク依存不確実性を組み合わせることは,複数のタスクを達成するための入力特徴から有効な情報を抽象化する上で極めて有効な方法であることが示唆された。

Multi-task learning (MTL) is an important subject in machine learning and artificial intelligence. Its applications to computer vision, signal processing, and speech recognition are ubiquitous. Although this subject has attracted considerable attention recently, the performance and robustness of the existing models to different tasks have not been well balanced. This article proposes an MTL model based on the architecture of the variational information bottleneck (VIB), which can provide a more effective latent representation of the input features for the downstream tasks. Extensive observations on three public data sets under adversarial attacks show that the proposed model is competitive to the state-of-the-art algorithms concerning the prediction accuracy. Experimental results suggest that combining the VIB and the task-dependent uncertainties is a very effective way to abstract valid information from the input features for accomplishing multiple tasks.
翻訳日:2022-11-14 22:09:15 公開日:2021-03-01
# グラフベース連続学習

Graph-Based Continual Learning ( http://arxiv.org/abs/2007.04813v2 )

ライセンス: Link先を確認
Binh Tang, David S. Matteson(参考訳) 大幅な進歩にもかかわらず、継続学習モデルは、非定常分布から段階的に利用可能なデータに露出した場合、破滅的な忘れを被る。 リハーサルアプローチは、しばしば独立したメモリスロットの配列として実装される以前のサンプルの小さなエピソディックメモリを維持して再生することで問題を緩和する。 そこで本研究では,学習可能なランダムグラフを用いて,サンプル間のペアの類似性を捕捉し,新しいタスクの学習だけでなく,忘れることの防止にも利用することを提案する。 いくつかのベンチマークデータセットの実証結果から,最近提案されたタスクフリー連続学習のベースラインを一貫して上回る結果が得られた。

Despite significant advances, continual learning models still suffer from catastrophic forgetting when exposed to incrementally available data from non-stationary distributions. Rehearsal approaches alleviate the problem by maintaining and replaying a small episodic memory of previous samples, often implemented as an array of independent memory slots. In this work, we propose to augment such an array with a learnable random graph that captures pairwise similarities between its samples, and use it not only to learn new tasks but also to guard against forgetting. Empirical results on several benchmark datasets show that our model consistently outperforms recently proposed baselines for task-free continual learning.
翻訳日:2022-11-12 03:31:17 公開日:2021-03-01
# 米国トウモロコシベルトの作物収量予測を改良した機械学習と作物のモデリング

Coupling Machine Learning and Crop Modeling Improves Crop Yield Prediction in the US Corn Belt ( http://arxiv.org/abs/2008.04060v2 )

ライセンス: Link先を確認
Mohsen Shahhosseini, Guiping Hu, Sotirios V. Archontoulis, Isaiah Huber(参考訳) 本研究では,米トウモロコシベルトにおける作物モデルと機械学習(ml)の結合がトウモロコシ収量予測を改善するかを検討する。 主な目的は、ハイブリッドアプローチ(crop modeling + ml)がより良い予測をもたらすかどうかを調べ、最も正確な予測を提供するハイブリッドモデルの組み合わせを調査し、トウモロコシ収量予測のためにmlと統合するのが最も効果的である作物モデリングから特徴を決定することである。 5つのMLモデル(線形回帰、LASSO、LightGBM、ランダムフォレスト、XGBoost)と6つのアンサンブルモデルが研究課題に対処するために設計されている。 その結果,MLモデルに入力特徴としてシミュレーション作物モデル変数(APSIM)を追加することで,収率予測根平均二乗誤差(RMSE)を7~20%削減できることがわかった。 さらに, ML予測モデルにAPSIMの特徴を部分的に含み, 土壌水分関連APSIM変数がML予測に最も影響していること, そして作物関連および表現学関連変数について検討した。 最後に, 特徴量から, 成長期におけるAPSIM平均干ばつ応力と平均水テーブル深さがMLへの最も重要なAPSIM入力であることがわかった。 この結果は、気象情報だけでは不十分であり、mlモデルは収量予測を改善するためにより多くの水文入力を必要とすることを示している。

This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybrid models provide the most accurate predictions, and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction. Five ML models (linear regression, LASSO, LightGBM, random forest, and XGBoost) and six ensemble models have been designed to address the research question. The results suggest that adding simulation crop model variables (APSIM) as input features to ML models can decrease yield prediction root mean squared error (RMSE) from 7 to 20%. Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models and we found soil moisture related APSIM variables are most influential on the ML predictions followed by crop-related and phenology-related variables. Finally, based on feature importance measure, it has been observed that simulated APSIM average drought stress and average water table depth during the growing season are the most important APSIM inputs to ML. This result indicates that weather information alone is not sufficient and ML models need more hydrological inputs to make improved yield predictions.
翻訳日:2022-11-06 03:18:21 公開日:2021-03-01
# 安全制約による学習:拘束型MDPにおける強化学習の複雑さ

Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs ( http://arxiv.org/abs/2008.00311v3 )

ライセンス: Link先を確認
Aria HasanzadeZonuzy, Archana Bura, Dileep Kalathil and Srinivas Shakkottai(参考訳) 多くの物理的なシステムには、一連の制約の満足度を保証するためのポリシーが必要とされる、基本的な安全上の考慮事項がある。 解析的定式化は通常、制約マルコフ決定過程(CMDP)の形式をとる。 CMDPが未知の場合に着目し、RLアルゴリズムはモデルを発見し、最適な制約付きポリシーを計算するためのサンプルを取得する。 当社の目標は、安全制約と、目標の最大化と制約満足度という、望ましいレベルの正確性を保証するために必要なサンプル数との関係を、pacの意味で特徴付けることです。 RLアルゴリズムの2つのクラス、すなわち (i)生成モデルに基づくアプローチで、まずはモデルを推定するためにサンプルを取ります。 (ii) サンプルの取得によってモデルが更新されるオンラインアプローチ。 我々の主な発見は、制約のない状態の最もよく知られた境界と比較して、制約数に対数的な因子によって制約付きRLアルゴリズムのサンプル複雑性が増大することであり、実際のシステムで容易に利用できることを示唆している。

Many physical systems have underlying safety considerations that require that the policy employed ensures the satisfaction of a set of constraints. The analytical formulation usually takes the form of a Constrained Markov Decision Process (CMDP). We focus on the case where the CMDP is unknown, and RL algorithms obtain samples to discover the model and compute an optimal constrained policy. Our goal is to characterize the relationship between safety constraints and the number of samples needed to ensure a desired level of accuracy -- both objective maximization and constraint satisfaction -- in a PAC sense. We explore two classes of RL algorithms, namely, (i) a generative model based approach, wherein samples are taken initially to estimate a model, and (ii) an online approach, wherein the model is updated as samples are obtained. Our main finding is that compared to the best known bounds of the unconstrained regime, the sample complexity of constrained RL algorithms are increased by a factor that is logarithmic in the number of constraints, which suggests that the approach may be easily utilized in real systems.
翻訳日:2022-11-04 00:19:52 公開日:2021-03-01
# 逆問題に対する学習凸正則化器

Learned convex regularizers for inverse problems ( http://arxiv.org/abs/2008.02839v2 )

ライセンス: Link先を確認
Subhadip Mukherjee, S\"oren Dittmer, Zakhar Shumaylov, Sebastian Lunz, Ozan \"Oktem, and Carola-Bibiane Sch\"onlieb(参考訳) 本稿では,逆問題に対する変分再構成フレームワークを検討し,データ適応型入力凸ニューラルネットワーク(icnn)を正規化関数として学習する。 ICNNベースの凸正則化器は、非正規化された再構成から地中真実像を識別するために逆向きに訓練される。 正則化器の凸性はそれ以来望ましい (i)対応する変分再構成問題に対する解析的収束保証を確立することができる。 二 再建のための効率的かつ証明可能なアルゴリズムを考案すること。 特に, 雑音のノルムに関して, ペナルティパラメータが非線形に減衰した場合, 変動問題に対する最適解が基底構造に収束することを示す。 さらに,反復を伴うパラメータ空間の誤差を単調に減少させる下位勾配に基づくアルゴリズムの存在を証明した。 本手法の逆問題に対する性能を示すために,ct(ct)における自然画像の浮揚と再構成の課題を考察し,提案する凸正規化器は,逆問題に対する最先端のデータ駆動技術と少なくとも競合し,時には優れていることを示す。

We consider the variational reconstruction framework for inverse problems and propose to learn a data-adaptive input-convex neural network (ICNN) as the regularization functional. The ICNN-based convex regularizer is trained adversarially to discern ground-truth images from unregularized reconstructions. Convexity of the regularizer is desirable since (i) one can establish analytical convergence guarantees for the corresponding variational reconstruction problem and (ii) devise efficient and provable algorithms for reconstruction. In particular, we show that the optimal solution to the variational problem converges to the ground-truth if the penalty parameter decays sub-linearly with respect to the norm of the noise. Further, we prove the existence of a sub-gradient-based algorithm that leads to a monotonically decreasing error in the parameter space with iterations. To demonstrate the performance of our approach for solving inverse problems, we consider the tasks of deblurring natural images and reconstructing images in computed tomography (CT), and show that the proposed convex regularizer is at least competitive with and sometimes superior to state-of-the-art data-driven techniques for inverse problems.
翻訳日:2022-11-02 06:44:36 公開日:2021-03-01
# 敵対的訓練と証明可能な堅牢性:2つの目的の物語

Adversarial Training and Provable Robustness: A Tale of Two Objectives ( http://arxiv.org/abs/2008.06081v3 )

ライセンス: Link先を確認
Jiameng Fan, Wenchao Li(参考訳) 本稿では,敵対的トレーニングと証明可能なロバスト性検証を組み合わせたニューラルネットワークのトレーニングフレームワークを提案する。 本研究では,経験的かつ証明可能なロバスト性目標を併用した共同最適化問題としてトレーニング問題を定式化し,確率的多段階の偏りを排除できる新しい勾配-蛍光法を開発した。 提案手法の収束に関する理論的解析と最先端技術との比較を行った。 mnist と cifar-10 の結果,本手法は証明可能な l infinity robustness の先行手法と一貫して一致するか,より優れることが示された。 特に, epsilon = 0.3でmnistで6.60%, cifar-10で66.57%, epsilon = 8/255で検証した。

We propose a principled framework that combines adversarial training and provable robustness verification for training certifiably robust neural networks. We formulate the training problem as a joint optimization problem with both empirical and provable robustness objectives and develop a novel gradient-descent technique that can eliminate bias in stochastic multi-gradients. We perform both theoretical analysis on the convergence of the proposed technique and experimental comparison with state-of-the-arts. Results on MNIST and CIFAR-10 show that our method can consistently match or outperform prior approaches for provable l infinity robustness. Notably, we achieve 6.60% verified test error on MNIST at epsilon = 0.3, and 66.57% on CIFAR-10 with epsilon = 8/255.
翻訳日:2022-10-30 22:36:36 公開日:2021-03-01
# 神経自己回帰密度推定器による因果効果の推定

Estimating Causal Effects with the Neural Autoregressive Density Estimator ( http://arxiv.org/abs/2008.07283v2 )

ライセンス: Link先を確認
Sergio Garrido, Stanislav S. Borysov, Jeppe Rich, Francisco C. Pereira(参考訳) 基礎となるシステムが積極的に介入される状況において、因果効果の推定は基本的なものである。 因果推論エンジンを構築する部分的には、変数が互いにどう関係しているかを定義すること、すなわち条件付き依存関係が与えられた変数間の機能的関係を定義することである。 本稿では,神経自己回帰的密度推定器を用いて因果モデルにおける線形関係の一般的な仮定から逸脱し,真珠のdo-calculusフレームワークにおける因果効果を推定する。 合成データを用いて,変数間の相互作用を明示的にモデル化することなく,非線形システムから因果効果を抽出できることを示す。

Estimation of causal effects is fundamental in situations were the underlying system will be subject to active interventions. Part of building a causal inference engine is defining how variables relate to each other, that is, defining the functional relationship between variables given conditional dependencies. In this paper, we deviate from the common assumption of linear relationships in causal models by making use of neural autoregressive density estimators and use them to estimate causal effects within the Pearl's do-calculus framework. Using synthetic data, we show that the approach can retrieve causal effects from non-linear systems without explicitly modeling the interactions between the variables.
翻訳日:2022-10-28 02:49:48 公開日:2021-03-01
# 深層学習を考慮した写像の多様体制御

Control on the Manifolds of Mappings with a View to the Deep Learning ( http://arxiv.org/abs/2008.12702v2 )

ライセンス: Link先を確認
Andrei Agrachev, Andrey Sarychev(参考訳) ニューラルネットワーク(ANN)の深層学習は補間問題の特定のクラスとして扱うことができる。 目的は、入力出力マップが有限または無限のトレーニングセット上の所望のマップをうまく近似するニューラルネットワークを見つけることである。 我々のアイデアは、非線形連続時間制御系から生じる入出力マップを近似するものである。 この制限下では、そのような制御システムは、時間変数によってラベル付けされた各層が連続したネットワークと見なすことができる。 各時点における制御の値は、その層のパラメータである。

Deep learning of the Artificial Neural Networks (ANN) can be treated as a particular class of interpolation problems. The goal is to find a neural network whose input-output map approximates well the desired map on a finite or an infinite training set. Our idea consists of taking as an approximant the input-output map, which arises from a nonlinear continuous-time control system. In the limit such control system can be seen as a network with a continuum of layers, each one labelled by the time variable. The values of the controls at each instant of time are the parameters of the layer.
翻訳日:2022-10-24 02:41:36 公開日:2021-03-01
# マルチタスク学習のための共有ルーティングの強化

Boosting Share Routing for Multi-task Learning ( http://arxiv.org/abs/2009.00387v2 )

ライセンス: Link先を確認
Xiaokai Chen and Xiaoguang Gu and Libo Fu(参考訳) マルチタスク学習(MTL)は、マルチタスク監視信号に含まれる知識をフル活用して、全体的なパフォーマンスを向上させることを目的としている。 複数のタスクの知識を適切に共有する方法は、MTLにとってオープンな問題である。 既存のディープmtlモデルはパラメータ共有に基づいている。 しかし、タスク間の関係が複雑であるため、適切な共有メカニズムの設計は困難である。 本稿では,与えられたmtl問題の適切な共有経路を効率的に見つけるために,マルチタスク・ニューラル・アーキテクチャ・サーチ(mtnas)と呼ばれる汎用フレームワークを提案する。 MTNASは共有部分を複数のサブネットワーク層にモジュール化する。 サブネットワーク間の疎結合を可能にし、ゲーティングに基づくソフトシェアリングを特定のルートで可能にする。 このような状況から、我々の検索空間における各候補アーキテクチャーは、従来のアプローチに比べてより柔軟な動的スパース共有経路を定義する。 既存の共有手法は検索空間のサブグラフであることを示す。 3つの実世界のレコメンデーションデータセットに関する大規模な実験は、MTANSが高い計算効率を維持しながらシングルタスクモデルや典型的なマルチタスク手法と比較して一貫した改善を達成することを示した。 さらに詳細な実験では、MTNASは負転移を緩和するために適切なスパース経路を学習できることを示した。

Multi-task learning (MTL) aims to make full use of the knowledge contained in multi-task supervision signals to improve the overall performance. How to make the knowledge of multiple tasks shared appropriately is an open problem for MTL. Most existing deep MTL models are based on parameter sharing. However, suitable sharing mechanism is hard to design as the relationship among tasks is complicated. In this paper, we propose a general framework called Multi-Task Neural Architecture Search (MTNAS) to efficiently find a suitable sharing route for a given MTL problem. MTNAS modularizes the sharing part into multiple layers of sub-networks. It allows sparse connection among these sub-networks and soft sharing based on gating is enabled for a certain route. Benefiting from such setting, each candidate architecture in our search space defines a dynamic sparse sharing route which is more flexible compared with full-sharing in previous approaches. We show that existing typical sharing approaches are sub-graphs in our search space. Extensive experiments on three real-world recommendation datasets demonstrate MTANS achieves consistent improvement compared with single-task models and typical multi-task methods while maintaining high computation efficiency. Furthermore, in-depth experiments demonstrates that MTNAS can learn suitable sparse route to mitigate negative transfer.
翻訳日:2022-10-23 00:44:40 公開日:2021-03-01
# 量子化のための乗算器の交互方向法

Alternating Direction Method of Multipliers for Quantization ( http://arxiv.org/abs/2009.03482v2 )

ライセンス: Link先を確認
Tianjian Huang, Prajwal Singhania, Maziar Sanjabi, Pabitra Mitra and Meisam Razaviyayn(参考訳) ディープニューラルネットワークのような機械学習モデルのパラメータの量子化には、制約付き最適化問題(制約集合は、多くの単純な離散集合のデカルト積によって形成される)を解決する必要がある。 このような最適化問題に対して、離散最適化問題に適用された広く使われているADMM法の変種である量子化用乗算器の交互方向法($\texttt{ADMM-Q}$)アルゴリズムの性能について検討する。 我々は、$\texttt{ADMM-Q}$の反復の収束を、ある$\textit{stationary points}$に設定する。 我々の知る限りでは、これは離散変数/制約問題に対するADMM型手法の最初の解析である。 理論的知見に基づいて,不正確な更新ルールを処理できる$\texttt{ADMM-Q}$のいくつかの変種を開発し,"ソフトプロジェクション"と"ランダムネスをアルゴリズムに注入することで,性能を改善した。 提案手法の有効性を実証的に評価する。

Quantization of the parameters of machine learning models, such as deep neural networks, requires solving constrained optimization problems, where the constraint set is formed by the Cartesian product of many simple discrete sets. For such optimization problems, we study the performance of the Alternating Direction Method of Multipliers for Quantization ($\texttt{ADMM-Q}$) algorithm, which is a variant of the widely-used ADMM method applied to our discrete optimization problem. We establish the convergence of the iterates of $\texttt{ADMM-Q}$ to certain $\textit{stationary points}$. To the best of our knowledge, this is the first analysis of an ADMM-type method for problems with discrete variables/constraints. Based on our theoretical insights, we develop a few variants of $\texttt{ADMM-Q}$ that can handle inexact update rules, and have improved performance via the use of "soft projection" and "injecting randomness to the algorithm". We empirically evaluate the efficacy of our proposed approaches.
翻訳日:2022-10-20 21:38:31 公開日:2021-03-01
# 深部特徴を用いた視覚品質評価の再現性に関する批判的分析

Critical analysis on the reproducibility of visual quality assessment using deep features ( http://arxiv.org/abs/2009.05369v3 )

ライセンス: Link先を確認
Franz G\"otz-Hahn and Vlad Hosu and Dietmar Saupe(参考訳) 教師付き機械学習モデルのトレーニングに使用されるデータは、一般的に独立したトレーニング、検証、テストセットに分割される。 本稿では,非参照画像と映像品質評価文献に複雑なデータ漏洩事件が発生したことを示す。 最近、いくつかの雑誌の論文が、この分野で最も優れた成績を報告している。 しかし,本研究では,テストセットからの情報を異なる方法でトレーニングプロセスで不適切に使用し,要求性能が達成できないことを示した。 データ漏洩を補正する場合、そのアプローチのパフォーマンスは最先端よりも大きなマージンで低下する。 さらに,提案手法の終末変動について検討するが,原点では改善されない。

Data used to train supervised machine learning models are commonly split into independent training, validation, and test sets. This paper illustrates that complex data leakage cases have occurred in the no-reference image and video quality assessment literature. Recently, papers in several journals reported performance results well above the best in the field. However, our analysis shows that information from the test set was inappropriately used in the training process in different ways and that the claimed performance results cannot be achieved. When correcting for the data leakage, the performances of the approaches drop even below the state-of-the-art by a large margin. Additionally, we investigate end-to-end variations to the discussed approaches, which do not improve upon the original.
翻訳日:2022-10-20 03:01:14 公開日:2021-03-01
# BSN++:時間的アクション提案生成のためのスケールベース関係モデリング付き補境界回帰器

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation ( http://arxiv.org/abs/2009.07641v5 )

ライセンス: Link先を確認
Haisheng Su, Weihao Gan, Wei Wu, Yu Qiao, Junjie Yan(参考訳) 非トリミングビデオで人間のアクション提案を生成することは、広範囲のアプリケーションにおいて重要な課題である。 現在の手法は、しばしばノイズの多い境界位置と、提案検索に使用される信頼スコアの劣る品質に悩まされる。 本稿では,時間的提案生成のための補完的境界回帰と関係モデリングを利用する新しいフレームワークであるBSN++を提案する。 まず,開始と終了の両方の境界分類器の相補的特性に基づく新しい境界回帰器を提案する。 具体的には,ネストしたスキップ接続を持つu字型アーキテクチャを用いてリッチなコンテキストをキャプチャし,境界精度を向上させるために双方向境界マッチング機構を導入する。 次に,従来の手法では無視された提案と提案の関係を考慮し,位置とチャネルの側面から2つの自己対応モジュールを含む提案関係ブロックを考案する。 さらに、正負の提案と時間的継続時間に必然的に不均衡なデータが存在し、テール分布のモデル性能を損なうことが判明した。 この問題を解消するために、スケールバランスの再サンプリング戦略を導入する。 大規模な実験はActivityNet-1.3とTHUMOS14の2つの人気のあるベンチマークで行われ、BSN++が最先端のパフォーマンスを達成することを示す。 当然のことながら、提案されたBSN++は、時間的アクションローカライゼーションタスクに関するCVPR19 - ActivityNetのリーダーボードで1位にランクインした。

Generating human action proposals in untrimmed videos is an important yet challenging task with wide applications. Current methods often suffer from the noisy boundary locations and the inferior quality of confidence scores used for proposal retrieving. In this paper, we present BSN++, a new framework which exploits complementary boundary regressor and relation modeling for temporal proposal generation. First, we propose a novel boundary regressor based on the complementary characteristics of both starting and ending boundary classifiers. Specifically, we utilize the U-shaped architecture with nested skip connections to capture rich contexts and introduce bi-directional boundary matching mechanism to improve boundary precision. Second, to account for the proposal-proposal relations ignored in previous methods, we devise a proposal relation block to which includes two self-attention modules from the aspects of position and channel. Furthermore, we find that there inevitably exists data imbalanced problems in the positive/negative proposals and temporal durations, which harm the model performance on tail distributions. To relieve this issue, we introduce the scale-balanced re-sampling strategy. Extensive experiments are conducted on two popular benchmarks: ActivityNet-1.3 and THUMOS14, which demonstrate that BSN++ achieves the state-of-the-art performance. Not surprisingly, the proposed BSN++ ranked 1st place in the CVPR19 - ActivityNet challenge leaderboard on temporal action localization task.
翻訳日:2022-10-18 06:35:03 公開日:2021-03-01
# トピック関連フェイクニュースコーパスをクローリングするための類似検出パイプライン

Similarity Detection Pipeline for Crawling a Topic Related Fake News Corpus ( http://arxiv.org/abs/2009.13367v2 )

ライセンス: Link先を確認
Inna Vogel, Jeong-Eun Choi, Meghana Meghana(参考訳) フェイクニュース検出は、ニュースの真偽を確認する人の時間と労力を減らすことを目的とした課題である。 しかし、フェイクニュースに対処する自動アプローチは、特に英語以外の言語において、ラベル付きベンチマークデータセットの欠如によって制限されている。 さらに、公開されている多くのコーパスには、使用が難しい特定の制限がある。 この問題に対処するため、私たちの貢献は3倍です。 まず,偽ニュース検出のための新しいドイツ語トピック関連コーパスを提案する。 私たちの知る限りでは、これはこの種の最初のコーパスです。 そこで我々は,類似ニュース記事をクロールするパイプラインを開発した。 第3の貢献として、偽ニュースを検出するための異なる学習実験を実施しました。 最高の性能は、SBERTの文レベルの埋め込みとBi-LSTM(k=0.88)を組み合わせて達成された。

Fake news detection is a challenging task aiming to reduce human time and effort to check the truthfulness of news. Automated approaches to combat fake news, however, are limited by the lack of labeled benchmark datasets, especially in languages other than English. Moreover, many publicly available corpora have specific limitations that make them difficult to use. To address this problem, our contribution is threefold. First, we propose a new, publicly available German topic related corpus for fake news detection. To the best of our knowledge, this is the first corpus of its kind. In this regard, we developed a pipeline for crawling similar news articles. As our third contribution, we conduct different learning experiments to detect fake news. The best performance was achieved using sentence level embeddings from SBERT in combination with a Bi-LSTM (k=0.88).
翻訳日:2022-10-13 22:15:33 公開日:2021-03-01
# rain-code fusion : code-to-code convlstmによる時空間降水予測

Rain-Code Fusion : Code-to-code ConvLSTM Forecasting Spatiotemporal Precipitation ( http://arxiv.org/abs/2009.14573v6 )

ライセンス: Link先を確認
Takato Yasuno, Akira Ishii, Masazumi Amakata(参考訳) 近年,気候変動による未経験の気象条件により,洪水被害が社会問題となっている。 豪雨に対する即時対応は、経済損失の軽減と急速な回復のためにも重要である。 時空間的な降水予測は、洪水被害軽減のために6時間以上前のダム流入予測の精度を高める可能性がある。 しかし,通常のConvLSTMでは,目標予測と地絡値の既約バイアスにより,実世界の降水予測において3時間以上の予測可能な範囲が制限されている。 本稿では,時空間降水量予測のためのレインコード手法を提案する。 本稿では,多フレーム融合を用いた時間ステップ低減のための時雨過程を表す新しい雨の特徴を提案する。 我々は,標準のConvLSTMに基づいて,様々な用語範囲でレインコード研究を行う。 平成18年~2019年5月から10月までの雨期毎時降雨量のダム地域を毎年127万時間程度で適用した。 我々は,レーダー解析を,136 x 148 km2の面積の中部広域地域の時間データに適用した。 最後に,複数の予測範囲における雨量と時間単位の精度について感度調査を行った。

Recently, flood damage has become a social problem owing to unexperienced weather conditions arising from climate change. An immediate response to heavy rain is important for the mitigation of economic losses and also for rapid recovery. Spatiotemporal precipitation forecasts may enhance the accuracy of dam inflow prediction, more than 6 hours forward for flood damage mitigation. However, the ordinary ConvLSTM has the limitation of predictable range more than 3-timesteps in real-world precipitation forecasting owing to the irreducible bias between target prediction and ground-truth value. This paper proposes a rain-code approach for spatiotemporal precipitation code-to-code forecasting. We propose a novel rainy feature that represents a temporal rainy process using multi-frame fusion for the timestep reduction. We perform rain-code studies with various term ranges based on the standard ConvLSTM. We applied to a dam region within the Japanese rainy term hourly precipitation data, under 2006 to 2019 approximately 127 thousands hours, every year from May to October. We apply the radar analysis hourly data on the central broader region with an area of 136 x 148 km2 . Finally we have provided sensitivity studies between the rain-code size and hourly accuracy within the several forecasting range.
翻訳日:2022-10-12 23:09:50 公開日:2021-03-01
# 物体検出を用いた自動車レーダデータ取得

Automotive Radar Data Acquisition using Object Detection ( http://arxiv.org/abs/2010.02367v2 )

ライセンス: Link先を確認
Madhumitha Sakthi, Ahmed Tewfik(参考訳) 都市化の複雑化は、自動運転車から様々なセンサー情報を取得し、処理する効率的なアルゴリズムを必要とする。 本稿では,画像からの物体検出結果を利用して,圧縮センシング(CS)を用いたレーダデータの適応サンプリングと取得を行うアルゴリズムを提案する。 この新しいアルゴリズムは、サンプリング予算が限られているため、均一なサンプリングではなく、オブジェクトの領域により多くのサンプリング予算を割り当てることで、最終的に関連するオブジェクト検出性能が向上する、という仮説に動機づけられている。 我々は,歩行者よりもバスなどの対象物にサンプリングレートを動的に割り当てることにより,関心のある領域のベースラインよりも高い再構築を実現することにより,検出性能を向上させる。 本研究では,線形計画法を用いてサンプリングレートの割り当てを自動化し,レーダブロックサイズを2。 また,ハードウェア効率のよいレーダ取得のための二元置換対角行列の解析を行い,その性能がガウス行列および二元置換ブロック対角行列に類似していることを示した。 オックスフォードレーダデータセットを用いた実験では,10%のサンプリング率で有効な対象の再構成が可能である。 最後に,nuscenesレーダと画像データを用いたトランスフォーマティブに基づく2次元物体検出ネットワークを開発した。

The growing urban complexity demands an efficient algorithm to acquire and process various sensor information from autonomous vehicles. In this paper, we introduce an algorithm to utilize object detection results from the image to adaptively sample and acquire radar data using Compressed Sensing (CS). This novel algorithm is motivated by the hypothesis that with a limited sampling budget, allocating more sampling budget to areas with the object as opposed to a uniform sampling ultimately improves relevant object detection performance. We improve detection performance by dynamically allocating a lower sampling rate to objects such as buses than pedestrians leading to better reconstruction than baseline across areas with objects of interest. We automate the sampling rate allocation using linear programming and show significant time savings while reducing the radar block size by a factor of 2. We also analyze a Binary Permuted Diagonal measurement matrix for radar acquisition which is hardware-efficient and show its performance is similar to Gaussian and Binary Permuted Block Diagonal matrix. Our experiments on the Oxford radar dataset show an effective reconstruction of objects of interest with 10% sampling rate. Finally, we develop a transformer-based 2D object detection network using the NuScenes radar and image data.
翻訳日:2022-10-10 21:06:40 公開日:2021-03-01
# アルゴリズム的言説に関する調査--定義・定式化・解法・展望

A survey of algorithmic recourse: definitions, formulations, solutions, and prospects ( http://arxiv.org/abs/2010.04050v2 )

ライセンス: Link先を確認
Amir-Hossein Karimi, Gilles Barthe, Bernhard Sch\"olkopf, Isabel Valera(参考訳) 機械学習は、決定が個人の生活に連続的な影響を及ぼすセンシティブな状況における意思決定を知らせるのにますます用いられる。 これらの設定では、モデルが正確で堅牢であることに加えて、公正性、プライバシ、説明責任、説明可能性といった社会的に関連する価値が、そのテクノロジの採用と影響に重要な役割を果たす。 本研究は,自動意思決定システムによって不当に扱われる個人に対して,説明とレコメンデーションを提供するアルゴリズムの講義に焦点を当てる。 まず,広範な文献レビューを行い,リコースのための統一的な定義,定式化,解決法を提示することにより,多くの著者の努力を一致させる。 次に,コミュニティが関与するであろう今後の研究の方向性を概観し,既存の仮定に挑戦し,セキュリティやプライバシ,公平性といった倫理的課題と明確に結びつくことを提案する。

Machine learning is increasingly used to inform decision-making in sensitive situations where decisions have consequential effects on individuals' lives. In these settings, in addition to requiring models to be accurate and robust, socially relevant values such as fairness, privacy, accountability, and explainability play an important role for the adoption and impact of said technologies. In this work, we focus on algorithmic recourse, which is concerned with providing explanations and recommendations to individuals who are unfavourably treated by automated decision-making systems. We first perform an extensive literature review, and align the efforts of many authors by presenting unified definitions, formulations, and solutions to recourse. Then, we provide an overview of the prospective research directions towards which the community may engage, challenging existing assumptions and making explicit connections to other ethical challenges such as security, privacy, and fairness.
翻訳日:2022-10-09 11:05:34 公開日:2021-03-01
# 劣化信号からの無線ネットワークにおけるロバスト局在

Robust Localization in Wireless Networks From Corrupted Signals ( http://arxiv.org/abs/2010.16297v2 )

ライセンス: Link先を確認
Muhammad Osama, Dave Zachariah, Satyam Dwivedi, Petre Stoica(参考訳) 非理想的な信号条件によって未知のデータの一部が破損した場合、無線ネットワークにおけるタイミングに基づくローカライズの問題に対処する。 タイミングに基づく手法は正確なローカライズを可能にするが、そのような破損したデータにも敏感である。 スケジュールベースの伝送における時間差,時間差,時間差など,様々な局所化手法に適用可能なロバストな手法を開発した。 この方法は非パラメトリックであり、破損したデータの割合の上限のみを必要とするため、破損するノイズ分布の分布的仮定を省略する。 本手法のロバスト性は数値実験で実証された。

We address the problem of timing-based localization in wireless networks, when an unknown fraction of data is corrupted by nonideal signal conditions. While timing-based techniques enable accurate localization, they are also sensitive to such corrupted data. We develop a robust method that is applicable to a range of localization techniques, including time-of-arrival, time-difference-of-arrival and time-difference in schedule-based transmissions. The method is nonparametric and requires only an upper bound on the fraction of corrupted data, thus obviating distributional assumptions of the corrupting noise distribution. The robustness of the method is demonstrated in numerical experiments.
翻訳日:2022-10-09 06:37:52 公開日:2021-03-01
# サブグリッドスケールスカラーフラックスモデリングのためのニューラルネットワークの物理的不変性

Physical invariance in neural networks for subgrid-scale scalar flux modeling ( http://arxiv.org/abs/2010.04663v4 )

ライセンス: Link先を確認
Hugo Frezat, Guillaume Balarac, Julien Le Sommer, Ronan Fablet, Redouane Lguensat(参考訳) 本稿では,物理インフォームドニューラルネットワーク(NN)を用いた三次元乱流非圧縮性流れのサブグリッドスケールスカラーフラックスをモデル化する新しい手法を提案する。 DNS(Direct Numerical Simulation)データからトレーニングされた場合、畳み込みニューラルネットワークのような最先端のニューラルネットワークは、よく知られた物理先行情報を保存しない可能性がある。 この問題に対処するために、古典的変換不変量と物理法則から導かれる対称性に基づくモデルへのハードおよびソフト制約について検討する。 シミュレーションに基づく実験から,提案した変換不変NNモデルは純粋にデータ駆動モデルとパラメトリック・オブ・ザ・アート・サブグリッドスケールモデルの両方に優れることを示した。 この不均一性は、事前評価中に物理メトリクスの正則化子と見なされ、予測されたサブグリッドスケール項の分布尾をDNSに近づけるように制約する。 また,大規模シミュレーション時にサロゲートとして使用する場合,モデルの安定性や性能も向上する。 さらに、トランスフォーメーション不変のNNは、トレーニング段階では見られなかったレギュレーションに一般化することが示されている。

In this paper we present a new strategy to model the subgrid-scale scalar flux in a three-dimensional turbulent incompressible flow using physics-informed neural networks (NNs). When trained from direct numerical simulation (DNS) data, state-of-the-art neural networks, such as convolutional neural networks, may not preserve well known physical priors, which may in turn question their application to real case-studies. To address this issue, we investigate hard and soft constraints into the model based on classical transformation invariances and symmetries derived from physical laws. From simulation-based experiments, we show that the proposed transformation-invariant NN model outperforms both purely data-driven ones as well as parametric state-of-the-art subgrid-scale models. The considered invariances are regarded as regularizers on physical metrics during the a priori evaluation and constrain the distribution tails of the predicted subgrid-scale term to be closer to the DNS. They also increase the stability and performance of the model when used as a surrogate during a large-eddy simulation. Moreover, the transformation-invariant NN is shown to generalize to regimes that have not been seen during the training phase.
翻訳日:2022-10-09 06:24:51 公開日:2021-03-01
# アスペクトベース感性分析のためのBERT性能の改善

Improving BERT Performance for Aspect-Based Sentiment Analysis ( http://arxiv.org/abs/2010.11731v2 )

ライセンス: Link先を確認
Akbar Karimi, Leonardo Rossi, Andrea Prati(参考訳) Aspect-Based Sentiment Analysis (ABSA)は、市場製品に関する消費者の意見を調査する。 それは、製品レビューで表現された感情のタイプや感情のターゲットを調べることです。 レビューで使用される言語を分析することは、言語を深く理解する必要がある難しい作業です。 近年、BERT \cite{devlin2019bert}のような深層言語モデルは、この点において大きな進歩を見せている。 本稿では,並列集約と階層集約という2つの単純なモジュールをbert上に配置し,アスペクト抽出(ae)とアスペクト感情分類(asc)という2つの主なabsaタスクに適用し,モデルの性能を向上させることを提案する。 提案モデルを適用することで,BERTモデルのさらなるトレーニングの必要性がなくなることを示す。 ソースコードはweb上で入手でき、結果のさらなる調査と再現が可能である。

Aspect-Based Sentiment Analysis (ABSA) studies the consumer opinion on the market products. It involves examining the type of sentiments as well as sentiment targets expressed in product reviews. Analyzing the language used in a review is a difficult task that requires a deep understanding of the language. In recent years, deep language models, such as BERT \cite{devlin2019bert}, have shown great progress in this regard. In this work, we propose two simple modules called Parallel Aggregation and Hierarchical Aggregation to be utilized on top of BERT for two main ABSA tasks namely Aspect Extraction (AE) and Aspect Sentiment Classification (ASC) in order to improve the model's performance. We show that applying the proposed models eliminates the need for further training of the BERT model. The source code is available on the Web for further research and reproduction of the results.
翻訳日:2022-10-04 05:12:37 公開日:2021-03-01
# 観光に応用される人工知能システム:調査

Artificial Intelligence Systems applied to tourism: A Survey ( http://arxiv.org/abs/2010.14654v2 )

ライセンス: Link先を確認
Luis Duarte, Jonathan Torres, Vitor Ribeiro, In\^es Moreira(参考訳) ai(artificial intelligence)は、さまざまなタスクのためのシステムのパフォーマンスを改善し、よりインタラクティブなパーソナルエージェントの世代を導入した。 現在、多くの地域でaiを適用する傾向がありますが、観光業で開発されているのと同じ量の作業は見られていません。 本稿では,観光用に開発されたAIシステムの主な応用と,その分野における技術の現状について述べる。 論文はまた、よりインタラクティブな体験を提供するために、個人エージェントのような観光に適用されるいくつかの重要な作品やシステムに関する最新の調査も提供している。 また,交通量予測システム,より正確なレコメンデーションシステム,さらには地理空間が観光データをより有意義な方法で表示し,発生前に問題を防止する方法について,詳細な調査を行った。

Artificial Intelligence (AI) has been improving the performance of systems for a diverse set of tasks and introduced a more interactive generation of personal agents. Despite the current trend of applying AI for a great amount of areas, we have not seen the same quantity of work being developed for the tourism sector. This paper reports on the main applications of AI systems developed for tourism and the current state of the art for this sector. The paper also provides an up-to-date survey of this field regarding several key works and systems that are applied to tourism, like Personal Agents, for providing a more interactive experience. We also carried out an in-depth research on systems for predicting traffic human flow, more accurate recommendation systems and even how geospatial is trying to display tourism data in a more informative way and prevent problems before they arise.
翻訳日:2022-10-02 12:16:13 公開日:2021-03-01
# 多様体とグラフ上の確率分布のコレクションの比較のための固有スライスワッサースタイン距離

Intrinsic Sliced Wasserstein Distances for Comparing Collections of Probability Distributions on Manifolds and Graphs ( http://arxiv.org/abs/2010.15285v2 )

ライセンス: Link先を確認
Raif M. Rustamov and Subhabrata Majumdar(参考訳) 確率分布のコレクションは、ユーザの活動パターン分析から脳コネクトミクスまで、さまざまな統計応用において生じる。 実際には、これらの分布は有限区間、円、シリンダー、球面、他の多様体、グラフを含む様々な領域タイプ上のヒストグラムで表される。 本稿では,このような一般領域に対する2つのヒストグラムの差を検出する手法を提案する。 多様体とグラフ上のワッサーシュタイン距離の新しいクラスを生成する固有スライシング構成を提案する。 これらの距離はヒルベルト埋め込み可能であり、ヒストグラムコレクション比較問題をヒルベルト空間におけるより親しみやすい平均テスト問題に還元することができる。 我々は、再サンプリングに基づく2つのテスト手順と、座標ワイドテストからのp値の組み合わせを提供する。 各種データ設定実験により,結果の検証が強力であり,p値が良好に校正されていることが示された。 ユーザ・アクティビティ・パターン、空間データ、脳コネクティクスへの応用例を提供する。

Collections of probability distributions arise in a variety of statistical applications ranging from user activity pattern analysis to brain connectomics. In practice these distributions are represented by histograms over diverse domain types including finite intervals, circles, cylinders, spheres, other manifolds, and graphs. This paper introduces an approach for detecting differences between two collections of histograms over such general domains. We propose the intrinsic slicing construction that yields a novel class of Wasserstein distances on manifolds and graphs. These distances are Hilbert embeddable, allowing us to reduce the histogram collection comparison problem to a more familiar mean testing problem in a Hilbert space. We provide two testing procedures one based on resampling and another on combining p-values from coordinate-wise tests. Our experiments in a variety of data settings show that the resulting tests are powerful and the p-values are well-calibrated. Example applications to user activity patterns, spatial data, and brain connectomics are provided.
翻訳日:2022-10-02 05:55:52 公開日:2021-03-01
# 限られた敵に対する予測

Prediction against a limited adversary ( http://arxiv.org/abs/2011.01217v3 )

ライセンス: Link先を確認
Erhan Bayraktar and Ibrahim Ekren and Xin Zhang(参考訳) 敵が少なくとも1人の専門家を腐敗させうる敵の腐敗に対する専門家の助言による予測の問題について検討する。 粘性理論のツールを用いて,予測器と敵の間のゲームの価値関数の長期的挙動を特徴付ける。 比較結果に頼らずに,後悔の増大率の上限を低く設定した。 後悔の記述によっては,ゲームの制限行動が著しく異なることが分かる。

We study the problem of prediction with expert advice with adversarial corruption where the adversary can at most corrupt one expert. Using tools from viscosity theory, we characterize the long-time behavior of the value function of the game between the forecaster and the adversary. We provide lower and upper bounds for the growth rate of regret without relying on a comparison result. We show that depending on the description of regret, the limiting behavior of the game can significantly differ.
翻訳日:2022-10-01 05:12:48 公開日:2021-03-01
# 保護部分空間を持つ確率線形帯域

Stochastic Linear Bandits with Protected Subspace ( http://arxiv.org/abs/2011.01016v2 )

ライセンス: Link先を確認
Advait Parulekar, Soumya Basu, Aditya Gopalan, Karthikeyan Shanmugam, Sanjay Shakkottai(参考訳) 我々は、線形目的関数を最適化するが、報酬は未知の部分空間に直交するのみであり(それを \textit{ protecteded space} と解釈する)、目的空間自体と保護された部分空間の両方へのゼロ階確率的オラクルアクセスのみを与える確率的線形包帯問題の変種について研究する。 特に、各ラウンドにおいて、学習者はアクションを選択すると同時に、目的または保護されたサブスペースに問い合わせるかどうかを選択する必要がある。 我々のアルゴリズムはOFULの原理から導かれており、いくつかのクエリを使って保護された空間を推定し、(ほぼすべてのラウンドにおいて)この空間の信頼セットに対して楽観的に再生する。 作用空間が $\mathbb{R}^d$, $s < d$ が保護部分空間の次元であり、$T$ が時間地平線である場合に、$\tilde{O}(sd\sqrt{T})$ 後悔の上界を与える。 さらに, 離散的行動空間は楽観的アルゴリズムによって線形後悔を生じさせ, 特定の設定における楽観主義の下位最適化性を強化することを実証する。 また、保護制約は、一定の設定において、一貫したアルゴリズムが$\Omega(T^{3/4})よりも小さい後悔を持つことができないことを示す。 最終的に私たちは、合成データセットと実際のデータセットで結果を実証的に検証しました。

We study a variant of the stochastic linear bandit problem wherein we optimize a linear objective function but rewards are accrued only orthogonal to an unknown subspace (which we interpret as a \textit{protected space}) given only zero-order stochastic oracle access to both the objective itself and protected subspace. In particular, at each round, the learner must choose whether to query the objective or the protected subspace alongside choosing an action. Our algorithm, derived from the OFUL principle, uses some of the queries to get an estimate of the protected space, and (in almost all rounds) plays optimistically with respect to a confidence set for this space. We provide a $\tilde{O}(sd\sqrt{T})$ regret upper bound in the case where the action space is the complete unit ball in $\mathbb{R}^d$, $s < d$ is the dimension of the protected subspace, and $T$ is the time horizon. Moreover, we demonstrate that a discrete action space can lead to linear regret with an optimistic algorithm, reinforcing the sub-optimality of optimism in certain settings. We also show that protection constraints imply that for certain settings, no consistent algorithm can have a regret smaller than $\Omega(T^{3/4}).$ We finally empirically validate our results with synthetic and real datasets.
翻訳日:2022-09-30 12:31:36 公開日:2021-03-01
# Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainity Estimation

Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation ( http://arxiv.org/abs/2011.02696v2 )

ライセンス: Link先を確認
Aurick Zhou, Sergey Levine(参考訳) ディープニューラルネットワークは、様々な課題に対して優れたパフォーマンスを提供するが、キャリブレーションと不確実性推定は、特に分散シフト下では大きな課題である。 本稿では,不確実性推定,キャリブレーション,分散的ロバスト性のためのスケーラブルな汎用手法として,amortized conditional normalized maximum likelihood (acnml) 法を提案する。 本アルゴリズムは,最小記述長原理に準じた最小最適特性を持つ条件付き正規化最大度 (cnml) 符号化スキームを基礎としているが, モデルクラスでもっとも単純なものを除いて, 正確に評価することは困難である。 本稿では,CNML分布の抽出可能な近似式を生成するためにベイズ近似手法を提案する。 提案手法は,モデルパラメータに対する移動可能な後方密度を与える近似推定アルゴリズムと組み合わせることができる。 acnmlは,分布外入力の校正の観点から,不確実性推定のための多くの先行手法と好適に比較できることを実証する。

While deep neural networks provide good performance for a range of challenging tasks, calibration and uncertainty estimation remain major challenges, especially under distribution shift. In this paper, we propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation, calibration, and out-of-distribution robustness with deep networks. Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle, but is computationally intractable to evaluate exactly for all but the simplest of model classes. We propose to use approximate Bayesian inference technqiues to produce a tractable approximation to the CNML distribution. Our approach can be combined with any approximate inference algorithm that provides tractable posterior densities over model parameters. We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
翻訳日:2022-09-29 11:31:07 公開日:2021-03-01
# 限界の不確実性を超えて:ベイズ回帰モデルが後方予測相関をどれくらい正確に推定できるか?

Beyond Marginal Uncertainty: How Accurately can Bayesian Regression Models Estimate Posterior Predictive Correlations? ( http://arxiv.org/abs/2011.03178v2 )

ライセンス: Link先を確認
Chaoqi Wang, Shengyang Sun, Roger Grosse(参考訳) 不確実性推定はディープラーニングでよく研究されているトピックだが、ほとんどの研究は限界不確実性の推定、すなわち個々の入力位置における予測平均と分散に焦点を当てている。 しかし、異なる入力位置における関数値間の予測相関を推定することがより有用であることが多い。 本稿では,ベイズモデルが予測相関をいかに正確に推定できるかをベンチマークする問題を考える。 まず,後方予測相関に依存する下流課題について考察する。 talは、通常のアクティブラーニングよりもモデルの不確実性推定をうまく利用し、ベイズモデルを評価するベンチマークとして推奨する。 talはアルゴリズム開発を導くには高価で間接的であるため,メタ相関(モデル相関推定値と真の値との相関)とクロス正規化確率(xll)という,予測相関をより直接的に評価し,効率的に計算できる2つの指標を導入する。 我々は,これらの指標を,tal性能との整合性を示すことによって検証し,現在のベイズニューラルネットとガウスプロセスモデルの相対的性能に関する知見を得る。

While uncertainty estimation is a well-studied topic in deep learning, most such work focuses on marginal uncertainty estimates, i.e. the predictive mean and variance at individual input locations. But it is often more useful to estimate predictive correlations between the function values at different input locations. In this paper, we consider the problem of benchmarking how accurately Bayesian models can estimate predictive correlations. We first consider a downstream task which depends on posterior predictive correlations: transductive active learning (TAL). We find that TAL makes better use of models' uncertainty estimates than ordinary active learning, and recommend this as a benchmark for evaluating Bayesian models. Since TAL is too expensive and indirect to guide development of algorithms, we introduce two metrics which more directly evaluate the predictive correlations and which can be computed efficiently: meta-correlations (i.e. the correlations between the models correlation estimates and the true values), and cross-normalized likelihoods (XLL). We validate these metrics by demonstrating their consistency with TAL performance and obtain insights about the relative performance of current Bayesian neural net and Gaussian process models.
翻訳日:2022-09-29 04:23:02 公開日:2021-03-01
# 半構造化深度指数モデル

Semi-Structured Deep Piecewise Exponential Models ( http://arxiv.org/abs/2011.05824v3 )

ライセンス: Link先を確認
Philipp Kopper, Sebastian P\"olsterl, Christian Wachinger, Bernd Bischl, Andreas Bender, David R\"ugamer(参考訳) 本稿では,統計学の先進的な概念と深層学習を組み合わせた生存分析のための多目的フレームワークを提案する。 提案フレームワークは指数関数モデルに基づいており、競合するリスクや多状態モデリングなど様々な生存タスクをサポートし、さらに時間変化効果や時間変化特性を推定することができる。 また、複数のデータソースと高次相互作用効果をモデルに含めるために、モデルクラスをニューラルネットワークに組み込むことにより、本質的に解釈可能な構造化回帰入力と、さらに非構造化データソースを処理可能なディープニューラルネットワークコンポーネントの同時推定を可能にする。 このフレームワークを用いて、表と3Dポイントのクラウドデータに基づいてアルツハイマー病の進行を予測し、それを合成データに適用することで概念実証を行う。

We propose a versatile framework for survival analysis that combines advanced concepts from statistics with deep learning. The presented framework is based on piecewise exponential models and thereby supports various survival tasks, such as competing risks and multi-state modeling, and further allows for estimation of time-varying effects and time-varying features. To also include multiple data sources and higher-order interaction effects into the model, we embed the model class in a neural network and thereby enable the simultaneous estimation of both inherently interpretable structured regression inputs as well as deep neural network components which can potentially process additional unstructured data sources. A proof of concept is provided by using the framework to predict Alzheimer's disease progression based on tabular and 3D point cloud data and applying it to synthetic data.
翻訳日:2022-09-26 22:57:37 公開日:2021-03-01
# 多目的進化アルゴリズムによる最適パルス除去間隔の探索

Finding optimal Pulse Repetion Intervals with Many-objective Evolutionary Algorithms ( http://arxiv.org/abs/2011.06913v2 )

ライセンス: Link先を確認
Paul Dufoss\'e and Cyrille Enderli(参考訳) 本稿では、パルスドップラーレーダシステムにおいて、最適な妥協範囲とドップラーのあいまいさを緩和できるパルス繰り返し間隔を求める問題を考察する。 多目的最適化アルゴリズムをテストする実例として,進化的計算コミュニティに提案された問題を再検討する。 ブラックボックス最適化のためのいくつかの進化的アルゴリズムを異なるメトリクスで比較するために,ベースラインとして使用する。 結果データを集約してパレート最適点の参照セットを構築し、レーダデザイナによるさらなる分析と運用の出発点となる。

In this paper we consider the problem of finding Pulse Repetition Intervals allowing the best compromises mitigating range and Doppler ambiguities in a Pulsed-Doppler radar system. We revisit a problem that was proposed to the Evolutionary Computation community as a real-world case to test Many-objective Optimization algorithms. We use it as a baseline to compare several Evolutionary Algorithms for black-box optimization with different metrics. Resulting data is aggregated to build a reference set of Pareto optimal points and is the starting point for further analysis and operational use by the radar designer.
翻訳日:2022-09-26 00:00:51 公開日:2021-03-01
# (参考訳) ArCorona:コロナウイルス(COVID-19)パンデミック初期のアラビア語ツイートの分析

ArCorona: Analyzing Arabic Tweets in the Early Days of Coronavirus (COVID-19) Pandemic ( http://arxiv.org/abs/2012.01462v3 )

ライセンス: CC BY 4.0
Hamdy Mubarak and Sabit Hassan(参考訳) 過去数ヶ月の間に、アラブ地域では大量のツイートやコロナウイルス(COVID-19)に関する議論があった。 政策立案者や多くの人々が、公開行動や関心事、政府からの要求、ツイートソースなどを理解するために、共有ツイートの種類を特定することが重要である。 また、ウイルスや悪い治療法に関する噂や誤報の拡散を防ぐことも重要である。 この目的のために、私たちは、covid-19に関連するアラビア語のつぶやきの、手作業による最大のデータセットを提示します。 アノテーションガイドラインを記述し、データセットを分析し、効果的な機械学習とトランスフォーマーに基づく分類モデルを構築する。

Over the past few months, there were huge numbers of circulating tweets and discussions about Coronavirus (COVID-19) in the Arab region. It is important for policy makers and many people to identify types of shared tweets to better understand public behavior, topics of interest, requests from governments, sources of tweets, etc. It is also crucial to prevent spreading of rumors and misinformation about the virus or bad cures. To this end, we present the largest manually annotated dataset of Arabic tweets related to COVID-19. We describe annotation guidelines, analyze our dataset and build effective machine learning and transformer based models for classification.
翻訳日:2021-05-30 03:23:53 公開日:2021-03-01
# 理解と知識

Comprehension and Knowledge ( http://arxiv.org/abs/2012.06561v2 )

ライセンス: Link先を確認
Pavel Naumov, Kevin Ros(参考訳) エージェントが文章を理解する能力は、エージェントの事前の経験と背景知識と密に結びついている。 本稿では、理解をモダリティと解釈し、理解と知識のモダリティの相互作用を記述した完全なバイモーダル論理システムを提案する。

The ability of an agent to comprehend a sentence is tightly connected to the agent's prior experiences and background knowledge. The paper suggests to interpret comprehension as a modality and proposes a complete bimodal logical system that describes an interplay between comprehension and knowledge modalities.
翻訳日:2021-05-11 03:10:36 公開日:2021-03-01
# 学習特異値を用いた非線形逆問題に対する効率的な準ニュートン法

An efficient Quasi-Newton method for nonlinear inverse problems via learned singular values ( http://arxiv.org/abs/2012.07676v2 )

ライセンス: Link先を確認
Danny Smyl, Tyler N. Tallman, Dong Liu, Andreas Hauptmann(参考訳) 工学と物理科学における複雑な最適化問題を解くには、多次元関数微分の繰り返し計算が必要である。 一般に、これは摂動法のような数値的な微分を必要とするため、最終的に時間に敏感なアプリケーションの使用を制限している。 特に非線形逆問題では、ガウス・ニュートン法はジャコビアンから計算される反復的な更新を必要とする。 計算上より効率的な代替手段は準ニュートン法であり、ヤコビアンの繰り返しの計算を近似的な更新に置き換える。 本稿では非線形逆問題に適用可能な高効率データ駆動準ニュートン法を提案する。 これを、特異値分解を用いて、モデル出力から特異値への写像を学習し、更新されたヤコビアンを計算する。 これにより、ラウンドオフエラーを蓄積することなく準ニュートン法の高速化を期待でき、時間クリティカルなアプリケーションを可能にし、不正な問題の解決に必要な事前知識を柔軟に組み込むことができる。 実験データを用いた電気インピーダンストモグラフィーの非線形逆問題について検討した。

Solving complex optimization problems in engineering and the physical sciences requires repetitive computation of multi-dimensional function derivatives. Commonly, this requires computationally-demanding numerical differentiation such as perturbation techniques, which ultimately limits the use for time-sensitive applications. In particular, in nonlinear inverse problems Gauss-Newton methods are used that require iterative updates to be computed from the Jacobian. Computationally more efficient alternatives are Quasi-Newton methods, where the repeated computation of the Jacobian is replaced by an approximate update. Here we present a highly efficient data-driven Quasi-Newton method applicable to nonlinear inverse problems. We achieve this, by using the singular value decomposition and learning a mapping from model outputs to the singular values to compute the updated Jacobian. This enables a speed-up expected of Quasi-Newton methods without accumulating roundoff errors, enabling time-critical applications and allowing for flexible incorporation of prior knowledge necessary to solve ill-posed problems. We present results for the highly non-linear inverse problem of electrical impedance tomography with experimental data.
翻訳日:2021-05-08 14:12:48 公開日:2021-03-01
# (参考訳) 分散協調型マルチエージェント強化学習における公平な学習政策

Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning ( http://arxiv.org/abs/2012.09421v2 )

ライセンス: CC BY 4.0
Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng(参考訳) 我々は,協調型マルチエージェント強化学習(MARL)における公正な政策学習の問題を考える。 公平性の2つの重要な側面を明示的にエンコードする福祉関数を最適化する問題、すなわち効率と平等を原則的に定式化する。 解法として,フェアネスの2つの側面を考慮した2つのサブネットワークからなるニューラルネットワークアーキテクチャを提案する。 実験では、公平な最適化のために2つのサブネットワークの重要性を示す。 我々の全体的なアプローチは、あらゆる(サブ)微分可能福祉機能に対応できるため、概して一般的である。 したがって、文献で提案された公平性の様々な概念(例えば、レキシコグラフィー・マキシミン、一般化されたジニ社会福祉機能、比例フェアネス)と互換性がある。 私たちのソリューションは汎用的で,集中トレーニングや分散実行,あるいは完全な分散化といった,さまざまなmarl設定で実装可能です。 最後に,様々な領域におけるアプローチを実験的に検証し,従来の手法よりもはるかに優れた性能を示す。

We consider the problem of learning fair policies in (deep) cooperative multi-agent reinforcement learning (MARL). We formalize it in a principled way as the problem of optimizing a welfare function that explicitly encodes two important aspects of fairness: efficiency and equity. As a solution method, we propose a novel neural network architecture, which is composed of two sub-networks specifically designed for taking into account the two aspects of fairness. In experiments, we demonstrate the importance of the two sub-networks for fair optimization. Our overall approach is general as it can accommodate any (sub)differentiable welfare function. Therefore, it is compatible with various notions of fairness that have been proposed in the literature (e.g., lexicographic maximin, generalized Gini social welfare function, proportional fairness). Our solution method is generic and can be implemented in various MARL settings: centralized training and decentralized execution, or fully decentralized. Finally, we experimentally validate our approach in various domains and show that it can perform much better than previous methods.
翻訳日:2021-05-02 21:25:52 公開日:2021-03-01
# マンフォールドマッピング学習のためのフローベース生成モデル

Flow-based Generative Models for Learning Manifold to Manifold Mappings ( http://arxiv.org/abs/2012.10013v2 )

ライセンス: Link先を確認
Xingjian Zhen, Rudrasis Chakraborty, Liu Yang, Vikas Singh(参考訳) コンピュータビジョンや機械学習における多くの測定や観察は、非ユークリッドデータとして表される。 最近の提案(球面cnnなど)は、多くのディープニューラルネットワークアーキテクチャを多様体値データに拡張しており、しばしばパフォーマンスが大幅に改善されているが、多様体データの生成モデルに関する文献は、かなり乏しい。 このギャップのために、多様体値データに対するモダリティ伝達/翻訳モデルも存在しないが、生成モデルに基づく多くの方法が自然画像で利用可能である。 本稿では,脳イメージングの必要性に動機づけられたこのギャップについて述べる。そのために我々は,自然画像から多様体値を持つ画像まで,特定の生成モデル(およびモダリティ伝達のための生成モデル)の動作範囲を拡大する。 我々の主な成果は、GLOW(フローベース非可逆生成モデル)の2ストリームバージョンの設計であり、これは、ある種類の多様体値の測定値のフィールドに関する情報を合成することができる。 理論的には、フローベース生成モデル(例えば、GLOW)におけるそれらの機能に類似するだけでなく、重要な利点(ヤコビ行列式は容易に計算できる)を保った多様体値データに対する3種類の可逆層を導入する。 実験では,Human Connectome Project (HCP) の大規模なデータセットを用いて,拡散テンソル画像(DTI)から方向分布関数(ODF)の脳像を確実かつ正確に再構成できる有望な結果を示す。

Many measurements or observations in computer vision and machine learning manifest as non-Euclidean data. While recent proposals (like spherical CNN) have extended a number of deep neural network architectures to manifold-valued data, and this has often provided strong improvements in performance, the literature on generative models for manifold data is quite sparse. Partly due to this gap, there are also no modality transfer/translation models for manifold-valued data whereas numerous such methods based on generative models are available for natural images. This paper addresses this gap, motivated by a need in brain imaging -- in doing so, we expand the operating range of certain generative models (as well as generative models for modality transfer) from natural images to images with manifold-valued measurements. Our main result is the design of a two-stream version of GLOW (flow-based invertible generative models) that can synthesize information of a field of one type of manifold-valued measurements given another. On the theoretical side, we introduce three kinds of invertible layers for manifold-valued data, which are not only analogous to their functionality in flow-based generative models (e.g., GLOW) but also preserve the key benefits (determinants of the Jacobian are easy to calculate). For experiments, on a large dataset from the Human Connectome Project (HCP), we show promising results where we can reliably and accurately reconstruct brain images of a field of orientation distribution functions (ODF) from diffusion tensor images (DTI), where the latter has a $5\times$ faster acquisition time but at the expense of worse angular resolution.
翻訳日:2021-05-01 18:14:45 公開日:2021-03-01
# パワーアウェアスパースゼロ次最適化による光ニューラルネットワークの効率的なオンチップ学習

Efficient On-Chip Learning for Optical Neural Networks Through Power-Aware Sparse Zeroth-Order Optimization ( http://arxiv.org/abs/2012.11148v2 )

ライセンス: Link先を確認
Jiaqi Gu, Chenghao Feng, Zheng Zhao, Zhoufeng Ying, Ray T. Chen, David Z. Pan(参考訳) 光ニューラルネットワーク(ONN)は、超高速実行速度と低エネルギー消費により、高性能なニューロモルフィックコンピューティングにおいて記録破りの可能性を示している。 しかし、現在の学習プロトコルは、実用的な応用においてフォトニック回路最適化のスケーラブルで効率的なソリューションを提供していない。 そこで本研究では,ONNのパワー効率向上のための潜在能力を最大限に発揮するための,新しいオンチップ学習フレームワークを提案する。 実装コストのかかるバックプロパゲーションをデプロイする代わりに、計算予算と電力制約でデバイス構成を直接最適化します。 我々は,onnオンチップ学習を資源制約付き確率的ゼロ次最適化問題としてモデル化し,実用的onn展開においてスケーラブルなオンチップトレーニングソリューションを提供するために,2レベルスパーシティとパワーアウェア動的プルーニングを用いた新しい混合学習戦略を提案する。 従来の手法と比較して、我々は初めて2500以上の光学部品をチップ上で最適化した。 最適化の安定性が向上し、3.7x-7.6倍の効率が向上し、実用的なデバイスバリエーションと熱クロストークで90%以上の電力を節約できる。

Optical neural networks (ONNs) have demonstrated record-breaking potential in high-performance neuromorphic computing due to their ultra-high execution speed and low energy consumption. However, current learning protocols fail to provide scalable and efficient solutions to photonic circuit optimization in practical applications. In this work, we propose a novel on-chip learning framework to release the full potential of ONNs for power-efficient in situ training. Instead of deploying implementation-costly back-propagation, we directly optimize the device configurations with computation budgets and power constraints. We are the first to model the ONN on-chip learning as a resource-constrained stochastic noisy zeroth-order optimization problem, and propose a novel mixed-training strategy with two-level sparsity and power-aware dynamic pruning to offer a scalable on-chip training solution in practical ONN deployment. Compared with previous methods, we are the first to optimize over 2,500 optical components on chip. We can achieve much better optimization stability, 3.7x-7.6x higher efficiency, and save >90% power under practical device variations and thermal crosstalk.
翻訳日:2021-04-27 06:19:24 公開日:2021-03-01
# (参考訳) TenIPS: テンソル完了のための逆確率サンプリング

TenIPS: Inverse Propensity Sampling for Tensor Completion ( http://arxiv.org/abs/2101.00323v2 )

ライセンス: CC0 1.0
Chengrun Yang, Lijun Ding, Ziyang Wu, Madeleine Udell(参考訳) テンソルは多方向データの配列を表現するために広く使われている。 テンソルで欠落したエントリの回復は、一般的には、エントリがランダムに完全に欠落している(MCAR)という仮定の下で、広範囲に研究されている。 しかし、ほとんどの実用的な環境では、観測はランダムではない(mnar): 与えられたエントリが観測される確率は、テンソル内の他のエントリまたは欠落エントリの値に依存する可能性がある。 本稿では,mnar観測で部分的に観測されたテンソルを完遂する問題について,その傾向に関する事前情報なしに検討する。 テンソルを完備化するために、元のテンソルと確率のテンソルの両方が低次線型階数であると仮定する。 アルゴリズムはまず凸緩和を用いて確率を推定し、次に高次SVDアプローチを用いて欠落値を予測し、逆の確率によって観測テンソルを再重み付けする。 得られた完備テンソル上の有限サンプル誤差境界を提供する。 数値実験により本手法の有効性が示された。

Tensors are widely used to represent multiway arrays of data. The recovery of missing entries in a tensor has been extensively studied, generally under the assumption that entries are missing completely at random (MCAR). However, in most practical settings, observations are missing not at random (MNAR): the probability that a given entry is observed (also called the propensity) may depend on other entries in the tensor or even on the value of the missing entry. In this paper, we study the problem of completing a partially observed tensor with MNAR observations, without prior information about the propensities. To complete the tensor, we assume that both the original tensor and the tensor of propensities have low multilinear rank. The algorithm first estimates the propensities using a convex relaxation and then predicts missing values using a higher-order SVD approach, reweighting the observed tensor by the inverse propensities. We provide finite-sample error bounds on the resulting complete tensor. Numerical experiments demonstrate the effectiveness of our approach.
翻訳日:2021-04-16 14:23:41 公開日:2021-03-01
# (参考訳) パーシステンスグラフ分類のためのスペクトルウェーブレットの最適化

Optimisation of Spectral Wavelets for Persistence-based Graph Classification ( http://arxiv.org/abs/2101.05201v2 )

Ka Man Yim, Jacob Leygonie(参考訳) グラフのスペクトルウェーブレットシグネチャは濾過を決定し、結果として拡張持続図の集合を決定する。 本稿では,グラフのデータセットに対するウェーブレットの選択を最適化するフレームワークを提案する。 グラフのスペクトルウェーブレットシグネチャはそのラプラシアンから導出されるので、このフレームワークはグラフの幾何学的性質を関連する永続化ダイアグラムにエンコードし、先行ノード属性なしでグラフに適用することができる。 我々は,グラフ分類問題に適用し,他の永続化アーキテクチャと競合する性能を得る。 基礎となる理論の基礎を提供するため、通常の永続ホモロジーの微分可能性の結果を拡張持続ホモロジーへ拡張する。

翻訳日:2021-04-08 11:12:42 公開日:2021-03-01
# (参考訳) カンフラン地下実験室におけるラドンレベル予測時系列におけるニューラルネットワークの入出力関係の理解

Understanding the input-output relationship of neural networks in the time series forecasting radon levels at Canfranc Underground Laboratory ( http://arxiv.org/abs/2102.07616v2 )

ライセンス: CC BY 4.0
I\~naki Rodr\'iguez-Garc\'ia and Miguel C\'ardenas-Montes(参考訳) 暗黒物質直接検出のような地下物理実験は、背景貢献の制御を維持する必要がある。 これらの実験を地下施設で実施することは、宇宙線のような背景源の最小化に役立つ。 最大の背景資料の1つは、研究施設を囲む岩石から放出されるラドンである。 ラドン粒子は、維持作業を行うために開封時に検出器の内部に堆積することができる。 したがって、ラドンレベルが最小の場合にメンテナンス操作をスケジュールしようとする場合、ラドンレベルを予測することは重要なタスクである。 これまで,スペイン・カンフラン地下研究所(LSC)のラドン時系列を良好な結果で予測する深層学習モデルが実施されてきた。 時系列を予測する場合には、時系列の過去の値を入力変数とする。 本研究は,ニューラルネットワークが生成する予測に対する入力変数の相対的寄与を理解することに焦点を当てている。 その結果,時系列の予測が入力変数に依存するかを理解することができた。 これらの結果は、将来より良い予測器を構築するために使われるかもしれない。

Underground physics experiments such as dark matter direct detection need to keep control of the background contribution. Hosting these experiments in underground facilities helps to minimize certain background sources such as the cosmic rays. One of the largest remaining background sources is the radon emanated from the rocks enclosing the research facility. The radon particles could be deposited inside the detectors when they are opened to perform the maintenance operations. Therefore, forecasting the radon levels is a crucial task in an attempt to schedule the maintenance operations when radon level is minimum. In the past, deep learning models have been implemented to forecast the radon time series at the Canfranc Underground Laboratory (LSC), in Spain, with satisfactory results. When forecasting time series, the past values of the time series are taken as input variables. The present work focuses on understanding the relative contribution of these input variables to the predictions generated by neural networks. The results allow us to understand how the predictions of the time series depend on the input variables. These results may be used to build better predictors in the future.
翻訳日:2021-04-06 07:18:34 公開日:2021-03-01
From Quantifying Vagueness To Pan-niftyism ( http://arxiv.org/abs/2103.03361v1 )

ライセンス: CC BY 4.0
In this short paper, we will introduce a simple model for quantifying philosophical vagueness. There is growing interest in this endeavor to quantify vague concepts of consciousness, agency, etc. We will then discuss some of the implications of this model including the conditions under which the quantification of `nifty' leads to pan-nifty-ism. Understanding this leads to an interesting insight - the reason a framework to quantify consciousness like Integrated Information Theory implies (forms of) panpsychism is because there is favorable structure already implicitly encoded in the construction of the quantification metric.
翻訳日:2021-04-05 07:41:01 公開日:2021-03-01
# グラフレイアウトにおける人間の選好予測のための機械学習アプローチ

A Machine Learning Approach for Predicting Human Preference for Graph Layouts ( http://arxiv.org/abs/2103.03665v1 )

ライセンス: Link先を確認
Understanding what graph layout human prefer and why they prefer is significant and challenging due to the highly complex visual perception and cognition system in human brain. In this paper, we present the first machine learning approach for predicting human preference for graph layouts. In general, the data sets with human preference labels are limited and insufficient for training deep networks. To address this, we train our deep learning model by employing the transfer learning method, e.g., exploiting the quality metrics, such as shape-based metrics, edge crossing and stress, which are shown to be correlated to human preference on graph layouts. Experimental results using the ground truth human preference data sets show that our model can successfully predict human preference for graph layouts. To our best knowledge, this is the first approach for predicting qualitative evaluation of graph layouts using human preference experiment data.
翻訳日:2021-04-05 00:53:06 公開日:2021-03-01
# ニューラルネットワーク蒸留と可変選択を用いた心不全の危険因子同定

Risk factor identification for incident heart failure using neural network distillation and variable selection ( http://arxiv.org/abs/2102.12936v2 )

ライセンス: Link先を確認
Recent evidence shows that deep learning models trained on electronic health records from millions of patients can deliver substantially more accurate predictions of risk compared to their statistical counterparts. While this provides an important opportunity for improving clinical decision-making, the lack of interpretability is a major barrier to the incorporation of these black-box models in routine care, limiting their trustworthiness and preventing further hypothesis-testing investigations. In this study, we propose two methods, namely, model distillation and variable selection, to untangle hidden patterns learned by an established deep learning model (BEHRT) for risk association identification. Due to the clinical importance and diversity of heart failure as a phenotype, it was used to showcase the merits of the proposed methods. A cohort with 788,880 (8.3% incident heart failure) patients was considered for the study. Model distillation identified 598 and 379 diseases that were associated and dissociated with heart failure at the population level, respectively. While the associations were broadly consistent with prior knowledge, our method also highlighted several less appreciated links that are worth further investigation. In addition to these important population-level insights, we developed an approach to individual-level interpretation to take account of varying manifestation of heart failure in clinical practice. This was achieved through variable selection by detecting a minimal set of encounters that can maximally preserve the accuracy of prediction for individuals. Our proposed work provides a discovery-enabling tool to identify risk factors in both population and individual levels from a data-driven perspective. This helps to generate new hypotheses and guides further investigations on causal links.
翻訳日:2021-04-05 00:41:35 公開日:2021-03-01
# 挿入型言語生成器の効率的な訓練・制御性・構成一般化に関する研究

On Efficient Training, Controllability and Compositional Generalization of Insertion-based Language Generators ( http://arxiv.org/abs/2102.11008v2 )

ライセンス: Link先を確認
Auto-regressive language models with the left-to-right generation order have been a predominant paradigm for language generation. Recently, out-of-order text generation beyond the traditional left-to-right paradigm has attracted extensive attention, with a notable variation of insertion-based generation, where a model is used to gradually extend the context into a complete sentence purely with insertion operations. However, since insertion operations disturb the position information of each token, it is often believed that each step of the insertion-based likelihood estimation requires a bi-directional \textit{re-encoding} of the whole generated sequence. This computational overhead prohibits the model from scaling up to generate long, diverse texts such as stories, news articles, and reports. To address this issue, we propose InsNet, an insertion-based sequence model that can be trained as efficiently as traditional transformer decoders while maintaining the same performance as that with a bi-directional context encoder. We evaluate InsNet on story generation and CleVR-CoGENT captioning, showing the advantages of InsNet in several dimensions, including computational costs, generation quality, the ability to perfectly incorporate lexical controls, and better compositional generalization.
翻訳日:2021-04-05 00:37:26 公開日:2021-03-01
# 積雪環境指標の積雪とセマンティックセグメンテーションによる道路表面翻訳

Road Surface Translation Under Snow-covered and Semantic Segmentation for Snow Hazard Index ( http://arxiv.org/abs/2101.05616v4 )

ライセンス: Link先を確認
In 2020, there was a record heavy snowfall owing to climate change. In reality, 2,000 vehicles were stuck on the highway for three days. Because of the freezing of the road surface, 10 vehicles had a billiard accident. Road managers are required to provide indicators to alert drivers regarding snow cover at hazardous locations. This study proposes a deep learning application with live image post-processing to automatically calculate a snow hazard ratio indicator. First, the road surface hidden under snow is translated using a generative adversarial network, pix2pix. Second, snow-covered and road surface classes are detected by semantic segmentation using DeepLabv3+ with MobileNet as a backbone. Based on these trained networks, we automatically compute the road to snow rate hazard index, indicating the amount of snow covered on the road surface. We demonstrate the applied results to 1,155 live snow images of the cold region in Japan. We mention the usefulness and the practical robustness of our study.
翻訳日:2021-03-29 00:50:44 公開日:2021-03-01
# (参考訳) 電気通信人工知能の次の10年

The Next Decade of Telecommunications Artificial Intelligence ( http://arxiv.org/abs/2101.09163v4 )

ライセンス: CC BY 4.0
It has been an exciting journey since the mobile communications and artificial intelligence were conceived 37 years and 64 years ago. While both fields evolved independently and profoundly changed communications and computing industries, the rapid convergence of 5G and deep learning is beginning to significantly transform the core communication infrastructure, network management and vertical applications. The paper first outlines the individual roadmaps of mobile communications and artificial intelligence in the early stage, with a concentration to review the era from 3G to 5G when AI and mobile communications started to converge. With regard to telecommunications artificial intelligence, the paper further introduces in detail the progress of artificial intelligence in the ecosystem of mobile communications. The paper then summarizes the classifications of AI in telecom ecosystems along with its evolution paths specified by various international telecommunications standardization bodies. Towards the next decade, the paper forecasts the prospective roadmap of telecommunications artificial intelligence. In line with 3GPP and ITU-R timeline of 5G & 6G, the paper further explores the network intelligence following 3GPP and ORAN routes respectively, experience and intention driven network management and operation, network AI signalling system, intelligent middle-office based BSS, intelligent customer experience management and policy control driven by BSS and OSS convergence, evolution from SLA to ELA, and intelligent private network for verticals. The paper is concluded with the vision that AI will reshape the future B5G or 6G landscape and we need pivot our R&D, standardizations, and ecosystem to fully take the unprecedented opportunities.
翻訳日:2021-03-23 11:34:00 公開日:2021-03-01
# 実雑音ラベルデータに対するノイズモデル誤差の解析

Analysing the Noise Model Error for Realistic Noisy Label Data ( http://arxiv.org/abs/2101.09763v2 )

ライセンス: Link先を確認
Distant and weak supervision allow to obtain large amounts of labeled training data quickly and cheaply, but these automatic annotations tend to contain a high amount of errors. A popular technique to overcome the negative effects of these noisy labels is noise modelling where the underlying noise process is modelled. In this work, we study the quality of these estimated noise models from the theoretical side by deriving the expected error of the noise model. Apart from evaluating the theoretical results on commonly used synthetic noise, we also publish NoisyNER, a new noisy label dataset from the NLP domain that was obtained through a realistic distant supervision technique. It provides seven sets of labels with differing noise patterns to evaluate different noise levels on the same instances. Parallel, clean labels are available making it possible to study scenarios where a small amount of gold-standard data can be leveraged. Our theoretical results and the corresponding experiments give insights into the factors that influence the noise model estimation like the noise distribution and the sampling technique.
翻訳日:2021-03-16 09:19:48 公開日:2021-03-01
# (参考訳) 機械学習ベンチマークにおける変数の会計

Accounting for Variance in Machine Learning Benchmarks ( http://arxiv.org/abs/2103.03098v1 )

ライセンス: CC BY 4.0
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.
翻訳日:2021-03-07 22:36:16 公開日:2021-03-01
# (参考訳) Cryptonite:Cryptic Crossword Benchmark for Extreme Ambiguity in Language

Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language ( http://arxiv.org/abs/2103.01242v1 )

ライセンス: CC BY 4.0
Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease. We present Cryptonite, a large-scale dataset based on cryptic crosswords, which is both linguistically complex and naturally sourced. Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a misleading surface reading, whose solving requires disambiguating semantic, syntactic, and phonetic wordplays, as well as world knowledge. Cryptic clues pose a challenge even for experienced solvers, though top-tier experts can solve them with almost 100% accuracy. Cryptonite is a challenging task for current models; fine-tuning T5-Large on 470k cryptic clues achieves only 7.6% accuracy, on par with the accuracy of a rule-based clue solver (8.6%).
翻訳日:2021-03-06 07:52:07 公開日:2021-03-01
# (参考訳) ビデオベースダイアローグにおけるセマンティックグラフの推論パスの学習

Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues ( http://arxiv.org/abs/2103.00820v1 )

ライセンス: CC BY 4.0
Compared to traditional visual question answering, video-grounded dialogues require additional reasoning over dialogue context to answer questions in a multi-turn setting. Previous approaches to video-grounded dialogues mostly use dialogue context as a simple text input without modelling the inherent information flows at the turn level. In this paper, we propose a novel framework of Reasoning Paths in Dialogue Context (PDC). PDC model discovers information flows among dialogue turns through a semantic graph constructed based on lexical components in each question and answer. PDC model then learns to predict reasoning paths over this semantic graph. Our path prediction model predicts a path from the current turn through past dialogue turns that contain additional visual cues to answer the current question. Our reasoning model sequentially processes both visual and textual information through this reasoning path and the propagated features are used to generate the answer. Our experimental results demonstrate the effectiveness of our method and provide additional insights on how models use semantic dependencies in a dialogue context to retrieve visual cues.
翻訳日:2021-03-06 07:43:50 公開日:2021-03-01
# (参考訳) 畳み込み正規化:深層畳み込みネットワークロバストネスとトレーニングの改善

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training ( http://arxiv.org/abs/2103.00673v1 )

ライセンス: CC BY 4.0
Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets). In particular, many recent works demonstrate that promoting the orthogonality of the weights helps train deep models and improve robustness. For ConvNets, most existing methods are based on penalizing or normalizing weight matrices derived from concatenating or flattening the convolutional kernels. These methods often destroy or ignore the benign convolutional structure of the kernels; therefore, they are often expensive or impractical for deep ConvNets. In contrast, we introduce a simple and efficient ``convolutional normalization'' method that can fully exploit the convolutional structure in the Fourier domain and serve as a simple plug-and-play module to be conveniently incorporated into any ConvNets. Our method is inspired by recent work on preconditioning methods for convolutional sparse coding and can effectively promote each layer's channel-wise isometry. Furthermore, we show that convolutional normalization can reduce the layerwise spectral norm of the weight matrices and hence improve the Lipschitzness of the network, leading to easier training and improved robustness for deep ConvNets. Applied to classification under noise corruptions and generative adversarial network (GAN), we show that convolutional normalization improves the robustness of common ConvNets such as ResNet and the performance of GAN. We verify our findings via extensive numerical experiments on CIFAR-10, CIFAR-100, and ImageNet.
翻訳日:2021-03-06 03:43:15 公開日:2021-03-01
# (参考訳) BERTを用いた非構造化ドメインテキストの知識抽出法

BERT-based knowledge extraction method of unstructured domain text ( http://arxiv.org/abs/2103.00728v1 )

ライセンス: CC BY-SA 4.0
With the development and business adoption of knowledge graph, there is an increasing demand for extracting entities and relations of knowledge graphs from unstructured domain documents. This makes the automatic knowledge extraction for domain text quite meaningful. This paper proposes a knowledge extraction method based on BERT, which is used to extract knowledge points from unstructured specific domain texts (such as insurance clauses in the insurance industry) automatically to save manpower of knowledge graph construction. Different from the commonly used methods which are based on rules, templates or entity extraction models, this paper converts the domain knowledge points into question and answer pairs and uses the text around the answer in documents as the context. The method adopts a BERT-based model similar to BERT's SQuAD reading comprehension task. The model is fine-tuned. And it is used to directly extract knowledge points from more insurance clauses. According to the test results, the model performance is good.
翻訳日:2021-03-06 01:48:19 公開日:2021-03-01
# (参考訳) ソーシャルメディアにおける抑うつ検出のためのサブ感情の深層化

Deep Bag-of-Sub-Emotions for Depression Detection in Social Media ( http://arxiv.org/abs/2103.01334v1 )

ライセンス: CC BY 4.0
This paper presents the Deep Bag-of-Sub-Emotions (DeepBoSE), a novel deep learning model for depression detection in social media. The model is formulated such that it internally computes a differentiable Bag-of-Features (BoF) representation that incorporates emotional information. This is achieved by a reinterpretation of classical weighting schemes like term frequency-inverse document frequency into probabilistic deep learning operations. An important advantage of the proposed method is that it can be trained under the transfer learning paradigm, which is useful to enhance conventional BoF models that cannot be directly integrated into deep learning architectures. Experiments were performed in the eRisk17 and eRisk18 datasets for the depression detection task; results show that DeepBoSE outperforms conventional BoF representations and it is competitive with the state of the art, achieving a F1-score over the positive class of 0.64 in eRisk17 and 0.65 in eRisk18.
翻訳日:2021-03-06 01:08:21 公開日:2021-03-01
# (参考訳) メタラーニングと自己教師付き学習の相互作用の概要

A Brief Summary of Interactions Between Meta-Learning and Self-Supervised Learning ( http://arxiv.org/abs/2103.00845v1 )

ライセンス: CC BY 4.0
This paper briefly reviews the connections between meta-learning and self-supervised learning. Meta-learning can be applied to improve model generalization capability and to construct general AI algorithms. Self-supervised learning utilizes self-supervision from original data and extracts higher-level generalizable features through unsupervised pre-training or optimization of contrastive loss objectives. In self-supervised learning, data augmentation techniques are widely applied and data labels are not required since pseudo labels can be estimated from trained models on similar tasks. Meta-learning aims to adapt trained deep models to solve diverse tasks and to develop general AI algorithms. We review the associations of meta-learning with both generative and contrastive self-supervised learning models. Unlabeled data from multiple sources can be jointly considered even when data sources are vastly different. We show that an integration of meta-learning and self-supervised learning models can best contribute to the improvement of model generalization capability. Self-supervised learning guided by meta-learner and general meta-learning algorithms under self-supervision are both examples of possible combinations.
翻訳日:2021-03-06 00:09:18 公開日:2021-03-01
# (参考訳) Federated Powerによるプライバシ保護分散SVD

Privacy-Preserving Distributed SVD via Federated Power ( http://arxiv.org/abs/2103.00704v1 )

ライセンス: CC BY 4.0
Singular value decomposition (SVD) is one of the most fundamental tools in machine learning and statistics.The modern machine learning community usually assumes that data come from and belong to small-scale device users. The low communication and computation power of such devices, and the possible privacy breaches of users' sensitive data make the computation of SVD challenging. Federated learning (FL) is a paradigm enabling a large number of devices to jointly learn a model in a communication-efficient way without data sharing. In the FL framework, we develop a class of algorithms called FedPower for the computation of partial SVD in the modern setting. Based on the well-known power method, the local devices alternate between multiple local power iterations and one global aggregation to improve communication efficiency. In the aggregation, we propose to weight each local eigenvector matrix with Orthogonal Procrustes Transformation (OPT). Considering the practical stragglers' effect, the aggregation can be fully participated or partially participated, where for the latter we propose two sampling and aggregation schemes. Further, to ensure strong privacy protection, we add Gaussian noise whenever the communication happens by adopting the notion of differential privacy (DP). We theoretically show the convergence bound for FedPower. The resulting bound is interpretable with each part corresponding to the effect of Gaussian noise, parallelization, and random sampling of devices, respectively. We also conduct experiments to demonstrate the merits of FedPower. In particular, the local iterations not only improve communication efficiency but also reduce the chance of privacy breaches.
翻訳日:2021-03-05 20:32:20 公開日:2021-03-01
# (参考訳) スペクトルクラスタリングの裏にある数学とPCAの等価性

The Mathematics Behind Spectral Clustering And The Equivalence To PCA ( http://arxiv.org/abs/2103.00733v1 )

ライセンス: CC BY-SA 4.0
Spectral clustering is a popular algorithm that clusters points using the eigenvalues and eigenvectors of Laplacian matrices derived from the data. For years, spectral clustering has been working mysteriously. This paper explains spectral clustering by dividing it into two categories based on whether the graph Laplacian is fully connected or not. For a fully connected graph, this paper demonstrates the dimension reduction part by offering an objective function: the covariance between the original data points' similarities and the mapped data points' similarities. For a multi-connected graph, this paper proves that with a proper $k$, the first $k$ eigenvectors are the indicators of the connected components. This paper also proves there is an equivalence between spectral embedding and PCA.
翻訳日:2021-03-05 20:31:09 公開日:2021-03-01
# (参考訳) タンジェントカーネル上での微分学習の利点の定量化

Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels ( http://arxiv.org/abs/2103.01210v1 )

ライセンス: CC BY 4.0
We study the relative power of learning with gradient descent on differentiable models, such as neural networks, versus using the corresponding tangent kernels. We show that under certain conditions, gradient descent achieves small error only if a related tangent kernel method achieves a non-trivial advantage over random guessing (a.k.a. weak learning), though this advantage might be very small even when gradient descent can achieve arbitrarily high accuracy. Complementing this, we show that without these conditions, gradient descent can in fact learn with small error even when no kernel method, in particular using the tangent kernel, can achieve a non-trivial advantage over random guessing.
翻訳日:2021-03-05 20:25:15 公開日:2021-03-01
# (参考訳) 高速かつ証明可能な対向ロバスト性を実現するためのマルチクラスブースティングフレームワーク

A Multiclass Boosting Framework for Achieving Fast and Provable Adversarial Robustness ( http://arxiv.org/abs/2103.01276v1 )

ライセンス: CC BY 4.0
Alongside the well-publicized accomplishments of deep neural networks there has emerged an apparent bug in their success on tasks such as object recognition: with deep models trained using vanilla methods, input images can be slightly corrupted in order to modify output predictions, even when these corruptions are practically invisible. This apparent lack of robustness has led researchers to propose methods that can help to prevent an adversary from having such capabilities. The state-of-the-art approaches have incorporated the robustness requirement into the loss function, and the training process involves taking stochastic gradient descent steps not using original inputs but on adversarially-corrupted ones. In this paper we propose a multiclass boosting framework to ensure adversarial robustness. Boosting algorithms are generally well-suited for adversarial scenarios, as they were classically designed to satisfy a minimax guarantee. We provide a theoretical foundation for this methodology and describe conditions under which robustness can be achieved given a weak training oracle. We show empirically that adversarially-robust multiclass boosting not only outperforms the state-of-the-art methods, it does so at a fraction of the training time.
翻訳日:2021-03-05 19:16:27 公開日:2021-03-01
# (参考訳) 関数勾配の推定による生成的粒子変動推定

Generative Particle Variational Inference via Estimation of Functional Gradients ( http://arxiv.org/abs/2103.01291v1 )

ライセンス: CC BY 4.0
Recently, particle-based variational inference (ParVI) methods have gained interest because they directly minimize the Kullback-Leibler divergence and do not suffer from approximation errors from the evidence-based lower bound. However, many ParVI approaches do not allow arbitrary sampling from the posterior, and the few that do allow such sampling suffer from suboptimality. This work proposes a new method for learning to approximately sample from the posterior distribution. We construct a neural sampler that is trained with the functional gradient of the KL-divergence between the empirical sampling distribution and the target distribution, assuming the gradient resides within a reproducing kernel Hilbert space. Our generative ParVI (GPVI) approach maintains the asymptotic performance of ParVI methods while offering the flexibility of a generative sampler. Through carefully constructed experiments, we show that GPVI outperforms previous generative ParVI methods such as amortized SVGD, and is competitive with ParVI as well as gold-standard approaches like Hamiltonian Monte Carlo for fitting both exactly known and intractable target distributions.
翻訳日:2021-03-05 18:36:09 公開日:2021-03-01
# (参考訳) UCBモメンタムQ-ラーニング:忘れずにバイアスを修正する

UCB Momentum Q-learning: Correcting the bias without forgetting ( http://arxiv.org/abs/2103.01312v1 )

ライセンス: CC BY 4.0
We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algorithm for reinforcement learning in tabular and possibly stage-dependent, episodic Markov decision process. UCBMQ is based on Q-learning where we add a momentum term and rely on the principle of optimism in face of uncertainty to deal with exploration. Our new technical ingredient of UCBMQ is the use of momentum to correct the bias that Q-learning suffers while, at the same time, limiting the impact it has on the second-order term of the regret. For UCBMQ , we are able to guarantee a regret of at most $O(\sqrt{H^3SAT}+ H^4 S A )$ where $H$ is the length of an episode, $S$ the number of states, $A$ the number of actions, $T$ the number of episodes and ignoring terms in poly$log(SAHT)$. Notably, UCBMQ is the first algorithm that simultaneously matches the lower bound of $\Omega(\sqrt{H^3SAT})$ for large enough $T$ and has a second-order term (with respect to the horizon $T$) that scales only linearly with the number of states $S$.
翻訳日:2021-03-05 17:47:24 公開日:2021-03-01
# (参考訳) RGBD顔面アンチスプーフィングのためのクロスモーダル焦点損失

Cross Modal Focal Loss for RGBD Face Anti-Spoofing ( http://arxiv.org/abs/2103.00948v1 )

ライセンス: CC BY 4.0
Automatic methods for detecting presentation attacks are essential to ensure the reliable use of facial recognition technology. Most of the methods available in the literature for presentation attack detection (PAD) fails in generalizing to unseen attacks. In recent years, multi-channel methods have been proposed to improve the robustness of PAD systems. Often, only a limited amount of data is available for additional channels, which limits the effectiveness of these methods. In this work, we present a new framework for PAD that uses RGB and depth channels together with a novel loss function. The new architecture uses complementary information from the two modalities while reducing the impact of overfitting. Essentially, a cross-modal focal loss function is proposed to modulate the loss contribution of each channel as a function of the confidence of individual channels. Extensive evaluations in two publicly available datasets demonstrate the effectiveness of the proposed approach.
翻訳日:2021-03-05 16:23:18 公開日:2021-03-01
# (参考訳) GAN(Generative Adversarial Networks)の公正性について

On the Fairness of Generative Adversarial Networks (GANs) ( http://arxiv.org/abs/2103.00950v1 )

ライセンス: CC BY 4.0
Generative adversarial networks (GANs) are one of the greatest advances in AI in recent years. With their ability to directly learn the probability distribution of data, and then sample synthetic realistic data. Many applications have emerged, using GANs to solve classical problems in machine learning, such as data augmentation, class unbalance problems, and fair representation learning. In this paper, we analyze and highlight fairness concerns of GANs model. In this regard, we show empirically that GANs models may inherently prefer certain groups during the training process and therefore they're not able to homogeneously generate data from different groups during the testing phase. Furthermore, we propose solutions to solve this issue by conditioning the GAN model towards samples' group or using ensemble method (boosting) to allow the GAN model to leverage distributed structure of data during the training phase and generate groups at equal rate during the testing phase.
翻訳日:2021-03-05 16:10:07 公開日:2021-03-01
# (参考訳) Inference-time Label-Preserving Target Projectionによるドメイン一般化

Domain Generalization via Inference-time Label-Preserving Target Projections ( http://arxiv.org/abs/2103.01134v1 )

ライセンス: CC BY 4.0
Generalization of machine learning models trained on a set of source domains on unseen target domains with different statistics, is a challenging problem. While many approaches have been proposed to solve this problem, they only utilize source data during training but do not take advantage of the fact that a single target example is available at the time of inference. Motivated by this, we propose a method that effectively uses the target sample during inference beyond mere classification. Our method has three components - (i) A label-preserving feature or metric transformation on source data such that the source samples are clustered in accordance with their class irrespective of their domain (ii) A generative model trained on the these features (iii) A label-preserving projection of the target point on the source-feature manifold during inference via solving an optimization problem on the input space of the generative model using the learned metric. Finally, the projected target is used in the classifier. Since the projected target feature comes from the source manifold and has the same label as the real target by design, the classifier is expected to perform better on it than the true target. We demonstrate that our method outperforms the state-of-the-art Domain Generalization methods on multiple datasets and tasks.
翻訳日:2021-03-05 15:41:15 公開日:2021-03-01
# 機能相互作用レンズによる解釈可能な人工知能

Interpretable Artificial Intelligence through the Lens of Feature Interaction ( http://arxiv.org/abs/2103.03103v1 )

ライセンス: Link先を確認
Interpretation of deep learning models is a very challenging problem because of their large number of parameters, complex connections between nodes, and unintelligible feature representations. Despite this, many view interpretability as a key solution to trustworthiness, fairness, and safety, especially as deep learning is applied to more critical decision tasks like credit approval, job screening, and recidivism prediction. There is an abundance of good research providing interpretability to deep learning models; however, many of the commonly used methods do not consider a phenomenon called "feature interaction." This work first explains the historical and modern importance of feature interactions and then surveys the modern interpretability methods which do explicitly consider feature interactions. This survey aims to bring to light the importance of feature interactions in the larger context of machine learning interpretability, especially in a modern context where deep learning models heavily rely on feature interactions.
翻訳日:2021-03-05 14:52:13 公開日:2021-03-01
# (参考訳) ゼロショット分類における性能変動

Performance Variability in Zero-Shot Classification ( http://arxiv.org/abs/2103.01284v1 )

ライセンス: CC BY 4.0
Zero-shot classification (ZSC) is the task of learning predictors for classes not seen during training. Although the different methods in the literature are evaluated using the same class splits, little is known about their stability under different class partitions. In this work we show experimentally that ZSC performance exhibits strong variability under changing training setups. We propose the use ensemble learning as an attempt to mitigate this phenomena.
翻訳日:2021-03-05 14:40:41 公開日:2021-03-01
# (参考訳) 自然言語処理におけるToken-Modification Adversarial Attacks: A Survey

Token-Modification Adversarial Attacks for Natural Language Processing: A Survey ( http://arxiv.org/abs/2103.00676v1 )

ライセンス: CC BY 4.0
There are now many adversarial attacks for natural language processing systems. Of these, a vast majority achieve success by modifying individual document tokens, which we call here a \textit{token-modification} attack. Each token-modification attack is defined by a specific combination of fundamental \textit{components}, such as a constraint on the adversary or a particular search algorithm. Motivated by this observation, we survey existing token-modification attacks and extract the components of each. We use an attack-independent framework to structure our survey which results in an effective categorisation of the field and an easy comparison of components. We hope this survey will guide new researchers to this field and spark further research into the individual attack components.
翻訳日:2021-03-05 11:12:00 公開日:2021-03-01
# (参考訳) 個人化フェデレーション学習に向けて

Towards Personalized Federated Learning ( http://arxiv.org/abs/2103.00710v1 )

ライセンス: CC BY 4.0
As artificial intelligence (AI)-empowered applications become widespread, there is growing awareness and concern for user privacy and data confidentiality. This has contributed to the popularity of federated learning (FL). FL applications often face data distribution and device capability heterogeneity across data owners. This has stimulated the rapid development of Personalized FL (PFL). In this paper, we complement existing surveys, which largely focus on the methods and applications of FL, with a review of recent advances in PFL. We discuss hurdles to PFL under the current FL settings, and present a unique taxonomy dividing PFL techniques into data-based and model-based approaches. We highlight their key ideas, and envision promising future trajectories of research towards new PFL architectural design, realistic PFL benchmarking, and trustworthy PFL approaches.
翻訳日:2021-03-05 10:05:27 公開日:2021-03-01
# (参考訳) 非ユークリッド微分私的確率凸最適化

Non-Euclidean Differentially Private Stochastic Convex Optimization ( http://arxiv.org/abs/2103.01278v1 )

ライセンス: CC BY 4.0
Differentially private (DP) stochastic convex optimization (SCO) is a fundamental problem, where the goal is to approximately minimize the population risk with respect to a convex loss function, given a dataset of i.i.d. samples from a distribution, while satisfying differential privacy with respect to the dataset. Most of the existing works in the literature of private convex optimization focus on the Euclidean (i.e., $\ell_2$) setting, where the loss is assumed to be Lipschitz (and possibly smooth) w.r.t. the $\ell_2$ norm over a constraint set with bounded $\ell_2$ diameter. Algorithms based on noisy stochastic gradient descent (SGD) are known to attain the optimal excess risk in this setting. In this work, we conduct a systematic study of DP-SCO for $\ell_p$-setups. For $p=1$, under a standard smoothness assumption, we give a new algorithm with nearly optimal excess risk. This result also extends to general polyhedral norms and feasible sets. For $p\in(1, 2)$, we give two new algorithms, whose central building block is a novel privacy mechanism, which generalizes the Gaussian mechanism. Moreover, we establish a lower bound on the excess risk for this range of $p$, showing a necessary dependence on $\sqrt{d}$, where $d$ is the dimension of the space. Our lower bound implies a sudden transition of the excess risk at $p=1$, where the dependence on $d$ changes from logarithmic to polynomial, resolving an open question in prior work [TTZ15] . For $p\in (2, \infty)$, noisy SGD attains optimal excess risk in the low-dimensional regime; in particular, this proves the optimality of noisy SGD for $p=\infty$. Our work draws upon concepts from the geometry of normed spaces, such as the notions of regularity, uniform convexity, and uniform smoothness.
翻訳日:2021-03-05 09:27:21 公開日:2021-03-01
# (参考訳) 内視鏡画像における腎臓結石同定のための深層学習法の評価

Assessing deep learning methods for the identification of kidney stones in endoscopic images ( http://arxiv.org/abs/2103.01146v1 )

ライセンス: CC BY 4.0
Knowing the type (i.e., the biochemical composition) of kidney stones is crucial to prevent relapses with an appropriate treatment. During ureteroscopies, kidney stones are fragmented, extracted from the urinary tract, and their composition is determined using a morpho-constitutional analysis. This procedure is time consuming (the morpho-constitutional analysis results are only available after some days) and tedious (the fragment extraction lasts up to an hour). Identifying the kidney stone type only with the in-vivo endoscopic images would allow for the dusting of the fragments, while the morpho-constitutional analysis could be avoided. Only few contributions dealing with the in vivo identification of kidney stones were published. This paper discusses and compares five classification methods including deep convolutional neural networks (DCNN)-based approaches and traditional (non DCNN-based) ones. Even if the best method is a DCCN approach with a precision and recall of 98% and 97% over four classes, this contribution shows that a XGBoost classifier exploiting well-chosen feature vectors can closely approach the performances of DCNN classifiers for a medical application with a limited number of annotated data.
翻訳日:2021-03-05 06:40:10 公開日:2021-03-01
# (参考訳) ビューティーパワービースト

BEAUTY Powered BEAST ( http://arxiv.org/abs/2103.00674v1 )

ライセンス: CC BY 4.0
We study inference about the uniform distribution with the proposed binary expansion approximation of uniformity (BEAUTY) approach. Through an extension of the celebrated Euler's formula, we approximate the characteristic function of any copula distribution with a linear combination of means of binary interactions from marginal binary expansions. This novel characterization enables a unification of many important existing tests through an approximation from some quadratic form of symmetry statistics, where the deterministic weight matrix characterizes the power properties of each test. To achieve a uniformly high power, we study test statistics with data-adaptive weights through an oracle approach, referred to as the binary expansion adaptive symmetry test (BEAST). By utilizing the properties of the binary expansion filtration, we show that the Neyman-Pearson test of uniformity can be approximated by an oracle weighted sum of symmetry statistics. The BEAST with this oracle leads all existing tests we considered in empirical power against all complex forms of alternatives. This oracle therefore sheds light on the potential of substantial improvements in power and on the form of optimal weights under each alternative. By approximating this oracle with data-adaptive weights, we develop the BEAST that improves the empirical power of many existing tests against a wide spectrum of common alternatives while providing clear interpretation of the form of non-uniformity upon rejection. We illustrate the BEAST with a study of the relationship between the location and brightness of stars.
翻訳日:2021-03-05 05:59:12 公開日:2021-03-01
# (参考訳) 自然言語処理モデルを用いたcovid-19インフォデミック対策

Combat COVID-19 Infodemic Using Explainable Natural Language Processing Models ( http://arxiv.org/abs/2103.00747v1 )

ライセンス: CC BY 4.0
Misinformation of COVID-19 is prevalent on social media as the pandemic unfolds, and the associated risks are extremely high. Thus, it is critical to detect and combat such misinformation. Recently, deep learning models using natural language processing techniques, such as BERT (Bidirectional Encoder Representations from Transformers), have achieved great successes in detecting misinformation. In this paper, we proposed an explainable natural language processing model based on DistilBERT and SHAP (Shapley Additive exPlanations) to combat misinformation about COVID-19 due to their efficiency and effectiveness. First, we collected a dataset of 984 claims about COVID-19 with fact checking. By augmenting the data using back-translation, we doubled the sample size of the dataset and the DistilBERT model was able to obtain good performance (accuracy: 0.972; areas under the curve: 0.993) in detecting misinformation about COVID-19. Our model was also tested on a larger dataset for AAAI2021 - COVID-19 Fake News Detection Shared Task and obtained good performance (accuracy: 0.938; areas under the curve: 0.985). The performance on both datasets was better than traditional machine learning models. Second, in order to boost public trust in model prediction, we employed SHAP to improve model explainability, which was further evaluated using a between-subjects experiment with three conditions, i.e., text (T), text+SHAP explanation (TSE), and text+SHAP explanation+source and evidence (TSESE). The participants were significantly more likely to trust and share information related to COVID-19 in the TSE and TSESE conditions than in the T condition. Our results provided good implications in detecting misinformation about COVID-19 and improving public trust.
翻訳日:2021-03-05 05:26:24 公開日:2021-03-01
# (参考訳) vy\=akarana:構文評価のための無色緑のベンチマーク

Vy\=akarana: A Colorless Green Benchmark for Syntactic Evaluation in Indic Languages ( http://arxiv.org/abs/2103.00854v1 )

ライセンス: CC BY 4.0
While there has been significant progress towards developing NLU datasets and benchmarks for Indic languages, syntactic evaluation has been relatively less explored. Unlike English, Indic languages have rich morphosyntax, grammatical genders, free linear word-order, and highly inflectional morphology. In this paper, we introduce Vy\=akarana: a benchmark of gender-balanced Colorless Green sentences in Indic languages for syntactic evaluation of multilingual language models. The benchmark comprises four syntax-related tasks: PoS Tagging, Syntax Tree-depth Prediction, Grammatical Case Marking, and Subject-Verb Agreement. We use the datasets from the evaluation tasks to probe five multilingual language models of varying architectures for syntax in Indic languages. Our results show that the token-level and sentence-level representations from the Indic language models (IndicBERT and MuRIL) do not capture the syntax in Indic languages as efficiently as the other highly multilingual language models. Further, our layer-wise probing experiments reveal that while mBERT, DistilmBERT, and XLM-R localize the syntax in middle layers, the Indic language models do not show such syntactic localization.
翻訳日:2021-03-05 04:43:17 公開日:2021-03-01
# (参考訳) アラビア方言識別の改善のためのMARBERTの適応:NADI 2021共有タスクへの提出

Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task ( http://arxiv.org/abs/2103.01065v1 )

ライセンス: CC BY-SA 4.0
In this paper, we tackle the Nuanced Arabic Dialect Identification (NADI) shared task (Abdul-Mageed et al., 2021) and demonstrate state-of-the-art results on all of its four subtasks. Tasks are to identify the geographic origin of short Dialectal (DA) and Modern Standard Arabic (MSA) utterances at the levels of both country and province. Our final model is an ensemble of variants built on top of MARBERT that achieves an F1-score of 34.03% for DA at the country-level development set -- an improvement of 7.63% from previous work.
翻訳日:2021-03-05 04:25:18 公開日:2021-03-01
# (参考訳) ベンチマークデータセットを用いたCOVID-19接触追跡アプリのユーザレビューの感性分析

Sentiment Analysis of Users' Reviews on COVID-19 Contact Tracing Apps with a Benchmark Dataset ( http://arxiv.org/abs/2103.01196v1 )

ライセンス: CC BY 4.0
Contact tracing has been globally adopted in the fight to control the infection rate of COVID-19. Thanks to digital technologies, such as smartphones and wearable devices, contacts of COVID-19 patients can be easily traced and informed about their potential exposure to the virus. To this aim, several interesting mobile applications have been developed. However, there are ever-growing concerns over the working mechanism and performance of these applications. The literature already provides some interesting exploratory studies on the community's response to the applications by analyzing information from different sources, such as news and users' reviews of the applications. However, to the best of our knowledge, there is no existing solution that automatically analyzes users' reviews and extracts the evoked sentiments. In this work, we propose a pipeline starting from manual annotation via a crowd-sourcing study and concluding on the development and training of AI models for automatic sentiment analysis of users' reviews. In total, we employ eight different methods achieving up to an average F1-Scores 94.8% indicating the feasibility of automatic sentiment analysis of users' reviews on the COVID-19 contact tracing applications. We also highlight the key advantages, drawbacks, and users' concerns over the applications. Moreover, we also collect and annotate a large-scale dataset composed of 34,534 reviews manually annotated from the contract tracing applications of 46 distinct countries. The presented analysis and the dataset are expected to provide a baseline/benchmark for future research in the domain.
翻訳日:2021-03-05 04:18:38 公開日:2021-03-01
# (参考訳) 単一言語, 多言語, ゼロショット条件におけるデータセット埋め込みの有効性について

On the Effectiveness of Dataset Embeddings in Mono-lingual,Multi-lingual and Zero-shot Conditions ( http://arxiv.org/abs/2103.01273v1 )

ライセンス: CC BY 4.0
Recent complementary strands of research have shown that leveraging information on the data source through encoding their properties into embeddings can lead to performance increase when training a single model on heterogeneous data sources. However, it remains unclear in which situations these dataset embeddings are most effective, because they are used in a large variety of settings, languages and tasks. Furthermore, it is usually assumed that gold information on the data source is available, and that the test data is from a distribution seen during training. In this work, we compare the effect of dataset embeddings in mono-lingual settings, multi-lingual settings, and with predicted data source label in a zero-shot setting. We evaluate on three morphosyntactic tasks: morphological tagging, lemmatization, and dependency parsing, and use 104 datasets, 66 languages, and two different dataset grouping strategies. Performance increases are highest when the datasets are of the same language, and we know from which distribution the test-instance is drawn. In contrast, for setups where the data is from an unseen distribution, performance increase vanishes.
翻訳日:2021-03-05 04:00:15 公開日:2021-03-01
# (参考訳) 映画対話における感情ダイナミクス

Emotion Dynamics in Movie Dialogues ( http://arxiv.org/abs/2103.01345v1 )

ライセンス: CC BY 4.0
Emotion dynamics is a framework for measuring how an individual's emotions change over time. It is a powerful tool for understanding how we behave and interact with the world. In this paper, we introduce a framework to track emotion dynamics through one's utterances. Specifically we introduce a number of utterance emotion dynamics (UED) metrics inspired by work in Psychology. We use this approach to trace emotional arcs of movie characters. We analyze thousands of such character arcs to test hypotheses that inform our broader understanding of stories. Notably, we show that there is a tendency for characters to use increasingly more negative words and become increasingly emotionally discordant with each other until about 90 percent of the narrative length. UED also has applications in behavior studies, social sciences, and public health.
翻訳日:2021-03-05 03:07:32 公開日:2021-03-01
# (参考訳) データスパーシネス仮説による敵対的脆弱性の解明

Explaining Adversarial Vulnerability with a Data Sparsity Hypothesis ( http://arxiv.org/abs/2103.00778v1 )

ライセンス: CC BY 4.0
Despite many proposed algorithms to provide robustness to deep learning (DL) models, DL models remain susceptible to adversarial attacks. We hypothesize that the adversarial vulnerability of DL models stems from two factors. The first factor is data sparsity which is that in the high dimensional data space, there are large regions outside the support of the data distribution. The second factor is the existence of many redundant parameters in the DL models. Owing to these factors, different models are able to come up with different decision boundaries with comparably high prediction accuracy. The appearance of the decision boundaries in the space outside the support of the data distribution does not affect the prediction accuracy of the model. However, they make an important difference in the adversarial robustness of the model. We propose that the ideal decision boundary should be as far as possible from the support of the data distribution.\par In this paper, we develop a training framework for DL models to learn such decision boundaries spanning the space around the class distributions further from the data points themselves. Semi-supervised learning was deployed to achieve this objective by leveraging unlabeled data generated in the space outside the support of the data distribution. We measure adversarial robustness of the models trained using this training framework against well-known adversarial attacks We found that our results, other regularization methods and adversarial training also support our hypothesis of data sparcity. We show that the unlabeled data generated by noise using our framework is almost as effective as unlabeled data, sourced from existing data sets or generated by synthesis algorithms, on adversarial robustness. Our code is available at https://github.com/MahsaPaknezhad/AdversariallyRobustTraining.
翻訳日:2021-03-05 01:22:46 公開日:2021-03-01
# (参考訳) コントラスト学習を用いたseg analysis schemeの性能向上

Using contrastive learning to improve the performance of steganalysis schemes ( http://arxiv.org/abs/2103.00891v1 )

ライセンス: CC BY 4.0
To improve the detection accuracy and generalization of steganalysis, this paper proposes the Steganalysis Contrastive Framework (SCF) based on contrastive learning. The SCF improves the feature representation of steganalysis by maximizing the distance between features of samples of different categories and minimizing the distance between features of samples of the same category. To decrease the computing complexity of the contrastive loss in supervised learning, we design a novel Steganalysis Contrastive Loss (StegCL) based on the equivalence and transitivity of similarity. The StegCL eliminates the redundant computing in the existing contrastive loss. The experimental results show that the SCF improves the generalization and detection accuracy of existing steganalysis DNNs, and the maximum promotion is 2% and 3% respectively. Without decreasing the detection accuracy, the training time of using the StegCL is 10% of that of using the contrastive loss in supervised learning.
翻訳日:2021-03-05 00:44:31 公開日:2021-03-01
# (参考訳) ビジネスルールケースのシーケンスに対する一貫性の測定

Measuring Inconsistency over Sequences of Business Rule Cases ( http://arxiv.org/abs/2103.01108v1 )

ライセンス: CC BY 4.0
In this report, we investigate (element-based) inconsistency measures for multisets of business rule bases. Currently, related works allow to assess individual rule bases, however, as companies might encounter thousands of such instances daily, studying not only individual rule bases separately, but rather also their interrelations becomes necessary, especially in regard to determining suitable re-modelling strategies. We therefore present an approach to induce multiset-measures from arbitrary (traditional) inconsistency measures, propose new rationality postulates for a multiset use-case, and investigate the complexity of various aspects regarding multi-rule base inconsistency measurement.
翻訳日:2021-03-05 00:32:37 公開日:2021-03-01
# (参考訳) アドホックチームワークにおける計画立案のためのコミュニケーションの価値

Expected Value of Communication for Planning in Ad Hoc Teamwork ( http://arxiv.org/abs/2103.01171v1 )

ライセンス: CC BY 4.0
A desirable goal for autonomous agents is to be able to coordinate on the fly with previously unknown teammates. Known as "ad hoc teamwork", enabling such a capability has been receiving increasing attention in the research community. One of the central challenges in ad hoc teamwork is quickly recognizing the current plans of other agents and planning accordingly. In this paper, we focus on the scenario in which teammates can communicate with one another, but only at a cost. Thus, they must carefully balance plan recognition based on observations vs. that based on communication. This paper proposes a new metric for evaluating how similar are two policies that a teammate may be following - the Expected Divergence Point (EDP). We then present a novel planning algorithm for ad hoc teamwork, determining which query to ask and planning accordingly. We demonstrate the effectiveness of this algorithm in a range of increasingly general communication in ad hoc teamwork problems.
翻訳日:2021-03-04 23:41:18 公開日:2021-03-01
# (参考訳) 深層学習に基づく医用画像の幾何学的登録:視覚的特徴のない精度はどの程度か?

Deep learning based geometric registration for medical images: How accurate can we get without visual features? ( http://arxiv.org/abs/2103.00885v1 )

ライセンス: CC BY 4.0
As in other areas of medical image analysis, e.g. semantic segmentation, deep learning is currently driving the development of new approaches for image registration. Multi-scale encoder-decoder network architectures achieve state-of-the-art accuracy on tasks such as intra-patient alignment of abdominal CT or brain MRI registration, especially when additional supervision, such as anatomical labels, is available. The success of these methods relies to a large extent on the outstanding ability of deep CNNs to extract descriptive visual features from the input images. In contrast to conventional methods, the explicit inclusion of geometric information plays only a minor role, if at all. In this work we take a look at an exactly opposite approach by investigating a deep learning framework for registration based solely on geometric features and optimisation. We combine graph convolutions with loopy belief message passing to enable highly accurate 3D point cloud registration. Our experimental validation is conducted on complex key-point graphs of inner lung structures, strongly outperforming dense encoder-decoder networks and other point set registration methods. Our code is publicly available at https://github.com/multimodallearning/deep-geo-reg.
翻訳日:2021-03-04 22:35:52 公開日:2021-03-01
# (参考訳) モノクロ3次元物体検出のためのカテゴリー深度分布ネットワーク

Categorical Depth Distribution Network for Monocular 3D Object Detection ( http://arxiv.org/abs/2103.01100v1 )

ライセンス: CC BY 4.0
Monocular 3D object detection is a key problem for autonomous vehicles, as it provides a solution with simple configuration compared to typical multi-sensor systems. The main challenge in monocular 3D detection lies in accurately predicting object depth, which must be inferred from object and scene cues due to the lack of direct range measurement. Many methods attempt to directly estimate depth to assist in 3D detection, but show limited performance as a result of depth inaccuracy. Our proposed solution, Categorical Depth Distribution Network (CaDDN), uses a predicted categorical depth distribution for each pixel to project rich contextual feature information to the appropriate depth interval in 3D space. We then use the computationally efficient bird's-eye-view projection and single-stage detector to produce the final output bounding boxes. We design CaDDN as a fully differentiable end-to-end approach for joint depth estimation and object detection. We validate our approach on the KITTI 3D object detection benchmark, where we rank 1st among published monocular methods. We also provide the first monocular 3D detection results on the newly released Waymo Open Dataset. The source code for CaDDN will be made publicly available before publication.
翻訳日:2021-03-04 22:22:46 公開日:2021-03-01
# (参考訳) ビデオにおける時間的活動検出のための粗細ネットワーク

Coarse-Fine Networks for Temporal Activity Detection in Videos ( http://arxiv.org/abs/2103.01302v1 )

ライセンス: CC BY 4.0
In this paper, we introduce 'Coarse-Fine Networks', a two-stream architecture which benefits from different abstractions of temporal resolution to learn better video representations for long-term motion. Traditional Video models process inputs at one (or few) fixed temporal resolution without any dynamic frame selection. However, we argue that, processing multiple temporal resolutions of the input and doing so dynamically by learning to estimate the importance of each frame can largely improve video representations, specially in the domain of temporal activity localization. To this end, we propose (1) `Grid Pool', a learned temporal downsampling layer to extract coarse features, and, (2) `Multi-stage Fusion', a spatio-temporal attention mechanism to fuse a fine-grained context with the coarse features. We show that our method can outperform the state-of-the-arts for action detection in public datasets including Charades with a significantly reduced compute and memory footprint.
翻訳日:2021-03-04 22:05:16 公開日:2021-03-01
# (参考訳) 勝利への異質性:ワンショットフェデレーションクラスタリング

Heterogeneity for the Win: One-Shot Federated Clustering ( http://arxiv.org/abs/2103.00697v1 )

ライセンス: CC BY-SA 4.0
In this work, we explore the unique challenges -- and opportunities -- of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, $k$-FED, based on the widely-used Lloyd's method for $k$-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. We analyse $k$-FED under a center separation assumption and compare it to the best known requirements of its centralized counterpart. Our analysis shows that in heterogeneous regimes where the number of clusters per device $(k')$ is smaller than the total number of clusters over the network $k$, $(k'\le \sqrt{k})$, we can use heterogeneity to our advantage -- significantly weakening the cluster separation requirements for $k$-FED. From a practical viewpoint, $k$-FED also has many desirable properties: it requires only round of communication, can run asynchronously, and can handle partial participation or node/network failures. We motivate our analysis with experiments on common FL benchmarks, and highlight the practical utility of one-shot clustering through use-cases in personalized FL and device sampling.
翻訳日:2021-03-04 15:46:22 公開日:2021-03-01
# (参考訳) ミニマックスフェア分類のための適応サンプリング

Adaptive Sampling for Minimax Fair Classification ( http://arxiv.org/abs/2103.00755v1 )

ライセンス: CC BY 4.0
Machine learning models trained on imbalanced datasets can often end up adversely affecting inputs belonging to the underrepresented groups. To address this issue, we consider the problem of adaptively constructing training sets which allow us to learn classifiers that are fair in a minimax sense. We first propose an adaptive sampling algorithm based on the principle of optimism, and derive theoretical bounds on its performance. We then suitably adapt the techniques developed for the analysis of our proposed algorithm to derive bounds on the performance of a related $\epsilon$-greedy strategy recently proposed in the literature. Next, by deriving algorithm independent lower-bounds for a specific class of problems, we show that the performance achieved by our adaptive scheme cannot be improved in general. We then validate the benefits of adaptively constructing training sets via experiments on synthetic tasks with logistic regression classifiers, as well as on several real-world tasks using convolutional neural networks.
翻訳日:2021-03-04 14:40:04 公開日:2021-03-01
# (参考訳) STUDD:教師なし概念ドリフト検出のための学生-教師手法

STUDD: A Student-Teacher Method for Unsupervised Concept Drift Detection ( http://arxiv.org/abs/2103.00903v1 )

ライセンス: CC BY 4.0
Concept drift detection is a crucial task in data stream evolving environments. Most of state of the art approaches designed to tackle this problem monitor the loss of predictive models. However, this approach falls short in many real-world scenarios, where the true labels are not readily available to compute the loss. In this context, there is increasing attention to approaches that perform concept drift detection in an unsupervised manner, i.e., without access to the true labels. We propose a novel approach to unsupervised concept drift detection based on a student-teacher learning paradigm. Essentially, we create an auxiliary model (student) to mimic the behaviour of the primary model (teacher). At run-time, our approach is to use the teacher for predicting new instances and monitoring the mimicking loss of the student for concept drift detection. In a set of experiments using 19 data streams, we show that the proposed approach can detect concept drift and present a competitive behaviour relative to the state of the art approaches.
翻訳日:2021-03-04 13:08:39 公開日:2021-03-01
# (参考訳) DTW-Merge: 時系列分類のための新しいデータ拡張技術

DTW-Merge: A Novel Data Augmentation Technique for Time Series Classification ( http://arxiv.org/abs/2103.01119v1 )

ライセンス: CC BY 4.0
In recent years, neural networks achieved much success in various applications. The main challenge in training deep neural networks is the lack of sufficient data to improve the model's generalization and avoid overfitting. One of the solutions is to generate new training samples. This paper proposes a novel data augmentation method for time series based on Dynamic Time Warping. This method is inspired by the concept that warped parts of two time series have the same temporal properties. Exploiting the proposed approach with recently-introduced ResNet reveals the improvement of results on the 2018 UCR Time Series Classification Archive.
翻訳日:2021-03-04 12:53:13 公開日:2021-03-01
# (参考訳) 進化的学習を用いた時系列におけるギャップ充足の自動データ駆動アプローチ

Automated data-driven approach for gap filling in the time series using evolutionary learning ( http://arxiv.org/abs/2103.01124v1 )

ライセンス: CC BY 4.0
Time series analysis is widely used in various fields of science and industry. However, the vast majority of the time series obtained from real sources contain a large number of gaps, have a complex character, and can contain incorrect or missed parts. So, it is useful to have a convenient, efficient, and flexible instrument to fill the gaps in the time series. In this paper, we propose an approach for filling the gaps by the evolutionary automatic machine learning, that is implemented as a part of the FEDOT framework. Automated identification of the optimal data-driven model structure allows the adopting of the gap filling strategy to the specific problem. As a case study, the multivariate sea surface height dataset is used. During the experimental studies, the proposed approach was compared with other gap-filling methods and the composite models allow obtaining the higher quality of the gap restoration.
翻訳日:2021-03-04 12:47:28 公開日:2021-03-01
# (参考訳) 理由、価値、ステークホルダー:説明可能な人工知能のための哲学的枠組み

Reasons, Values, Stakeholders: A Philosophical Framework for Explainable Artificial Intelligence ( http://arxiv.org/abs/2103.00752v1 )

ライセンス: CC BY-SA 4.0
The societal and ethical implications of the use of opaque artificial intelligence systems for consequential decisions, such as welfare allocation and criminal justice, have generated a lively debate among multiple stakeholder groups, including computer scientists, ethicists, social scientists, policy makers, and end users. However, the lack of a common language or a multi-dimensional framework to appropriately bridge the technical, epistemic, and normative aspects of this debate prevents the discussion from being as productive as it could be. Drawing on the philosophical literature on the nature and value of explanations, this paper offers a multi-faceted framework that brings more conceptual precision to the present debate by (1) identifying the types of explanations that are most pertinent to artificial intelligence predictions, (2) recognizing the relevance and importance of social and ethical values for the evaluation of these explanations, and (3) demonstrating the importance of these explanations for incorporating a diversified approach to improving the design of truthful algorithmic ecosystems. The proposed philosophical framework thus lays the groundwork for establishing a pertinent connection between the technical and ethical aspects of artificial intelligence systems.
翻訳日:2021-03-04 11:09:22 公開日:2021-03-01
# (参考訳) クラッタと動的背景の衝突検出のためのバイオインスパイアアプローチ感応ニューラルネットワーク

A Bioinspired Approach-Sensitive Neural Network for Collision Detection in Cluttered and Dynamic Backgrounds ( http://arxiv.org/abs/2103.00857v1 )

ライセンス: CC BY 4.0
Rapid, accurate and robust detection of looming objects in cluttered moving backgrounds is a significant and challenging problem for robotic visual systems to perform collision detection and avoidance tasks. Inspired by the neural circuit of elementary motion vision in the mammalian retina, this paper proposes a bioinspired approach-sensitive neural network (ASNN) that contains three main contributions. Firstly, a direction-selective visual processing module is built based on the spatiotemporal energy framework, which can estimate motion direction accurately via only two mutually perpendicular spatiotemporal filtering channels. Secondly, a novel approach-sensitive neural network is modeled as a push-pull structure formed by ON and OFF pathways, which responds strongly to approaching motion while insensitivity to lateral motion. Finally, a method of directionally selective inhibition is introduced, which is able to suppress the translational backgrounds effectively. Extensive synthetic and real robotic experiments show that the proposed model is able to not only detect collision accurately and robustly in cluttered and dynamic backgrounds but also extract more collision information like position and direction, for guiding rapid decision making.
翻訳日:2021-03-04 11:08:24 公開日:2021-03-01
# (参考訳) multi-spectral consistency loss を用いた単眼熱ビデオの教師なし深度とエゴモーション推定

Unsupervised Depth and Ego-motion Estimation for Monocular Thermal Video using Multi-spectral Consistency Loss ( http://arxiv.org/abs/2103.00760v1 )

ライセンス: CC BY 4.0
Most of the deep-learning based depth and ego-motion networks have been designed for visible cameras. However, visible cameras heavily rely on the presence of an external light source. Therefore, it is challenging to use them under low-light conditions such as night scenes, tunnels, and other harsh conditions. A thermal camera is one solution to compensate for this problem because it detects Long Wave Infrared Radiation(LWIR) regardless of any external light sources. However, despite this advantage, both depth and ego-motion estimation research for the thermal camera are not actively explored until so far. In this paper, we propose an unsupervised learning method for the all-day depth and ego-motion estimation. The proposed method exploits multi-spectral consistency loss to gives complementary supervision for the networks by reconstructing visible and thermal images with the depth and pose estimated from thermal images. The networks trained with the proposed method robustly estimate the depth and pose from monocular thermal video under low-light and even zero-light conditions. To the best of our knowledge, this is the first work to simultaneously estimate both depth and ego-motion from the monocular thermal video in an unsupervised manner.
翻訳日:2021-03-04 09:15:00 公開日:2021-03-01
# (参考訳) 弱々しい監視学習による未開のCOVID-19病変の局在とセグメンテーションを目指して

Towards Unbiased COVID-19 Lesion Localisation and Segmentation via Weakly Supervised Learning ( http://arxiv.org/abs/2103.00780v1 )

ライセンス: CC BY 4.0
Despite tremendous efforts, it is very challenging to generate a robust model to assist in the accurate quantification assessment of COVID-19 on chest CT images. Due to the nature of blurred boundaries, the supervised segmentation methods usually suffer from annotation biases. To support unbiased lesion localisation and to minimise the labeling costs, we propose a data-driven framework supervised by only image-level labels. The framework can explicitly separate potential lesions from original images, with the help of a generative adversarial network and a lesion-specific decoder. Experiments on two COVID-19 datasets demonstrate the effectiveness of the proposed framework and its superior performance to several existing methods.
翻訳日:2021-03-04 09:00:06 公開日:2021-03-01
# (参考訳) 機能統計を用いた顔ビデオの感情パターン検出

Emotion pattern detection on facial videos using functional statistics ( http://arxiv.org/abs/2103.00844v1 )

ライセンス: CC BY 4.0
There is an increasing scientific interest in automatically analysing and understanding human behavior, with particular reference to the evolution of facial expressions and the recognition of the corresponding emotions. In this paper we propose a technique based on Functional ANOVA to extract significant patterns of face muscles movements, in order to identify the emotions expressed by actors in recorded videos. We determine if there are time-related differences on expressions among emotional groups by using a functional F-test. Such results are the first step towards the construction of a reliable automatic emotion recognition system
翻訳日:2021-03-04 08:51:40 公開日:2021-03-01
# (参考訳) SkySat画像からのマルチビューステレオを用いた自動ストッキングボリュームモニタリング

Automatic Stockpile Volume Monitoring using Multi-view Stereo from SkySat Imagery ( http://arxiv.org/abs/2103.00945v1 )

ライセンス: CC BY-SA 4.0
This paper proposes a system for automatic surface volume monitoring from time series of SkySat pushframe imagery. A specific challenge of building and comparing large 3D models from SkySat data is to correct inconsistencies between the camera models associated to the multiple views that are necessary to cover the area at a given time, where these camera models are represented as Rational Polynomial Cameras (RPCs). We address the problem by proposing a date-wise RPC refinement, able to handle dynamic areas covered by sets of partially overlapping views. The cameras are refined by means of a rotation that compensates for errors due to inaccurate knowledge of the satellite attitude. The refined RPCs are then used to reconstruct multiple consistent Digital Surface Models (DSMs) from different stereo pairs at each date. RPC refinement strengthens the consistency between the DSMs of each date, which is extremely beneficial to accurately measure volumes in the 3D surface models. The system is tested in a real case scenario, to monitor large coal stockpiles. Our volume estimates are validated with measurements collected on site in the same period of time.
翻訳日:2021-03-04 08:45:41 公開日:2021-03-01
# (参考訳) 圧縮のための深部知覚画像品質評価

Deep Perceptual Image Quality Assessment for Compression ( http://arxiv.org/abs/2103.01114v1 )

ライセンス: CC BY 4.0
Lossy Image compression is necessary for efficient storage and transfer of data. Typically the trade-off between bit-rate and quality determines the optimal compression level. This makes the image quality metric an integral part of any imaging system. While the existing full-reference metrics such as PSNR and SSIM may be less sensitive to perceptual quality, the recently introduced learning methods may fail to generalize to unseen data. In this paper we propose the largest image compression quality dataset to date with human perceptual preferences, enabling the use of deep learning, and we develop a full reference perceptual quality assessment metric for lossy image compression that outperforms the existing state-of-the-art methods. We show that the proposed model can effectively learn from thousands of examples available in the new dataset, and consequently it generalizes better to other unseen datasets of human perceptual preference.
翻訳日:2021-03-04 08:37:15 公開日:2021-03-01
# (参考訳) ジオメトリに基づくブドウトマトのグラッピング

Geometry-Based Grasping of Vine Tomatoes ( http://arxiv.org/abs/2103.01272v1 )

ライセンス: CC BY 4.0
We propose a geometry-based grasping method for vine tomatoes. It relies on a computer-vision pipeline to identify the required geometric features of the tomatoes and of the truss stem. The grasping method then uses a geometric model of the robotic hand and the truss to determine a suitable grasping location on the stem. This approach allows for grasping tomato trusses without requiring delicate contact sensors or complex mechanistic models and under minimal risk of damaging the tomatoes. Lab experiments were conducted to validate the proposed methods, using an RGB-D camera and a low-cost robotic manipulator. The success rate was 83% to 92%, depending on the type of truss.
翻訳日:2021-03-04 08:28:42 公開日:2021-03-01
# (参考訳) 次なる期待 - システムに敏感な技術開発と作業コンテキストの統合

Anticipation Next -- System-sensitive technology development and integration in work contexts ( http://arxiv.org/abs/2103.00923v1 )

ライセンス: CC BY 4.0
When discussing future concerns within socio-technical systems in work contexts, we often find descriptions of missed technology development and integration. The experience of technology that fails whilst being integrated is often rooted in dysfunctional epistemological approaches within the research and development process. Thus, ultimately leading to sustainable technology-distrust in work contexts. This is true for organisations which integrate new technologies and for organisations that invent them. Organisations in which we find failed technology development and integrations are in their very nature social systems. Nowadays, those complex social systems act within an even more complex environment. This urges for new anticipation methods for technology development and integration. Gathering of and dealing with complex information in the described context is what we call Anticipation Next. This explorative work uses existing literature from the adjoining research fields of system theory, organizational theory, and socio-technical research to combine various concepts. We end with suggesting a conceptual framework that is supposed to be used in very early stages of technology development and integration for and in work contexts.
翻訳日:2021-03-04 06:40:15 公開日:2021-03-01
# (参考訳) Eコマース検索のためのサイクル一貫性翻訳によるクエリ書き換え

Query Rewriting via Cycle-Consistent Translation for E-Commerce Search ( http://arxiv.org/abs/2103.00800v1 )

ライセンス: CC BY-SA 4.0
Nowadays e-commerce search has become an integral part of many people's shopping routines. One critical challenge in today's e-commerce search is the semantic matching problem where the relevant items may not contain the exact terms in the user query. In this paper, we propose a novel deep neural network based approach to query rewriting, in order to tackle this problem. Specifically, we formulate query rewriting into a cyclic machine translation problem to leverage abundant click log data. Then we introduce a novel cyclic consistent training algorithm in conjunction with state-of-the-art machine translation models to achieve the optimal performance in terms of query rewriting accuracy. In order to make it practical in industrial scenarios, we optimize the syntax tree construction to reduce computational cost and online serving latency. Offline experiments show that the proposed method is able to rewrite hard user queries into more standard queries that are more appropriate for the inverted index to retrieve. Comparing with human curated rule-based method, the proposed model significantly improves query rewriting diversity while maintaining good relevancy. Online A/B experiments show that it improves core e-commerce business metrics significantly. Since the summer of 2020, the proposed model has been launched into our search engine production, serving hundreds of millions of users.
翻訳日:2021-03-04 04:59:35 公開日:2021-03-01
# (参考訳) 最適輸送のためのマニホールド最適化

Manifold optimization for optimal transport ( http://arxiv.org/abs/2103.00902v1 )

ライセンス: CC BY 4.0
Optimal transport (OT) has recently found widespread interest in machine learning. It allows to define novel distances between probability measures, which have shown promise in several applications. In this work, we discuss how to computationally approach OT problems within the framework of the Riemannian manifold optimization. The basis of this is the manifold of doubly stochastic matrices (and its generalization). Even though the manifold geometry is not new, surprisingly, its usefulness for solving OT problems has not been considered. To this end, we specifically discuss optimization-related ingredients that allow modeling the OT problem on smooth Riemannian manifolds by exploiting the geometry of the search space. We also discuss extensions where we reuse the developed optimization ingredients. We make available the Manifold optimization-based Optimal Transport, or MOT, repository with codes useful in solving OT problems in Python and Matlab. The codes are available at https://github.com/SatyadevNtv/MOT.
翻訳日:2021-03-04 04:36:14 公開日:2021-03-01
# (参考訳) 信用リスクマネジメントにおける説明可能なAI

Explainable AI in Credit Risk Management ( http://arxiv.org/abs/2103.00949v1 )

ライセンス: CC BY 4.0
Artificial Intelligence (AI) has created the single biggest technology revolution the world has ever seen. For the finance sector, it provides great opportunities to enhance customer experience, democratize financial services, ensure consumer protection and significantly improve risk management. While it is easier than ever to run state-of-the-art machine learning models, designing and implementing systems that support real-world finance applications have been challenging. In large part because they lack transparency and explainability which are important factors in establishing reliable technology and the research on this topic with a specific focus on applications in credit risk management. In this paper, we implement two advanced post-hoc model agnostic explainability techniques called Local Interpretable Model Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) to machine learning (ML)-based credit scoring models applied to the open-access data set offered by the US-based P2P Lending Platform, Lending Club. Specifically, we use LIME to explain instances locally and SHAP to get both local and global explanations. We discuss the results in detail and present multiple comparison scenarios by using various kernels available for explaining graphs generated using SHAP values. We also discuss the practical challenges associated with the implementation of these state-of-art eXplainabale AI (XAI) methods and document them for future reference. We have made an effort to document every technical aspect of this research, while at the same time providing a general summary of the conclusions.
翻訳日:2021-03-04 04:26:57 公開日:2021-03-01
# (参考訳) 非コーディングRNAと深層学習ニューラルネットワークは多癌型を識別する

Noncoding RNAs and deep learning neural network discriminate multi-cancer types ( http://arxiv.org/abs/2103.01179v1 )

ライセンス: CC BY 4.0
Detecting cancers at early stages can dramatically reduce mortality rates. Therefore, practical cancer screening at the population level is needed. Here, we develop a comprehensive detection system to classify all common cancer types. By integrating artificial intelligence deep learning neural network and noncoding RNA biomarkers selected from massive data, our system can accurately detect cancer vs healthy object with 96.3% of AUC of ROC (Area Under Curve of a Receiver Operating Characteristic curve). Intriguinely, with no more than 6 biomarkers, our approach can easily discriminate any individual cancer type vs normal with 99% to 100% AUC. Furthermore, a comprehensive marker panel can simultaneously multi-classify all common cancers with a stable 78% of accuracy at heterological cancerous tissues and conditions. This provides a valuable framework for large scale cancer screening. The AI models and plots of results were available in https://combai.org/ai/cancerdetection/
翻訳日:2021-03-04 04:11:39 公開日:2021-03-01
# (参考訳) 学習者の言語

Learners' languages ( http://arxiv.org/abs/2103.01189v1 )

ライセンス: CC BY-SA 4.0
In "Backprop as functor", the authors show that the fundamental elements of deep learning -- gradient descent and backpropagation -- can be conceptualized as a strong monoidal functor $\mathbf{Para}(\mathbf{Euc})\to\mathbf{Learn}$ from the category of parameterized Euclidean spaces to that of learners, a category developed explicitly to capture parameter update and backpropagation. It was soon realized that there is an isomorphism $\mathbf{Learn}\cong\mathbf{Para}(\mathbf{SLens})$, where $\mathbf{SLens}$ is the symmetric monoidal category of simple lenses as used in functional programming. In this note, we observe that $\mathbf{SLens}$ is a full subcategory of $\mathbf{Poly}$, the category of polynomial functors in one variable, via the functor $A\mapsto Ay^A$. Using the fact that $(\mathbf{Poly},\otimes)$ is monoidal closed, we show that a map $A\to B$ in $\mathbf{Para}(\mathbf{SLens})$ has a natural interpretation in terms of dynamical systems (more precisely, generalized Moore machines) whose interface is the internal-hom type $[Ay^A,By^B]$. Finally, we review the fact that the category $p\text{-}\mathbf{Coalg}$ of dynamical systems on any $p\in\mathbf{Poly}$ forms a topos, and consider the logical propositions that can be stated in its internal language. We give gradient descent as an example, and we conclude by discussing some directions for future work.
翻訳日:2021-03-04 04:02:27 公開日:2021-03-01
# (参考訳) 教師付き学習を用いたクライアント提案予測のオフショアソフトウェアメンテナンスアウトソーシング

Offshore Software Maintenance Outsourcing Predicting Clients Proposal using Supervised Learning ( http://arxiv.org/abs/2103.01223v1 )

ライセンス: CC BY-SA 4.0
In software engineering, software maintenance is the process of correction, updating, and improvement of software products after handed over to the customer. Through offshore software maintenance outsourcing clients can get advantages like reduce cost, save time, and improve quality. In most cases, the OSMO vendor generates considerable revenue. However, the selection of an appropriate proposal among multiple clients is one of the critical problems for OSMO vendors. The purpose of this paper is to suggest an effective machine learning technique that can be used by OSMO vendors to assess or predict the OSMO client proposal. The dataset is generated through a survey of OSMO vendors working in a developing country. The results showed that supervised learning-based classifiers like Na\"ive Bayesian, SMO, Logistics apprehended 69.75, 81.81, and 87.27 percent testing accuracy respectively. This study concludes that supervised learning is the most suitable technique to predict the OSMO client's proposal.
翻訳日:2021-03-04 03:09:50 公開日:2021-03-01
# (参考訳) オプティカルフォーム計測のためのアンサンブル学習による不確かさ定量化

Uncertainty Quantification by Ensemble Learning for Computational Optical Form Measurements ( http://arxiv.org/abs/2103.01259v1 )

ライセンス: CC BY 4.0
Uncertainty quantification by ensemble learning is explored in terms of an application from computational optical form measurements. The application requires to solve a large-scale, nonlinear inverse problem. Ensemble learning is used to extend a recently developed deep learning approach for this application in order to provide an uncertainty quantification of its predicted solution to the inverse problem. By systematically inserting out-of-distribution errors as well as noisy data the reliability of the developed uncertainty quantification is explored. Results are encouraging and the proposed application exemplifies the ability of ensemble methods to make trustworthy predictions on high dimensional data in a real-world application.
翻訳日:2021-03-04 03:02:08 公開日:2021-03-01
# (参考訳) 差分プライバシーを用いた広域ネットワーク学習

Wide Network Learning with Differential Privacy ( http://arxiv.org/abs/2103.01294v1 )

ライセンス: CC BY 4.0
Despite intense interest and considerable effort, the current generation of neural networks suffers a significant loss of accuracy under most practically relevant privacy training regimes. One particularly challenging class of neural networks are the wide ones, such as those deployed for NLP typeahead prediction or recommender systems. Observing that these models share something in common--an embedding layer that reduces the dimensionality of the input--we focus on developing a general approach towards training these models that takes advantage of the sparsity of the gradients. More abstractly, we address the problem of differentially private Empirical Risk Minimization (ERM) for models that admit sparse gradients. We demonstrate that for non-convex ERM problems, the loss is logarithmically dependent on the number of parameters, in contrast with polynomial dependence for the general case. Following the same intuition, we propose a novel algorithm for privately training neural networks. Finally, we provide an empirical study of a DP wide neural network on a real-world dataset, which has been rarely explored in the previous work.
翻訳日:2021-03-04 02:48:42 公開日:2021-03-01
# (参考訳) 匿名位置情報型ソーシャルネットワークにおける機械学習によるユーザライフタイムの理解と予測

Understanding & Predicting User Lifetime with Machine Learning in an Anonymous Location-Based Social Network ( http://arxiv.org/abs/2103.01300v1 )

ライセンス: CC BY 4.0
In this work, we predict the user lifetime within the anonymous and location-based social network Jodel in the Kingdom of Saudi Arabia. Jodel's location-based nature yields to the establishment of disjoint communities country-wide and enables for the first time the study of user lifetime in the case of a large set of disjoint communities. A user's lifetime is an important measurement for evaluating and steering customer bases as it can be leveraged to predict churn and possibly apply suitable methods to circumvent potential user losses. We train and test off the shelf machine learning techniques with 5-fold crossvalidation to predict user lifetime as a regression and classification problem; identifying the Random Forest to provide very strong results. Discussing model complexity and quality trade-offs, we also dive deep into a time-dependent feature subset analysis, which does not work very well; Easing up the classification problem into a binary decision (lifetime longer than timespan $x$) enables a practical lifetime predictor with very good performance. We identify implicit similarities across community models according to strong correlations in feature importance. A single countrywide model generalizes the problem and works equally well for any tested community; the overall model internally works similar to others also indicated by its feature importances.
翻訳日:2021-03-04 02:09:28 公開日:2021-03-01
# (参考訳) 動的共変量バランス:経時的治療効果の推定

Dynamic covariate balancing: estimating treatment effects over time ( http://arxiv.org/abs/2103.01280v1 )

ライセンス: CC BY 4.0
This paper discusses the problem of estimation and inference on time-varying treatments. We propose a method for inference on treatment histories, by introducing a \textit{dynamic} covariate balancing method. Our approach allows for (i) treatments to propagate arbitrarily over time; (ii) non-stationarity and heterogeneity of treatment effects; (iii) high-dimensional covariates, and (iv) unknown propensity score functions. We study the asymptotic properties of the estimator, and we showcase the parametric convergence rate of the proposed procedure. We illustrate in simulations and an empirical application the advantage of the method over state-of-the-art competitors.
翻訳日:2021-03-03 23:57:09 公開日:2021-03-01
# (参考訳) Variational Bayes の実践的チュートリアル

A practical tutorial on Variational Bayes ( http://arxiv.org/abs/2103.01327v1 )

ライセンス: CC BY 4.0
This tutorial gives a quick introduction to Variational Bayes (VB), also called Variational Inference or Variational Approximation, from a practical point of view. The paper covers a range of commonly used VB methods and an attempt is made to keep the materials accessible to the wide community of data analysis practitioners. The aim is that the reader can quickly derive and implement their first VB algorithm for Bayesian inference with their data analysis problem. An end-user software package in Matlab together with the documentation can be found at https://vbayeslab.github.io/VBLabDocs/
翻訳日:2021-03-03 23:56:19 公開日:2021-03-01
# (参考訳) ポイントプロセスの統計的学習とクロスバリデーション

Statistical learning and cross-validation for point processes ( http://arxiv.org/abs/2103.01356v1 )

ライセンス: CC BY 4.0
This paper presents the first general (supervised) statistical learning framework for point processes in general spaces. Our approach is based on the combination of two new concepts, which we define in the paper: i) bivariate innovations, which are measures of discrepancy/prediction-accuracy between two point processes, and ii) point process cross-validation (CV), which we here define through point process thinning. The general idea is to carry out the fitting by predicting CV-generated validation sets using the corresponding training sets; the prediction error, which we minimise, is measured by means of bivariate innovations. Having established various theoretical properties of our bivariate innovations, we study in detail the case where the CV procedure is obtained through independent thinning and we apply our statistical learning methodology to three typical spatial statistical settings, namely parametric intensity estimation, non-parametric intensity estimation and Papangelou conditional intensity fitting. Aside from deriving theoretical properties related to these cases, in each of them we numerically show that our statistical learning approach outperforms the state of the art in terms of mean (integrated) squared error.
翻訳日:2021-03-03 23:55:20 公開日:2021-03-01
# (参考訳) ハイブリッド量子古典ハミルトン学習アルゴリズム

A Hybrid Quantum-Classical Hamiltonian Learning Algorithm ( http://arxiv.org/abs/2103.01061v1 )

ライセンス: CC BY 4.0
Hamiltonian learning is crucial to the certification of quantum devices and quantum simulators. In this paper, we propose a hybrid quantum-classical Hamiltonian learning algorithm to find the coefficients of the Pauli operator components of the Hamiltonian. Its main subroutine is the practical log-partition function estimation algorithm, which is based on the minimization of the free energy of the system. Concretely, we devise a stochastic variational quantum eigensolver (SVQE) to diagonalize the Hamiltonians and then exploit the obtained eigenvalues to compute the free energy's global minimum using convex optimization. Our approach not only avoids the challenge of estimating von Neumann entropy in free energy minimization, but also reduces the quantum resources via importance sampling in Hamiltonian diagonalization, facilitating the implementation of our method on near-term quantum devices. Finally, we demonstrate our approach's validity by conducting numerical experiments with Hamiltonians of interest in quantum many-body physics.
翻訳日:2021-03-03 22:24:39 公開日:2021-03-01
# (参考訳) 前方kalmanフィルタを用いた有声音声とピッチ追跡の教師なし分類

Unsupervised Classification of Voiced Speech and Pitch Tracking Using Forward-Backward Kalman Filtering ( http://arxiv.org/abs/2103.01173v1 )

ライセンス: CC BY 4.0
The detection of voiced speech, the estimation of the fundamental frequency, and the tracking of pitch values over time are crucial subtasks for a variety of speech processing techniques. Many different algorithms have been developed for each of the three subtasks. We present a new algorithm that integrates the three subtasks into a single procedure. The algorithm can be applied to pre-recorded speech utterances in the presence of considerable amounts of background noise. We combine a collection of standard metrics, such as the zero-crossing rate, for example, to formulate an unsupervised voicing classifier. The estimation of pitch values is accomplished with a hybrid autocorrelation-based technique. We propose a forward-backward Kalman filter to smooth the estimated pitch contour. In experiments, we are able to show that the proposed method compares favorably with current, state-of-the-art pitch detection algorithms.
翻訳日:2021-03-03 21:21:19 公開日:2021-03-01
# OmniNet: トランスフォーマーからの一方向表現

OmniNet: Omnidirectional Representations from Transformers ( http://arxiv.org/abs/2103.01075v1 )

ライセンス: Link先を確認
This paper proposes Omnidirectional Representations from Transformers (OmniNet). In OmniNet, instead of maintaining a strictly horizontal receptive field, each token is allowed to attend to all tokens in the entire network. This process can also be interpreted as a form of extreme or intensive attention mechanism that has the receptive field of the entire width and depth of the network. To this end, the omnidirectional attention is learned via a meta-learner, which is essentially another self-attention based model. In order to mitigate the computationally expensive costs of full receptive field attention, we leverage efficient self-attention models such as kernel-based (Choromanski et al.), low-rank attention (Wang et al.) and/or Big Bird (Zaheer et al.) as the meta-learner. Extensive experiments are conducted on autoregressive language modeling (LM1B, C4), Machine Translation, Long Range Arena (LRA), and Image Recognition. The experiments show that OmniNet achieves considerable improvements across these tasks, including achieving state-of-the-art performance on LM1B, WMT'14 En-De/En-Fr, and Long Range Arena. Moreover, using omnidirectional representation in Vision Transformers leads to significant improvements on image recognition tasks on both few-shot learning and fine-tuning setups.
翻訳日:2021-03-03 17:35:40 公開日:2021-03-01
# localdrop:ディープニューラルネットワークのためのハイブリッド正規化

LocalDrop: A Hybrid Regularization for Deep Neural Networks ( http://arxiv.org/abs/2103.00719v1 )

ライセンス: Link先を確認
In neural networks, developing regularization algorithms to settle overfitting is one of the major study areas. We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop. A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs), including drop rates and weight matrices, has been developed based on the proposed upper bound of the local Rademacher complexity by the strict mathematical deduction. The analyses of dropout in FCNs and DropBlock in CNNs with keep rate matrices in different layers are also included in the complexity analyses. With the new regularization function, we establish a two-stage procedure to obtain the optimal keep rate matrix and weight matrix to realize the whole training model. Extensive experiments have been conducted to demonstrate the effectiveness of LocalDrop in different models by comparing it with several algorithms and the effects of different hyperparameters on the final performances.
翻訳日:2021-03-03 17:34:06 公開日:2021-03-01
# Persistent Message Passing

Persistent Message Passing ( http://arxiv.org/abs/2103.01043v1 )

ライセンス: Link先を確認
Graph neural networks (GNNs) are a powerful inductive bias for modelling algorithmic reasoning procedures and data structures. Their prowess was mainly demonstrated on tasks featuring Markovian dynamics, where querying any associated data structure depends only on its latest state. For many tasks of interest, however, it may be highly beneficial to support efficient data structure queries dependent on previous states. This requires tracking the data structure's evolution through time, placing significant pressure on the GNN's latent representations. We introduce Persistent Message Passing (PMP), a mechanism which endows GNNs with capability of querying past state by explicitly persisting it: rather than overwriting node representations, it creates new nodes whenever required. PMP generalises out-of-distribution to more than 2x larger test inputs on dynamic temporal range queries, significantly outperforming GNNs which overwrite states.
翻訳日:2021-03-03 17:33:50 公開日:2021-03-01
# 共有グローバルワークスペースによる神経モジュール間の協調

Coordination Among Neural Modules Through a Shared Global Workspace ( http://arxiv.org/abs/2103.01197v1 )

ライセンス: Link先を確認
Deep learning has seen a movement away from representing examples with a monolithic hidden state towards a richly structured state. For example, Transformers segment by position, and object-centric architectures decompose images into entities. In all these architectures, interactions between different elements are modeled via pairwise interactions: Transformers make use of self-attention to incorporate information from other positions; object-centric architectures make use of graph neural networks to model interactions among entities. However, pairwise interactions may not achieve global coordination or a coherent, integrated representation that can be used for downstream tasks. In cognitive science, a global workspace architecture has been proposed in which functionally specialized components share information through a common, bandwidth-limited communication channel. We explore the use of such a communication channel in the context of deep learning for modeling the structure of complex environments. The proposed method includes a shared workspace through which communication among different specialist modules takes place but due to limits on the communication bandwidth, specialist modules must compete for access. We show that capacity limitations have a rational basis in that (1) they encourage specialization and compositionality and (2) they facilitate the synchronization of otherwise independent specialists.
翻訳日:2021-03-03 17:33:37 公開日:2021-03-01
# 統計的に有意なニューラルネットワークトレーニング停止

Statistically Significant Stopping of Neural Network Training ( http://arxiv.org/abs/2103.01205v1 )

ライセンス: Link先を確認
The general approach taken when training deep learning classifiers is to save the parameters after every few iterations, train until either a human observer or a simple metric-based heuristic decides the network isn't learning anymore, and then backtrack and pick the saved parameters with the best validation accuracy. Simple methods are used to determine if a neural network isn't learning anymore because, as long as it's well after the optimal values are found, the condition doesn't impact the final accuracy of the model. However from a runtime perspective, this is of great significance to the many cases where numerous neural networks are trained simultaneously (e.g. hyper-parameter tuning). Motivated by this, we introduce a statistical significance test to determine if a neural network has stopped learning. This stopping criterion appears to represent a happy medium compared to other popular stopping criterions, achieving comparable accuracy to the criterions that achieve the highest final accuracies in 77% or fewer epochs, while the criterions which stop sooner do so with an appreciable loss to final accuracy. Additionally, we use this as the basis of a new learning rate scheduler, removing the need to manually choose learning rate schedules and acting as a quasi-line search, achieving superior or comparable empirical performance to existing methods.
翻訳日:2021-03-03 17:30:53 公開日:2021-03-01
# RAGA:グローバルエンティティアライメントのためのリレーショナルグラフアテンションネットワーク

RAGA: Relation-aware Graph Attention Networks for Global Entity Alignment ( http://arxiv.org/abs/2103.00791v1 )

ライセンス: Link先を確認
Entity alignment (EA) is the task to discover entities referring to the same real-world object from different knowledge graphs (KGs), which is the most crucial step in integrating multi-source KGs. The majority of the existing embeddings-based entity alignment methods embed entities and relations into a vector space based on relation triples of KGs for local alignment. As these methods insufficiently consider the multiple relations between entities, the structure information of KGs has not been fully leveraged. In this paper, we propose a novel framework based on Relation-aware Graph Attention Networks to capture the interactions between entities and relations. Our framework adopts the self-attention mechanism to spread entity information to the relations and then aggregate relation information back to entities. Furthermore, we propose a global alignment algorithm to make one-to-one entity alignments with a fine-grained similarity matrix. Experiments on three real-world cross-lingual datasets show that our framework outperforms the state-of-the-art methods.
翻訳日:2021-03-03 17:28:50 公開日:2021-03-01
# ゼロショットとオープンセット視覚認識

Counterfactual Zero-Shot and Open-Set Visual Recognition ( http://arxiv.org/abs/2103.00887v1 )

ライセンス: Link先を確認
We present a novel counterfactual framework for both Zero-Shot Learning (ZSL) and Open-Set Recognition (OSR), whose common challenge is generalizing to the unseen-classes by only training on the seen-classes. Our idea stems from the observation that the generated samples for unseen-classes are often out of the true distribution, which causes severe recognition rate imbalance between the seen-class (high) and unseen-class (low). We show that the key reason is that the generation is not Counterfactual Faithful, and thus we propose a faithful one, whose generation is from the sample-specific counterfactual question: What would the sample look like, if we set its class attribute to a certain class, while keeping its sample attribute unchanged? Thanks to the faithfulness, we can apply the Consistency Rule to perform unseen/seen binary classification, by asking: Would its counterfactual still look like itself? If ``yes'', the sample is from a certain class, and ``no'' otherwise. Through extensive experiments on ZSL and OSR, we demonstrate that our framework effectively mitigates the seen/unseen imbalance and hence significantly improves the overall performance. Note that this framework is orthogonal to existing methods, thus, it can serve as a new baseline to evaluate how ZSL/OSR models generalize. Codes are available at https://github.com/yue-zhongqi/gcm-cf.
翻訳日:2021-03-03 17:26:47 公開日:2021-03-01
# 学習モノポリーゲームプレイ:モデルフリーなディープラーニングと模倣学習のハイブリッドアプローチ

Learning Monopoly Gameplay: A Hybrid Model-Free Deep Reinforcement Learning and Imitation Learning Approach ( http://arxiv.org/abs/2103.00683v1 )

ライセンス: Link先を確認
Learning how to adapt and make real-time informed decisions in dynamic and complex environments is a challenging problem. To learn this task, Reinforcement Learning (RL) relies on an agent interacting with an environment and learning through trial and error to maximize the cumulative sum of rewards received by it. In multi-player Monopoly game, players have to make several decisions every turn which involves complex actions, such as making trades. This makes the decision-making harder and thus, introduces a highly complicated task for an RL agent to play and learn its winning strategies. In this paper, we introduce a Hybrid Model-Free Deep RL (DRL) approach that is capable of playing and learning winning strategies of the popular board game, Monopoly. To achieve this, our DRL agent (1) starts its learning process by imitating a rule-based agent (that resembles the human logic) to initialize its policy, (2) learns the successful actions, and improves its policy using DRL. Experimental results demonstrate an intelligent behavior of our proposed agent as it shows high win rates against different types of agent-players.
翻訳日:2021-03-03 17:24:38 公開日:2021-03-01
# 都市に耳を傾ける:短期流れ予測問題のための時空間的注意強化オートエンコーダ

Listening to the city, attentively: A Spatio-Temporal Attention Boosted Autoencoder for the Short-Term Flow Prediction Problem ( http://arxiv.org/abs/2103.00983v1 )

ライセンス: Link先を確認
In recent years, the importance of studying traffic flows and making predictions on alternative mobility (sharing services) has become increasingly important, as accurate and timely information on the travel flow is important for the successful implementation of systems that increase the quality of sharing services. This need has been accentuated by the current health crisis that requires alternative transport mobility such as electric bike and electric scooter sharing. Considering the new approaches in the world of deep learning and the difficulty due to the strong spatial and temporal dependence of this problem, we propose a framework, called STREED-Net, with multi-attention (Spatial and Temporal) able to better mining the high-level spatial and temporal features. We conduct experiments on three real datasets to predict the Inflow and Outflow of the different regions into which the city has been divided. The results indicate that the proposed STREED-Net model improves the state-of-the-art for this problem.
翻訳日:2021-03-03 17:24:15 公開日:2021-03-01
# 小型サンプルを用いた機械学習:合成知識合成

Machine learning on small size samples: A synthetic knowledge synthesis ( http://arxiv.org/abs/2103.01002v1 )

ライセンス: Link先を確認
One of the increasingly important technologies dealing with the growing complexity of the digitalization of almost all human activities is Artificial intelligence, more precisely machine learning Despite the fact, that we live in a Big data world where almost everything is digitally stored, there are many real-world situations, where researchers are faced with small data samples. The present study aim is to answer the following research question namely What is the small data problem in machine learning and how it is solved?. Our bibliometric study showed a positive trend in the number of research publications concerning the use of small datasets and substantial growth of the research community dealing with the small dataset problem, indicating that the research field is moving toward higher maturity levels. Despite notable international cooperation, the regional concentration of research literature production in economically more developed countries was observed.
翻訳日:2021-03-03 17:23:58 公開日:2021-03-01
# 分類器の最適線形組合せ

Optimal Linear Combination of Classifiers ( http://arxiv.org/abs/2103.01109v1 )

ライセンス: Link先を確認
The question of whether to use one classifier or a combination of classifiers is a central topic in Machine Learning. We propose here a method for finding an optimal linear combination of classifiers derived from a bias-variance framework for the classification task.
翻訳日:2021-03-03 17:23:34 公開日:2021-03-01
# 連続学習のための後部メタ再生

Posterior Meta-Replay for Continual Learning ( http://arxiv.org/abs/2103.01133v1 )

ライセンス: Link先を確認
Continual Learning (CL) algorithms have recently received a lot of attention as they attempt to overcome the need to train with an i.i.d. sample from some unknown target data distribution. Building on prior work, we study principled ways to tackle the CL problem by adopting a Bayesian perspective and focus on continually learning a task-specific posterior distribution via a shared meta-model, a task-conditioned hypernetwork. This approach, which we term Posterior-replay CL, is in sharp contrast to most Bayesian CL approaches that focus on the recursive update of a single posterior distribution. The benefits of our approach are (1) an increased flexibility to model solutions in weight space and therewith less susceptibility to task dissimilarity, (2) access to principled task-specific predictive uncertainty estimates, that can be used to infer task identity during test time and to detect task boundaries during training, and (3) the ability to revisit and update task-specific posteriors in a principled manner without requiring access to past data. The proposed framework is versatile, which we demonstrate using simple posterior approximations (such as Gaussians) as well as powerful, implicit distributions modelled via a neural network. We illustrate the conceptual advance of our framework on low-dimensional problems and show performance gains on computer vision benchmarks.
翻訳日:2021-03-03 17:23:29 公開日:2021-03-01
# ニューラルネットワークコントローラの確率安全保証の生成

Generating Probabilistic Safety Guarantees for Neural Network Controllers ( http://arxiv.org/abs/2103.01203v1 )

ライセンス: Link先を確認
Neural networks serve as effective controllers in a variety of complex settings due to their ability to represent expressive policies. The complex nature of neural networks, however, makes their output difficult to verify and predict, which limits their use in safety-critical applications. While simulations provide insight into the performance of neural network controllers, they are not enough to guarantee that the controller will perform safely in all scenarios. To address this problem, recent work has focused on formal methods to verify properties of neural network outputs. For neural network controllers, we can use a dynamics model to determine the output properties that must hold for the controller to operate safely. In this work, we develop a method to use the results from neural network verification tools to provide probabilistic safety guarantees on a neural network controller. We develop an adaptive verification approach to efficiently generate an overapproximation of the neural network policy. Next, we modify the traditional formulation of Markov decision process (MDP) model checking to provide guarantees on the overapproximated policy given a stochastic dynamics model. Finally, we incorporate techniques in state abstraction to reduce overapproximation error during the model checking process. We show that our method is able to generate meaningful probabilistic safety guarantees for aircraft collision avoidance neural networks that are loosely inspired by Airborne Collision Avoidance System X (ACAS X), a family of collision avoidance systems that formulates the problem as a partially observable Markov decision process (POMDP).
翻訳日:2021-03-03 17:23:04 公開日:2021-03-01
# コミュニケーション制約付き連合学習における対人訓練

Adversarial training in communication constrained federated learning ( http://arxiv.org/abs/2103.01319v1 )

ライセンス: Link先を確認
Federated learning enables model training over a distributed corpus of agent data. However, the trained model is vulnerable to adversarial examples, designed to elicit misclassification. We study the feasibility of using adversarial training (AT) in the federated learning setting. Furthermore, we do so assuming a fixed communication budget and non-iid data distribution between participating agents. We observe a significant drop in both natural and adversarial accuracies when AT is used in the federated setting as opposed to centralized training. We attribute this to the number of epochs of AT performed locally at the agents, which in turn effects (i) drift between local models; and (ii) convergence time (measured in number of communication rounds). Towards this end, we propose FedDynAT, a novel algorithm for performing AT in federated setting. Through extensive experimentation we show that FedDynAT significantly improves both natural and adversarial accuracy, as well as model convergence time by reducing the model drift.
翻訳日:2021-03-03 17:22:38 公開日:2021-03-01
# AdaSpeech:カスタム音声のための音声への適応テキスト

AdaSpeech: Adaptive Text to Speech for Custom Voice ( http://arxiv.org/abs/2103.00993v1 )

ライセンス: Link先を確認
Custom voice, a specific text to speech (TTS) service in commercial speech platforms, aims to adapt a source TTS model to synthesize personal voice for a target speaker using few speech data. Custom voice presents two unique challenges for TTS adaptation: 1) to support diverse customers, the adaptation model needs to handle diverse acoustic conditions that could be very different from source speech data, and 2) to support a large number of customers, the adaptation parameters need to be small enough for each target speaker to reduce memory usage while maintaining high voice quality. In this work, we propose AdaSpeech, an adaptive TTS system for high-quality and efficient customization of new voices. We design several techniques in AdaSpeech to address the two challenges in custom voice: 1) To handle different acoustic conditions, we use two acoustic encoders to extract an utterance-level vector and a sequence of phoneme-level vectors from the target speech during training; in inference, we extract the utterance-level vector from a reference speech and use an acoustic predictor to predict the phoneme-level vectors. 2) To better trade off the adaptation parameters and voice quality, we introduce conditional layer normalization in the mel-spectrogram decoder of AdaSpeech, and fine-tune this part in addition to speaker embedding for adaptation. We pre-train the source TTS model on LibriTTS datasets and fine-tune it on VCTK and LJSpeech datasets (with different acoustic conditions from LibriTTS) with few adaptation data, e.g., 20 sentences, about 1 minute speech. Experiment results show that AdaSpeech achieves much better adaptation quality than baseline methods, with only about 5K specific parameters for each speaker, which demonstrates its effectiveness for custom voice. Audio samples are available at https://speechresearch.github.io/adaspeech/.
翻訳日:2021-03-03 17:22:03 公開日:2021-03-01
# アメリカの健康状態:ソーシャルメディアによる健康分類の創造

The Healthy States of America: Creating a Health Taxonomy with Social Media ( http://arxiv.org/abs/2103.01169v1 )

ライセンス: Link先を確認
Since the uptake of social media, researchers have mined online discussions to track the outbreak and evolution of specific diseases or chronic conditions such as influenza or depression. To broaden the set of diseases under study, we developed a Deep Learning tool for Natural Language Processing that extracts mentions of virtually any medical condition or disease from unstructured social media text. With that tool at hand, we processed Reddit and Twitter posts, analyzed the clusters of the two resulting co-occurrence networks of conditions, and discovered that they correspond to well-defined categories of medical conditions. This resulted in the creation of the first comprehensive taxonomy of medical conditions automatically derived from online discussions. We validated the structure of our taxonomy against the official International Statistical Classification of Diseases and Related Health Problems (ICD-11), finding matches of our clusters with 20 official categories, out of 22. Based on the mentions of our taxonomy's sub-categories on Reddit posts geo-referenced in the U.S., we were then able to compute disease-specific health scores. As opposed to counts of disease mentions or counts with no knowledge of our taxonomy's structure, we found that our disease-specific health scores are causally linked with the officially reported prevalence of 18 conditions.
翻訳日:2021-03-03 17:21:31 公開日:2021-03-01
# 固有値解を用いた教師付き異常検出のためのメタラーニングワンクラス分類器

Meta-learning One-class Classifiers with Eigenvalue Solvers for Supervised Anomaly Detection ( http://arxiv.org/abs/2103.00684v1 )

ライセンス: Link先を確認
Neural network-based anomaly detection methods have shown to achieve high performance. However, they require a large amount of training data for each task. We propose a neural network-based meta-learning method for supervised anomaly detection. The proposed method improves the anomaly detection performance on unseen tasks, which contains a few labeled normal and anomalous instances, by meta-training with various datasets. With a meta-learning framework, quick adaptation to each task and its effective backpropagation are important since the model is trained by the adaptation for each epoch. Our model enables them by formulating adaptation as a generalized eigenvalue problem with one-class classification; its global optimum solution is obtained, and the solver is differentiable. We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods on various datasets.
# 無限ガウス混合モデルを用いたクラスタリングのためのメタラーニング表現

Meta-learning representations for clustering with infinite Gaussian mixture models ( http://arxiv.org/abs/2103.00694v1 )

ライセンス: Link先を確認
For better clustering performance, appropriate representations are critical. Although many neural network-based metric learning methods have been proposed, they do not directly train neural networks to improve clustering performance. We propose a meta-learning method that train neural networks for obtaining representations such that clustering performance improves when the representations are clustered by the variational Bayesian (VB) inference with an infinite Gaussian mixture model. The proposed method can cluster unseen unlabeled data using knowledge meta-learned with labeled data that are different from the unlabeled data. For the objective function, we propose a continuous approximation of the adjusted Rand index (ARI), by which we can evaluate the clustering performance from soft clustering assignments. Since the approximated ARI and the VB inference procedure are differentiable, we can backpropagate the objective function through the VB inference procedure to train the neural networks. With experiments using text and image data sets, we demonstrate that our proposed method has a higher adjusted Rand index than existing methods do.
翻訳日:2021-03-03 17:17:09 公開日:2021-03-01
Moment-Based Variational Inference for Stochastic Differential Equations ( http://arxiv.org/abs/2103.00988v1 )

ライセンス: Link先を確認
Existing deterministic variational inference approaches for diffusion processes use simple proposals and target the marginal density of the posterior. We construct the variational process as a controlled version of the prior process and approximate the posterior by a set of moment functions. In combination with moment closure, the smoothing problem is reduced to a deterministic optimal control problem. Exploiting the path-wise Fisher information, we propose an optimization procedure that corresponds to a natural gradient descent in the variational parameters. Our approach allows for richer variational approximations that extend to state-dependent diffusion terms. The classical Gaussian process approximation is recovered as a special case.
# 自動符号化3D変換によるマルチビュー学習

Self-Supervised Multi-View Learning via Auto-Encoding 3D Transformations ( http://arxiv.org/abs/2103.00787v1 )

Xiang Gao, Wei Hu, Guo-Jun Qi(参考訳) 3Dオブジェクト表現学習は、3D世界を推論するコンピュータビジョンの基本的な課題です。 近年のディープラーニングの進歩は、3Dオブジェクト認識における効率性を示しており、ビューベース手法がこれまでで最も優れている。 しかし、既存の手法における複数のビューの特徴学習は、多くの場合、高コストで大量のデータラベルを必要とする教師付き方式で行われる。 対照的に、自己監督学習はラベル付きデータを介さずにマルチビュー機能表現を学習することを目指している。 この目的のために,3次元オブジェクトの同変変換とその投影された複数のビューを探索し,MV-TER(Multi-View Transformation Equivariant Representation)を学習するための,新しい自己教師型パラダイムを提案する。 具体的には、3Dオブジェクト上で3D変換を行い、投影による変換前後の複数のビューを取得する。 次に、変換前後の複数のビューの融合特徴表現から3d変換パラメータをデコードすることにより、内在的な3dオブジェクト表現をキャプチャする表現を自己学習する。 実験の結果,提案したMV-TERは3次元オブジェクト分類および検索タスクにおける最先端のビューベースアプローチよりも優れており,実世界のデータセットへの一般化を示す。

3D object representation learning is a fundamental challenge in computer vision to infer about the 3D world. Recent advances in deep learning have shown their efficiency in 3D object recognition, among which view-based methods have performed best so far. However, feature learning of multiple views in existing methods is mostly performed in a supervised fashion, which often requires a large amount of data labels with high costs. In contrast, self-supervised learning aims to learn multi-view feature representations without involving labeled data. To this end, we propose a novel self-supervised paradigm to learn Multi-View Transformation Equivariant Representations (MV-TER), exploring the equivariant transformations of a 3D object and its projected multiple views. Specifically, we perform a 3D transformation on a 3D object, and obtain multiple views before and after the transformation via projection. Then, we self-train a representation to capture the intrinsic 3D object representation by decoding 3D transformation parameters from the fused feature representations of multiple views before and after the transformation. Experimental results demonstrate that the proposed MV-TER significantly outperforms the state-of-the-art view-based approaches in 3D object classification and retrieval tasks, and show the generalization to real-world datasets.
# 奥行きレベル動的ニューラルネットワークにおける埋め込み知識蒸留

Embedded Knowledge Distillation in Depth-level Dynamic Neural Network ( http://arxiv.org/abs/2103.00793v1 )

Shuchang Lyu, Ting-Bing Xu and Guangliang Cheng(参考訳) 実際のアプリケーションでは、異なる計算リソースデバイスは高い精度で異なる深いネットワーク(resnet-18/34/50など)を必要とする。 通常、既存の戦略では、複数のネットワーク(ネット)を設計し、それらを独立に訓練するか、圧縮技術(低ランク分解、刈り込み、教師から教師まで)を使って訓練された大規模モデルを小さなネットに進化させる。 これらの方法は、小網の精度が低いこと、または伴奏型大規模モデルの依存によって引き起こされる複雑な訓練過程の対象となる。 本稿では、類似アーキテクチャの異なる深度サブネットを統合したエレガントな深度レベル動的ニューラルネットワーク(DDNN)を提案する。 異なる深度構成の個々のネットをトレーニングする代わりに、1組の共有重みパラメータを使用して、実行時に異なる深度サブネットを動的に切り替えるようにDDNNを訓練する。 サブネットの一般化を改善するために,教師ネット(フル)から複数のサブネットへの意味的知識伝達を実装するために,DDNNの組込み知識蒸留(EKD)トレーニング機構を設計する。 具体的には、フルネットとサブネット間の後続クラス確率の整合性を制限するためにクルバック・リーブラー分岐を導入し、より豊富なサブネットの特徴表現を駆動するために、異なる深さの同じ解像度特徴に対する自己アテンションに対処する。 これにより、オンライン知識蒸留を通じてDDNNにおいて、余分な計算コストを伴わずに、複数の高精度サブネットを同時に取得できる。 CIFAR-10, CIFAR-100, ImageNetデータセットの大規模な実験により、EDKDトレーニング付きDDNNのサブネットは、フルネットの本来の性能を維持しながら、深さレベルのプルーニングや個別のトレーニングよりも優れたパフォーマンスを達成することが示された。

In real applications, different computation-resource devices need different-depth networks (e.g., ResNet-18/34/50) with high-accuracy. Usually, existing strategies either design multiple networks (nets) and train them independently, or utilize compression techniques (e.g., low-rank decomposition, pruning, and teacher-to-student) to evolve a trained large model into a small net. These methods are subject to the low-accuracy of small nets, or complicated training processes induced by the dependence of accompanying assistive large models. In this article, we propose an elegant Depth-level Dynamic Neural Network (DDNN) integrated different-depth sub-nets of similar architectures. Instead of training individual nets with different-depth configurations, we only train a DDNN to dynamically switch different-depth sub-nets at runtime using one set of shared weight parameters. To improve the generalization of sub-nets, we design the Embedded-Knowledge-Distillation (EKD) training mechanism for the DDNN to implement semantic knowledge transfer from the teacher (full) net to multiple sub-nets. Specifically, the Kullback-Leibler divergence is introduced to constrain the posterior class probability consistency between full-net and sub-net, and self-attention on the same resolution feature of different depth is addressed to drive more abundant feature representations of sub-nets. Thus, we can obtain multiple high accuracy sub-nets simultaneously in a DDNN via the online knowledge distillation in each training iteration without extra computation cost. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet datasets demonstrate that sub-nets in DDNN with EKD training achieves better performance than the depth-level pruning or individually training while preserving the original performance of full-net.
# ボックスを考えて:$l_1$-APGD for sparse adversarial attack on image Classifications

Mind the box: $l_1$-APGD for sparse adversarial attacks on image classifiers ( http://arxiv.org/abs/2103.01208v1 )

Francesco Croce, Matthias Hein(参考訳) また、画像領域の$[0,1]^d$も考慮すると、$[0,1]^d$と$[0,1]^d$の交差を有効脅威モデルとみなさないため、確立された$l_1$-投射勾配降下(PGD)攻撃は最適以下であることを示す。 この効果的な脅威モデルのために最も急な降下ステップが期待できるスパーシティを調べ、この集合への正確な射影が計算可能であり、より良い性能をもたらすことを示す。 さらに,小予算の反復であっても高い有効性を有する適応型PGDを提案する。 結果の$l_1$-APGDは、前の作業が $l_1$-robustness を過大評価していることを示す強力なホワイトボックス攻撃です。 敵の訓練に$l_1$-APGDを使用すると、SOTA $l_1$-robustnessで堅牢な分類器が得られる。 最後に、$l_1$-apgd と square attack を$l_1$ から $l_1$-autoattack に適応させることで、$[0,1]^d$ と交差する $l_1$-ball の脅威モデルの敵対的ロバスト性を確実に評価する。

We show that when taking into account also the image domain $[0,1]^d$, established $l_1$-projected gradient descent (PGD) attacks are suboptimal as they do not consider that the effective threat model is the intersection of the $l_1$-ball and $[0,1]^d$. We study the expected sparsity of the steepest descent step for this effective threat model and show that the exact projection onto this set is computationally feasible and yields better performance. Moreover, we propose an adaptive form of PGD which is highly effective even with a small budget of iterations. Our resulting $l_1$-APGD is a strong white box attack showing that prior work overestimated their $l_1$-robustness. Using $l_1$-APGD for adversarial training we get a robust classifier with SOTA $l_1$-robustness. Finally, we combine $l_1$-APGD and an adaptation of the Square Attack to $l_1$ into $l_1$-AutoAttack, an ensemble of attacks which reliably assesses adversarial robustness for the threat model of $l_1$-ball intersected with $[0,1]^d$.
# 実世界のポイントクラウドからのスケーラブルなシーンフロー

Scalable Scene Flow from Point Clouds in the Real World ( http://arxiv.org/abs/2103.01306v1 )

Philipp Jund, Chris Sweeney, Nichola Abdo, Zhifeng Chen, Jonathon Shlens(参考訳) 自動運転車は、非常にダイナミックな環境で動作し、シーンのどの側面が動いているのか、どこに移動するのかを正確に評価する必要がある。 シーンフローと呼ばれる3Dモーション推定の一般的なアプローチは、連続するLiDARスキャンから3Dポイントクラウドデータを採用することであるが、そのようなアプローチは実際のLiDARデータの小さなサイズによって制限されている。 本稿では,waymo open datasetに基づくシーンフローのための新しい大規模ベンチマークを提案する。 データセットは、注釈付きフレームの数の観点から、以前の現実世界のデータセットよりも$\sim$1,000$\times$大きく、対応する追跡された3Dオブジェクトから導出される。 我々は,従来の作業が実際のLiDARデータの量に基づいてバウンドされていることを実証し,最先端の予測性能を達成するためにはより大きなデータセットが必要であることを示唆した。 さらに, 従来, ダウンサンプリングによる性能低下など, ポイントクラウド上での動作に関するヒューリスティックスが, フルポイントクラウド上でのトラクタブルな新たなモデルの動機となっていることを示す。 この問題に対処するために、フルポイントクラウド上でリアルタイム推論を提供するモデルアーキテクチャ \modelname~を紹介します。 最後に,この問題は,ラベルなし物体の運動を予測する手法を一般化するためのオープン問題を強調することで,半教師付き学習の手法に適応できることを実証する。 このデータセットが、現実世界のシーンフローシステムの開発と、新しい機械学習問題への動機付けに新たな機会を提供することを期待している。

Autonomous vehicles operate in highly dynamic environments necessitating an accurate assessment of which aspects of a scene are moving and where they are moving to. A popular approach to 3D motion estimation -- termed scene flow -- is to employ 3D point cloud data from consecutive LiDAR scans, although such approaches have been limited by the small size of real-world, annotated LiDAR data. In this work, we introduce a new large scale benchmark for scene flow based on the Waymo Open Dataset. The dataset is $\sim$1,000$\times$ larger than previous real-world datasets in terms of the number of annotated frames and is derived from the corresponding tracked 3D objects. We demonstrate how previous works were bounded based on the amount of real LiDAR data available, suggesting that larger datasets are required to achieve state-of-the-art predictive performance. Furthermore, we show how previous heuristics for operating on point clouds such as artificial down-sampling heavily degrade performance, motivating a new class of models that are tractable on the full point cloud. To address this issue, we introduce the model architecture \modelname~that provides real time inference on the full point cloud. Finally, we demonstrate that this problem is amenable to techniques from semi-supervised learning by highlighting open problems for generalizing methods for predicting motion on unlabeled objects. We hope that this dataset may provide new opportunities for developing real world scene flow systems and motivate a new class of machine learning problems.
# Few-Shot Learningのための不変および同変表現の相補的強度の探索

Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning ( http://arxiv.org/abs/2103.01315v1 )

Mamshad Nayeem Rizve, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah(参考訳) 多くの現実世界では、多数のラベル付きサンプルの収集は不可能です。 FSL(Few-shot Learning)はこの問題に対処する主要なアプローチであり、少数のサンプルが存在する場合、新しいカテゴリに迅速に適応することが目的である。 FSLタスクは、勾配に基づくメタラーニングとメトリックラーニングのアプローチのアイデアを活用することで、主に解決されてきた。 しかし、最近の研究では、既存の高度なFSLアルゴリズムを上回るシンプルな埋め込みネットワークにより、強力な機能表現の重要性が実証されている。 本研究は, この知見に基づいて, 幾何変換の一般集合に対する等分散と不変性を同時に実施する, 新たな学習機構を提案する。 等価性や不変性は、以前の作品では単独で採用されてきたが、私たちの知る限りでは、これらは共同で使用されていない。 これら両方の対比目的の同時最適化により、モデルは入力変換から独立しているだけでなく、幾何学的変換の構造をエンコードする特徴を共同で学習することができる。 これらの補完的な機能のセットは、少数のデータサンプルだけで新しいクラスを一般化するのに役立ちます。 我々は,新しい自己監督蒸留目的を組み込むことにより,さらなる改良を図る。 提案手法は,5つのベンチマークデータセットにおいて,現在最先端のfsl法を上回ることができることを示す。

In many real-world problems, collecting a large number of labeled samples is infeasible. Few-shot learning (FSL) is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples. FSL tasks have been predominantly solved by leveraging the ideas from gradient-based meta-learning and metric learning approaches. However, recent works have demonstrated the significance of powerful feature representations with a simple embedding network that can outperform existing sophisticated FSL algorithms. In this work, we build on this insight and propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations. Equivariance or invariance has been employed standalone in the previous works; however, to the best of our knowledge, they have not been used jointly. Simultaneous optimization for both of these contrasting objectives allows the model to jointly learn features that are not only independent of the input transformation but also the features that encode the structure of geometric transformations. These complementary sets of features help generalize well to novel classes with only a few data samples. We achieve additional improvements by incorporating a novel self-supervised distillation objective. Our extensive experimentation shows that even without knowledge distillation our proposed method can outperform current state-of-the-art FSL methods on five popular benchmark datasets.
# 推薦システムにおけるホットエンベディング活用による高性能トレーニング

High-Performance Training by Exploiting Hot-Embeddings in Recommendation Systems ( http://arxiv.org/abs/2103.00686v1 )

Muhammad Adnan, Yassaman Ebrahimzadeh Maboud, Divya Mahajan, Prashant J. Nair(参考訳) 推奨モデルは、電子商取引およびオンライン広告ベースのアプリケーションのためにユーザーに関連アイテムを提案する一般的な学習モデルです。 現在の推奨モデルには、ディープラーニングベース(DLRM)と時間ベースシーケンス(TBSM)モデルがある。 これらのモデルは、巨大な埋め込みテーブルを使用して、アイテムとユーザのカテゴリ変数(メモリバウンド)の数値表現を格納し、ニューラルネットワークを使用して出力(計算バウンド)を生成する。 これらの競合する計算とメモリの要求のため、レコメンデーションモデルのトレーニングプロセスはそれぞれ、組み込みとニューラルネットワークの実行のためのCPUとGPUに分割される。 このようなトレーニングプロセスは、各埋め込みエントリに同じレベルの重要度を鼻で割り当てる。 本稿では,いくつかの学習入力とその埋め込みテーブルへのアクセスが,最大10000倍以上のエントリがアクセスされることで大きく歪んでいることを観察する。 本稿では、スキューテーブルアクセスを活用して、トレーニング中にGPUリソースを効率的に活用する。 そこで本稿では,gpuメモリ容量と入力人気指数に基づいて動的ノブをソフトウェアに公開する,頻繁にアクセスされる組込み(fae)フレームワークを提案する。 このフレームワークは、GPU内の埋め込みテーブルのホット部分のサイズを効率的に見積もり、変更し、残りの埋め込みをCPUに再配置します。 全体として、我々のフレームワークは、Intel-Xeon CPUとNvidia Tesla-V100 GPUを使用するベースラインと比較して、Kaggle、Terabyte、Alibabaデータセットの推奨モデルのトレーニングを2.34倍高速化し、精度を維持しています。

Recommendation models are commonly used learning models that suggest relevant items to a user for e-commerce and online advertisement-based applications. Current recommendation models include deep-learning-based (DLRM) and time-based sequence (TBSM) models. These models use massive embedding tables to store a numerical representation of item's and user's categorical variables (memory-bound) while also using neural networks to generate outputs (compute-bound). Due to these conflicting compute and memory requirements, the training process for recommendation models is divided across CPU and GPU for embedding and neural network executions, respectively. Such a training process naively assigns the same level of importance to each embedding entry. This paper observes that some training inputs and their accesses into the embedding tables are heavily skewed with certain entries being accessed up to 10000x more. This paper tries to leverage skewed embedded table accesses to efficiently use the GPU resources during training. To this end, this paper proposes a Frequently Accessed Embeddings (FAE) framework that exposes a dynamic knob to the software based on the GPU memory capacity and the input popularity index. This framework efficiently estimates and varies the size of the hot portions of the embedding tables within GPUs and reallocates the rest of the embeddings on the CPU. Overall, our framework speeds-up the training of the recommendation models on Kaggle, Terabyte, and Alibaba datasets by 2.34x as compared to a baseline that uses Intel-Xeon CPUs and Nvidia Tesla-V100 GPUs, while maintaining accuracy.
# 深部強化学習による超音波プローブの標準走査面への自律ナビゲーション

Autonomous Navigation of an Ultrasound Probe Towards Standard Scan Planes with Deep Reinforcement Learning ( http://arxiv.org/abs/2103.00718v1 )

Keyu Li, Jian Wang, Yangxin Xu, Hao Qin, Dongsheng Liu, Li Liu, Max Q.-H. Meng(参考訳) 自律超音波(US)の取得は、非常に複雑で可変的な画像とその空間関係の解釈を含むため、重要かつ困難な作業です。 本研究では,実世界のusスキャンの制約下で標準スキャンプレーンに向かってナビゲートするために,リアルタイム画像フィードバックに基づいて,仮想usプローブの6次元ポーズを自律的に制御する深層強化学習フレームワークを提案する。 さらに,学習過程における画像品質の最適化を符号化する信頼度に基づく手法を提案する。 本手法は,米国の脊椎イメージングで収集した実世界データを用いて構築したシミュレーション環境で検証した。 実験の結果, 本手法は, 標準スキャンプレーンに対して, 患者内設定で4.91mm/4.65^\circ$の精度で再現可能なusプローブナビゲーションを行い, 患者内および患者間におけるタスクをそれぞれ9.2\%$および4.6\%$で達成できることが判明した。 また,本手法における画像品質最適化の導入により,ナビゲーション性能を効果的に改善できることを示した。

Autonomous ultrasound (US) acquisition is an important yet challenging task, as it involves interpretation of the highly complex and variable images and their spatial relationships. In this work, we propose a deep reinforcement learning framework to autonomously control the 6-D pose of a virtual US probe based on real-time image feedback to navigate towards the standard scan planes under the restrictions in real-world US scans. Furthermore, we propose a confidence-based approach to encode the optimization of image quality in the learning process. We validate our method in a simulation environment built with real-world data collected in the US imaging of the spine. Experimental results demonstrate that our method can perform reproducible US probe navigation towards the standard scan plane with an accuracy of $4.91mm/4.65^\circ$ in the intra-patient setting, and accomplish the task in the intra- and inter-patient settings with a success rate of $92\%$ and $46\%$, respectively. The results also show that the introduction of image quality optimization in our method can effectively improve the navigation performance.
# ルールセットの視覚化:設計空間の探索と検証

Visualizing Rule Sets: Exploration and Validation of a Design Space ( http://arxiv.org/abs/2103.01022v1 )

Jun Yuan, Oded Nov, Enrico Bertini(参考訳) ルールセットは、透明性と知性が必要な設定でモデルロジックを伝える手段として、機械学習(ML)でよく使用される。 ルールセットは通常、論理文(ルール)のテキストベースのリストとして表示される。 驚いたことに、これまでルールを提示するための視覚的な代替方法を探求する作業は限られていた。 本論文では,ルールの可読性や理解にポジティブな影響を与えると思われる視覚的要因に焦点をあてて,ルールの代替表現を設計するアイデアを検討する。 本稿では,ルールセットを視覚化するための初期設計空間と,その影響を探索するユーザスタディを提案する。 その結果, 設計要因のいくつかは, 精度への影響を最小限に抑えつつ, 読者がいかに効率的にルールを処理できるかに強い影響を与えていることがわかった。 この作業は、ルールをコミュニケーション戦略として使用してMLモデルを理解する際に、実践者がより効果的なソリューションを採用するのに役立ちます。

Rule sets are often used in Machine Learning (ML) as a way to communicate the model logic in settings where transparency and intelligibility are necessary. Rule sets are typically presented as a text-based list of logical statements (rules). Surprisingly, to date there has been limited work on exploring visual alternatives for presenting rules. In this paper, we explore the idea of designing alternative representations of rules, focusing on a number of visual factors we believe have a positive impact on rule readability and understanding. The paper presents an initial design space for visualizing rule sets and a user study exploring their impact. The results show that some design factors have a strong impact on how efficiently readers can process the rules while having minimal impact on accuracy. This work can help practitioners employ more effective solutions when using rules as a communication strategy to understand ML models.
# 複合データ駆動モデルの多目的進化設計

Multi-Objective Evolutionary Design of CompositeData-Driven Models ( http://arxiv.org/abs/2103.01301v1 )

Iana S. Polonskaia, Nikolay O. Nikitin, Ilia Revin, Pavel Vychuzhanin, Anna V. Kalyuzhnaya(参考訳) 本稿では,複合データ駆動数理モデルの設計のための多目的アプローチを提案する。 機械学習モデル、データ前処理ブロックなど、異なるブロックで構成されるグラフベースの異種パイプラインの識別を自動化できます。 この手法はGPComp@Freeと呼ばれるモデル設計のためのパラメータフリーな遺伝的アルゴリズム(GA)に基づいている。 自動機械学習ソリューションの一部であり、モデリングパイプラインの自動化の効率を高めるために開発されています。 提案手法の正しさと効率性を検証し, 選択解を検証するための一連の実験を行った。 実験結果から, モデル設計に対する多目的アプローチにより, 得られたモデルの多様性と品質が向上することが確認された。 実装されたアプローチは、オープンソースのAutoMLフレームワークFEDOTの一部として利用できる。

In this paper, a multi-objective approach for the design of composite data-driven mathematical models is proposed. It allows automating the identification of graph-based heterogeneous pipelines that consist of different blocks: machine learning models, data preprocessing blocks, etc. The implemented approach is based on a parameter-free genetic algorithm (GA) for model design called GPComp@Free. It is developed to be part of automated machine learning solutions and to increase the efficiency of the modeling pipeline automation. A set of experiments was conducted to verify the correctness and efficiency of the proposed approach and substantiate the selected solutions. The experimental results confirm that a multi-objective approach to the model design allows achieving better diversity and quality of obtained models. The implemented approach is available as a part of the open-source AutoML framework FEDOT.
# Inference Combinatorによる確率的プログラムの学習提案

Learning Proposals for Probabilistic Programs with Inference Combinators ( http://arxiv.org/abs/2103.00668v1 )

Sam Stites, Heiko Zimmermann, Hao Wu, Eli Sennesh, Jan-Willem can de Meent(参考訳) 推論コンビネーター(inference combinator)と呼ばれる確率的プログラムにおける提案の構築のためのオペレータを開発しています。 推論コンビネータは、遷移核の適用や重要再サンプリングといったプリミティブ演算を構成する重要スペーサよりも文法を定義する。 これらのサンプラーの提案はニューラルネットワークを使ってパラメータ化することができ、変動目標を最適化することでトレーニングすることができる。 その結果、ユーザプログラマブルな変分法のためのフレームワークが構築によって修正され、特定のモデルに合わせることができる。 我々は,このフレームワークの柔軟性を,償却ギブスサンプリングとアニーリングに基づく高度な変分法に適用する。

We develop operators for construction of proposals in probabilistic programs, which we refer to as inference combinators. Inference combinators define a grammar over importance samplers that compose primitive operations such as application of a transition kernels and importance resampling. Proposals in these samplers can be parameterized using neural networks, which in turn can be trained by optimizing variational objectives. The result is a framework for user-programmable variational methods that are correct by construction and can be tailored to specific models. We demonstrate the flexibility of this framework in applications to advanced variational methods based on amortized Gibbs sampling and annealing.
# 電力消費予測のためのパネル半パラメトリック量子回帰ニューラルネットワーク

Panel semiparametric quantile regression neural network for electricity consumption forecasting ( http://arxiv.org/abs/2103.00711v1 )

Xingcai Zhou and Jiangyan Wang(参考訳) 中国は、改革と開放の長期的な深化の中で、電力産業で大きな成果を上げている。 しかし、複雑な地域経済、社会、自然条件、電力資源は均等に分配されておらず、中国の一部の地域では電気不足の原因となっている。 堅牢な電気予測モデルを開発することが望ましい。 そこで我々は,人工ニューラルネットワークと半パラメトリック量子回帰を利用して,パネル半パラメトリック量子回帰ニューラルネットワーク(PSQRNN)を提案する。 PSQRNNは変数間の潜在的な線形および非線形関係を探索し、観測されていない地域の異質性を解釈し、パラメトリックモデルの解釈可能性を同時に維持することができる。 そしてPSQRNNは、ペナル化量子レグレッションとLASSO、リッジレグレッション、バックプロパゲーションアルゴリズムを組み合わせることで訓練される。 予測精度を評価するため,1999年から2018年にかけて中国の地方電力消費を3つのシナリオに基づいて分析する実験的検討を行った。 その結果,psqrnnモデルは,経済・気候要因を考慮し,電力消費予測に有効であることがわかった。 最後に、中国における次の5ドル(2019-2023年)の省電力消費が予測されている。

China has made great achievements in electric power industry during the long-term deepening of reform and opening up. However, the complex regional economic, social and natural conditions, electricity resources are not evenly distributed, which accounts for the electricity deficiency in some regions of China. It is desirable to develop a robust electricity forecasting model. Motivated by which, we propose a Panel Semiparametric Quantile Regression Neural Network (PSQRNN) by utilizing the artificial neural network and semiparametric quantile regression. The PSQRNN can explore a potential linear and nonlinear relationships among the variables, interpret the unobserved provincial heterogeneity, and maintain the interpretability of parametric models simultaneously. And the PSQRNN is trained by combining the penalized quantile regression with LASSO, ridge regression and backpropagation algorithm. To evaluate the prediction accuracy, an empirical analysis is conducted to analyze the provincial electricity consumption from 1999 to 2018 in China based on three scenarios. From which, one finds that the PSQRNN model performs better for electricity consumption forecasting by considering the economic and climatic factors. Finally, the provincial electricity consumptions of the next $5$ years (2019-2023) in China are reported by forecasting.
# CogDL: グラフのディープラーニングのための拡張ツールキット

CogDL: An Extensive Toolkit for Deep Learning on Graphs ( http://arxiv.org/abs/2103.00959v1 )

Yukuo Cen, Zhenyu Hou, Yan Wang, Qibin Chen, Yizhen Luo, Xingcheng Yao, Aohan Zeng, Shiguang Guo, Peng Zhang, Guohao Dai, Yu Wang, Chang Zhou, Hongxia Yang, Jie Tang(参考訳) グラフ表現学習は、グラフの低次元ノード埋め込みを学ぶことを目的としている。 ソーシャルネットワーク分析や大規模リコメンデータシステムなど、現実のいくつかのアプリケーションで使用されている。 本論文では,研究者や開発者が容易に実験やアプリケーション構築を行うことができるグラフの深層学習のための広範な研究ツールキットであるCogDLについて紹介する。 ノード分類、リンク予測、グラフ分類、その他のグラフタスクを含む、グラフ領域で最も重要なタスクに対する標準的なトレーニングと評価を提供する。 各タスクに対して、最先端モデルの実装を提供する。 我々のツールキットのモデルは、グラフ埋め込み法とグラフニューラルネットワークの2つの主要な部分に分けられる。 グラフ埋め込みメソッドのほとんどは、ノードレベルまたはグラフレベルの表現を監視されていない方法で学習し、構造情報などのグラフプロパティを保存し、グラフニューラルネットワークはノードの特徴をキャプチャし、半監視または自己監視設定で動作します。 私たちのツールキットで実装されたすべてのモデルは、リーダーボードの結果を簡単に再現できます。 CogDLのほとんどのモデルはPyTorch上に開発されており、ユーザーはPyTorchの利点を利用して独自のモデルを実装することができる。 さらに、大規模な学術データベースとシステムであるAMinerにおける実世界のアプリケーションに対するCogDLの有効性を実証する。

Graph representation learning aims to learn low-dimensional node embeddings for graphs. It is used in several real-world applications such as social network analysis and large-scale recommender systems. In this paper, we introduce CogDL, an extensive research toolkit for deep learning on graphs that allows researchers and developers to easily conduct experiments and build applications. It provides standard training and evaluation for the most important tasks in the graph domain, including node classification, link prediction, graph classification, and other graph tasks. For each task, it offers implementations of state-of-the-art models. The models in our toolkit are divided into two major parts, graph embedding methods and graph neural networks. Most of the graph embedding methods learn node-level or graph-level representations in an unsupervised way and preserves the graph properties such as structural information, while graph neural networks capture node features and work in semi-supervised or self-supervised settings. All models implemented in our toolkit can be easily reproducible for leaderboard results. Most models in CogDL are developed on top of PyTorch, and users can leverage the advantages of PyTorch to implement their own models. Furthermore, we demonstrate the effectiveness of CogDL for real-world applications in AMiner, which is a large academic database and system.
# 高次元変分推論の挑戦と機会

Challenges and Opportunities in High-dimensional Variational Inference ( http://arxiv.org/abs/2103.01085v1 )

Akash Kumar Dhaka, Alejandro Catalina, Manushi Welandawe, Michael Riis Andersen, Jonathan Huggins, Aki Vehtari(参考訳) 本稿では,ブラックボックス変分推論を用いたモデルパラメータの後方要約推定の限界とベストプラクティスについて検討する。 By taking an importance sampling perspective, we are able to explain and empirically demonstrate: 1) why the intuitions about the behavior of approximate families and divergences for low-dimensional posteriors fail for higher-dimensional posteriors, 2) how we can diagnose the pre-asymptotic reliability of variational inference in practice by examining the behavior of the density ratios (i.e., importance weights), 3) why the choice of variational objective is not as relevant for higher-dimensional posteriors, and 4) why, although flexible variational families can provide some benefits in higher dimensions, they also introduce additional optimization challenges. これらの知見に基づき, 高次元後方では, 最適化が最も容易かつ安定な排他的KL偏差を用いた上で, 変分族の改善やモデルパラメータ変換による近似族との類似性を高めることに注力することを推奨する。 また,低次元から中等次元では重尾の変動族や集団被覆の分岐が重要サンプリングによって近似を改善できる可能性が示唆された。

We explore the limitations of and best practices for using black-box variational inference to estimate posterior summaries of the model parameters. By taking an importance sampling perspective, we are able to explain and empirically demonstrate: 1) why the intuitions about the behavior of approximate families and divergences for low-dimensional posteriors fail for higher-dimensional posteriors, 2) how we can diagnose the pre-asymptotic reliability of variational inference in practice by examining the behavior of the density ratios (i.e., importance weights), 3) why the choice of variational objective is not as relevant for higher-dimensional posteriors, and 4) why, although flexible variational families can provide some benefits in higher dimensions, they also introduce additional optimization challenges. Based on these findings, for high-dimensional posteriors we recommend using the exclusive KL divergence that is most stable and easiest to optimize, and then focusing on improving the variational family or using model parameter transformations to make the posterior more similar to the approximating family. Our results also show that in low to moderate dimensions, heavy-tailed variational families and mass-covering divergences can increase the chances that the approximation can be improved by importance sampling.
# 早期退行決定機構としてのクラス平均

Class Means as an Early Exit Decision Mechanism ( http://arxiv.org/abs/2103.01148v1 )

ライセンス: Link先を確認
Alperen Gormez and Erdem Koyuncu(参考訳) 初期のエグジット機構を持つ最先端のニューラルネットワークは、計算コストの低い優れたパフォーマンスを達成するために、かなりの量のトレーニングと微調整を必要とする。 本稿では,サンプルの類型的手法に基づく新しい早期出口手法を提案する。 既存のほとんどのスキームとは異なり、我々の手法は内部分類器の勾配に基づく訓練を必要としない。 これにより、無線エッジネットワークのように低消費電力デバイスでのニューラルネットワークトレーニングに特に有用である。 特に,一定のトレーニング時間予算が与えられると,既存の早期脱出機構と比較して高い精度が得られる。 さらに、トレーニング時間予算に制限がない場合、既存の早期終了方式と組み合わせて性能を向上し、計算コストとネットワーク精度のトレードオフを改善することができる。

State-of-the-art neural networks with early exit mechanisms often need considerable amount of training and fine-tuning to achieve good performance with low computational cost. We propose a novel early exit technique based on the class means of samples. Unlike most existing schemes, our method does not require gradient-based training of internal classifiers. This makes our method particularly useful for neural network training in low-power devices, as in wireless edge networks. In particular, given a fixed training time budget, our scheme achieves higher accuracy as compared to existing early exit mechanisms. Moreover, if there are no limitations on the training time budget, our method can be combined with an existing early exit scheme to boost its performance, achieving a better trade-off between computational cost and network accuracy.
# フラクタル学習率スケジューリングによる加速

When balancing the practical tradeoffs of iterative methods for large-scale optimization, the learning rate schedule remains notoriously difficult to understand and expensive to tune. We demonstrate the presence of these subtleties even in the innocuous case when the objective is a convex quadratic. We reinterpret an iterative algorithm from the numerical analysis literature as what we call the Chebyshev learning rate schedule for accelerating vanilla gradient descent, and show that the problem of mitigating instability leads to a fractal ordering of step sizes. We provide some experiments and discussion to challenge current understandings of the "edge of stability" in deep learning: even in simple settings, provable acceleration can be obtained by making negative local progress on the objective.
# gebt: グラフ畳み込みネットワークトレーニングで早期チケットを描画する

Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs. However, it remains notoriously challenging to train and inference GCNs over large graph datasets, limiting their application to large real-world graphs and hindering the exploration of deeper and more sophisticated GCN graphs. This is because as the graph size grows, the sheer number of node features and the large adjacency matrix can easily explode the required memory and data movements. To tackle the aforementioned challenge, we explore the possibility of drawing lottery tickets when sparsifying GCN graphs, i.e., subgraphs that largely shrink the adjacency matrix yet are capable of achieving accuracy comparable to or even better than their corresponding full graphs. Specifically, we for the first time discover the existence of graph early-bird (GEB) tickets that emerge at the very early stage when sparsifying GCN graphs, and propose a simple yet effective detector to automatically identify the emergence of such GEB tickets. Furthermore, we develop a generic efficient GCN training framework dubbed GEBT that can significantly boost the efficiency of GCN training by (1) drawing joint early-bird tickets between the GCN graphs and models and (2) enabling simultaneously sparsifying both GCN graphs and models, paving the way for training and inferencing large GCN graphs to handle real-world graph datasets. Experiments on various GCN models and datasets consistently validate our GEB finding and the effectiveness of our GEBT, e.g., our GEBT achieves up to 80.2% ~ 85.6% and 84.6% ~ 87.5% savings of GCN training and inference costs while leading to a comparable or even better accuracy as compared to state-of-the-art methods. Code available at https://github.com/RICE-EIC/GEBT
# Panoramic Panoptic Segmentation: Unsupervised Contrastive Learning による全周理解に向けて

In this work, we introduce panoramic panoptic segmentation as the most holistic scene understanding both in terms of field of view and image level understanding. A complete surrounding understanding provides a maximum of information to the agent, which is essential for any intelligent vehicle in order to make informed decisions in a safety-critical dynamic environment such as real-world traffic. In order to overcome the lack of annotated panoramic images, we propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain. Using our proposed method, we manage to achieve significant improvements of over 5\% measured in PQ over non-adapted models on our Wild Panoramic Panoptic Segmentation (WildPPS) dataset. We show that our proposed Panoramic Robust Feature (PRF) framework is not only suitable to improve performance on panoramic images but can be beneficial whenever model training and deployment are executed on data taken from different distributions. As an additional contribution, we publish WildPPS: The first panoramic panoptic image dataset to foster progress in surrounding perception.
# DR-TANet:街路変化検出のための動的受容時間注意ネットワーク

Street scene change detection continues to capture researchers' interests in the computer vision community. It aims to identify the changed regions of the paired street-view images captured at different times. The state-of-the-art network based on the encoder-decoder architecture leverages the feature maps at the corresponding level between two channels to gain sufficient information of changes. Still, the efficiency of feature extraction, feature correlation calculation, even the whole network requires further improvement. This paper proposes the temporal attention and explores the impact of the dependency-scope size of temporal attention on the performance of change detection. In addition, based on the Temporal Attention Module (TAM), we introduce a more efficient and light-weight version - Dynamic Receptive Temporal Attention Module (DRTAM) and propose the Concurrent Horizontal and Vertical Attention (CHVA) to improve the accuracy of the network on specific challenging entities. On street scene datasets `GSV', `TSUNAMI' and `VL-CMU-CD', our approach gains excellent performance, establishing new state-of-the-art scores without bells and whistles, while maintaining high efficiency applicable in autonomous vehicles.
# マクロホールのロバスト3次元Uネットセグメンテーション

Macular holes are a common eye condition which result in visual impairment. We look at the application of deep convolutional neural networks to the problem of macular hole segmentation. We use the 3D U-Net architecture as a basis and experiment with a number of design variants. Manually annotating and measuring macular holes is time consuming and error prone. Previous automated approaches to macular hole segmentation take minutes to segment a single 3D scan. Our proposed model generates significantly more accurate segmentations in less than a second. We found that an approach of architectural simplification, by greatly simplifying the network capacity and depth, exceeds both expert performance and state-of-the-art models such as residual 3D U-Nets.
# 目を見つめる以上のものがある:マルチモーダル知識を希釈した自己監督型マルチオブジェクト検出と音追跡

Attributes of sound inherent to objects can provide valuable cues to learn rich representations for object detection and tracking. Furthermore, the co-occurrence of audiovisual events in videos can be exploited to localize objects over the image field by solely monitoring the sound in the environment. Thus far, this has only been feasible in scenarios where the camera is static and for single object detection. Moreover, the robustness of these methods has been limited as they primarily rely on RGB images which are highly susceptible to illumination and weather changes. In this work, we present the novel self-supervised MM-DistillNet framework consisting of multiple teachers that leverage diverse modalities including RGB, depth and thermal images, to simultaneously exploit complementary cues and distill knowledge into a single audio student network. We propose the new MTA loss function that facilitates the distillation of information from multimodal teachers in a self-supervised manner. Additionally, we propose a novel self-supervised pretext task for the audio student that enables us to not rely on labor-intensive manual annotations. We introduce a large-scale multimodal dataset with over 113,000 time-synchronized frames of RGB, depth, thermal, and audio modalities. Extensive experiments demonstrate that our approach outperforms state-of-the-art methods while being able to detect multiple objects using only sound during inference and even while moving.
# 自己監督型表現学習のためのコントラスト分離符号化

To extract robust deep representations from long sequential modeling of speech data, we propose a self-supervised learning approach, namely Contrastive Separative Coding (CSC). Our key finding is to learn such representations by separating the target signal from contrastive interfering signals. First, a multi-task separative encoder is built to extract shared separable and discriminative embedding; secondly, we propose a powerful cross-attention mechanism performed over speaker representations across various interfering conditions, allowing the model to focus on and globally aggregate the most critical information to answer the "query" (current bottom-up embedding) while paying less attention to interfering, noisy, or irrelevant parts; lastly, we form a new probabilistic contrastive loss which estimates and maximizes the mutual information between the representations and the global speaker vector. While most prior unsupervised methods have focused on predicting the future, neighboring, or missing samples, we take a different perspective of predicting the interfered samples. Moreover, our contrastive separative loss is free from negative sampling. The experiment demonstrates that our approach can learn useful representations achieving a strong speaker verification performance in adverse conditions.
# Sandglasset: 時間領域の音声分離のための軽量多粒度自己アテンシブネットワーク

One of the leading single-channel speech separation (SS) models is based on a TasNet with a dual-path segmentation technique, where the size of each segment remains unchanged throughout all layers. In contrast, our key finding is that multi-granularity features are essential for enhancing contextual modeling and computational efficiency. We introduce a self-attentive network with a novel sandglass-shape, namely Sandglasset, which advances the state-of-the-art (SOTA) SS performance at significantly smaller model size and computational cost. Forward along each block inside Sandglasset, the temporal granularity of the features gradually becomes coarser until reaching half of the network blocks, and then successively turns finer towards the raw signal level. We also unfold that residual connections between features with the same granularity are critical for preserving information after passing through the bottleneck layer. Experiments show our Sandglasset with only 2.3M parameters has achieved the best results on two benchmark SS datasets -- WSJ0-2mix and WSJ0-3mix, where the SI-SNRi scores have been improved by absolute 0.6 dB and 2.4 dB, respectively, comparing to the prior SOTA results.
# 時空間医療データにおける潜在線形ダイナミクス

Spatiotemporal imaging is common in medical imaging, with applications in e.g. cardiac diagnostics, surgical guidance and radiotherapy monitoring. In this paper, we present an unsupervised model that identifies the underlying dynamics of the system, only based on the sequential images. The model maps the input to a low-dimensional latent space wherein a linear relationship holds between a hidden state process and the observed latent process. Knowledge of the system dynamics enables denoising, imputation of missing values and extrapolation of future image frames. We use a Variational Auto-Encoder (VAE) for the dimensionality reduction and a Linear Gaussian State Space Model (LGSSM) for the latent dynamics. The model, known as a Kalman Variational Auto-Encoder, is end-to-end trainable and the weights, both in the VAE and LGSSM, are simultaneously updated by maximizing the evidence lower bound of the marginal log likelihood. Our experiment, on cardiac ultrasound time series, shows that the dynamical model provide better reconstructions than a similar model without dynamics. And also possibility to impute and extrapolate for missing samples.
# 事前学習型言語モデルを用いた低リソース設定における長文文書要約

Abstractive summarization is the task of compressing a long document into a coherent short document while retaining salient information. Modern abstractive summarization methods are based on deep neural networks which often require large training datasets. Since collecting summarization datasets is an expensive and time-consuming task, practical industrial settings are usually low-resource. In this paper, we study a challenging low-resource setting of summarizing long legal briefs with an average source document length of 4268 words and only 120 available (document, summary) pairs. To account for data scarcity, we used a modern pretrained abstractive summarizer BART (Lewis et al., 2020), which only achieves 17.9 ROUGE-L as it struggles with long documents. We thus attempt to compress these long documents by identifying salient sentences in the source which best ground the summary, using a novel algorithm based on GPT-2 (Radford et al., 2019) language model perplexity scores, that operates within the low resource regime. On feeding the compressed documents to BART, we observe a 6.0 ROUGE-L improvement. Our method also beats several competitive salience detection baselines. Furthermore, the identified salient sentences tend to agree with an independent human labeling by domain experts.
# マルチターン対話におけるユーザの満足度推定のためのデータ駆動アプローチ

The evaluation of multi-turn dialogues remains challenging. The common approach of labeling the user satisfaction with the experience on the dialogue level does not reflect the task's difficulty. Therefore assigning the same experience score to two tasks with different complexity levels is misleading. Another approach, which suggests evaluating each dialogue turn independently, ignores each turn's long-term influence over the final user experience with dialogue. We instead develop a new method to estimate the turn-level satisfaction for dialogue, which is context-sensitive and has a long-term view. Our approach is data-driven which makes it easily personalized. The interactions between users and dialogue systems are formulated using a budget consumption setup. We assume the user has an initial interaction budget for a conversation based on the task complexity, and each dialogue turn has a cost. When the task is completed or the budget has been run out, the user will quit the interaction. We demonstrate the effectiveness of our method by extensive experimentation with a simulated dialogue platform and a realistic dialogue dataset.
# ToxCCIn: 解釈可能な毒性コンテンツ分類

Despite the recent successes of transformer-based models in terms of effectiveness on a variety of tasks, their decisions often remain opaque to humans. Explanations are particularly important for tasks like offensive language or toxicity detection on social media because a manual appeal process is often in place to dispute automatically flagged content. In this work, we propose a technique to improve the interpretability of these models, based on a simple and powerful assumption: a post is at least as toxic as its most toxic span. We incorporate this assumption into transformer models by scoring a post based on the maximum toxicity of its spans and augmenting the training process to identify correct spans. We find this approach effective and can produce explanations that exceed the quality of those provided by Logistic Regression analysis (often regarded as a highly-interpretable model), according to a human study.
# バイオインスパイアされた網膜ニューラルネットによる小ターゲット運動情報の正確な抽出

Robust and accurate detection of small moving targets in cluttered moving backgrounds is a significant and challenging problem for robotic visual systems to perform search and tracking tasks. Inspired by the neural circuitry of elementary motion vision in the mammalian retina, this paper proposes a bioinspired retinal neural network based on a new neurodynamics-based temporal filtering and multiform 2-D spatial Gabor filtering. This model can estimate motion direction accurately via only two perpendicular spatiotemporal filtering signals, and respond to small targets of different sizes and velocities by adjusting the dendrite field size of the spatial filter. Meanwhile, an algorithm of directionally selective inhibition is proposed to suppress the target-like features in the moving background, which can reduce the influence of background motion effectively. Extensive synthetic and real-data experiments show that the proposed model works stably for small targets of a wider size and velocity range, and has better detection performance than other bioinspired models. Additionally, it can also extract the information of motion direction and motion energy accurately and rapidly.
# 対向攻撃に対する車線検出のためのモデル非依存防御

Susceptibility of neural networks to adversarial attack prompts serious safety concerns for lane detection efforts, a domain where such models have been widely applied. Recent work on adversarial road patches have successfully induced perception of lane lines with arbitrary form, presenting an avenue for rogue control of vehicle behavior. In this paper, we propose a modular lane verification system that can catch such threats before the autonomous driving system is misled while remaining agnostic to the particular lane detection model. Our experiments show that implementing the system with a simple convolutional neural network (CNN) can defend against a wide gamut of attacks on lane detection models. With a 10% impact to inference time, we can detect 96% of bounded non-adaptive attacks, 90% of bounded adaptive attacks, and 98% of patch attacks while preserving accurate identification at least 95% of true lanes, indicating that our proposed verification system is effective at mitigating lane detection security risks with minimal overhead.
# FPS-Net: 大規模LiDARポイントクラウドセグメンテーションのための畳み込みネットワーク

Scene understanding based on LiDAR point cloud is an essential task for autonomous cars to drive safely, which often employs spherical projection to map 3D point cloud into multi-channel 2D images for semantic segmentation. Most existing methods simply stack different point attributes/modalities (e.g. coordinates, intensity, depth, etc.) as image channels to increase information capacity, but ignore distinct characteristics of point attributes in different image channels. We design FPS-Net, a convolutional fusion network that exploits the uniqueness and discrepancy among the projected image channels for optimal point cloud segmentation. FPS-Net adopts an encoder-decoder structure. Instead of simply stacking multiple channel images as a single input, we group them into different modalities to first learn modality-specific features separately and then map the learned features into a common high-dimensional feature space for pixel-level fusion and learning. Specifically, we design a residual dense block with multiple receptive fields as a building block in the encoder which preserves detailed information in each modality and learns hierarchical modality-specific and fused features effectively. In the FPS-Net decoder, we use a recurrent convolution block likewise to hierarchically decode fused features into output space for pixel-level classification. Extensive experiments conducted on two widely adopted point cloud datasets show that FPS-Net achieves superior semantic segmentation as compared with state-of-the-art projection-based methods. In addition, the proposed modality fusion idea is compatible with typical projection-based methods and can be incorporated into them with consistent performance improvements.
# NeuTex: ボリュームニューラルレンダリングのためのニューラルテクスチャマッピング

Recent work has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic rendering for challenging scenes that mesh reconstruction fails on. However, these methods entangle geometry and appearance in a "black-box" volume that cannot be edited. Instead, we present an approach that explicitly disentangles geometry--represented as a continuous 3D volume--from appearance--represented as a continuous 2D texture map. We achieve this by introducing a 3D-to-2D texture mapping (or surface parameterization) network into volumetric representations. We constrain this texture mapping network using an additional 2D-to-3D inverse mapping network and a novel cycle consistency loss to make 3D surface points map to 2D texture points that map back to the original 3D points. We demonstrate that this representation can be reconstructed using only multi-view image supervision and generates high-quality rendering results. More importantly, by separating geometry and texture, we allow users to edit appearance by simply editing 2D texture maps.
# 精度・高能率画像ガイド奥行き完成に向けて

Image guided depth completion is the task of generating a dense depth map from a sparse depth map and a high quality image. In this task, how to fuse the color and depth modalities plays an important role in achieving good performance. This paper proposes a two-branch backbone that consists of a color-dominant branch and a depth-dominant branch to exploit and fuse two modalities thoroughly. More specifically, one branch inputs a color image and a sparse depth map to predict a dense depth map. The other branch takes as inputs the sparse depth map and the previously predicted depth map, and outputs a dense depth map as well. The depth maps predicted from two branches are complimentary to each other and therefore they are adaptively fused. In addition, we also propose a simple geometric convolutional layer to encode 3D geometric cues. The geometric encoded backbone conducts the fusion of different modalities at multiple stages, leading to good depth completion results. We further implement a dilated and accelerated CSPN++ to refine the fused depth map efficiently. The proposed full model ranks 1st in the KITTI depth completion online leaderboard at the time of submission. It also infers much faster than most of the top ranked methods. The code of this work will be available at https://github.com/JUGGHM/PENet_ICRA2021.
# テキストキーポイントとリンクを用いた任意形状のシーンテキストの検出と修正

Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc. This paper presents a mask-guided multi-task network that detects and rectifies scene texts of arbitrary shapes reliably. Three types of keypoints are detected which specify the centre line and so the shape of text instances accurately. In addition, four types of keypoint links are detected of which the horizontal links associate the detected keypoints of each text instance and the vertical links predict a pair of landmark points (for each keypoint) along the upper and lower text boundary, respectively. Scene texts can be located and rectified by linking up the associated landmark points (giving localization polygon boxes) and transforming the polygon boxes via thin plate spline, respectively. Extensive experiments over several public datasets show that the use of text keypoints is tolerant to the variation in text orientations, lengths, and curvatures, and it achieves superior scene text detection and rectification performance as compared with state-of-the-art methods.
# 野生の顔にマスクを装着する3dモデルに基づくアプローチ

Face recognition research now requires a large number of labelled masked face images in the era of this unprecedented COVID-19 pandemic. Unfortunately, the rapid spread of the virus has left us little time to prepare for such dataset in the wild. To circumvent this issue, we present a 3D model-based approach called WearMask3D for augmenting face images of various poses to the masked face counterparts. Our method proceeds by first fitting a 3D morphable model on the input image, second overlaying the mask surface onto the face model and warping the respective mask texture, and last projecting the 3D mask back to 2D. The mask texture is adapted based on the brightness and resolution of the input image. By working in 3D, our method can produce more natural masked faces of diverse poses from a single mask texture. To compare precisely between different augmentation approaches, we have constructed a dataset comprising masked and unmasked faces with labels called MFW-mini. Experimental results demonstrate WearMask3D, which will be made publicly available, produces more realistic masked images, and utilizing these images for training leads to improved recognition accuracy of masked faces compared to the state-of-the-art.
# ノイズX線画像における禁止項目検出のための過剰サンプリング脱閉塞注意ネットワーク

Security inspection is X-ray scanning for personal belongings in suitcases, which is significantly important for the public security but highly time-consuming for human inspectors. Fortunately, deep learning has greatly promoted the development of computer vision, offering a possible way of automatic security inspection. However, items within a luggage are randomly overlapped resulting in noisy X-ray images with heavy occlusions. Thus, traditional CNN-based models trained through common image recognition datasets fail to achieve satisfactory performance in this scenario. To address these problems, we contribute the first high-quality prohibited X-ray object detection dataset named OPIXray, which contains 8885 X-ray images from 5 categories of the widely-occurred prohibited item ``cutters''. The images are gathered from an airport and these prohibited items are annotated manually by professional inspectors, which can be used as a benchmark for model training and further facilitate future research. To better improve occluded X-ray object detection, we further propose an over-sampling de-occlusion attention network (DOAM-O), which consists of a novel de-occlusion attention module and a new over-sampling training strategy. Specifically, our de-occlusion module, namely DOAM, simultaneously leverages the different appearance information of the prohibited items; the over-sampling training strategy forces the model to put more emphasis on these hard samples consisting these items of high occlusion levels, which is more suitable for this scenario. We comprehensively evaluated DOAM-O on the OPIXray dataset, which proves that our model can stably improve the performance of the famous detection models such as SSD, YOLOv3, and FCOS, and outperform many extensively-used attention mechanisms.
# MFST:マルチ機能Siameseトラッカー

Siamese trackers have recently achieved interesting results due to their balance between accuracy and speed. This success is mainly due to the fact that deep similarity networks were specifically designed to address the image similarity problem. Therefore, they are inherently more appropriate than classical CNNs for the tracking task. However, Siamese trackers rely on the last convolutional layers for similarity analysis and target search, which restricts their performance. In this paper, we argue that using a single convolutional layer as feature representation is not the optimal choice within the deep similarity framework, as multiple convolutional layers provide several abstraction levels in characterizing an object. Starting from this motivation, we present the Multi-Features Siamese Tracker (MFST), a novel tracking algorithm exploiting several hierarchical feature maps for robust deep similarity tracking. MFST proceeds by fusing hierarchical features to ensure a richer and more efficient representation. Moreover, we handle appearance variation by calibrating deep features extracted from two different CNN models. Based on this advanced feature representation, our algorithm achieves high tracking accuracy, while outperforming several state-of-the-art trackers, including standard Siamese trackers. The code and trained models are available at https://github.com/zhenxili96/MFST.
# DST: ノイズラベルを用いた学習のためのデータ選択と共同トレーニング

Training a deep neural network heavily relies on a large amount of training data with accurate annotations. To alleviate this problem, various methods have been proposed to annotate the data automatically. However, automatically generating annotations will inevitably yields noisy labels. In this paper, we propose a Data Selection and joint Training (DST) method to automatically select training samples with accurate annotations. Specifically, DST fits a mixture model according to the original annotation as well as the predicted label for each training sample, and the mixture model is utilized to dynamically divide the training dataset into a correctly labeled dataset, a correctly predicted set and a wrong dataset. Then, DST is trained with these datasets in a supervised manner. Due to confirmation bias problem, we train the two networks alternately, and each network is tasked to establish the data division to teach another network. For each iteration, the correctly labeled and predicted labels are reweighted respectively by the probabilities from the mixture model, and a uniform distribution is used to generate the probabilities of the wrong samples. Experiments on CIFAR-10, CIFAR-100 and Clothing1M demonstrate that DST is the comparable or superior to the state-of-the-art methods.
# 自己監督型低照度画像強調とデノナイジング

This paper proposes a self-supervised low light image enhancement method based on deep learning, which can improve the image contrast and reduce noise at the same time to avoid the blur caused by pre-/post-denoising. The method contains two deep sub-networks, an Image Contrast Enhancement Network (ICE-Net) and a Re-Enhancement and Denoising Network (RED-Net). The ICE-Net takes the low light image as input and produces a contrast enhanced image. The RED-Net takes the result of ICE-Net and the low light image as input, and can re-enhance the low light image and denoise at the same time. Both of the networks can be trained with low light images only, which is achieved by a Maximum Entropy based Retinex (ME-Retinex) model and an assumption that noises are independently distributed. In the ME-Retinex model, a new constraint on the reflectance image is introduced that the maximum channel of the reflectance image conforms to the maximum channel of the low light image and its entropy should be the largest, which converts the decomposition of reflectance and illumination in Retinex model to a non-ill-conditioned problem and allows the ICE-Net to be trained with a self-supervised way. The loss functions of RED-Net are carefully formulated to separate the noises and details during training, and they are based on the idea that, if noises are independently distributed, after the processing of smoothing filters (\eg mean filter), the gradient of the noise part should be smaller than the gradient of the detail part. It can be proved qualitatively and quantitatively through experiments that the proposed method is efficient.
# 二元ニューラルネットワークの学習周波数領域近似

Binary neural networks (BNNs) represent original full-precision weights and activations into 1-bit with sign function. Since the gradient of the conventional sign function is almost zero everywhere which cannot be used for back-propagation, several attempts have been proposed to alleviate the optimization difficulty by using approximate gradient. However, those approximations corrupt the main direction of de facto gradient. To this end, we propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs, namely frequency domain approximation (FDA). The proposed approach does not affect the low-frequency information of the original sign function which occupies most of the overall energy, and high-frequency coefficients will be ignored to avoid the huge computational overhead. In addition, we embed a noise adaptation module into the training phase to compensate the approximation error. The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
# ADAADepth: 自己監督単眼深推定のためのデータ拡張と注意の適応

Self-supervised learning of depth has been a highly studied topic of research as it alleviates the requirement of having ground truth annotations for predicting depth. Depth is learnt as an intermediate solution to the task of view synthesis, utilising warped photometric consistency. Although it gives good results when trained using stereo data, the predicted depth is still sensitive to noise, illumination changes and specular reflections. Also, occlusion can be tackled better by learning depth from a single camera. We propose ADAA, utilising depth augmentation as depth supervision for learning accurate and robust depth. We propose a relational self-attention module that learns rich contextual features and further enhances depth results. We also optimize the auto-masking strategy across all losses by enforcing L1 regularisation over mask. Our novel progressive training strategy first learns depth at a lower resolution and then progresses to the original resolution with slight training. We utilise a ResNet18 encoder, learning features for prediction of both depth and pose. We evaluate our predicted depth on the standard KITTI driving dataset and achieve state-of-the-art results for monocular depth estimation whilst having significantly lower number of trainable parameters in our deep learning framework. We also evaluate our model on Make3D dataset showing better generalization than other methods.
# ゼロ参照深曲推定による低照度画像の学習

This paper presents a novel method, Zero-Reference Deep Curve Estimation (Zero-DCE), which formulates light enhancement as a task of image-specific curve estimation with a deep network. Our method trains a lightweight deep network, DCE-Net, to estimate pixel-wise and high-order curves for dynamic range adjustment of a given image. The curve estimation is specially designed, considering pixel value range, monotonicity, and differentiability. Zero-DCE is appealing in its relaxed assumption on reference images, i.e., it does not require any paired or even unpaired data during training. This is achieved through a set of carefully formulated non-reference loss functions, which implicitly measure the enhancement quality and drive the learning of the network. Despite its simplicity, we show that it generalizes well to diverse lighting conditions. Our method is efficient as image enhancement can be achieved by an intuitive and simple nonlinear curve mapping. We further present an accelerated and light version of Zero-DCE, called Zero-DCE++, that takes advantage of a tiny network with just 10K parameters. Zero-DCE++ has a fast inference speed (1000/11 FPS on a single GPU/CPU for an image of size 1200*900*3) while keeping the enhancement performance of Zero-DCE. Extensive experiments on various benchmarks demonstrate the advantages of our method over state-of-the-art methods qualitatively and quantitatively. Furthermore, the potential benefits of our method to face detection in the dark are discussed. The source code will be made publicly available at https://li-chongyi.github.io/Proj_Zero-DCE++.html.
# FineNet:Face Video Deblurringのフレーム補間と強化

The objective of this work is to deblur face videos. We propose a method that tackles this problem from two directions: (1) enhancing the blurry frames, and (2) treating the blurry frames as missing values and estimate them by interpolation. These approaches are complementary to each other, and their combination outperforms individual ones. We also introduce a novel module that leverages the structure of faces for finding positional offsets between video frames. This module can be integrated into the processing pipelines of both approaches, improving the quality of the final outcome. Experiments on three real and synthetically generated blurry video datasets show that our method outperforms the previous state-of-the-art methods by a large margin in terms of both quantitative and qualitative results.
# DF-VO:視覚オドメトリーに何を学ぶべきか?

Multi-view geometry-based methods dominate the last few decades in monocular Visual Odometry for their superior performance, while they have been vulnerable to dynamic and low-texture scenes. More importantly, monocular methods suffer from scale-drift issue, i.e., errors accumulate over time. Recent studies show that deep neural networks can learn scene depths and relative camera in a self-supervised manner without acquiring ground truth labels. More surprisingly, they show that the well-trained networks enable scale-consistent predictions over long videos, while the accuracy is still inferior to traditional methods because of ignoring geometric information. Building on top of recent progress in computer vision, we design a simple yet robust VO system by integrating multi-view geometry and deep learning on Depth and optical Flow, namely DF-VO. In this work, a) we propose a method to carefully sample high-quality correspondences from deep flows and recover accurate camera poses with a geometric module; b) we address the scale-drift issue by aligning geometrically triangulated depths to the scale-consistent deep depths, where the dynamic scenes are taken into account. Comprehensive ablation studies show the effectiveness of the proposed method, and extensive evaluation results show the state-of-the-art performance of our system, e.g., Ours (1.652%) v.s. ORB-SLAM (3.247%}) in terms of translation error in KITTI Odometry benchmark. Source code is publicly available at: \href{https://github.com/Huangying-Zhan/DF-VO}{DF-VO}.
# 小さなエネルギーが長い道のり:エネルギー効率の良い、畳み込みニューラルネットワークからスパイクニューラルネットワークへの正確な変換

Spiking neural networks (SNNs) offer an inherent ability to process spatial-temporal data, or in other words, realworld sensory data, but suffer from the difficulty of training high accuracy models. A major thread of research on SNNs is on converting a pre-trained convolutional neural network (CNN) to an SNN of the same structure. State-of-the-art conversion methods are approaching the accuracy limit, i.e., the near-zero accuracy loss of SNN against the original CNN. However, we note that this is made possible only when significantly more energy is consumed to process an input. In this paper, we argue that this trend of ''energy for accuracy'' is not necessary -- a little energy can go a long way to achieve the near-zero accuracy loss. Specifically, we propose a novel CNN-to-SNN conversion method that is able to use a reasonably short spike train (e.g., 256 timesteps for CIFAR10 images) to achieve the near-zero accuracy loss. The new conversion method, named as explicit current control (ECC), contains three techniques (current normalisation, thresholding for residual elimination, and consistency maintenance for batch-normalisation), in order to explicitly control the currents flowing through the SNN when processing inputs. We implement ECC into a tool nicknamed SpKeras, which can conveniently import Keras CNN models and convert them into SNNs. We conduct an extensive set of experiments with the tool -- working with VGG16 and various datasets such as CIFAR10 and CIFAR100 -- and compare with state-of-the-art conversion methods. Results show that ECC is a promising method that can optimise over energy consumption and accuracy loss simultaneously.
# Few-Shot Lifelong Learning

Many real-world classification problems often have classes with very few labeled training samples. Moreover, all possible classes may not be initially available for training, and may be given incrementally. Deep learning models need to deal with this two-fold problem in order to perform well in real-life situations. In this paper, we propose a novel Few-Shot Lifelong Learning (FSLL) method that enables deep learning models to perform lifelong/continual learning on few-shot data. Our method selects very few parameters from the model for training every new set of classes instead of training the full model. This helps in preventing overfitting. We choose the few parameters from the model in such a way that only the currently unimportant parameters get selected. By keeping the important parameters in the model intact, our approach minimizes catastrophic forgetting. Furthermore, we minimize the cosine similarity between the new and the old class prototypes in order to maximize their separation, thereby improving the classification performance. We also show that integrating our method with self-supervision improves the model performance significantly. We experimentally show that our method significantly outperforms existing methods on the miniImageNet, CIFAR-100, and CUB-200 datasets. Specifically, we outperform the state-of-the-art method by an absolute margin of 19.27% for the CUB dataset.
# 道路ダイナミクスとコストマップの自己監督同時マルチステップ予測

While supervised learning is widely used for perception modules in conventional autonomous driving solutions, scalability is hindered by the huge amount of data labeling needed. In contrast, while end-to-end architectures do not require labeled data and are potentially more scalable, interpretability is sacrificed. We introduce a novel architecture that is trained in a fully self-supervised fashion for simultaneous multi-step prediction of space-time cost map and road dynamics. Our solution replaces the manually designed cost function for motion planning with a learned high dimensional cost map that is naturally interpretable and allows diverse contextual information to be integrated without manual data labeling. Experiments on real world driving data show that our solution leads to lower number of collisions and road violations in long planning horizons in comparison to baselines, demonstrating the feasibility of fully self-supervised prediction without sacrificing either scalability or interpretability.
# 高精度データフリー量子化のための分散サンプル生成

Quantization has emerged as one of the most prevalent approaches to compress and accelerate neural networks. Recently, data-free quantization has been widely studied as a practical and promising solution. It synthesizes data for calibrating the quantized model according to the batch normalization (BN) statistics of FP32 ones and significantly relieves the heavy dependency on real training data in traditional quantization methods. Unfortunately, we find that in practice, the synthetic data identically constrained by BN statistics suffers serious homogenization at both distribution level and sample level and further causes a significant performance drop of the quantized model. We propose Diverse Sample Generation (DSG) scheme to mitigate the adverse effects caused by homogenization. Specifically, we slack the alignment of feature statistics in the BN layer to relax the constraint at the distribution level and design a layerwise enhancement to reinforce specific layers for different data samples. Our DSG scheme is versatile and even able to be applied to the state-of-the-art post-training quantization method like AdaRound. We evaluate the DSG scheme on the large-scale image classification task and consistently obtain significant improvements over various network architectures and quantization methods, especially when quantized to lower bits (e.g., up to 22% improvement on W4A4). Moreover, benefiting from the enhanced diversity, models calibrated by synthetic data perform close to those calibrated by real data and even outperform them on W4A4.
# 二重注意抑制攻撃 : 物理的世界における逆境カモフラージュの発生

Deep learning models are vulnerable to adversarial examples. As a more threatening type for practical deep learning systems, physical adversarial examples have received extensive research attention in recent years. However, without exploiting the intrinsic characteristics such as model-agnostic and human-specific patterns, existing works generate weak adversarial perturbations in the physical world, which fall short of attacking across different models and show visually suspicious appearance. Motivated by the viewpoint that attention reflects the intrinsic characteristics of the recognition process, this paper proposes the Dual Attention Suppression (DAS) attack to generate visually-natural physical adversarial camouflages with strong transferability by suppressing both model and human attention. As for attacking, we generate transferable adversarial camouflages by distracting the model-shared similar attention patterns from the target to non-target regions. Meanwhile, based on the fact that human visual attention always focuses on salient items (e.g., suspicious distortions), we evade the human-specific bottom-up attention to generate visually-natural camouflages which are correlated to the scenario context. We conduct extensive experiments in both the digital and physical world for classification and detection tasks on up-to-date models (e.g., Yolo-V5) and significantly demonstrate that our method outperforms state-of-the-art methods.
# p2-net: ピクセルと点マッチングのための局所特徴の同時記述と検出

Accurately describing and detecting 2D and 3D keypoints is crucial to establishing correspondences across images and point clouds. Despite a plethora of learning-based 2D or 3D local feature descriptors and detectors having been proposed, the derivation of a shared descriptor and joint keypoint detector that directly matches pixels and points remains under-explored by the community. This work takes the initiative to establish fine-grained correspondences between 2D images and 3D point clouds. In order to directly match pixels and points, a dual fully convolutional framework is presented that maps 2D and 3D inputs into a shared latent representation space to simultaneously describe and detect keypoints. Furthermore, an ultra-wide reception mechanism in combination with a novel loss function are designed to mitigate the intrinsic information variations between pixel and point local regions. Extensive experimental results demonstrate that our framework shows competitive performance in fine-grained matching between images and point clouds and achieves state-of-the-art results for the task of indoor visual localization. Our source code will be available at [no-name-for-blind-review].
# Few-Shotオブジェクト検出のためのUniversal-Prototype Augmentation

Few-shot object detection (FSOD) aims to strengthen the performance of novel object detection with few labeled samples. To alleviate the constraint of few samples, enhancing the generalization ability of learned features for novel objects plays a key role. Thus, the feature learning process of FSOD should focus more on intrinsical object characteristics, which are invariant under different visual changes and therefore are helpful for feature generalization. Unlike previous attempts of the meta-learning paradigm, in this paper, we explore how to smooth object features with intrinsical characteristics that are universal across different object categories. We propose a new prototype, namely universal prototype, that is learned from all object categories. Besides the advantage of characterizing invariant characteristics, the universal prototypes alleviate the impact of unbalanced object categories. After augmenting object features with the universal prototypes, we impose a consistency loss to maximize the agreement between the augmented features and the original one, which is beneficial for learning invariant object characteristics. Thus, we develop a new framework of few-shot object detection with universal prototypes (${FSOD}^{up}$) that owns the merit of feature generalization towards novel objects. Experimental results on PASCAL VOC and MS COCO demonstrate the effectiveness of ${FSOD}^{up}$. Particularly, for the 1-shot case of VOC Split2, ${FSOD}^{up}$ outperforms the baseline by 6.8\% in terms of mAP. Moreover, we further verify ${FSOD}^{up}$ on a long-tail detection dataset, i.e., LVIS. And employing ${FSOD}^{up}$ outperforms the state-of-the-art method.
# StyleGAN用円形人工物の系統解析と除去

StyleGAN is one of the state-of-the-art image generators which is well-known for synthesizing high-resolution and hyper-realistic face images. Though images generated by vanilla StyleGAN model are visually appealing, they sometimes contain prominent circular artifacts which severely degrade the quality of generated images. In this work, we provide a systematic investigation on how those circular artifacts are formed by studying the functionalities of different stages of vanilla StyleGAN architecture, with both mechanism analysis and extensive experiments. The key modules of vanilla StyleGAN that promote such undesired artifacts are highlighted. Our investigation also explains why the artifacts are usually circular, relatively small and rarely split into 2 or more parts. Besides, we propose a simple yet effective solution to remove the prominent circular artifacts for vanilla StyleGAN, by applying a novel pixel-instance normalization (PIN) layer.
# instancerefer:マルチレベルコンテクスト参照によるポイントクラウドの視覚的接地のための協調的総合的理解

Compared with the visual grounding in 2D images, the natural-language-guided 3D object localization on point clouds is more challenging due to the sparse and disordered property. In this paper, we propose a new model, named InstanceRefer, to achieve a superior 3D visual grounding through unifying instance attribute, relation and localization perceptions. In practice, based on the predicted target category from natural language, our model first filters instances from panoptic segmentation on point clouds to obtain a small number of candidates. Note that such instance-level candidates are more effective and rational than the redundant 3D object-proposal candidates. Then, for each candidate, we conduct the cooperative holistic scene-language understanding, i.e., multi-level contextual referring from instance attribute perception, instance-to-instance relation perception and instance-to-background global localization perception. Eventually, the most relevant candidate is localized effectively through adaptive confidence fusion. Experiments confirm that our InstanceRefer outperforms previous state-of-the-art methods by a large margin, i.e., 9.5% improvement on the ScanRefer benchmark (ranked 1st place) and 7.2% improvement on Sr3D.
# siameseネットワークにおける物体追跡のための多重畳み込み特徴

Siamese trackers demonstrated high performance in object tracking due to their balance between accuracy and speed. Unlike classification-based CNNs, deep similarity networks are specifically designed to address the image similarity problem, and thus are inherently more appropriate for the tracking task. However, Siamese trackers mainly use the last convolutional layers for similarity analysis and target search, which restricts their performance. In this paper, we argue that using a single convolutional layer as feature representation is not an optimal choice in a deep similarity framework. We present a Multiple Features-Siamese Tracker (MFST), a novel tracking algorithm exploiting several hierarchical feature maps for robust tracking. Since convolutional layers provide several abstraction levels in characterizing an object, fusing hierarchical features allows to obtain a richer and more efficient representation of the target. Moreover, we handle the target appearance variations by calibrating the deep features extracted from two different CNN models. Based on this advanced feature representation, our method achieves high tracking accuracy, while outperforming the standard siamese tracker on object tracking benchmarks. The source code and trained models are available at https://github.com/zhenxili96/MFST.
# シングルイメージシャドウ除去のための自動露光融合

Shadow removal is still a challenging task due to its inherent background-dependent and spatial-variant properties, leading to unknown and diverse shadow patterns. Even powerful state-of-the-art deep neural networks could hardly recover traceless shadow-removed background. This paper proposes a new solution for this task by formulating it as an exposure fusion problem to address the challenges. Intuitively, we can first estimate multiple over-exposure images w.r.t. the input image to let the shadow regions in these images have the same color with shadow-free areas in the input image. Then, we fuse the original input with the over-exposure images to generate the final shadow-free counterpart. Nevertheless, the spatial-variant property of the shadow requires the fusion to be sufficiently `smart', that is, it should automatically select proper over-exposure pixels from different images to make the final output natural. To address this challenge, we propose the {\bf shadow-aware FusionNet} that takes the shadow image as input to generate fusion weight maps across all the over-exposure images. Moreover, we propose the {\bf boundary-aware RefineNet} to eliminate the remaining shadow trace further. We conduct extensive experiments on the ISTD, ISTD+, and SRD datasets to validate our method's effectiveness and show better performance in shadow regions and comparable performance in non-shadow regions over the state-of-the-art methods. We release the model and code in https://github.com/tsingqguo/exposure-fusion-shadow-removal.
# HSI特徴の高次元幾何学の探索

We explore feature space geometries induced by the 3-D Fourier scattering transform and deep neural network with extended attribute profiles on four standard hyperspectral images. We examine the distances and angles of class means, the variability of classes, and their low-dimensional structures. These statistics are compared to that of raw features, and our results provide insight into the vastly different properties of these two methods. We also explore a connection with the newly observed deep learning phenomenon of neural collapse.
# 脳プログラミングは敵対的攻撃と相反する: シンボリック学習による正確な画像分類とロバストな画像分類に向けて

In recent years, the security concerns about the vulnerability of Deep Convolutional Neural Networks (DCNN) to Adversarial Attacks (AA) in the form of small modifications to the input image almost invisible to human vision make their predictions untrustworthy. Therefore, it is necessary to provide robustness to adversarial examples in addition to an accurate score when developing a new classifier. In this work, we perform a comparative study of the effects of AA on the complex problem of art media categorization, which involves a sophisticated analysis of features to classify a fine collection of artworks. We tested a prevailing bag of visual words approach from computer vision, four state-of-the-art DCNN models (AlexNet, VGG, ResNet, ResNet101), and the Brain Programming (BP) algorithm. In this study, we analyze the algorithms' performance using accuracy. Besides, we use the accuracy ratio between adversarial examples and clean images to measure robustness. Moreover, we propose a statistical analysis of each classifier's predictions' confidence to corroborate the results. We confirm that BP predictions' change was below 2\% using adversarial examples computed with the fast gradient sign method. Also, considering the multiple pixel attack, BP obtained four out of seven classes without changes and the rest with a maximum error of 4\% in the predictions. Finally, BP also gets four categories using adversarial patches without changes and for the remaining three classes with a variation of 1\%. Additionally, the statistical analysis showed that the predictions' confidence of BP were not significantly different for each pair of clean and perturbed images in every experiment. These results prove BP's robustness against adversarial examples compared to DCNN and handcrafted features methods, whose performance on the art media classification was compromised with the proposed perturbations.
# 深部畳み込みニューラルネットワークを用いたマルチクラスバーンハウンド画像分類

Millions of people are affected by acute and chronic wounds yearly across the world. Continuous wound monitoring is important for wound specialists to allow more accurate diagnosis and optimization of management protocols. Machine Learning-based classification approaches provide optimal care strategies resulting in more reliable outcomes, cost savings, healing time reduction, and improved patient satisfaction. In this study, we use a deep learning-based method to classify burn wound images into two or three different categories based on the wound conditions. A pre-trained deep convolutional neural network, AlexNet, is fine-tuned using a burn wound image dataset and utilized as the classifier. The classifier's performance is evaluated using classification metrics such as accuracy, precision, and recall as well as confusion matrix. A comparison with previous works that used the same dataset showed that our designed classifier improved the classification accuracy by more than 8%.
# クリーンラベル攻撃によるロバスト学習

We study the problem of robust learning under clean-label data-poisoning attacks, where the attacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time. The learning goal is to minimize the attackable rate (the probability mass of attackable test instances), which is more difficult than optimal PAC learning. As we show, any robust algorithm with diminishing attackable rate can achieve the optimal dependence on $\epsilon$ in its PAC sample complexity, i.e., $O(1/\epsilon)$. On the other hand, the attackable rate might be large even for some optimal PAC learners, e.g., SVM for linear classifiers. Furthermore, we show that the class of linear hypotheses is not robustly learnable when the data distribution has zero margin and is robustly learnable in the case of positive margin but requires sample complexity exponential in the dimension. For a general hypothesis class with bounded VC dimension, if the attacker is limited to add at most $t>0$ poison examples, the optimal robust learning sample complexity grows almost linearly with $t$.
# 確率的プログラムに対する推論アルゴリズムのメタラーニング

We present a meta-algorithm for learning a posterior-inference algorithm for restricted probabilistic programs. Our meta-algorithm takes a training set of probabilistic programs that describe models with observations, and attempts to learn an efficient method for inferring the posterior of a similar program. A key feature of our approach is the use of what we call a white-box inference algorithm that extracts information directly from model descriptions themselves, given as programs in a probabilistic programming language. Concretely, our white-box inference algorithm is equipped with multiple neural networks, one for each type of atomic command in the language, and computes an approximate posterior of a given probabilistic program by analysing individual atomic commands in the program using these networks. The parameters of these networks are then learnt from a training set by our meta-algorithm. Our empirical evaluation for six model classes shows the promise of our approach.
# グラフ上での自動機械学習:調査

Machine learning on graphs has been extensively studied in both academic and industry. However, as the literature on graph learning booms with a vast number of emerging methods and techniques, it becomes increasingly difficult to manually design the optimal machine learning algorithm for different graph-related tasks. To solve this critical challenge, automated machine learning (AutoML) on graphs which combines the strength of graph machine learning and AutoML together, is gaining attentions from the research community. Therefore, we comprehensively survey AutoML on graphs in this paper, primarily focusing on hyper-parameter optimization (HPO) and neural architecture search (NAS) for graph machine learning. We further overview libraries related to automated graph machine learning and in depth discuss AutoGL, the first dedicated open-source library for AutoML on graphs. In the end, we share our insights on future research directions for automated graph machine learning. To the best of our knowledge, this paper is the first systematic and comprehensive review of automated machine learning on graphs.
# メタラーニングによるグラフニューラルネットワークの自己監督型補助学習

In recent years, graph neural networks (GNNs) have been widely adopted in representation learning of graph-structured data and provided state-of-the-art performance in various application such as link prediction and node classification. Simultaneously, self-supervised learning has been studied to some extent to leverage rich unlabeled data in representation learning on graphs. However, employing self-supervision tasks as auxiliary tasks to assist a primary task has been less explored in the literature on graphs. In this paper, we propose a novel self-supervised auxiliary learning framework to effectively learn graph neural networks. Moreover, we design first a meta-path prediction as a self-supervised auxiliary task for heterogeneous graphs. Our method is learning to learn a primary task with various auxiliary tasks to improve generalization performance. The proposed method identifies an effective combination of auxiliary tasks and automatically balances them to improve the primary task. Our methods can be applied to any graph neural networks in a plug-in manner without manual labeling or additional data. Also, it can be extended to any other auxiliary tasks. Our experiments demonstrate that the proposed method consistently improves the performance of link prediction and node classification on heterogeneous graphs.
# 構造ラベルのための計算効率の高いWasserstein損失

The problem of estimating the probability distribution of labels has been widely studied as a label distribution learning (LDL) problem, whose applications include age estimation, emotion analysis, and semantic segmentation. We propose a tree-Wasserstein distance regularized LDL algorithm, focusing on hierarchical text classification tasks. We propose predicting the entire label hierarchy using neural networks, where the similarity between predicted and true labels is measured using the tree-Wasserstein distance. Through experiments using synthetic and real-world datasets, we demonstrate that the proposed method successfully considers the structure of labels during training, and it compares favorably with the Sinkhorn algorithm in terms of computation time and memory usage.
# 後方更新によるセキュアな双方向非同期垂直フェデレーション学習

Vertical federated learning (VFL) attracts increasing attention due to the emerging demands of multi-party collaborative modeling and concerns of privacy leakage. In the real VFL applications, usually only one or partial parties hold labels, which makes it challenging for all parties to collaboratively learn the model without privacy leakage. Meanwhile, most existing VFL algorithms are trapped in the synchronous computations, which leads to inefficiency in their real-world applications. To address these challenging problems, we propose a novel {\bf VF}L framework integrated with new {\bf b}ackward updating mechanism and {\bf b}ilevel asynchronous parallel architecture (VF{${\textbf{B}}^2$}), under which three new algorithms, including VF{${\textbf{B}}^2$}-SGD, -SVRG, and -SAGA, are proposed. We derive the theoretical results of the convergence rates of these three algorithms under both strongly convex and nonconvex conditions. We also prove the security of VF{${\textbf{B}}^2$} under semi-honest threat models. Extensive experiments on benchmark datasets demonstrate that our algorithms are efficient, scalable and lossless.
# Snowflake:パラメータ凍結によるGNNの高次元連続制御へのスケーリング

Recent research has shown that Graph Neural Networks (GNNs) can learn policies for locomotion control that are as effective as a typical multi-layer perceptron (MLP), with superior transfer and multi-task performance (Wang et al., 2018; Huang et al., 2020). Results have so far been limited to training on small agents, with the performance of GNNs deteriorating rapidly as the number of sensors and actuators grows. A key motivation for the use of GNNs in the supervised learning setting is their applicability to large graphs, but this benefit has not yet been realised for locomotion control. We identify the weakness with a common GNN architecture that causes this poor scaling: overfitting in the MLPs within the network that encode, decode, and propagate messages. To combat this, we introduce Snowflake, a GNN training method for high-dimensional continuous control that freezes parameters in parts of the network that suffer from overfitting. Snowflake significantly boosts the performance of GNNs for locomotion control on large agents, now matching the performance of MLPs, and with superior transfer properties.
# 非侵入検査システムの予測保守ツール

Cross-border security is of topmost priority for societies. Economies lose billions each year due to counterfeiters and other threats. Security checkpoints equipped with X-ray Security Systems (NIIS-Non-Intrusive Inspection Systems) like airports, ports, border control and customs authorities tackle the myriad of threats by using NIIS to inspect bags, air, land, sea and rail cargo, and vehicles. The reliance on the X-ray scanning systems necessitates their continuous 24/7 functioning being provided for. Hence the need for their working condition being closely monitored and preemptive actions being taken to reduce the overall X-ray systems downtime. In this paper, we present a predictive maintenance decision support system, abbreviated as PMT4NIIS (Predictive Maintenance Tool for Non-Intrusive Inspection Systems), which is a kind of augmented analytics platforms that provides real-time AI-generated warnings for upcoming risk of system malfunctioning leading to possible downtime. The industrial platform is the basis of a 24/7 Service Desk and Monitoring center for the working condition of various X-ray Security Systems.
# ほぼ最適に後悔するバイアス付きグラフニューラルネットワークサンプル

Graph neural networks (GNN) have recently emerged as a vehicle for applying deep network architectures to graph and relational data. However, given the increasing size of industrial datasets, in many practical situations, the message passing computations required for sharing information across GNN layers are no longer scalable. Although various sampling methods have been introduced to approximate full-graph training within a tractable budget, there remain unresolved complications such as high variances and limited theoretical guarantees. To address these issues, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem but with a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts. And unlike prior bandit-GNN use cases, the resulting policy leads to near-optimal regret while accounting for the GNN training dynamics introduced by SGD. From a practical standpoint, this translates into lower variance estimates and competitive or superior test accuracy across several benchmarks.
# 斜め決定木に対する反実例:正確で効率的なアルゴリズム

We consider counterfactual explanations, the problem of minimally adjusting features in a source input instance so that it is classified as a target class under a given classifier. This has become a topic of recent interest as a way to query a trained model and suggest possible actions to overturn its decision. Mathematically, the problem is formally equivalent to that of finding adversarial examples, which also has attracted significant attention recently. Most work on either counterfactual explanations or adversarial examples has focused on differentiable classifiers, such as neural nets. We focus on classification trees, both axis-aligned and oblique (having hyperplane splits). Although here the counterfactual optimization problem is nonconvex and nondifferentiable, we show that an exact solution can be computed very efficiently, even with high-dimensional feature vectors and with both continuous and categorical features, and demonstrate it in different datasets and settings. The results are particularly relevant for finance, medicine or legal applications, where interpretability and counterfactual explanations are particularly important.
# 株式市場における極端なボラティリティ予測:GameStopが長期記憶ネットワークと出会うとき

The beginning of 2021 saw a surge in volatility for certain stocks such as GameStop company stock (Ticker GME under NYSE). GameStop stock increased around 10 fold from its decade-long average to its peak at \$485. In this paper, we hypothesize a buy-and-hold strategy can be outperformed in the presence of extreme volatility by predicting and trading consolidation breakouts. We investigate GME stock for its volatility and compare it to SPY as a benchmark (since it is a less volatile ETF fund) from February 2002 to February 2021. For strategy 1, we develop a Long Short-term Memory (LSTM) Neural Network to predict stock prices recurrently with a very short look ahead period in the presence of extreme volatility. For our strategy 2, we develop an LSTM autoencoder network specifically designed to trade only on consolidation breakouts after predicting anomalies in the stock price. When back-tested in our simulations, our strategy 1 executes 863 trades for SPY and 452 trades for GME. Our strategy 2 executes 931 trades for SPY and 325 trades for GME. We compare both strategies to buying and holding one single share for the period that we picked as a benchmark. In our simulations, SPY returns \$281.160 from buying and holding one single share, \$110.29 from strategy 1 with 53.5% success rate and \$4.34 from strategy 2 with 57.6% success rate. GME returns \$45.63 from buying and holding one single share, \$69.046 from strategy 1 with 47.12% success rate and \$2.10 from strategy 2 with 48% success rate. Overall, buying and holding outperforms all deep-learning assisted prediction models in our study except for when the LSTM-based prediction model (strategy 1) is applied to GME. We hope that our study sheds more light into the field of extreme volatility predictions based on LSTMs to outperform buying and holding strategy.
# SWIS -- 効率的なニューラルネットワーク高速化のための共有重みbItスポーサリティ

Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient neural network inference acceleration delivering improved performance and storage compression through an offline weight decomposition and scheduling algorithm. SWIS can achieve up to 54.3% (19.8%) point accuracy improvement compared to weight truncation when quantizing MobileNet-v2 to 4 (2) bits post-training (with retraining) showing the strength of leveraging shared bit-sparsity in weights. SWIS accelerator gives up to 6x speedup and 1.9x energy improvement overstate of the art bit-serial architectures.
# 知識誘導動的システムモデリング:河川水質のモデル化を事例として

Modeling real-world phenomena is a focus of many science and engineering efforts, such as ecological modeling and financial forecasting, to name a few. Building an accurate model for complex and dynamic systems improves understanding of underlying processes and leads to resource efficiency. Towards this goal, knowledge-driven modeling builds a model based on human expertise, yet is often suboptimal. At the opposite extreme, data-driven modeling learns a model directly from data, requiring extensive data and potentially generating overfitting. We focus on an intermediate approach, model revision, in which prior knowledge and data are combined to achieve the best of both worlds. In this paper, we propose a genetic model revision framework based on tree-adjoining grammar (TAG) guided genetic programming (GP), using the TAG formalism and GP operators in an effective mechanism to incorporate prior knowledge and make data-driven revisions in a way that complies with prior knowledge. Our framework is designed to address the high computational cost of evolutionary modeling of complex systems. Via a case study on the challenging problem of river water quality modeling, we show that the framework efficiently learns an interpretable model, with higher modeling accuracy than existing methods.
# モバイルエッジネットワークにおけるブロックチェーンベースのフェデレーション学習と車両インターネットへの応用

The rapid increase of the data scale in Internet of Vehicles (IoV) system paradigm, hews out new possibilities in boosting the service quality for the emerging applications through data sharing. Nevertheless, privacy concerns are major bottlenecks for data providers to share private data in traditional IoV networks. To this end, federated learning (FL) as an emerging learning paradigm, where data providers only send local model updates trained on their local raw data rather than upload any raw data, has been recently proposed to build a privacy-preserving data sharing models. Unfortunately, by analyzing on the differences of uploaded local model updates from data providers, private information can still be divulged, and performance of the system cannot be guaranteed when partial federated nodes executes malicious behavior. Additionally, traditional cloud-based FL poses challenges to the communication overhead with the rapid increase of terminal equipment in IoV system. All these issues inspire us to propose an autonomous blockchain empowered privacy-preserving FL framework in this paper, where the mobile edge computing (MEC) technology was naturally integrated in IoV system.
# 分類器システムによる深層学習:初期結果

This article presents the first results from using a learning classifier system capable of performing adaptive computation with deep neural networks. Individual classifiers within the population are composed of two neural networks. The first acts as a gating or guarding component, which enables the conditional computation of an associated deep neural network on a per instance basis. Self-adaptive mutation is applied upon reproduction and prediction networks are refined with stochastic gradient descent during lifetime learning. The use of fully-connected and convolutional layers are evaluated on handwritten digit recognition tasks where evolution adapts (i) the gradient descent learning rate applied to each layer (ii) the number of units within each layer, i.e., the number of fully-connected neurons and the number of convolutional kernel filters (iii) the connectivity of each layer, i.e., whether each weight is active (iv) the weight magnitudes, enabling escape from local optima. The system automatically reduces the number of weights and units while maintaining performance after achieving a maximum prediction error.
# Roosterize: ディープラーニングを用いたCoq検証プロジェクトのためのLemmaネームの提案

Naming conventions are an important concern in large verification projects using proof assistants, such as Coq. In particular, lemma names are used by proof engineers to effectively understand and modify Coq code. However, providing accurate and informative lemma names is a complex task, which is currently often carried out manually. Even when lemma naming is automated using rule-based tools, generated names may fail to adhere to important conventions not specified explicitly. We demonstrate a toolchain, dubbed Roosterize, which automatically suggests lemma names in Coq projects. Roosterize leverages a neural network model trained on existing Coq code, thus avoiding manual specification of naming conventions. To allow proof engineers to conveniently access suggestions from Roosterize during Coq project development, we integrated the toolchain into the popular Visual Studio Code editor. Our evaluation shows that Roosterize substantially outperforms strong baselines for suggesting lemma names and is useful in practice. The demo video for Roosterize can be viewed at: https://youtu.be/HZ5ac7Q14rc.
# 変圧器による単発モーション補完

Motion completion is a challenging and long-discussed problem, which is of great significance in film and game applications. For different motion completion scenarios (in-betweening, in-filling, and blending), most previous methods deal with the completion problems with case-by-case designs. In this work, we propose a simple but effective method to solve multiple motion completion problems under a unified framework and achieves a new state of the art accuracy under multiple evaluation settings. Inspired by the recent great success of attention-based models, we consider the completion as a sequence to sequence prediction problem. Our method consists of two modules - a standard transformer encoder with self-attention that learns long-range dependencies of input motions, and a trainable mixture embedding module that models temporal information and discriminates key-frames. Our method can run in a non-autoregressive manner and predict multiple missing frames within a single forward propagation in real time. We finally show the effectiveness of our method in music-dance applications.
# Bi-LSTMとマルチスケールCNNを用いた運転行動認識モデル

In autonomous driving, perceiving the driving behaviors of surrounding agents is important for the ego-vehicle to make a reasonable decision. In this paper, we propose a neural network model based on trajectories information for driving behavior recognition. Unlike existing trajectory-based methods that recognize the driving behavior using the hand-crafted features or directly encoding the trajectory, our model involves a Multi-Scale Convolutional Neural Network (MSCNN) module to automatically extract the high-level features which are supposed to encode the rich spatial and temporal information. Given a trajectory sequence of an agent as the input, firstly, the Bi-directional Long Short Term Memory (Bi-LSTM) module and the MSCNN module respectively process the input, generating two features, and then the two features are fused to classify the behavior of the agent. We evaluate the proposed model on the public BLVD dataset, achieving a satisfying performance.
# イベントベースビゾモフタポリシのための表現学習

Event-based cameras are dynamic vision sensors that can provide asynchronous measurements of changes in per-pixel brightness at a microsecond level. This makes them significantly faster than conventional frame-based cameras, and an appealing choice for high-speed navigation. While an interesting sensor modality, this asynchronous data poses a challenge for common machine learning techniques. In this paper, we present an event variational autoencoder for unsupervised representation learning from asynchronous event camera data. We show that it is feasible to learn compact representations from spatiotemporal event data to encode the context. Furthermore, we show that such pretrained representations can be beneficial for navigation, allowing for usage in reinforcement learning instead of end-to-end reward driven perception. We validate this framework of learning visuomotor policies by applying it to an obstacle avoidance scenario in simulation. We show that representations learnt from event data enable training fast control policies that can adapt to different control capacities, and demonstrate a higher degree of robustness than end-to-end learning from event images.
# クロスマップトランスフォーマー : 視覚言語ナビゲーションのためのダブルバックトランスレーションを用いたクロスモーダルマスクドパストランスフォーマー

Navigation guided by natural language instructions is particularly suitable for Domestic Service Robots that interacts naturally with users. This task involves the prediction of a sequence of actions that leads to a specified destination given a natural language navigation instruction. The task thus requires the understanding of instructions, such as ``Walk out of the bathroom and wait on the stairs that are on the right''. The Visual and Language Navigation remains challenging, notably because it requires the exploration of the environment and at the accurate following of a path specified by the instructions to model the relationship between language and vision. To address this, we propose the CrossMap Transformer network, which encodes the linguistic and visual features to sequentially generate a path. The CrossMap transformer is tied to a Transformer-based speaker that generates navigation instructions. The two networks share common latent features, for mutual enhancement through a double back translation model: Generated paths are translated into instructions while generated instructions are translated into path The experimental results show the benefits of our approach in terms of instruction understanding and instruction generation.
# fool me once: コントラスト学習による分散検出によるロバスト選択的セグメンテーション

In this work, we train a network to simultaneously perform segmentation and pixel-wise Out-of-Distribution (OoD) detection, such that the segmentation of unknown regions of scenes can be rejected. This is made possible by leveraging an OoD dataset with a novel contrastive objective and data augmentation scheme. By combining data including unknown classes in the training data, a more robust feature representation can be learned with known classes represented distinctly from those unknown. When presented with unknown classes or conditions, many current approaches for segmentation frequently exhibit high confidence in their inaccurate segmentations and cannot be trusted in many operational environments. We validate our system on a real-world dataset of unusual driving scenes, and show that by selectively segmenting scenes based on what is predicted as OoD, we can increase the segmentation accuracy by an IoU of 0.2 with respect to alternative techniques.
# ladmm-net:圧縮データからのスペクトル画像融合のための未ロール深層ネットワーク

Hyperspectral (HS) and multispectral (MS) image fusion aims at estimating a high-resolution spectral image from a low-spatial-resolution HS image and a low-spectral-resolution MS image. Compressive spectral imaging (CSI) has emerged as an acquisition framework that captures the relevant information of spectral images using a reduced number of snapshots. Various spectral image fusion methods from multi-sensor CSI measurements have been proposed. Nevertheless, these methods exhibit high running times and face the drawback of choosing a representation transform. In this work, a deep learning architecture under the algorithm unrolling approach is proposed for solving the fusion problem from HS and MS compressive measurements. This architecture, dubbed LADMM-Net, casts each iteration of a linearized version of the alternating direction method of multipliers into a processing layer whose concatenation forms a deep network. The linearized approach leads to estimate the target variable without resorting to expensive matrix operations. This approach also estimates the image high-frequency component included in both the auxiliary variable and the Lagrange multiplier. The performance of the proposed technique is evaluated on two spectral image databases and one dataset captured at the laboratory. Extensive simulations show that the proposed method outperforms the state-of-the-art approaches that fuse spectral images from compressive data.
# DPCNによる空中・地上ロボットによる実現可能領域の認識

Ground robots always get collision in that only if they get close to the obstacles, can they sense the danger and take actions, which is usually too late to avoid the crash, causing severe damage to the robots. To address this issue, we present collaboration of aerial and ground robots in recognition of feasible region. Taking the aerial robots' advantages of having large scale variance of view points of the same route which the ground robots is on, the collaboration work provides global information of road segmentation for the ground robot, thus enabling it to obtain feasible region and adjust its pose ahead of time. Under normal circumstance, the transformation between these two devices can be obtained by GPS yet with much error, directly causing inferior influence on recognition of feasible region. Thereby, we utilize the state-of-the-art research achievements in matching heterogeneous sensor measurements called deep phase correlation network(DPCN), which has excellent performance on heterogeneous mapping, to refine the transformation. The network is light-weighted and promising for better generalization. We use Aero-Ground dataset which consists of heterogeneous sensor images and aerial road segmentation images. The results show that our collaborative system has great accuracy, speed and stability.
# 3d文字の二次動作のための深いエミュレータ

Fast and light-weight methods for animating 3D characters are desirable in various applications such as computer games. We present a learning-based approach to enhance skinning-based animations of 3D characters with vivid secondary motion effects. We design a neural network that encodes each local patch of a character simulation mesh where the edges implicitly encode the internal forces between the neighboring vertices. The network emulates the ordinary differential equations of the character dynamics, predicting new vertex positions from the current accelerations, velocities and positions. Being a local method, our network is independent of the mesh topology and generalizes to arbitrarily shaped 3D character meshes at test time. We further represent per-vertex constraints and material properties such as stiffness, enabling us to easily adjust the dynamics in different parts of the mesh. We evaluate our method on various character meshes and complex motion sequences. Our method can be over 30 times more efficient than ground-truth physically based simulation, and outperforms alternative solutions that provide fast approximations.
# 目標関係グラフを用いた階層的および部分的観察可能な目標駆動政策学習

We present a novel two-layer hierarchical reinforcement learning approach equipped with a Goals Relational Graph (GRG) for tackling the partially observable goal-driven task, such as goal-driven visual navigation. Our GRG captures the underlying relations of all goals in the goal space through a Dirichlet-categorical process that facilitates: 1) the high-level network raising a sub-goal towards achieving a designated final goal; 2) the low-level network towards an optimal policy; and 3) the overall system generalizing unseen environments and goals. We evaluate our approach with two settings of partially observable goal-driven tasks -- a grid-world domain and a robotic object search task. Our experimental results show that our approach exhibits superior generalization performance on both unseen environments and new goals.
# Surrogate gradient Learning を用いたマルチラベルオーディオタグの高速しきい値最適化

Multi-label audio tagging consists of assigning sets of tags to audio recordings. At inference time, thresholds are applied on the confidence scores outputted by a probabilistic classifier, in order to decide which classes are detected active. In this work, we consider having at disposal a trained classifier and we seek to automatically optimize the decision thresholds according to a performance metric of interest, in our case F-measure (micro-F1). We propose a new method, called SGL-Thresh for Surrogate Gradient Learning of Thresholds, that makes use of gradient descent. Since F1 is not differentiable, we propose to approximate the thresholding operation gradients with the gradients of a sigmoid function. We report experiments on three datasets, using state-of-the-art pre-trained deep neural networks. In all cases, SGL-Thresh outperformed three other approaches: a default threshold value (defThresh), an heuristic search algorithm and a method estimating F1 gradients numerically. It reached 54.9\% F1 on AudioSet eval, compared to 50.7% with defThresh. SGL-Thresh is very fast and scalable to a large number of tags. To facilitate reproducibility, data and source code in Pytorch are available online: https://github.com/topel/SGL-Thresh
# 決定境界を提示しないフェデレーション学習

We consider the recent privacy preserving methods that train the models not on original images, but on mixed images that look like noise and hard to trace back to the original images. We explain that those mixed images will be samples on the decision boundaries of the trained model, and although such methods successfully hide the contents of images from the entity in charge of federated learning, they provide crucial information to that entity about the decision boundaries of the trained model. Once the entity has exact samples on the decision boundaries of the model, they may use it for effective adversarial attacks on the model during training and/or afterwards. If we have to hide our images from that entity, how can we trust them with the decision boundaries of our model? As a remedy, we propose a method to encrypt the images, and have a decryption module hidden inside the model. The entity in charge of federated learning will only have access to a set of complex-valued coefficients, but the model will first decrypt the images and then put them through the convolutional layers. This way, the entity will not see the training images and they will not know the location of the decision boundaries of the model.
# エッジ機械学習推論ボットのマイグレーション: Google Edgeモデルの高速化に関する実証的研究

As the need for edge computing grows, many modern consumer devices now contain edge machine learning (ML) accelerators that can compute a wide range of neural network (NN) models while still fitting within tight resource constraints. We analyze a commercial Edge TPU using 24 Google edge NN models (including CNNs, LSTMs, transducers, and RCNNs), and find that the accelerator suffers from three shortcomings, in terms of computational throughput, energy efficiency, and memory access handling. We comprehensively study the characteristics of each NN layer in all of the Google edge models, and find that these shortcomings arise from the one-size-fits-all approach of the accelerator, as there is a high amount of heterogeneity in key layer characteristics both across different models and across different layers in the same model. We propose a new acceleration framework called Mensa. Mensa incorporates multiple heterogeneous ML edge accelerators (including both on-chip and near-data accelerators), each of which caters to the characteristics of a particular subset of models. At runtime, Mensa schedules each layer to run on the best-suited accelerator, accounting for both efficiency and inter-layer dependencies. As we analyze the Google edge NN models, we discover that all of the layers naturally group into a small number of clusters, which allows us to design an efficient implementation of Mensa for these models with only three specialized accelerators. Averaged across all 24 Google edge models, Mensa improves energy efficiency and throughput by 3.0x and 3.1x over the Edge TPU, and by 2.4x and 4.3x over Eyeriss v2, a state-of-the-art accelerator.
# 統計レバレッジを用いたストリーミングビジネスプロセスイベントのオンライン異常検出

While several techniques for detecting trace-level anomalies in event logs in offline settings have appeared recently in the literature, such techniques are currently lacking for online settings. Event log anomaly detection in online settings can be crucial for discovering anomalies in process execution as soon as they occur and, consequently, allowing to promptly take early corrective actions. This paper describes a novel approach to event log anomaly detection on event streams that uses statistical leverage. Leverage has been used extensively in statistics to develop measures to identify outliers and it has been adapted in this paper to the specific scenario of event stream data. The proposed approach has been evaluated on both artificial and real event streams.
# CARMI:コストベース構築アルゴリズムを用いたキャッシュ対応学習指標

Learned indexes, which use machine learning models to replace traditional index structures, have shown promising results in recent studies. However, our understanding of this new type of index structure is still at an early stage with many details that need to be carefully examined and improved. In this paper, we propose a cache-aware learned index (CARMI) design to improve the efficiency of the Recursive Model Index (RMI) framework proposed by Kraska et al. and a cost-based construction algorithm to construct the optimal indexes in a wide variety of application scenarios. We formulate the problem of finding the optimal design of a learned index as an optimization problem and propose a dynamic programming algorithm for solving it and a partial greedy step to speed up. Experiments show that our index construction strategy can construct indexes with significantly better performance compared to baselines under various data distribution and workload requirements. Among them, CARMI can obtain an average of 2.52X speedup compared to B-tree, while using only about 0.56X memory space of B-tree on average.
# 計画とプランナー評価のための異種臨界相互作用生成

Generating diverse and comprehensive interacting agents to evaluate the decision-making modules of autonomous vehicles~(AV) is essential for safe and robust planning. Due to efficiency and safety concerns, most researchers choose to train adversary agents in simulators and generate test cases to interact with evaluated AVs. However, most existing methods fail to provide both natural and critical interaction behaviors in various traffic scenarios. To tackle this problem, we propose a styled generative model RouteGAN that generates diverse interactions by controlling the vehicles separately with desired styles. By altering its style coefficients, the model can generate trajectories with different safety levels serve as an online planner. Experiments show that our model can generate diverse interactions in various scenarios. We evaluate different planners with our model by testing their collision rate in interaction with RouteGAN planners of multiple critical levels.
# 訓練されたニューラルネットワークの情報内容の計算

How much information does a learning algorithm extract from the training data and store in a neural network's weights? Too much, and the network would overfit to the training data. Too little, and the network would not fit to anything at all. Na\"ively, the amount of information the network stores should scale in proportion to the number of trainable weights. This raises the question: how can neural networks with vastly more weights than training data still generalise? A simple resolution to this conundrum is that the number of weights is usually a bad proxy for the actual amount of information stored. For instance, typical weight vectors may be highly compressible. Then another question occurs: is it possible to compute the actual amount of information stored? This paper derives both a consistent estimator and a closed-form upper bound on the information content of infinitely wide neural networks. The derivation is based on an identification between neural information content and the negative log probability of a Gaussian orthant. This identification yields bounds that analytically control the generalisation behaviour of the entire solution space of infinitely wide networks. The bounds have a simple dependence on both the network architecture and the training data. Corroborating the findings of Valle-P\'erez et al. (2019), who conducted a similar analysis using approximate Gaussian integration techniques, the bounds are found to be both non-vacuous and correlated with the empirical generalisation behaviour at finite width.
# GreenAIからみた変分オートエンコーダに関する調査

Variational AutoEncoders (VAEs) are powerful generative models that merge elements from statistics and information theory with the flexibility offered by deep neural networks to efficiently solve the generation problem for high dimensional data. The key insight of VAEs is to learn the latent distribution of data in such a way that new meaningful samples can be generated from it. This approach led to tremendous research and variations in the architectural design of VAEs, nourishing the recent field of research known as unsupervised representation learning. In this article, we provide a comparative evaluation of some of the most successful, recent variations of VAEs. We particularly focus the analysis on the energetic efficiency of the different models, in the spirit of the so called Green AI, aiming both to reduce the carbon footprint and the financial cost of generative techniques. For each architecture we provide its mathematical formulation, the ideas underlying its design, a detailed model description, a running implementation and quantitative results.
# サブニキスト検体超音波画像の深部展開復元

The most common technique for generating B-mode ultrasound (US) images is delay and sum (DAS) beamforming, where the signals received at the transducer array are sampled before an appropriate delay is applied. This necessitates sampling rates exceeding the Nyquist rate and the use of a large number of antenna elements to ensure sufficient image quality. Recently we proposed methods to reduce the sampling rate and the array size relying on image recovery using iterative algorithms, based on compressed sensing (CS) and the finite rate of innovation (FRI) frameworks. Iterative algorithms typically require a large number of iterations, making them difficult to use in real-time. Here, we propose a reconstruction method from sub-Nyquist samples in the time and spatial domain, that is based on unfolding the ISTA algorithm, resulting in an efficient and interpretable deep network. The inputs to our network are the subsampled beamformed signals after summation and delay in the frequency domain, requiring only a subset of the US signal to be stored for recovery. Our method allows reducing the number of array elements, sampling rate, and computational time while ensuring high quality imaging performance. Using \emph{in vivo} data we demonstrate that the proposed method yields high-quality images while reducing the data volume traditionally used up to 36 times. In terms of image resolution and contrast, our technique outperforms previously suggested methods as well as DAS and minimum-variance (MV) beamforming, paving the way to real-time applicable recovery methods.
# 機械学習はCOVID-19の景気後退に耐えられるか?

Based on evidence gathered from a newly built large macroeconomic data set for the UK, labeled UK-MD and comparable to similar datasets for the US and Canada, it seems the most promising avenue for forecasting during the pandemic is to allow for general forms of nonlinearity by using machine learning (ML) methods. But not all nonlinear ML methods are alike. For instance, some do not allow to extrapolate (like regular trees and forests) and some do (when complemented with linear dynamic components). This and other crucial aspects of ML-based forecasting in unprecedented times are studied in an extensive pseudo-out-of-sample exercise.
# アプリケーションによる最大関数プーリング

Inspired by the Hardy-Littlewood maximal function, we propose a novel pooling strategy which is called maxfun pooling. It is presented both as a viable alternative to some of the most popular pooling functions, such as max pooling and average pooling, and as a way of interpolating between these two algorithms. We demonstrate the features of maxfun pooling with two applications: first in the context of convolutional sparse coding, and then for image classification.
# SmartON:エネルギーハーベスティングシステムのジャストインタイムアクティブイベント検出

We propose SmartON, a batteryless system that learns to wake up proactively at the right moment in order to detect events of interest. It does so by adapting the duty cycle to match the distribution of event arrival times under the constraints of harvested energy. While existing energy harvesting systems either wake up periodically at a fixed rate to sense and process the data, or wake up only in accordance with the availability of the energy source, SmartON employs a three-phase learning framework to learn the energy harvesting pattern as well as the pattern of events at run-time, and uses that knowledge to wake itself up when events are most likely to occur. The three-phase learning framework enables rapid adaptation to environmental changes in both short and long terms. Being able to remain asleep more often than a CTID (charging-then-immediate-discharging) wake-up system and adapt to the event pattern, SmartON is able to reduce energy waste, increase energy efficiency, and capture more events. To realize SmartON we have developed a dedicated hardware platform whose power management module activates capacitors on-the-fly to dynamically increase its storage capacitance. We conduct both simulation-driven and real-system experiments to demonstrate that SmartON captures 1X--7X more events and is 8X--17X more energy-efficient than a CTID system.
# 境界ペナルティを有するニューラルネットワークの変分学習における誤差推定

We establish estimates on the error made by the Ritz method for quadratic energies on the space $H^1(\Omega)$ in the approximation of the solution of variational problems with different boundary conditions. Special attention is paid to the case of Dirichlet boundary values which are treated with the boundary penalty method. We consider arbitrary and in general non linear classes $V\subseteq H^1(\Omega)$ of ansatz functions and estimate the error in dependence of the optimisation accuracy, the approximation capabilities of the ansatz class and - in the case of Dirichlet boundary values - the penalisation strength $\lambda$. For non-essential boundary conditions the error of the Ritz method decays with the same rate as the approximation rate of the ansatz classes. For the boundary penalty method we obtain that given an approximation rate of $r$ in $H^1(\Omega)$ and an approximation rate of $s$ in $L^2(\partial\Omega)$ of the ansatz classes, the optimal decay rate of the estimated error is $\min(s/2, r) \in [r/2, r]$ and achieved by choosing $\lambda_n\sim n^{s}$. We discuss how this rate can be improved, the relation to existing estimates for finite element functions as well as the implications for ansatz classes which are given through ReLU networks. Finally, we use the notion of $\Gamma$-convergence to show that the Ritz method converges for a wide class of energies including nonlinear stationary PDEs like the $p$-Laplace.
# 戦略的学習における情報格差

We study a decision-making model where a principal deploys a scoring rule and the agents strategically invest effort to improve their scores. Unlike existing work in the strategic learning literature, we do not assume that the principal's scoring rule is fully known to the agents, and agents may form different estimates of the scoring rule based on their own sources of information. We focus on disparities in outcomes that stem from information discrepancies in our model. To do so, we consider a population of agents who belong to different subgroups, which determine their knowledge about the deployed scoring rule. Agents within each subgroup observe the past scores received by their peers, which allow them to construct an estimate of the deployed scoring rule and to invest their efforts accordingly. The principal, taking into account the agents' behaviors, deploys a scoring rule that maximizes the social welfare of the whole population. We provide a collection of theoretical results that characterize the impact of the welfare-maximizing scoring rules on the strategic effort investments across different subgroups. In particular, we identify sufficient and necessary conditions for when the deployed scoring rule incentivizes optimal strategic investment across all groups for different notions of optimality. Finally, we complement and validate our theoretical analysis with experimental results on the real-world datasets Taiwan-Credit and Adult.
# 適応的メッシュリファインメントのための強化学習

Large-scale finite element simulations of complex physical systems governed by partial differential equations crucially depend on adaptive mesh refinement (AMR) to allocate computational budget to regions where higher resolution is required. Existing scalable AMR methods make heuristic refinement decisions based on instantaneous error estimation and thus do not aim for long-term optimality over an entire simulation. We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning (RL) to train refinement policies directly from simulation. AMR poses a new problem for RL in that both the state dimension and available action set changes at every step, which we solve by proposing new policy architectures with differing generality and inductive bias. The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations. We demonstrate in comprehensive experiments on static function estimation and the advection of different fields that RL policies can be competitive with a widely-used error estimator and generalize to larger, more complex, and unseen test problems.
# 部分的に観測された状態軌道からの縮小モデル学習のための非マルコフ項の演算子推論

This work introduces a non-intrusive model reduction approach for learning reduced models from partially observed state trajectories of high-dimensional dynamical systems. The proposed approach compensates for the loss of information due to the partially observed states by constructing non-Markovian reduced models that make future-state predictions based on a history of reduced states, in contrast to traditional Markovian reduced models that rely on the current reduced state alone to predict the next state. The core contributions of this work are a data sampling scheme to sample partially observed states from high-dimensional dynamical systems and a formulation of a regression problem to fit the non-Markovian reduced terms to the sampled states. Under certain conditions, the proposed approach recovers from data the very same non-Markovian terms that one obtains with intrusive methods that require the governing equations and discrete operators of the high-dimensional dynamical system. Numerical results demonstrate that the proposed approach leads to non-Markovian reduced models that are predictive far beyond the training regime. Additionally, in the numerical experiments, the proposed approach learns non-Markovian reduced models from trajectories with only 20% observed state components that are about as accurate as traditional Markovian reduced models fitted to trajectories with 99% observed components.
# 分散学習のための動的クラスタリングによるグラディエント符号化

Distributed implementations are crucial in speeding up large scale machine learning applications. Distributed gradient descent (GD) is widely employed to parallelize the learning task by distributing the dataset across multiple workers. A significant performance bottleneck for the per-iteration completion time in distributed synchronous GD is $straggling$ workers. Coded distributed computation techniques have been introduced recently to mitigate stragglers and to speed up GD iterations by assigning redundant computations to workers. In this paper, we consider gradient coding (GC), and propose a novel dynamic GC scheme, which assigns redundant data to workers to acquire the flexibility to dynamically choose from among a set of possible codes depending on the past straggling behavior. In particular, we consider GC with clustering, and regulate the number of stragglers in each cluster by dynamically forming the clusters at each iteration; hence, the proposed scheme is called $GC$ $with$ $dynamic$ $clustering$ (GC-DC). Under a time-correlated straggling behavior, GC-DC gains from adapting to the straggling behavior over time such that, at each iteration, GC-DC aims at distributing the stragglers across clusters as uniformly as possible based on the past straggler behavior. For both homogeneous and heterogeneous worker models, we numerically show that GC-DC provides significant improvements in the average per-iteration completion time without an increase in the communication load compared to the original GC scheme.
# 自然言語ビデオのローカライゼーション: Span-based Question Answering Frameworkの再考

Natural Language Video Localization (NLVL) aims to locate a target moment from an untrimmed video that semantically corresponds to a text query. Existing approaches mainly solve the NLVL problem from the perspective of computer vision by formulating it as ranking, anchor, or regression tasks. These methods suffer from large performance degradation when localizing on long videos. In this work, we address the NLVL from a new perspective, i.e., span-based question answering (QA), by treating the input video as a text passage. We propose a video span localizing network (VSLNet), on top of the standard span-based QA framework (named VSLBase), to address NLVL. VSLNet tackles the differences between NLVL and span-based QA through a simple yet effective query-guided highlighting (QGH) strategy. QGH guides VSLNet to search for the matching video span within a highlighted region. To address the performance degradation on long videos, we further extend VSLNet to VSLNet-L by applying a multi-scale split-and-concatenation strategy. VSLNet-L first splits the untrimmed video into short clip segments; then, it predicts which clip segment contains the target moment and suppresses the importance of other segments. Finally, the clip segments are concatenated, with different confidences, to locate the target moment accurately. Extensive experiments on three benchmark datasets show that the proposed VSLNet and VSLNet-L outperform the state-of-the-art methods; VSLNet-L addresses the issue of performance degradation on long videos. Our study suggests that the span-based QA framework is an effective strategy to solve the NLVL problem.
# 教師-学生設定におけるロバストさの理解:新しい視点

Adversarial examples have appeared as a ubiquitous property of machine learning models where bounded adversarial perturbation could mislead the models to make arbitrarily incorrect predictions. Such examples provide a way to assess the robustness of machine learning models as well as a proxy for understanding the model training process. Extensive studies try to explain the existence of adversarial examples and provide ways to improve model robustness (e.g. adversarial training). While they mostly focus on models trained on datasets with predefined labels, we leverage the teacher-student framework and assume a teacher model, or oracle, to provide the labels for given instances. We extend Tian (2019) in the case of low-rank input data and show that student specialization (trained student neuron is highly correlated with certain teacher neuron at the same layer) still happens within the input subspace, but the teacher and student nodes could differ wildly out of the data subspace, which we conjecture leads to adversarial examples. Extensive experiments show that student specialization correlates strongly with model robustness in different scenarios, including student trained via standard training, adversarial training, confidence-calibrated adversarial training, and training with robust feature dataset. Our studies could shed light on the future exploration about adversarial examples, and enhancing model robustness via principled data augmentation.
# 公正かつ安定なグラフ表現学習のための統一フレームワークを目指して

As the representations output by Graph Neural Networks (GNNs) are increasingly employed in real-world applications, it becomes important to ensure that these representations are fair and stable. In this work, we establish a key connection between counterfactual fairness and stability and leverage it to propose a novel framework, NIFTY (uNIfying Fairness and stabiliTY), which can be used with any GNN to learn fair and stable representations. We introduce a novel objective function that simultaneously accounts for fairness and stability and develop a layer-wise weight normalization using the Lipschitz constant to enhance neural message passing in GNNs. In doing so, we enforce fairness and stability both in the objective function as well as in the GNN architecture. Further, we show theoretically that our layer-wise weight normalization promotes counterfactual fairness and stability in the resulting representations. We introduce three new graph datasets comprising of high-stakes decisions in criminal justice and financial lending domains. Extensive experimentation with the above datasets demonstrates the efficacy of our framework.
# 不特定ロボットモデルを用いたCRiSP逆キネマティクス学習の構造予測

With the recent advances in machine learning, problems that traditionally would require accurate modeling to be solved analytically can now be successfully approached with data-driven strategies. Among these, computing the inverse kinematics of a redundant robot arm poses a significant challenge due to the non-linear structure of the robot, the hard joint constraints and the non-invertible kinematics map. Moreover, most learning algorithms consider a completely data-driven approach, while often useful information on the structure of the robot is available and should be positively exploited. In this work, we present a simple, yet effective, approach for learning the inverse kinematics. We introduce a structured prediction algorithm that combines a data-driven strategy with the model provided by a forward kinematics function -- even when this function is misspeficied -- to accurately solve the problem. The proposed approach ensures that predicted joint configurations are well within the robot's constraints. We also provide statistical guarantees on the generalization properties of our estimator as well as an empirical evaluation of its performance on trajectory reconstruction tasks.
# FjORD: 順序付きドロップアウトによる異種目標下での公正かつ正確なフェデレーション学習

Federated Learning (FL) has been gaining significant traction across different ML tasks, ranging from vision to keyboard predictions. In large-scale deployments, client heterogeneity is a fact, and constitutes a primary problem for fairness, training performance and accuracy. Although significant efforts have been made into tackling statistical data heterogeneity, the diversity in the processing capabilities and network bandwidth of clients, termed as system heterogeneity, has remained largely unexplored. Current solutions either disregard a large portion of available devices or set a uniform limit on the model's capacity, restricted by the least capable participants. In this work, we introduce Ordered Dropout, a mechanism that achieves an ordered, nested representation of knowledge in Neural Networks and enables the extraction of lower footprint submodels without the need of retraining. We further show that for linear maps our Ordered Dropout is equivalent to SVD. We employ this technique, along with a self-distillation methodology, in the realm of FL in a framework called FjORD. FjORD alleviates the problem of client system heterogeneity by tailoring the model width to the client's capabilities. Extensive evaluation on both CNNs and RNNs across diverse modalities shows that FjORD consistently leads to significant performance gains over state-of-the-art baselines, while maintaining its nested structure.
