# CHERIアロケータの選択:セキュリティとパフォーマンスに関する考察 Picking a CHERI Allocator: Security and Performance Considerations ( http://arxiv.org/abs/2303.15130v2 ) ライセンス: Link先を確認 | Jacob Bramley, Dejice Jacob, Andrei Lascu, Jeremy Singer, Laurence Tratt | (参考訳) いくつかのオープンソースのメモリアロケータがハードウェア機能プラットフォームであるCHERIに移植された。
本稿では,arm の実験的 morello プラットフォーム上で cheribsd 上で動作するアロケータのセキュリティと性能について検討する。
Several open-source memory allocators have been ported to CHERI, a hardware capability platform. In this paper we examine the security and performance of these allocators when run under CheriBSD on Arm's experimental Morello platform. We introduce a number of security attacks and show that all but one allocator are vulnerable to some of the attacks - including the default CheriBSD allocator. We then show that while some forms of allocator performance are meaningful, comparing the performance of hybrid and pure capability (i.e. 'running in non-CHERI vs. running in CHERI modes') allocators does not appear to be meaningful. Although we do not fully understand the reasons for this, it seems to be at least as much due to factors such as immature compiler toolchains as it is due to the effects of capabilities on hardware.
# MVP開発におけるソフトウェアエンジニアリングプラクティスの活用に関するシステムマッピング研究と実践的考察 A Systematic Mapping Study and Practitioner Insights on the Use of Software Engineering Practices to Develop MVPs ( http://arxiv.org/abs/2305.08299v1 ) ライセンス: Link先を確認 | Silvio Alonso, Marcos Kalinowski, Bruna Ferreira, Simone D. J. Barbosa, Helio Lopes | (参考訳) 背景] mvpの概念は、開発チームがソフトウェアエンジニアリングプラクティスを適用する方法に影響を与えています。
目的] 私たちの目標は,ソフトウェアmvpのコンテキストで使用されているプラクティスの出版状況の特徴と,特定されたプラクティスに関する実践的洞察の収集です。
focus groupセッションの実践者たちは、イデオレーションと評価プラクティスに関する結果に対する自信を強化し、ほとんどの特定されたプラクティスを認識しました。
結論]本分析は, 技術的実現可能性評価と取り組み推定に関する文献的ギャップに対処する解法の提案と評価研究の機会があることを示唆する。
Cancer is one of the leading causes of death worldwide. It is caused by a variety of genetic mutations, which makes every instance of the disease unique. Since chemotherapy can have extremely severe side effects, each patient requires a personalized treatment plan. Finding the dosages that maximize the beneficial effects of the drugs and minimize their adverse side effects is vital. Deep neural networks automate and improve drug selection. However, they require a lot of data to be trained on. Therefore, there is a need for machine-learning approaches that require less data. Hybrid quantum neural networks were shown to provide a potential advantage in problems where training data availability is limited. We propose a novel hybrid quantum neural network for drug response prediction, based on a combination of convolutional, graph convolutional, and deep quantum neural layers of 8 qubits with 363 layers. We test our model on the reduced Genomics of Drug Sensitivity in Cancer dataset and show that the hybrid quantum model outperforms its classical analog by 15% in predicting IC50 drug effectiveness values. The proposed hybrid quantum machine learning model is a step towards deep quantum data-efficient algorithms with thousands of quantum gates for solving problems in personalized medicine, where data collection is a challenge.
# brf: ebpfランタイムファザー BRF: eBPF Runtime Fuzzer ( http://arxiv.org/abs/2305.08782v1 ) ライセンス: Link先を確認 | Hsin-Wei Hung and Ardalan Amiri Sani | (参考訳) linuxカーネルのebpf技術は、それが提供するプログラム性のおかげで、ネットワーキング、トレース、セキュリティなど、さまざまなアプリケーションで広く採用されている。
本稿では,検証器とeBPFサブシステムに必要なセマンティクスと依存関係を満足するファザであるBPF Runtime Fuzzer (BRF)を紹介する。
The eBPF technology in the Linux kernel has been widely adopted for different applications, such as networking, tracing, and security, thanks to the programmability it provides. By allowing user-supplied eBPF programs to be executed directly in the kernel, it greatly increases the flexibility and efficiency of deploying customized logic. However, eBPF also introduces a new and wide attack surface: malicious eBPF programs may try to exploit the vulnerabilities in the eBPF subsystem in the kernel. Fuzzing is a promising technique to find such vulnerabilities. Unfortunately, our experiments with the state-of-the-art kernel fuzzer, Syzkaller, shows that it cannot effectively fuzz the eBPF runtime, those components that are in charge of executing an eBPF program, for two reasons. First, the eBPF verifier (which is tasked with verifying the safety of eBPF programs) rejects many fuzzing inputs because (1) they do not comply with its required semantics or (2) they miss some dependencies, i.e., other syscalls that need to be issued before the program is loaded. Second, Syzkaller fails to attach and trigger the execution of eBPF programs most of the times. This paper introduces the BPF Runtime Fuzzer (BRF), a fuzzer that can satisfy the semantics and dependencies required by the verifier and the eBPF subsystem. Our experiments show, in 48-hour fuzzing sessions, BRF can successfully execute 8x more eBPF programs compared to Syzkaller. Moreover, eBPF programs generated by BRF are much more expressive than Syzkaller's. As a result, BRF achieves 101% higher code coverage. Finally, BRF has so far managed to find 4 vulnerabilities (some of them have been assigned CVE numbers) in the eBPF runtime, proving its effectiveness.
# CompSuite: Javaライブラリのデータセットの不互換性問題 CompSuite: A Dataset of Java Library Upgrade Incompatibility Issues ( http://arxiv.org/abs/2305.08671v1 ) ライセンス: Link先を確認 | Xiufeng Xu, Chenguang Zhu, Yi Li | (参考訳) 現代のソフトウェアシステムは、効率的な開発を確保するためにサードパーティが開発した外部ライブラリに大きく依存している。
Modern software systems heavily rely on external libraries developed by third-parties to ensure efficient development. However, frequent library upgrades can lead to compatibility issues between the libraries and their client systems. In this paper, we introduce CompSuite, a dataset that includes 123 real-world Java client-library pairs where upgrading the library causes an incompatibility issue in the corresponding client. Each incompatibility issue in CompSuite is associated with a test case authored by the developers, which can be used to reproduce the issue. The dataset also provides a command-line interface that simplifies the execution and validation of each issue. With this infrastructure, users can perform an inspection of any incompatibility issue with the push of a button, or reproduce an issue step-by-step for a more detailed investigation. We make CompSuite publicly available to promote open science. We believe that various software analysis techniques, such as compatibility checking, debugging, and regression test selection, can benefit from CompSuite.
# DevServOps: プロダクト指向のプロダクトサービスシステムのためのDevOps DevServOps: DevOps For Product-Oriented Product-Service Systems ( http://arxiv.org/abs/2305.08601v1 ) ライセンス: Link先を確認 | Anas Dakkak, Jan Bosch and Helena Holmstr\"om Olsson | (参考訳) ウェブベースのアプリケーションを開発する企業では、DevとOpsは運用と開発の両方に焦点を当てた異なるグループを指す。
For companies developing web-based applications, the Dev and the Ops refer to different groups with either operational or development focus. Therefore, DevOps help these companies streamline software development and operations activities by emphasizing the collaboration between the two groups. However, for companies producing software-intensive products, the Ops would refer to customers who use and operate the product. In addition, companies producing software-intensive products do not only offer products to customers but rather Product Service Systems (PSS), where product-related services play a key role in ensuring customer satisfaction besides their significant revenue contribution. Thus, the context of product-oriented PSS is very different from web-based applications, making it difficult to apply DevOps without considering the role of the services. Therefore, based on a two years participant observation case study conducted at a multinational telecommunications systems provider, we propose a new and novel approach called Development-Services-Operations (DevServOps) which incorporates services as a key player facilitating an end-to-end software flow toward customers in one direction and feedback toward developers in the other direction. Services become the glue that connects the Dev and the Ops, achieved by providing internal services to increase the precision of the development organization and external services to increase the speed of deployment and new content adoption on the customers' side.
# RDFのサーフェス:コンピューターはノー RDF Surfaces: Computer Says No ( http://arxiv.org/abs/2305.08476v1 ) ライセンス: Link先を確認 | Patrick Hochstenbach, Jos De Roo, Ruben Verborgh | (参考訳) Logicは、リソースへのアクセスをエージェントが提供または拒否する方法、マイニングプロセスを使ってリソースをインターリンクする方法、ワークフローにおける次のステップの選択肢をユーザに提供できる。
この内部ロジックを交換するためには、Semantic Webが提供するポータブルなWebロジックが必要である。
RDF Surfaceは、情報の誤用の説明、推論への説明可能性と信頼の追加、データやクエリのストリームに対する推論のスコープの提供など、多くのユースケースにまたがっている。
RDF SurfacesはSemantic Web向けのFOLの直接翻訳を提供する。
Logic can define how agents are provided or denied access to resources, how to interlink resources using mining processes and provide users with choices for possible next steps in a workflow. These decisions are for the most part hidden, internal to machines processing data. In order to exchange this internal logic a portable Web logic is required which the Semantic Web could provide. Combining logic and data provides insights into the reasoning process and creates a new level of trust on the Semantic Web. Current Web logics carries only a fragment of first-order logic (FOL) to keep exchange languages decidable or easily processable. But, this is at a cost: the portability of logic. Machines require implicit agreements to know which fragment of logic is being exchanged and need a strategy for how to cope with the different fragments. These choices could obscure insights into the reasoning process. We created RDF Surfaces in order to express the full expressivity of FOL including saying explicitly `no'. This vision paper provides basic principles and compares existing work. Even though support for FOL is semi-decidable, we argue these problems are surmountable. RDF Surfaces span many use cases, including describing misuse of information, adding explainability and trust to reasoning, and providing scope for reasoning over streams of data and queries. RDF Surfaces provide the direct translation of FOL for the Semantic Web. We hope this vision paper attracts new implementers and opens the discussion to its formal specification.
# DAppSCAN: DAppプロジェクトにおけるスマートコントラクト弱さのための大規模データセットの構築 DAppSCAN: Building Large-Scale Datasets for Smart Contract Weaknesses in DApp Projects ( http://arxiv.org/abs/2305.08456v1 ) ライセンス: Link先を確認 | Zibin Zheng, Jianzhong Su, Jiachi Chen, David Lo, Zhijie Zhong and Mingxi Ye | (参考訳) smart contract weakness classification registry(swcレジストリ)は、ethereumプラットフォーム特有のスマートコントラクトの弱点のリストとして広く認識されている。
The Smart Contract Weakness Classification Registry (SWC Registry) is a widely recognized list of smart contract weaknesses specific to the Ethereum platform. In recent years, significant research efforts have been dedicated to building tools to detect SWC weaknesses. However, evaluating these tools has proven challenging due to the absence of a large, unbiased, real-world dataset. To address this issue, we recruited 22 participants and spent 44 person-months analyzing 1,322 open-source audit reports from 30 security teams. In total, we identified 10,016 weaknesses and developed two distinct datasets, i.e., DAppSCAN-Source and DAppSCAN-Bytecode. The DAppSCAN-Source dataset comprises 25,077 Solidity files, featuring 1,689 SWC vulnerabilities sourced from 1,139 real-world DApp projects. The Solidity files in this dataset may not be directly compilable. To enable the dataset to be compilable, we developed a tool capable of automatically identifying dependency relationships within DApps and completing missing public libraries. By utilizing this tool, we created our DAPPSCAN-Bytecode dataset, which consists of 8,167 compiled smart contract bytecode with 895 SWC weaknesses. Based on the second dataset, we conducted an empirical study to assess the performance of five state-of-the-art smart contract vulnerability detection tools. The evaluation results revealed subpar performance for these tools in terms of both effectiveness and success detection rate, indicating that future development should prioritize real-world datasets over simplistic toy contracts.
# コード生成のためのChatGPTプロンプトの改善 Improving ChatGPT Prompt for Code Generation ( http://arxiv.org/abs/2305.08360v1 ) ライセンス: Link先を確認 | Chao Liu, Xuanlin Bao, Hongyu Zhang, Neng Zhang, Haibo Hu, Xiaohong Zhang, Meng Yan | (参考訳) 自動コード生成はソフトウェア開発の強力なテクニックであり、要求に基づいて自動生成することで、開発者が新しいコードを作成するのに必要な時間と労力を大幅に削減する。
Automated code generation can be a powerful technique for software development, significantly reducing developers' efforts and time required to create new code by generating it automatically based on requirements. Recently, OpenAI's language model ChatGPT has emerged as a powerful tool for generating human-like responses to a wide range of textual inputs (i.e., prompts), including those related to code generation. However, the effectiveness of ChatGPT for code generation is not well understood, and the generation performance could be heavily influenced by the choice of prompt. To answer these questions, we conducted experiments using the CodeXGlue dataset to evaluate ChatGPT's capabilities for two code generation tasks, including text-to-code and code-to-code generation. We designed prompts by leveraging the chain-of-thought strategy with multi-step optimizations. Our results showed that by carefully designing prompts to guide ChatGPT, the generation performance can be improved substantially. We also analyzed the factors that influenced the prompt design and provided insights that could guide future research.
# 教育ソフトウェアにおけるゲーミフィケーションの負の効果:システムマッピングと実践者知覚 Negative Effects of Gamification in Education Software: Systematic Mapping and Practitioner Perceptions ( http://arxiv.org/abs/2305.08346v1 ) ライセンス: Link先を確認 | Clauvin Almeida, Marcos Kalinowski, Anderson Uchoa, Bruno Feijo | (参考訳) 文脈:ほとんどの研究はゲーミフィケーションのポジティブな効果を示しているが、その悪影響に対する焦点はかなり小さく、さらに理解する必要がある。
結果: マッピング研究により,ゲームデザイン要素の望ましくない影響を報告した87の論文が明らかになった。
結論: ゲーミフィケーションが適切に適用されると、教育/学習ソフトウェアにポジティブな影響を与える可能性がある。
Context: While most research shows positive effects of gamification, the focus on its adverse effects is considerably smaller and further understanding is needed. Objective: To provide a comprehensive overview on research reporting negative effects of game design elements and to provide insights into the awareness of developers on these effects and into how they could be considered in practice. Method: We conducted a systematic mapping study of the negative effects of game design elements on education/learning systems. We also held a focus group discussion with developers of a gamified software, discussing the mapping study results with regard to their awareness and perceptions on the reported negative effects in practice. Results: The mapping study revealed 87 papers reporting undesired effects of game design elements. We found that badges, leaderboards, competitions, and points are the game design elements most often reported as causing negative effects. The most cited negative effects were lack of effect, worsened performance, motivational issues, lack of understanding, and irrelevance. The ethical issues of gaming the system and cheating were also often reported. As part of our results, we map the relations between game design elements and the negative effects that they may cause. The focus group revealed that developers were not aware of many of the possible negative effects and that they consider this type of information useful. The discussion revealed their agreement on some of those potential negative effects and also some positive counterparts. Conclusions: Gamification, when properly applied, can have positive effects on education/learning software. However, gamified software is also prone to generate harmful effects. Revealing and discussing potentially negative effects can help to make more informed decisions considering their trade-off with respect to the expected benefits.
# 減速、移動:自動運転車の責任感性安全モデルの形式的検証、洗練、およびテストにおけるケーススタディ Slow Down, Move Over: A Case Study in Formal Verification, Refinement, and Testing of the Responsibility-Sensitive Safety Model for Self-Driving Cars ( http://arxiv.org/abs/2305.08812v1 ) ライセンス: Link先を確認 | Megan Strauss and Stefan Mitsch | (参考訳) テクノロジーの進歩は、人間のミスなしに運転し、車の排出を減らし、自動運転車の未来で日々のタスクを単純化する希望を与えてくれる。
我々は、ハイブリッドシステム定理証明器KeYmaera Xを用いて、RSSを非決定論的制御選択と連続運動モデルを備えたハイブリッドシステムとして形式化し、衝突の欠如を証明する。
Technology advances give us the hope of driving without human error, reducing vehicle emissions and simplifying an everyday task with the future of self-driving cars. Making sure these vehicles are safe is very important to the continuation of this field. In this paper, we formalize the Responsibility-Sensitive Safety model (RSS) for self-driving cars and prove the safety and optimality of this model in the longitudinal direction. We utilize the hybrid systems theorem prover KeYmaera X to formalize RSS as a hybrid system with its nondeterministic control choices and continuous motion model, and prove absence of collisions. We then illustrate the practicality of RSS through refinement proofs that turn the verified nondeterministic control envelopes into deterministic ones and further verified compilation to Python. The refinement and compilation are safety-preserving; as a result, safety proofs of the formal model transfer to the compiled code, while counterexamples discovered in testing the code of an unverified model transfer back. The resulting Python code allows to test the behavior of cars following the motion model of RSS in simulation, to measure agreement between the model and simulation with monitors that are derived from the formal model, and to report counterexamples from simulation back to the formal model.
# ニューラルネットワークを用いた悪性黒色腫のリスク階層化 Risk stratification of malignant melanoma using neural networks ( http://arxiv.org/abs/2306.06195v1 ) ライセンス: Link先を確認 | Julian Burghoff, Leonhard Ackermann, Younes Salahdine, Veronika Bram, Katharina Wunderlich, Julius Balkenhol, Thomas Dirschka and Hanno Gottschalk | (参考訳) 本稿では,悪性黒色腫の検出と分類を改善するため,臨床情報なしで最大0.78のauroc値を達成するための画像ベース手法を提案する。
機械学習手法の適用においては,輝度,コントラスト,シャープネスなどのスキャナ特性の変化が予測手法の品質に強い(負の)影響をもたらす可能性があるため,この領域間隙を克服する2つの方法が議論されている。 In order to improve the detection and classification of malignant melanoma, this paper describes an image-based method that can achieve AUROC values of up to 0.78 without additional clinical information. Furthermore, the importance of the domain gap between two different image sources is considered, as it is important to create usability independent of hardware components such as the high-resolution scanner used. Since for the application of machine learning methods, alterations of scanner-specific properties such as brightness, contrast or sharpness can have strong (negative) effects on the quality of the prediction methods, two ways to overcome this domain gap are discussed in this paper. | 翻訳日:2023-06-18 12:39:52 公開日:2023-05-15 |
# 量子ニューラルネットワークによるバイオマーカー発見:CTLA4活性化経路のケーススタディ Biomarker Discovery with Quantum Neural Networks: A Case-study in CTLA4-Activation Pathways ( http://arxiv.org/abs/2306.01745v1 ) ライセンス: Link先を確認 | Nam Nguyen | (参考訳) バイオマーカーの発見は、膨大な検索スペースのために難しい課題だ。
バイオマーカー候補集合の最大関連性, 最小冗長性 (mRMR) 基準を用いる。
我々は, (1) CTLA4-activation stand-alone, (2) CTLA4-CD8A-CD8B co-activation, (3) CTLA4-CD2 co-activation, (4) CTLA4-CD2-CD48-CD58-CD84 co-activationを含む, CTLA4の4つの活性化経路に関する概念実証を行った。
CLIC4, CPE, ETS2, FAM107A, GPR116, HYOU1, LCN2, MACF1, MT1G, NAPA, NDUFS5, PAK1, PFN1, PGAP3, PPM1G, PSMD8, RNF213, SLC25A3, UBA1, WLSを含むCLTA4関連経路の突然変異活性化に関与する新規なバイオマーカーを示す。
https://github.com/namnguyen0510/Biomarker-Discovery-with-Quantum-Neural-Networks。 Biomarker discovery is a challenging task due to the massive search space. Quantum computing and quantum Artificial Intelligence (quantum AI) can be used to address the computational problem of biomarker discovery tasks. We propose a Quantum Neural Networks (QNNs) architecture to discover biomarkers for input activation pathways. The Maximum Relevance, Minimum Redundancy (mRMR) criteria is used to score biomarker candidate sets. Our proposed model is economical since the neural solution can be delivered on constrained hardware. We demonstrate the proof of concept on four activation pathways associated with CTLA4, including (1) CTLA4-activation stand-alone, (2) CTLA4-CD8A-CD8B co-activation, (3) CTLA4-CD2 co-activation, and (4) CTLA4-CD2-CD48-CD53-CD58-CD84 co-activation. The model indicates new biomarkers associated with the mutational activation of CLTA4-associated pathways, including 20 genes: CLIC4, CPE, ETS2, FAM107A, GPR116, HYOU1, LCN2, MACF1, MT1G, NAPA, NDUFS5, PAK1, PFN1, PGAP3, PPM1G, PSMD8, RNF213, SLC25A3, UBA1, and WLS. We open source the implementation at: https://github.com/namnguyen0510/Biomarker-Discovery-with-Quantum-Neural-Networks. | 翻訳日:2023-06-11 14:05:45 公開日:2023-05-15 |
# llmsと潜在拡散モデルを用いたインタラクティブファッションコンテンツ生成 Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models ( http://arxiv.org/abs/2306.05182v1 ) ライセンス: Link先を確認 | Krishna Sri Ipsit Mantri and Nevasini Sasikumar | (参考訳) Fashionable Image Generationは、世界中の多様なファッションのイメージを合成することを目的としており、ファッションデザイナーが、特定のデザインの好みが実際にどのように見えるか、そして顧客満足度を高めるためにさらに改善されるかについて、基本的なカスタマイズされた構造を提供することによって、リアルタイム可視化を支援する。
本研究は,llmを用いて潜在拡散モデルへのプロンプトを洗練することにより,グローバルに創造的かつ文化的に多様化したファッションスタイルを創造し,バイアスを低減できることを示す。 Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would look in real life and what further improvements can be made for enhanced customer satisfaction. Moreover, users can alone interact and generate fashionable images by just giving a few simple prompts. Recently, diffusion models have gained popularity as generative models owing to their flexibility and generation of realistic images from Gaussian noise. Latent diffusion models are a type of generative model that use diffusion processes to model the generation of complex data, such as images, audio, or text. They are called "latent" because they learn a hidden representation, or latent variable, of the data that captures its underlying structure. We propose a method exploiting the equivalence between diffusion models and energy-based models (EBMs) and suggesting ways to compose multiple probability distributions. We describe a pipeline on how our method can be used specifically for new fashionable outfit generation and virtual try-on using LLM-guided text-to-image generation. Our results indicate that using an LLM to refine the prompts to the latent diffusion model assists in generating globally creative and culturally diversified fashion styles and reducing bias. | 翻訳日:2023-06-11 13:28:26 公開日:2023-05-15 |
# 視覚的接地言語モデルにおける意味構成 Semantic Composition in Visually Grounded Language Models ( http://arxiv.org/abs/2305.16328v1 ) ライセンス: Link先を確認 | Rohan Pandey | (参考訳) 文の意味とその理想表現とは何か?
我々は、研究の神経科学、精神言語学、形式的意味論、哲学との関係について論じる。 What is sentence meaning and its ideal representation? Much of the expressive power of human language derives from semantic composition, the mind's ability to represent meaning hierarchically & relationally over constituents. At the same time, much sentential meaning is outside the text and requires grounding in sensory, motor, and experiential modalities to be adequately learned. Although large language models display considerable compositional ability, recent work shows that visually-grounded language models drastically fail to represent compositional structure. In this thesis, we explore whether & how models compose visually grounded semantics, and how we might improve their ability to do so. Specifically, we introduce 1) WinogroundVQA, a new compositional visual question answering benchmark, 2) Syntactic Neural Module Distillation, a measure of compositional ability in sentence embedding models, 3) Causal Tracing for Image Captioning Models to locate neural representations vital for vision-language composition, 4) Syntactic MeanPool to inject a compositional inductive bias into sentence embeddings, and 5) Cross-modal Attention Congruence Regularization, a self-supervised objective function for vision-language relation alignment. We close by discussing connections of our work to neuroscience, psycholinguistics, formal semantics, and philosophy. | 翻訳日:2023-06-04 12:06:48 公開日:2023-05-15 |
# インテリジェント車両システムにおける生成型人工知能の統合 Integrating Generative Artificial Intelligence in Intelligent Vehicle Systems ( http://arxiv.org/abs/2305.17137v1 ) ライセンス: Link先を確認 | Lukas Stappen, Jeremy Dillmann, Serena Striegel, Hans-J\"org V\"ogel, Nicolas Flores-Herr, Bj\"orn W. Schuller | (参考訳) 本稿では,研究者や実践者のための総合的なガイドとして機能し,インテリジェントな自動車の文脈における生成型人工知能と基礎モデルの現状,応用可能性,今後の研究方向性について考察することを目的とする。
コラボレーションを育み、これらの研究領域に取り組むことによって、ジェネレイティブ・人工知能は、その全可能性を解き放ち、運転体験を変革し、インテリジェントな自動車の未来を形作ることができる。 This paper aims to serve as a comprehensive guide for researchers and practitioners, offering insights into the current state, potential applications, and future research directions for generative artificial intelligence and foundation models within the context of intelligent vehicles. As the automotive industry progressively integrates AI, generative artificial intelligence technologies hold the potential to revolutionize user interactions, delivering more immersive, intuitive, and personalised in-car experiences. We provide an overview of current applications of generative artificial intelligence in the automotive domain, emphasizing speech, audio, vision, and multimodal interactions. We subsequently outline critical future research areas, including domain adaptability, alignment, multimodal integration and others, as well as, address the challenges and risks associated with ethics. By fostering collaboration and addressing these research areas, generative artificial intelligence can unlock its full potential, transforming the driving experience and shaping the future of intelligent vehicles. | 翻訳日:2023-06-04 12:00:29 公開日:2023-05-15 |
# 信頼できるAIのための認定ラベル:実証的な混合手法の研究から Certification Labels for Trustworthy AI: Insights From an Empirical Mixed-Method Study ( http://arxiv.org/abs/2305.18307v1 ) ライセンス: Link先を確認 | Nicolas Scharowski, Michaela Benk, Swen J. K\"uhne, L\'eane Wettstein, Florian Br\"uhlmann | (参考訳) 監査は信頼できるAIの開発において重要な役割を果たす。
調査 (N = 12) と国勢調査表現調査 (N = 302) を通じて, 認証ラベルに対するエンドユーザの態度と, リスクの高いAIシナリオにおける信頼性のコミュニケーションにおける有効性を検討した。
我々の研究は、信頼できるAIエコシステムの中で有望な構成要素として認定ラベルを設計、実装するための貴重な洞察とレコメンデーションを提供します。 Auditing plays a pivotal role in the development of trustworthy AI. However, current research primarily focuses on creating auditable AI documentation, which is intended for regulators and experts rather than end-users affected by AI decisions. How to communicate to members of the public that an AI has been audited and considered trustworthy remains an open challenge. This study empirically investigated certification labels as a promising solution. Through interviews (N = 12) and a census-representative survey (N = 302), we investigated end-users' attitudes toward certification labels and their effectiveness in communicating trustworthiness in low- and high-stakes AI scenarios. Based on the survey results, we demonstrate that labels can significantly increase end-users' trust and willingness to use AI in both low- and high-stakes scenarios. However, end-users' preferences for certification labels and their effect on trust and willingness to use AI were more pronounced in high-stake scenarios. Qualitative content analysis of the interviews revealed opportunities and limitations of certification labels, as well as facilitators and inhibitors for the effective use of labels in the context of AI. For example, while certification labels can mitigate data-related concerns expressed by end-users (e.g., privacy and data protection), other concerns (e.g., model performance) are more challenging to address. Our study provides valuable insights and recommendations for designing and implementing certification labels as a promising constituent within the trustworthy AI ecosystem. | 翻訳日:2023-06-04 11:40:31 公開日:2023-05-15 |
# NeuSTIP: 時間的知識グラフにおけるリンクと時間予測のための新しいニューロシンボリックモデル NeuSTIP: A Novel Neuro-Symbolic Model for Link and Time Prediction in Temporal Knowledge Graphs ( http://arxiv.org/abs/2305.11301v1 ) ライセンス: Link先を確認 | Ishaan Singh and Navdeep Kaur and Garima Gaur and Mausam | (参考訳) 静的事実に関する知識グラフ補完(KGC)は成熟した分野であるが、静的事実に有効時間を組み込んだ時間グラフ補完(TKGC)はまだ初期段階にある。
2つの時間間隔に基づくTKGCデータセットに対する実験的な評価は、リンク予測と時間間隔予測の両方において、我々のモデルが最先端のモデルより優れていることを示唆している。 While Knowledge Graph Completion (KGC) on static facts is a matured field, Temporal Knowledge Graph Completion (TKGC), that incorporates validity time into static facts is still in its nascent stage. The KGC methods fall into multiple categories including embedding-based, rule-based, GNN-based, pretrained Language Model based approaches. However, such dimensions have not been explored in TKG. To that end, we propose a novel temporal neuro-symbolic model, NeuSTIP, that performs link prediction and time interval prediction in a TKG. NeuSTIP learns temporal rules in the presence of the Allen predicates that ensure the temporal consistency between neighboring predicates in a given rule. We further design a unique scoring function that evaluates the confidence of the candidate answers while performing link prediction and time interval prediction by utilizing the learned rules. Our empirical evaluation on two time interval based TKGC datasets suggests that our model outperforms state-of-the-art models for both link prediction and the time interval prediction task. | 翻訳日:2023-05-28 05:37:21 公開日:2023-05-15 |
# チャットGPTと労働市場:AI討論が学生の期待に与える影響を解明する ChatGPT and the Labor Market: Unraveling the Effect of AI Discussions on Students' Earnings Expectations ( http://arxiv.org/abs/2305.11900v1 ) ライセンス: Link先を確認 | Samir Huseynov | (参考訳) 本稿では,アメリカの学生が期待する労働市場の結果に対する,否定的かつ肯定的なChatGPT人工知能(AI)の議論の因果的影響について検討する。
教育者、管理者、政策立案者は学生と定期的に関わり、彼らの懸念に対処し、AIによって必然的に形作られる未来に備えるために教育カリキュラムを強化することができる。 This paper investigates the causal impact of negatively and positively framed ChatGPT Artificial Intelligence (AI) discussions on US students' anticipated labor market outcomes. Our findings reveal students reduce their confidence regarding their future earnings prospects after exposure to AI debates, and this effect is more pronounced after reading discussion excerpts with a negative tone. Unlike STEM majors, students in Non-STEM fields show asymmetric and pessimistic belief changes, suggesting that they might feel more vulnerable to emerging AI technologies. Pessimistic belief updates regarding future earnings are also prevalent across gender and GPA levels, indicating widespread AI concerns among all student subgroups. Educators, administrators, and policymakers may regularly engage with students to address their concerns and enhance educational curricula to better prepare them for a future that will be inevitably shaped by AI. | 翻訳日:2023-05-28 05:18:49 公開日:2023-05-15 |
# 効率的なスパイクベース画像復調のためのニューラル情報符号化 Neural information coding for efficient spike-based image denoising ( http://arxiv.org/abs/2305.11898v1 ) ライセンス: Link先を確認 | Andrea Castagnetti, Alain Pegatoquet, Beno\^it Miramond | (参考訳) 近年,Deep Convolutional Neural Networks (DCNN) は,画像復元作業における古典的アルゴリズムの性能を上回っている。
本稿では,Leaky Integrate and Fire(LIF)ニューロンによる情報変換処理の形式的解析を行い,その性能を古典的なレート符号化機構と比較する。
その結果, LIFニューロンを用いたSNNは, 計算コストを抑えつつ, 競争性能を向上できることがわかった。 In recent years, Deep Convolutional Neural Networks (DCNNs) have outreached the performance of classical algorithms for image restoration tasks. However most of these methods are not suited for computational efficiency and are therefore too expensive to be executed on embedded and mobile devices. In this work we investigate Spiking Neural Networks (SNNs) for Gaussian denoising, with the goal of approaching the performance of conventional DCNN while reducing the computational load. We propose a formal analysis of the information conversion processing carried out by the Leaky Integrate and Fire (LIF) neurons and we compare its performance with the classical rate-coding mechanism. The neural coding schemes are then evaluated through experiments in terms of denoising performance and computation efficiency for a state-of-the-art deep convolutional neural network. Our results show that SNNs with LIF neurons can provide competitive denoising performance but at a reduced computational cost. | 翻訳日:2023-05-28 05:18:33 公開日:2023-05-15 |
# 人工知能を用いたコミュニケーションの批判的評価 Critical Appraisal of Artificial Intelligence-Mediated Communication ( http://arxiv.org/abs/2305.11897v1 ) ライセンス: Link先を確認 | Dara Tafazoli | (参考訳) 過去20年間で、言語学習と教育における技術利用は著しく進歩し、現在はコンピュータ支援言語学習(CALL)と呼ばれている。
結論として,言語教師が CALL の教師教育や専門的開発に従事し,進化を続ける技術環境に追随し,教育効果を向上させることが重要であると論じる。 Over the last two decades, technology use in language learning and teaching has significantly advanced and is now referred to as Computer-Assisted Language Learning (CALL). Recently, the integration of Artificial Intelligence (AI) into CALL has brought about a significant shift in the traditional approach to language education both inside and outside the classroom. In line with this book's scope, I explore the advantages and disadvantages of AI-mediated communication in language education. I begin with a brief review of AI in education. I then introduce the ICALL and give a critical appraisal of the potential of AI-powered automatic speech recognition (ASR), Machine Translation (MT), Intelligent Tutoring Systems (ITSs), AI-powered chatbots, and Extended Reality (XR). In conclusion, I argue that it is crucial for language teachers to engage in CALL teacher education and professional development to keep up with the ever-evolving technology landscape and improve their teaching effectiveness. | 翻訳日:2023-05-28 05:18:17 公開日:2023-05-15 |
# パーソナライズド・ミュージック・セラピーに向けて : 神経計算モデリングの視点から Towards personalised music-therapy; a neurocomputational modelling perspective ( http://arxiv.org/abs/2305.14364v1 ) ライセンス: Link先を確認 | Nicole Lai, Marios Philiastides, Fahim Kawsar, Fani Deligianni | (参考訳) 音楽療法は、副作用のない幅広い神経疾患や気分障害において、患者の結果を改善するための介入として最近登場した。
この比較的調査されていない領域では、神経生理学的反応の測定を介するフィードバックループを通じて、個人のニーズやタスクに合った音楽の選択プロセスをパーソナライズし、自動化する方法を理解する必要があります。 Music therapy has emerged recently as a successful intervention that improves patient's outcome in a large range of neurological and mood disorders without adverse effects. Brain networks are entrained to music in ways that can be explained both via top-down and bottom-up processes. In particular, the direct interaction of auditory with the motor and the reward system via a predictive framework explains the efficacy of music-based interventions in motor rehabilitation. In this manuscript, we provide a brief overview of current theories of music perception and processing. Subsequently, we summarise evidence of music-based interventions primarily in motor, emotional and cardiovascular regulation. We highlight opportunities to improve quality of life and reduce stress beyond the clinic environment and in healthy individuals. This relatively unexplored area requires an understanding of how we can personalise and automate music selection processes to fit individuals needs and tasks via feedback loops mediated by measurements of neuro-physiological responses. | 翻訳日:2023-05-28 04:50:02 公開日:2023-05-15 |
# 計算アーキテクチャに対する人間の脳のベンチマーク Benchmarking the human brain against computational architectures ( http://arxiv.org/abs/2305.14363v1 ) ライセンス: Link先を確認 | C\'eline van Valkenhoef, Catherine Schuman, Philip Walther | (参考訳) 人間の脳は、人工ニューラルネットワークやニューロモルフィックコンピュータのような古典的および量子コンピューティングアーキテクチャを補完する新しい概念にインスピレーションを与えたが、その性能がどのように比較されるかは明らかになっていない。
したがって、このフレームワークはブラックボックスとして考えることで、脳の計算効率に関するユニークな洞察を提供する。 The human brain has inspired novel concepts complementary to classical and quantum computing architectures, such as artificial neural networks and neuromorphic computers, but it is not clear how their performances compare. Here we report a new methodological framework for benchmarking cognitive performance based on solving computational problems with increasing problem size. We determine computational efficiencies in experiments with human participants and benchmark these against complexity classes. We show that a neuromorphic architecture with limited field-of-view size and added noise provides a good approximation to our results. The benchmarking also suggests there is no quantum advantage on the scales of human capability compared to the neuromorphic model. Thus, the framework offers unique insights into the computational efficiency of the brain by considering it a black box. | 翻訳日:2023-05-28 04:49:46 公開日:2023-05-15 |
# ビルディング・ポイント・クラウドからのクラッタ耐性フロアプラン生成のためのハイブリッド・セマンティクス・ジオメトリアプローチ A Hybrid Semantic-Geometric Approach for Clutter-Resistant Floorplan Generation from Building Point Clouds ( http://arxiv.org/abs/2305.15420v1 ) ライセンス: Link先を確認 | Seongyong Kim, Yosuke Yajima, Jisoo Park, Jingdao Chen, Yong K. Cho | (参考訳) 情報モデリング(BIM)技術の構築は、現代の建設工学とプロジェクト管理のワークフローの重要なコンポーネントである。
プロジェクトサイトの空間的現実を表すAs-is BIMモデルは、建設進捗監視、エラーチェック、メンテナンスのための重要な情報を提供することができる。
提案手法は,精度,リコール,インターセクション・オーバー・ユニオン(IOU),ベティ誤差,ワープ誤差の測定値を用いて評価する。 Building Information Modeling (BIM) technology is a key component of modern construction engineering and project management workflows. As-is BIM models that represent the spatial reality of a project site can offer crucial information to stakeholders for construction progress monitoring, error checking, and building maintenance purposes. Geometric methods for automatically converting raw scan data into BIM models (Scan-to-BIM) often fail to make use of higher-level semantic information in the data. Whereas, semantic segmentation methods only output labels at the point level without creating object level models that is necessary for BIM. To address these issues, this research proposes a hybrid semantic-geometric approach for clutter-resistant floorplan generation from laser-scanned building point clouds. The input point clouds are first pre-processed by normalizing the coordinate system and removing outliers. Then, a semantic segmentation network based on PointNet++ is used to label each point as ceiling, floor, wall, door, stair, and clutter. The clutter points are removed whereas the wall, door, and stair points are used for 2D floorplan generation. A region-growing segmentation algorithm paired with geometric reasoning rules is applied to group the points together into individual building elements. Finally, a 2-fold Random Sample Consensus (RANSAC) algorithm is applied to parameterize the building elements into 2D lines which are used to create the output floorplan. The proposed method is evaluated using the metrics of precision, recall, Intersection-over-Union (IOU), Betti error, and warping error. | 翻訳日:2023-05-28 04:41:08 公開日:2023-05-15 |
# ニューラルネットワークを用いた自動評価スコーリングにおける動的損失関数の有効性 The Effectiveness of a Dynamic Loss Function in Neural Network Based Automated Essay Scoring ( http://arxiv.org/abs/2305.10447v1 ) ライセンス: Link先を確認 | Oscar Morris | (参考訳) ニューラルネットワーク、特に注意機構は、自動評価の分野に大きな進歩をもたらした。
我々の損失関数は, 学生評価自動評価データセットにおいて, 準重み付きカッパスコア0.752の成績を犠牲にすることなく, この目標を達成する。 Neural networks and in particular the attention mechanism have brought significant advances to the field of Automated Essay Scoring. Many of these systems use a regression-based model which may be prone to underfitting when the model only predicts the mean of the training data. In this paper, we present a dynamic loss function that creates an incentive for the model to predict with the correct distribution, as well as predicting the correct values. Our loss function achieves this goal without sacrificing any performance achieving a Quadratic Weighted Kappa score of 0.752 on the Automated Student Assessment Prize Automated Essay Scoring dataset. | 翻訳日:2023-05-19 18:55:00 公開日:2023-05-15 |
# 感情制御のためのガイドナラティブにおける心理的要素に基づく感情認識 Emotion Recognition based on Psychological Components in Guided Narratives for Emotion Regulation ( http://arxiv.org/abs/2305.10446v1 ) ライセンス: Link先を確認 | Gustave Cortal (LMF, LISN), Alain Finkel (LMF, IUF), Patrick Paroubek (LISN), Lina Ye (LMF) | (参考訳) 感情調節は感情的な出来事を扱う上で重要な要素であり、精神的健康に肯定的な影響を及ぼす。
また, 学習済み言語モデルを用いて, 感情成分の表現方法の相違を明らかにすることで, 特定の成分から個別の感情を予測できることを示す。 Emotion regulation is a crucial element in dealing with emotional events and has positive effects on mental health. This paper aims to provide a more comprehensive understanding of emotional events by introducing a new French corpus of emotional narratives collected using a questionnaire for emotion regulation. We follow the theoretical framework of the Component Process Model which considers emotions as dynamic processes composed of four interrelated components (behavior, feeling, thinking and territory). Each narrative is related to a discrete emotion and is structured based on all emotion components by the writers. We study the interaction of components and their impact on emotion classification with machine learning methods and pre-trained language models. Our results show that each component improves prediction performance, and that the best results are achieved by jointly considering all components. Our results also show the effectiveness of pre-trained language models in predicting discrete emotion from certain components, which reveal differences in how emotion components are expressed. | 翻訳日:2023-05-19 18:54:52 公開日:2023-05-15 |
# 記憶: 自己回帰型言語モデルによる暗号化 Memorization for Good: Encryption with Autoregressive Language Models ( http://arxiv.org/abs/2305.10445v1 ) ライセンス: Link先を確認 | Samuel Stevens and Yu Su | (参考訳) over-parameterized neural language models (lms)は、トレーニングデータの長いシーケンスを記憶し、引用することができる。
私たちのコードとデータセットはhttps://github.com/OSU-NLP-Group/SELMで公開されています。 Over-parameterized neural language models (LMs) can memorize and recite long sequences of training data. While such memorization is normally associated with undesired properties such as overfitting and information leaking, our work casts memorization as an unexplored capability of LMs. We propose the first symmetric encryption algorithm with autoregressive language models (SELM). We show that autoregressive LMs can encode arbitrary data into a compact real-valued vector (i.e., encryption) and then losslessly decode the vector to the original message (i.e., decryption) via random subspace optimization and greedy decoding. While SELM is not amenable to conventional cryptanalysis, we investigate its security through a novel empirical variant of the classic IND-CPA (indistinguishability under chosen-plaintext attack) game. Our code and datasets are available at https://github.com/OSU-NLP-Group/SELM. | 翻訳日:2023-05-19 18:54:37 公開日:2023-05-15 |
# OOD-Speech:アウトオブディストリビューションベンチマークのための大規模ベンガル音声認識データセット OOD-Speech: A Large Bengali Speech Recognition Dataset for Out-of-Distribution Benchmarking ( http://arxiv.org/abs/2305.09688v1 ) ライセンス: Link先を確認 | Fazle Rabbi Rakib, Souhardya Saha Dip, Samiul Alam, Nazia Tasnim, Md. Istiak Hossain Shihab, Md. Nazmuddoha Ansary, Syed Mobassir Hossen, Marsia Haque Meghla, Mamunur Mamun, Farig Sadeque, Sayma Sultana Chowdhury, Tahsin Reasat, Asif Sushmit, Ahmed Imtiaz Humayun | (参考訳) 本稿では,ベンガル語自動音声認識(ASR)のための最初のOOD-Speechベンチマークデータセットを提案する。
OOD-Speechは、Bengaliの最初のアウト・オブ・ディストリビューションのASRベンチマークデータセットとともに、公開可能な最大の音声データセットである。 We present OOD-Speech, the first out-of-distribution (OOD) benchmarking dataset for Bengali automatic speech recognition (ASR). Being one of the most spoken languages globally, Bengali portrays large diversity in dialects and prosodic features, which demands ASR frameworks to be robust towards distribution shifts. For example, islamic religious sermons in Bengali are delivered with a tonality that is significantly different from regular speech. Our training dataset is collected via massively online crowdsourcing campaigns which resulted in 1177.94 hours collected and curated from $22,645$ native Bengali speakers from South Asia. Our test dataset comprises 23.03 hours of speech collected and manually annotated from 17 different sources, e.g., Bengali TV drama, Audiobook, Talk show, Online class, and Islamic sermons to name a few. OOD-Speech is jointly the largest publicly available speech dataset, as well as the first out-of-distribution ASR benchmarking dataset for Bengali. | 翻訳日:2023-05-18 19:12:21 公開日:2023-05-15 |
# データバイアス管理 Data Bias Management ( http://arxiv.org/abs/2305.09686v1 ) ライセンス: Link先を確認 | Gianluca Demartini and Kevin Roitero and Stefano Mizzaro | (参考訳) 日常生活におけるデータ駆動システムの普及により、バイアスや公平性といった概念は、産業とアカデミアの両方において、研究者や実践者の間で大きな注目を集めた。
データバイアスは、すべてのケースにおいて必ずしも取り除くべきものではないし、研究の注意は、偏見の除去から識別、測定、インデックス化、表面化、偏見の適応へとシフトすべきである、と私たちは主張する。 Due to the widespread use of data-powered systems in our everyday lives, concepts like bias and fairness gained significant attention among researchers and practitioners, in both industry and academia. Such issues typically emerge from the data, which comes with varying levels of quality, used to train supervised machine learning systems. With the commercialization and deployment of such systems that are sometimes delegated to make life-changing decisions, significant efforts are being made towards the identification and removal of possible sources of data bias that may resurface to the final end user or in the decisions being made. In this paper, we present research results that show how bias in data affects end users, where bias is originated, and provide a viewpoint about what we should do about it. We argue that data bias is not something that should necessarily be removed in all cases, and that research attention should instead shift from bias removal towards the identification, measurement, indexing, surfacing, and adapting for bias, which we name bias management. | 翻訳日:2023-05-18 19:12:02 公開日:2023-05-15 |
# 二次元魅力的なフェルミ・ハバードモデルにおける動的構造因子とペアリングギャップの測定法 Dynamical structure factor and a new method to measure the pairing gap in two-dimensional attractive Fermi-Hubbard model ( http://arxiv.org/abs/2305.09685v1 ) ライセンス: Link先を確認 | Huaisong Zhao, Peng Zou and Feng Yuan | (参考訳) ブリルアンゾーンの高対称性方向に沿った動的構造因子を計算することにより、ランダム位相近似に基づいて、2次元魅力的なフェルミ・ハバードモデルの動的励起を研究する。
特に移動運動量${\bf q}=\left[\pi,\pi\right]$では、動的構造因子は低エネルギー領域の鋭いボソニック分子励起ピークと高エネルギー領域の広い原子励起バンドからなる。
これらの理論的結果は、光学格子のペアリングギャップは${\bf q}=\left[\pi,\pi\right]$で力学構造因子を測定することによって実験的に得られることを示している。 By calculating the dynamical structure factor along the high symmetry directions in the Brillouin zone, the dynamical excitations in two-dimensional attractive Fermi-Hubbard model are studied based on the random-phase approximation. At the small transfer momentum, the sound speed can be obtained and is suppressed by the interaction strength. In particular, at the transfer momentum ${\bf q}=\left[\pi,\pi\right]$, the dynamical structure factor consists of a sharp bosonic molecular excitation peak in the low-energy region and a broad atomic excitation band in the higher energy region. Furthermore, as the hopping strength increases (the interaction strength decreases), the weight of the molecular excitation peak decreases monotonically while the weight of the atomic excitations increases quickly. The area of the molecular excitation peak scales with the square of the pairing gap, which also applies to the spin-orbit coupling case. These theoretical results show that the pairing gap in optical lattice can be obtained experimentally by measuring the dynamical structure factor at ${\bf q}=\left[\pi,\pi\right]$. | 翻訳日:2023-05-18 19:11:45 公開日:2023-05-15 |
# 減衰機能付き時系列異常検出の評価戦略 Evaluation Strategy of Time-series Anomaly Detection with Decay Function ( http://arxiv.org/abs/2305.09691v1 ) ライセンス: Link先を確認 | Yongwan Gim, Kyushik Min | (参考訳) 近年の時系列異常検出のアルゴリズムは、ポイント調整(PA)プロトコルを適用して評価されている。
本稿では,pa や pa\%k のような既存プロトコルの過大かつ過大な評価問題をpadfプロトコルが解くことを理論的および実験的に示す。
ベンチマークデータセットでSOTAモデルの再評価を行うことにより,PAプロトコルは多数の異常セグメントの発見にのみ焦点をあてているのに対し,PAdfプロトコルのスコアは多数のセグメントの発見だけでなく,遅延なく迅速に異常を検出することを考慮している。 Recent algorithms of time-series anomaly detection have been evaluated by applying a Point Adjustment (PA) protocol. However, the PA protocol has a problem of overestimating the performance of the detection algorithms because it only depends on the number of detected abnormal segments and their size. We propose a novel evaluation protocol called the Point-Adjusted protocol with decay function (PAdf) to evaluate the time-series anomaly detection algorithm by reflecting the following ideal requirements: detect anomalies quickly and accurately without false alarms. This paper theoretically and experimentally shows that the PAdf protocol solves the over- and under-estimation problems of existing protocols such as PA and PA\%K. By conducting re-evaluations of SOTA models in benchmark datasets, we show that the PA protocol only focuses on finding many anomalous segments, whereas the score of the PAdf protocol considers not only finding many segments but also detecting anomalies quickly without delay. | 翻訳日:2023-05-18 18:59:32 公開日:2023-05-15 |
# 合成キャプションと転送学習による音声キャプション学習のためのささやきトランスフォーマー A Whisper transformer for audio captioning trained with synthetic captions and transfer learning ( http://arxiv.org/abs/2305.09690v1 ) ライセンス: Link先を確認 | Marek Kadl\v{c}\'ik, Adam H\'ajek, J\"urgen Kieslich, Rados{\l}aw Winiecki | (参考訳) 近年の音声キャプションの分野は、大規模オーディオデータセットの利用可能化とディープラーニング技術の進歩により、大きな進歩を遂げている。
私たちのコードとトレーニングされたモデルは、GitHubとHugging Face Hubで公開されています。 The field of audio captioning has seen significant advancements in recent years, driven by the availability of large-scale audio datasets and advancements in deep learning techniques. In this technical report, we present our approach to audio captioning, focusing on the use of a pretrained speech-to-text Whisper model and pretraining on synthetic captions. We discuss our training procedures and present our experiments' results, which include model size variations, dataset mixtures, and other hyperparameters. Our findings demonstrate the impact of different training strategies on the performance of the audio captioning model. Our code and trained models are publicly available on GitHub and Hugging Face Hub. | 翻訳日:2023-05-18 18:59:16 公開日:2023-05-15 |
# llmの隠れたリスク評価--ロバスト性、一貫性、信頼性に関する実証的研究 Assessing Hidden Risks of LLMs: An Empirical Study on Robustness, Consistency, and Credibility ( http://arxiv.org/abs/2305.10235v1 ) ライセンス: Link先を確認 | Wentao Ye, Mingfeng Ou, Tianyi Li, Yipeng chen, Xuetao Ma, Yifan Yanggong, Sai Wu, Jie Fu, Gang Chen, Junbo Zhao | (参考訳) 近年の大規模言語モデル(LLM)の人気は、特にAPIやオープンソースモデル、プラグインといったオープンなエコシステムを通じて、バウンダリのないフィールドに大きな影響を与えている。
その際, LLMシステムの堅牢性, 一貫性, 信頼性について, 先駆的かつ先駆的な研究を行う。
そこで本研究では,LLM を用いた評価において,そのようなデータの有効性を大まかに決定する新たな指標を提案する。
上記の主張を支持するために広範な実証研究が行われている。 The recent popularity of large language models (LLMs) has brought a significant impact to boundless fields, particularly through their open-ended ecosystem such as the APIs, open-sourced models, and plugins. However, with their widespread deployment, there is a general lack of research that thoroughly discusses and analyzes the potential risks concealed. In that case, we intend to conduct a preliminary but pioneering study covering the robustness, consistency, and credibility of LLMs systems. With most of the related literature in the era of LLM uncharted, we propose an automated workflow that copes with an upscaled number of queries/responses. Overall, we conduct over a million queries to the mainstream LLMs including ChatGPT, LLaMA, and OPT. Core to our workflow consists of a data primitive, followed by an automated interpreter that evaluates these LLMs under different adversarial metrical systems. As a result, we draw several, and perhaps unfortunate, conclusions that are quite uncommon from this trendy community. Briefly, they are: (i)-the minor but inevitable error occurrence in the user-generated query input may, by chance, cause the LLM to respond unexpectedly; (ii)-LLMs possess poor consistency when processing semantically similar query input. In addition, as a side finding, we find that ChatGPT is still capable to yield the correct answer even when the input is polluted at an extreme level. While this phenomenon demonstrates the powerful memorization of the LLMs, it raises serious concerns about using such data for LLM-involved evaluation in academic development. To deal with it, we propose a novel index associated with a dataset that roughly decides the feasibility of using such data for LLM-involved evaluation. Extensive empirical studies are tagged to support the aforementioned claims. | 翻訳日:2023-05-18 15:41:26 公開日:2023-05-15 |
# UNIQORN: RDF知識グラフと自然言語テキストに関する統一質問 UNIQORN: Unified Question Answering over RDF Knowledge Graphs and Natural Language Text ( http://arxiv.org/abs/2108.08614v6 ) ライセンス: Link先を確認 | Soumajit Pramanik, Jesujoba Alabi, Rishiraj Saha Roy, Gerhard Weikum | (参考訳) 知識グラフやその他のRDFデータに対する質問応答は大幅に進歩しており、自然言語の質問やテレグラフの問い合わせに対して簡潔な回答を提供するシステムも数多くある。
グラフベースの方法論は、完全な応答プロセスに対するユーザ解釈可能な証拠を提供する。 Question answering over knowledge graphs and other RDF data has been greatly advanced, with a number of good systems providing crisp answers for natural language questions or telegraphic queries. Some of these systems incorporate textual sources as additional evidence for the answering process, but cannot compute answers that are present in text alone. Conversely, systems from the IR and NLP communities have addressed QA over text, but such systems barely utilize semantic data and knowledge. This paper presents a method for complex questions that can seamlessly operate over a mixture of RDF datasets and text corpora, or individual sources, in a unified framework. Our method, called UNIQORN, builds a context graph on-the-fly, by retrieving question-relevant evidences from the RDF data and/or a text corpus, using fine-tuned BERT models. The resulting graph is typically contains all question-relevant evidences but also a lot of noise. UNIQORN copes with this input by a graph algorithm for Group Steiner Trees, that identifies the best answer candidates in the context graph. Experimental results on several benchmarks of complex questions with multiple entities and relations, show that UNIQORN significantly outperforms state-of-the-art methods for heterogeneous QA. The graph-based methodology provides user-interpretable evidence for the complete answering process. | 翻訳日:2023-05-17 20:25:00 公開日:2023-05-15 |
# 非パラメトリックマニフォールド学習 Non-Parametric Manifold Learning ( http://arxiv.org/abs/2107.08089v3 ) ライセンス: Link先を確認 | Dena Marie Asta | (参考訳) ラプラス・ベルトラミ作用素のグラフラプラシアン推定に基づくコンパクトリーマン多様体における距離推定器を導入する。
我々は、グラフラプラシアン推定におけるスペクトル誤差および暗黙的に、多様体の幾何的性質の観点から、多様体距離の推定誤差、あるいはより正確には非可換幾何学における興味のある多様体距離のスペクトル切断変種の推定(cf. [connes and suijelekom, 2020])を上限する。
推定器は類似しており、実際に収束特性はコンヌ距離公式として知られるワッサーシュタイン距離のコントロヴィッチ双対再構成の特別な場合に由来する。 We introduce an estimator for distances in a compact Riemannian manifold based on graph Laplacian estimates of the Laplace-Beltrami operator. We upper bound the error in the estimate of manifold distances, or more precisely an estimate of a spectrally truncated variant of manifold distance of interest in non-commutative geometry (cf. [Connes and Suijelekom, 2020]), in terms of spectral errors in the graph Laplacian estimates and, implicitly, several geometric properties of the manifold. A consequence is a proof of consistency for (untruncated) manifold distances. The estimator resembles, and in fact its convergence properties are derived from, a special case of the Kontorovic dual reformulation of Wasserstein distance known as Connes' Distance Formula. | 翻訳日:2023-05-17 20:24:09 公開日:2023-05-15 |
# ディープラーニングの教訓を用いたニューラルネットワークの学習 Training Spiking Neural Networks Using Lessons From Deep Learning ( http://arxiv.org/abs/2109.12894v5 ) ライセンス: Link先を確認 | Jason K. Eshraghian and Max Ward and Emre Neftci and Xinxin Wang and Gregor Lenz and Girish Dwivedi and Mohammed Bennamoun and Doo Seok Jeong and Wei D. Lu | (参考訳) 脳はより効率的なニューラルネットワークを開発するためのインスピレーションを探すのに最適な場所だ。
本論文は, 深層学習, 勾配降下, バックプロパゲーション, 神経科学における数十年の研究から学んだ教訓を, 生物学的にもっともらしいスパイクニューラルネットワークに適用する方法を示すチュートリアルおよび視点として機能する。
https://snntorch.readthedocs.io/en/latest/tutorials/index.htmlを参照。 The brain is the perfect place to look for inspiration to develop more efficient neural networks. The inner workings of our synapses and neurons provide a glimpse at what the future of deep learning might look like. This paper serves as a tutorial and perspective showing how to apply the lessons learnt from several decades of research in deep learning, gradient descent, backpropagation and neuroscience to biologically plausible spiking neural neural networks. We also explore the delicate interplay between encoding data as spikes and the learning process; the challenges and solutions of applying gradient-based learning to spiking neural networks (SNNs); the subtle link between temporal backpropagation and spike timing dependent plasticity, and how deep learning might move towards biologically plausible online learning. Some ideas are well accepted and commonly used amongst the neuromorphic engineering community, while others are presented or justified for the first time here. The fields of deep learning and spiking neural networks evolve very rapidly. We endeavour to treat this document as a 'dynamic' manuscript that will continue to be updated as the common practices in training SNNs also change. A series of companion interactive tutorials complementary to this paper using our Python package, snnTorch, are also made available. See https://snntorch.readthedocs.io/en/latest/tutorials/index.html . | 翻訳日:2023-05-17 20:14:03 公開日:2023-05-15 |
# Pythonパッケージを伴う任意の超伝導量子回路の解析:SQcircuit Analysis of arbitrary superconducting quantum circuits accompanied by a Python package: SQcircuit ( http://arxiv.org/abs/2206.08319v2 ) ライセンス: Link先を確認 | Taha Rajabzadeh, Zhaoyou Wang, Nathan Lee, Takuma Makihara, Yudan Guo, Amir H. Safavi-Naeini | (参考訳) 超伝導量子回路は、フォールトトレラント量子コンピュータを実現するための有望なハードウェアプラットフォームである。
興味深い量子回路を解析し、スペクトル、コヒーレンス時間、遷移行列要素、結合作用素、固有関数の位相座標表現などの特徴を得る一連の例を示す。 Superconducting quantum circuits are a promising hardware platform for realizing a fault-tolerant quantum computer. Accelerating progress in this field of research demands general approaches and computational tools to analyze and design more complex superconducting circuits. We develop a framework to systematically construct a superconducting quantum circuit's quantized Hamiltonian from its physical description. As is often the case with quantum descriptions of multicoordinate systems, the complexity rises rapidly with the number of variables. Therefore, we introduce a set of coordinate transformations with which we can find bases to diagonalize the Hamiltonian efficiently. Furthermore, we broaden our framework's scope to calculate the circuit's key properties required for optimizing and discovering novel qubits. We implement the methods described in this work in an open-source Python package SQcircuit. In this manuscript, we introduce the reader to the SQcircuit environment and functionality. We show through a series of examples how to analyze a number of interesting quantum circuits and obtain features such as the spectrum, coherence times, transition matrix elements, coupling operators, and the phase coordinate representation of eigenfunctions. | 翻訳日:2023-05-17 20:05:57 公開日:2023-05-15 |
# プレフィックス条件付言語とラベルスーパービジョン Prefix Conditioning Unifies Language and Label Supervision ( http://arxiv.org/abs/2206.01125v2 ) ライセンス: Link先を確認 | Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, Chun-Liang Li, Chen-Yu Lee, Kate Saenko, Tomas Pfister | (参考訳) 画像分類データセットは、画像認識モデルの事前学習に使用されている。
実験では、この簡易な手法により、ゼロショット画像認識精度と画像レベルの分布シフトに対するロバスト性が向上することを示す。 Image-classification datasets have been used to pretrain image recognition models. Recently, web-scale image-caption datasets have emerged as a source of powerful pretraining alternative. Image-caption datasets are more ``open-domain'', containing a wider variety of scene types and vocabulary words than traditional classification datasets, and models trained on these datasets have demonstrated strong performance on few- and zero-shot recognition tasks. When naively unifying image-classification and -caption dataset, we show that such dataset biases negatively affect pre-training by reducing the generalizability of learned representations and thus jeopardizing zero-shot performance since the unification can tailor the model for the classification dataset, making it vulnerable to the distribution shift from the dataset. In this work, we address the problem by disentangling the dataset bias using prefix tokens that inform a language encoder of the type of the input dataset (e.g., image-classification or caption) at training time. This approach allows the language encoder to share the knowledge from two datasets as well as switch the mode of feature extraction, i.e., image-classification dataset or image-caption dataset tailored mode, where we use image-caption mode in the zero-shot evaluation. Our method is generic and can be easily integrated into existing VL pre-training objectives such as CLIP or UniCL. In experiments, we show that this simple technique improves the performance in zero-shot image recognition accuracy and robustness to the image-level distribution shift. | 翻訳日:2023-05-17 20:05:24 公開日:2023-05-15 |
# 原子間多重終端アハロノフ-ボーム干渉計 Atomtronic multi-terminal Aharonov-Bohm interferometer ( http://arxiv.org/abs/2205.01636v3 ) ライセンス: Link先を確認 | Jonathan Wei Zhong Lau, Koon Siang Gan, Rainer Dumke, Luigi Amico, Leong-Chuan Kwek, Tobias Haug | (参考訳) 本研究では,合成磁束により貫通する3端子リング回路からなる寒冷原子用多機能デバイスについて検討した。
私たちの研究は、量子技術における実用的な応用のための新しい原子トロンデバイスの可能性を開きます。 We study a multi-functional device for cold atoms consisting of a three-terminal ring circuit pierced by a synthetic magnetic flux, where the ring can be continuous or discretized. The flux controls the atomic current through the ring via the Aharonov-Bohm effect. Our device shows a flux-induced transition of reflections from an Andreev-like negative density to positive density. Further, the flux can direct the atomic current into specific output ports, realizing a flexible non-reciprocal switch to connect multiple atomic systems or sense rotations. By changing the flux linearly in time, we convert constant matter wave currents into an AC modulated current. This effect can be used to realize an atomic frequency generator and study fundamental problems related to the Aharonov-Bohm effect. We experimentally demonstrate Bose-Einstein condensation into the light-shaped optical potential of the three-terminal ring. Our work opens up the possibility of novel atomtronic devices for practical applications in quantum technologies. | 翻訳日:2023-05-17 20:04:46 公開日:2023-05-15 |
# Federated Progressive Sparsification (Purge, Merge, Tune)+ Federated Progressive Sparsification (Purge, Merge, Tune)+ ( http://arxiv.org/abs/2204.12430v2 ) ライセンス: Link先を確認 | Dimitris Stripelis, Umang Gupta, Greg Ver Steeg, Jose Luis Ambite | (参考訳) ニューラルネットワークのフェデレートトレーニングを改善するために,プログレッシブウェイトマグニチュードプルーニングに基づくスパシフィケーション戦略であるFedSparsifyを開発した。
我々のスパースモデルは、既存のプルーニングや非プルーニングのベースラインと比較して、同じまたはより良い精度で元のモデルの10分の1に達することができる。 To improve federated training of neural networks, we develop FedSparsify, a sparsification strategy based on progressive weight magnitude pruning. Our method has several benefits. First, since the size of the network becomes increasingly smaller, computation and communication costs during training are reduced. Second, the models are incrementally constrained to a smaller set of parameters, which facilitates alignment/merging of the local models and improved learning performance at high sparsification rates. Third, the final sparsified model is significantly smaller, which improves inference efficiency and optimizes operations latency during encrypted communication. We show experimentally that FedSparsify learns a subnetwork of both high sparsity and learning performance. Our sparse models can reach a tenth of the size of the original model with the same or better accuracy compared to existing pruning and nonpruning baselines. | 翻訳日:2023-05-17 20:04:32 公開日:2023-05-15 |
# ai倫理の物語を広める: インディクティブ・アート・ビュー Broadening AI Ethics Narratives: An Indic Art View ( http://arxiv.org/abs/2204.03789v5 ) ライセンス: Link先を確認 | Ajay Divakaran and Aparna Sridhar and Ramya Srinivasan | (参考訳) 学際的な視点を取り入れることは、人工知能(AI)倫理の強化に不可欠なステップであると考えられている。
倫理的AIアルゴリズムに共感を取り入れることの必要性,(2)倫理的AIシステム設計と開発のためのマルチモーダルデータフォーマットを統合すること,(3)AI倫理を,価値の消滅なしに適応性を促進するための静的な自己完結型フレームワークとしてではなく,動的で多様性があり,累積的かつ共有的なプロセスとして見ること,(4)AI説明可能性を高める一貫した生涯学習の必要性を概説した。 Incorporating interdisciplinary perspectives is seen as an essential step towards enhancing artificial intelligence (AI) ethics. In this regard, the field of arts is perceived to play a key role in elucidating diverse historical and cultural narratives, serving as a bridge across research communities. Most of the works that examine the interplay between the field of arts and AI ethics concern digital artworks, largely exploring the potential of computational tools in being able to surface biases in AI systems. In this paper, we investigate a complementary direction--that of uncovering the unique socio-cultural perspectives embedded in human-made art, which in turn, can be valuable in expanding the horizon of AI ethics. Through semi-structured interviews across sixteen artists, art scholars, and researchers of diverse Indian art forms like music, sculpture, painting, floor drawings, dance, etc., we explore how {\it non-Western} ethical abstractions, methods of learning, and participatory practices observed in Indian arts, one of the most ancient yet perpetual and influential art traditions, can shed light on aspects related to ethical AI systems. Through a case study concerning the Indian dance system (i.e. the {\it `Natyashastra'}), we analyze potential pathways towards enhancing ethics in AI systems. Insights from our study outline the need for (1) incorporating empathy in ethical AI algorithms, (2) integrating multimodal data formats for ethical AI system design and development, (3) viewing AI ethics as a dynamic, diverse, cumulative, and shared process rather than as a static, self-contained framework to facilitate adaptability without annihilation of values (4) consistent life-long learning to enhance AI accountability | 翻訳日:2023-05-17 20:03:58 公開日:2023-05-15 |
# 偽ニュース検出のための偽ニュースのフェーキング:プロパガンダによるトレーニングデータ生成 Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation ( http://arxiv.org/abs/2203.05386v2 ) ライセンス: Link先を確認 | Kung-Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi and Heng Ji | (参考訳) 近年のニューラルモデルによる偽ニュースの検出の進歩にもかかわらず、その結果は人による偽情報の効果的な検出には適用できない。
実験の結果,プロパニューズで学習した偽ニュース検出器は,2つの公開データセットで3.62~7.69%のf1スコアで人文情報を検出するのに優れていることがわかった。 Despite recent advances in detecting fake news generated by neural models, their results are not readily applicable to effective detection of human-written disinformation. What limits the successful transfer between them is the sizable gap between machine-generated fake news and human-authored ones, including the notable differences in terms of style and underlying intent. With this in mind, we propose a novel framework for generating training examples that are informed by the known styles and strategies of human-authored propaganda. Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles, while also incorporating propaganda techniques, such as appeal to authority and loaded language. In particular, we create a new training dataset, PropaNews, with 2,256 examples, which we release for future use. Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets. | 翻訳日:2023-05-17 20:03:14 公開日:2023-05-15 |
# 混雑依存型大規模避難計画のためのシミュレーション支援最適化 Simulation-Assisted Optimization for Large-Scale Evacuation Planning with Congestion-Dependent Delays ( http://arxiv.org/abs/2209.01535v5 ) ライセンス: Link先を確認 | Kazi Ashik Islam, Da Qi Chen, Madhav Marathe, Henning Mortveit, Samarth Swarup, Anil Vullikanti | (参考訳) 避難計画は災害管理の重要な部分である。
テキサス州ヒューストンのハリス郡を 研究地域として使っています
さらに, MIP-LNS-SIMは, MIP-LNSと比較して, 推定避難完了時間の誤差が有意に低い。 Evacuation planning is a crucial part of disaster management. However, joint optimization of its two essential components, routing and scheduling, with objectives such as minimizing average evacuation time or evacuation completion time, is a computationally hard problem. To approach it, we present MIP-LNS, a scalable optimization method that utilizes heuristic search with mathematical optimization and can optimize a variety of objective functions. We also present the method MIP-LNS-SIM, where we combine agent-based simulation with MIP-LNS to estimate delays due to congestion, as well as, find optimized plans considering such delays. We use Harris County in Houston, Texas, as our study area. We show that, within a given time limit, MIP-LNS finds better solutions than existing methods in terms of three different metrics. However, when congestion dependent delay is considered, MIP-LNS-SIM outperforms MIP-LNS in multiple performance metrics. In addition, MIP-LNS-SIM has a significantly lower percent error in estimated evacuation completion time compared to MIP-LNS. | 翻訳日:2023-05-17 19:57:03 公開日:2023-05-15 |
# 破滅的投機に関する一般的な推測 Challenging Common Assumptions about Catastrophic Forgetting ( http://arxiv.org/abs/2207.04543v2 ) ライセンス: Link先を確認 | Timoth\'ee Lesort, Oleksiy Ostapenko, Diganta Misra, Md Rifat Arefin, Pau Rodr\'iguez, Laurent Charlin, Irina Rish | (参考訳) 知識を段階的に学習し蓄積できる学習エージェントの構築は、継続学習(CL)研究分野のコア目標である。
そこで我々は,SCoLe (Scaling Continual Learning) という新しいフレームワークを提案し,SGDで訓練したDNNに破滅的な忘れ込みが限定的であることを示す。
各種データ発生頻度の異なるDNNにおけるKAを実験的に検討し,DNNにおける知識蓄積を高めるためのシンプルでスケーラブルな戦略を提案する。 Building learning agents that can progressively learn and accumulate knowledge is the core goal of the continual learning (CL) research field. Unfortunately, training a model on new data usually compromises the performance on past data. In the CL literature, this effect is referred to as catastrophic forgetting (CF). CF has been largely studied, and a plethora of methods have been proposed to address it on short sequences of non-overlapping tasks. In such setups, CF always leads to a quick and significant drop in performance in past tasks. Nevertheless, despite CF, recent work showed that SGD training on linear models accumulates knowledge in a CL regression setup. This phenomenon becomes especially visible when tasks reoccur. We might then wonder if DNNs trained with SGD or any standard gradient-based optimization accumulate knowledge in such a way. Such phenomena would have interesting consequences for applying DNNs to real continual scenarios. Indeed, standard gradient-based optimization methods are significantly less computationally expensive than existing CL algorithms. In this paper, we study the progressive knowledge accumulation (KA) in DNNs trained with gradient-based algorithms in long sequences of tasks with data re-occurrence. We propose a new framework, SCoLe (Scaling Continual Learning), to investigate KA and discover that catastrophic forgetting has a limited effect on DNNs trained with SGD. When trained on long sequences with data sparsely re-occurring, the overall accuracy improves, which might be counter-intuitive given the CF phenomenon. We empirically investigate KA in DNNs under various data occurrence frequencies and propose simple and scalable strategies to increase knowledge accumulation in DNNs. | 翻訳日:2023-05-17 19:55:37 公開日:2023-05-15 |
# 安定化器PEPSにおける測定ベース量子ワイヤの分類 Classification of measurement-based quantum wire in stabilizer PEPS ( http://arxiv.org/abs/2207.00616v3 ) ライセンス: Link先を確認 | Paul Herringer, Robert Raussendorf | (参考訳) 我々は、安定化器対称性を持つ翻訳不変2次元テンソルネットワークのクラスを考察し、安定化器PEPSと呼ぶ。
さらに、他の12のクラスも識別する。 We consider a class of translation-invariant 2D tensor network states with a stabilizer symmetry, which we call stabilizer PEPS. The cluster state, GHZ state, and states in the toric code belong to this class. We investigate the transmission capacity of stabilizer PEPS for measurement-based quantum wire, and arrive at a complete classification of transmission behaviors. The transmission behaviors fall into 13 classes, one of which corresponds to Clifford quantum cellular automata. In addition, we identify 12 other classes. | 翻訳日:2023-05-17 19:54:52 公開日:2023-05-15 |
# ソーシャルメディアトピック分類のための非パラメトリック時間適応 Non-Parametric Temporal Adaptation for Social Media Topic Classification ( http://arxiv.org/abs/2209.05706v2 ) ライセンス: Link先を確認 | Fatemehsadat Mireshghallah, Nikolai Vogler, Junxian He, Omar Florez, Ahmed El-Kishky, Taylor Berg-Kirkpatrick | (参考訳) 新しいトレンドがオンラインの議論に影響を与え、個人情報がプライバシーの懸念から削除されるにつれ、ユーザー生成のソーシャルメディアデータは絶えず変化している。
そこで本研究では, 逐次的ハッシュタグ予測の課題を通して時間適応を考察し, 単純かつ効果的な解として, 再学習を必要としない非パラメトリック高密度検索手法を提案する。
我々の高密度検索アプローチは、データプライバシ法に従って動的に削除されるユーザデータにも適しており、計算コストと性能損失は無視できる。 User-generated social media data is constantly changing as new trends influence online discussion and personal information is deleted due to privacy concerns. However, most current NLP models are static and rely on fixed training data, which means they are unable to adapt to temporal change -- both test distribution shift and deleted training data -- without frequent, costly re-training. In this paper, we study temporal adaptation through the task of longitudinal hashtag prediction and propose a non-parametric dense retrieval technique, which does not require re-training, as a simple but effective solution. In experiments on a newly collected, publicly available, year-long Twitter dataset exhibiting temporal distribution shift, our method improves by 64.12% over the best parametric baseline without any of its costly gradient-based updating. Our dense retrieval approach is also particularly well-suited to dynamically deleted user data in line with data privacy laws, with negligible computational cost and performance loss. | 翻訳日:2023-05-17 19:44:55 公開日:2023-05-15 |
# 線形力学系の観測予測における公平性 Fairness in Forecasting of Observations of Linear Dynamical Systems ( http://arxiv.org/abs/2209.05274v4 ) ライセンス: Link先を確認 | Quan Zhou, Jakub Marecek, Robert N. Shorten | (参考訳) 機械学習では、トレーニングデータはしばしば、下層の人間集団の複数のサブグループの振る舞いを捉えている。
保険申請に動機づけられた偏りのあるデータセットとよく知られたCompASデータセットに対する実験結果から,本手法の有効性が示された。 In machine learning, training data often capture the behaviour of multiple subgroups of some underlying human population. This behaviour can often be modelled as observations of an unknown dynamical system with an unobserved state. When the training data for the subgroups are not controlled carefully, however, under-representation bias arises. To counter under-representation bias, we introduce two natural notions of fairness in time-series forecasting problems: subgroup fairness and instantaneous fairness. These notions extend predictive parity to the learning of dynamical systems. We also show globally convergent methods for the fairness-constrained learning problems using hierarchies of convexifications of non-commutative polynomial optimisation problems. We also show that by exploiting sparsity in the convexifications, we can reduce the run time of our methods considerably. Our empirical results on a biased data set motivated by insurance applications and the well-known COMPAS data set demonstrate the efficacy of our methods. | 翻訳日:2023-05-17 19:44:36 公開日:2023-05-15 |
# 薬物応答予測のためのハイブリッド量子ニューラルネットワーク Hybrid quantum neural network for drug response prediction ( http://arxiv.org/abs/2211.05777v2 ) ライセンス: Link先を確認 | Asel Sagingalieva, Mohammad Kordzanganeh, Nurbolat Kenbayev, Daria Kosichkina, Tatiana Tomashuk, Alexey Melnikov | (参考訳) がんは世界中の死因の1つである。
提案されたハイブリッド量子機械学習モデルは、データ収集が課題であるパーソナライズ医療における問題を解決するために、数千の量子ゲートを持つ深層量子データ効率アルゴリズムへの一歩である。 Cancer is one of the leading causes of death worldwide. It is caused by a variety of genetic mutations, which makes every instance of the disease unique. Since chemotherapy can have extremely severe side effects, each patient requires a personalized treatment plan. Finding the dosages that maximize the beneficial effects of the drugs and minimize their adverse side effects is vital. Deep neural networks automate and improve drug selection. However, they require a lot of data to be trained on. Therefore, there is a need for machine-learning approaches that require less data. Hybrid quantum neural networks were shown to provide a potential advantage in problems where training data availability is limited. We propose a novel hybrid quantum neural network for drug response prediction, based on a combination of convolutional, graph convolutional, and deep quantum neural layers of 8 qubits with 363 layers. We test our model on the reduced Genomics of Drug Sensitivity in Cancer dataset and show that the hybrid quantum model outperforms its classical analog by 15% in predicting IC50 drug effectiveness values. The proposed hybrid quantum machine learning model is a step towards deep quantum data-efficient algorithms with thousands of quantum gates for solving problems in personalized medicine, where data collection is a challenge. | 翻訳日:2023-05-17 19:37:41 公開日:2023-05-15 |
# 蛍光強度三重相関によるAb初期空間位相検索 Ab Initio Spatial Phase Retrieval via Fluorescence Intensity Triple Correlations ( http://arxiv.org/abs/2210.03793v2 ) ライセンス: Link先を確認 | Nolan Peard, Kartik Ayyer, and Henry N. Chapman | (参考訳) 非コヒーレントエミッタからの2次強度相関は、空間分布のフーリエ変換係数を明らかにすることができるが、実空間への完全一般フーリエ変換を可能にするための位相の検索は依然として困難である。
本稿では, 強度三重相関を用いた ab initio 相の一般検索法について述べる。
この研究により、フーリエ変換を直接実行し、遠方界の強度相関のみを通して任意の独立したエミッター配列の画像を再構成することができるようになった。 Second-order intensity correlations from incoherent emitters can reveal the Fourier transform modulus of their spatial distribution, but retrieving the phase to enable completely general Fourier inversion to real space remains challenging. Phase retrieval via the third-order intensity correlations has relied on special emitter configurations which simplified an unaddressed sign problem in the computation. Without a complete treatment of this sign problem, the general case of retrieving the Fourier phase from a truly arbitrary configuration of emitters is not possible. In this paper, a general method for ab initio phase retrieval via the intensity triple correlations is described. Simulations demonstrate accurate phase retrieval for clusters of incoherent emitters which could be applied to imaging stars or fluorescent atoms and molecules. With this work, it is now finally tractable to perform Fourier inversion directly and reconstruct images of arbitrary arrays of independent emitters via far-field intensity correlations alone. | 翻訳日:2023-05-17 19:35:16 公開日:2023-05-15 |
# 言語間移動のためのフラストレーションやすいラベル投影法 Frustratingly Easy Label Projection for Cross-lingual Transfer ( http://arxiv.org/abs/2211.15613v4 ) ライセンス: Link先を確認 | Yang Chen, Chao Jiang, Alan Ritter, Wei Xu | (参考訳) 訓練データを多くの言語に翻訳することは、言語間転送を改善するための実用的な解決策として現れてきた。
近年, ラベル付きスパンの周囲に特別なマーカーを挿入することにより, 翻訳と投影を共同で行うための簡易なマーク翻訳手法が試みられている。
すべてのコードとデータを公開します。 Translating training data into many languages has emerged as a practical solution for improving cross-lingual transfer. For tasks that involve span-level annotations, such as information extraction or question answering, an additional label projection step is required to map annotated spans onto the translated texts. Recently, a few efforts have utilized a simple mark-then-translate method to jointly perform translation and projection by inserting special markers around the labeled spans in the original sentence. However, as far as we are aware, no empirical analysis has been conducted on how this approach compares to traditional annotation projection based on word alignment. In this paper, we present an extensive empirical study across 57 languages and three tasks (QA, NER, and Event Extraction) to evaluate the effectiveness and limitations of both methods, filling an important gap in the literature. Experimental results show that our optimized version of mark-then-translate, which we call EasyProject, is easily applied to many languages and works surprisingly well, outperforming the more complex word alignment-based methods. We analyze several key factors that affect the end-task performance, and show EasyProject works well because it can accurately preserve label span boundaries after translation. We will publicly release all our code and data. | 翻訳日:2023-05-17 19:27:35 公開日:2023-05-15 |
# 超大語彙を持つ大規模事前学習モデル:ヘブライ語のBERTモデルの対比分析と、その全てを上回る新しいモデル Large Pre-Trained Models with Extra-Large Vocabularies: A Contrastive Analysis of Hebrew BERT Models and a New One to Outperform Them All ( http://arxiv.org/abs/2211.15199v2 ) ライセンス: Link先を確認 | Eylon Gueta, Avi Shmidman, Shaltiel Shmidman, Cheyn Shmuel Shmidman, Joshua Guedalia, Moshe Koppel, Dan Bareket, Amit Seker, Reut Tsarfaty | (参考訳) 我々は,従来のヘブライ語plmよりもはるかに大きな語彙(128k項目)を用いた現代ヘブライ語のための新しい事前学習言語モデル(plm)を提案する。
我々は,従来のヘブライ語 PLM (mBERT, heBERT, AlephBERT) に対して,このモデルを対照的に解析し,より大きな語彙がタスク性能に与える影響を評価する。
すべての新しいモデルにおいて、Morphological Segmentation、POS Tagging、Full Morphological Analysis、NER、Sentiment Analysisを含むすべてのHebrewベンチマークで新しいSOTAを実現している。
制限のない使用のために、新しいモデルを公開しています。 We present a new pre-trained language model (PLM) for modern Hebrew, termed AlephBERTGimmel, which employs a much larger vocabulary (128K items) than standard Hebrew PLMs before. We perform a contrastive analysis of this model against all previous Hebrew PLMs (mBERT, heBERT, AlephBERT) and assess the effects of larger vocabularies on task performance. Our experiments show that larger vocabularies lead to fewer splits, and that reducing splits is better for model performance, across different tasks. All in all this new model achieves new SOTA on all available Hebrew benchmarks, including Morphological Segmentation, POS Tagging, Full Morphological Analysis, NER, and Sentiment Analysis. Subsequently we advocate for PLMs that are larger not only in terms of number of layers or training data, but also in terms of their vocabulary. We release the new model publicly for unrestricted use. | 翻訳日:2023-05-17 19:27:12 公開日:2023-05-15 |
# c-TPE:高パラメータ最適化のための不等式制約付き木構造パーゼン推定器 c-TPE: Tree-structured Parzen Estimator with Inequality Constraints for Expensive Hyperparameter Optimization ( http://arxiv.org/abs/2211.14411v3 ) ライセンス: Link先を確認 | Shuhei Watanabe, Frank Hutter | (参考訳) ハイパーパラメータ最適化(hpo)は、ディープラーニングアルゴリズムの強力なパフォーマンスに不可欠であり、現実世界のアプリケーションは、しばしばメモリ使用量やパフォーマンス要求の遅延といったいくつかの制約を課す。
本研究では,多用途ベイズ最適化手法である木構造パルゼン推定器(tree-structured parzen estimator, tpe)の拡張である制約付きtpe (c-tpe) を提案する。
ベースラインの欠如により,Appendix D におけるハードコントラスト最適化への本手法の適用性についてのみ論じる。 Hyperparameter optimization (HPO) is crucial for strong performance of deep learning algorithms and real-world applications often impose some constraints, such as memory usage, or latency on top of the performance requirement. In this work, we propose constrained TPE (c-TPE), an extension of the widely-used versatile Bayesian optimization method, tree-structured Parzen estimator (TPE), to handle these constraints. Our proposed extension goes beyond a simple combination of an existing acquisition function and the original TPE, and instead includes modifications that address issues that cause poor performance. We thoroughly analyze these modifications both empirically and theoretically, providing insights into how they effectively overcome these challenges. In the experiments, we demonstrate that c-TPE exhibits the best average rank performance among existing methods with statistical significance on 81 expensive HPO with inequality constraints. Due to the lack of baselines, we only discuss the applicability of our method to hard-constrained optimization in Appendix D. | 翻訳日:2023-05-17 19:26:36 公開日:2023-05-15 |
# Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias を用いたコントラスト言語ビジョンAIモデル Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias ( http://arxiv.org/abs/2212.11261v2 ) ライセンス: Link先を確認 | Robert Wolfe, Yiwei Yang, Bill Howe, Aylin Caliskan | (参考訳) ウェブスクレイプで訓練された9つの言語ビジョンaiモデルと対照的な言語イメージ前訓練(clip)の目的を、心理学者が研究したバイアスの証拠として評価する: 感情のような人間の特徴が無視され、その人物が身体として扱われるときに起こる、少女と女性の性的対象化。
埋め込み関連テスト (eats) は怒り (d >0.80) と悲しみ (d >0.50) の両方に対して大きな効果を返し、完全に服を着た被験者のイメージと感情を関連付ける。
自動画像キャプション装置(antarctic captions)は、完全に服を着た女性の画像よりも、部分的に服を着た女性の画像の50%未満の感情を示す単語を含む。
第4の実験では、"a [age] old girl"のプロンプトが、VQGAN-CLIPとStable Diffusionの73%の時間(NSFW分類器によって決定される)で性的なイメージを生成する。
この証拠は、ウェブスクラップで訓練された言語ビジョンAIモデルは、下流のアプリケーションに伝播する性的対象化のバイアスを学ぶことを示している。 Nine language-vision AI models trained on web scrapes with the Contrastive Language-Image Pretraining (CLIP) objective are evaluated for evidence of a bias studied by psychologists: the sexual objectification of girls and women, which occurs when a person's human characteristics, such as emotions, are disregarded and the person is treated as a body. We replicate three experiments in psychology quantifying sexual objectification and show that the phenomena persist in AI. A first experiment uses standardized images of women from the Sexual OBjectification and EMotion Database, and finds that human characteristics are disassociated from images of objectified women: the model's recognition of emotional state is mediated by whether the subject is fully or partially clothed. Embedding association tests (EATs) return significant effect sizes for both anger (d >0.80) and sadness (d >0.50), associating images of fully clothed subjects with emotions. GRAD-CAM saliency maps highlight that CLIP gets distracted from emotional expressions in objectified images. A second experiment measures the effect in a representative application: an automatic image captioner (Antarctic Captions) includes words denoting emotion less than 50% as often for images of partially clothed women than for images of fully clothed women. A third experiment finds that images of female professionals (scientists, doctors, executives) are likely to be associated with sexual descriptions relative to images of male professionals. A fourth experiment shows that a prompt of "a [age] year old girl" generates sexualized images (as determined by an NSFW classifier) up to 73% of the time for VQGAN-CLIP and Stable Diffusion; the corresponding rate for boys never surpasses 9%. The evidence indicates that language-vision AI models trained on web scrapes learn biases of sexual objectification, which propagate to downstream applications. | 翻訳日:2023-05-17 19:19:39 公開日:2023-05-15 |
# 適応型ポリトープによるニューラルネットワーク制御システムの到達可能性自動解析 Automated Reachability Analysis of Neural Network-Controlled Systems via Adaptive Polytopes ( http://arxiv.org/abs/2212.07553v3 ) ライセンス: Link先を確認 | Taha Entesari, Mahyar Fazlyab | (参考訳) 到達可能な力学系の集合を過度に近似することは、安全性検証と堅牢な制御合成における根本的な問題である。
本稿では,ニューラルネットワーク制御による線形システムの到達可能性解析における提案手法の有用性について述べる。 Over-approximating the reachable sets of dynamical systems is a fundamental problem in safety verification and robust control synthesis. The representation of these sets is a key factor that affects the computational complexity and the approximation error. In this paper, we develop a new approach for over-approximating the reachable sets of neural network dynamical systems using adaptive template polytopes. We use the singular value decomposition of linear layers along with the shape of the activation functions to adapt the geometry of the polytopes at each time step to the geometry of the true reachable sets. We then propose a branch-and-bound method to compute accurate over-approximations of the reachable sets by the inferred templates. We illustrate the utility of the proposed approach in the reachability analysis of linear systems driven by neural network controllers. | 翻訳日:2023-05-17 19:17:10 公開日:2023-05-15 |
# D適応による学習時間自由学習 Learning-Rate-Free Learning by D-Adaptation ( http://arxiv.org/abs/2301.07733v4 ) ライセンス: Link先を確認 | Aaron Defazio and Konstantin Mishchenko | (参考訳) d-適応(d-adaptation)は、バックトラッキングやラインサーチなしに凸リプシッツ関数を最小化するための収束率を漸近的に達成し、ステップごとに追加の関数値や勾配評価を行わない学習率を自動的に設定する手法である。
オープンソース実装が利用可能だ。 D-Adaptation is an approach to automatically setting the learning rate which asymptotically achieves the optimal rate of convergence for minimizing convex Lipschitz functions, with no back-tracking or line searches, and no additional function value or gradient evaluations per step. Our approach is the first hyper-parameter free method for this class without additional multiplicative log factors in the convergence rate. We present extensive experiments for SGD and Adam variants of our method, where the method automatically matches hand-tuned learning rates across more than a dozen diverse machine learning problems, including large-scale vision and language problems. An open-source implementation is available. | 翻訳日:2023-05-17 19:08:32 公開日:2023-05-15 |
# ViT-AE++:自己教師型医用画像表現のための視覚変換器オートエンコーダの改良 ViT-AE++: Improving Vision Transformer Autoencoder for Self-supervised Medical Image Representations ( http://arxiv.org/abs/2301.07382v2 ) ライセンス: Link先を確認 | Chinmay Prabhakar, Hongwei Bran Li, Jiancheng Yang, Suprosana Shit, Benedikt Wiestler, and Bjoern Menze | (参考訳) 自己教師付き学習は、アノテーションなしでデータからデータ駆動表現を学ぶことで注目を集めている。
He et al. (2021) による視覚トランスフォーマーベースのオートエンコーダ (ViT-AE) は、パッチマスキング戦略を用いて有意義な潜在空間を学習する。
コードはこちら。 https://github.com/chinmay5/vit_ae_plus_plus.git。 Self-supervised learning has attracted increasing attention as it learns data-driven representation from data without annotations. Vision transformer-based autoencoder (ViT-AE) by He et al. (2021) is a recent self-supervised learning technique that employs a patch-masking strategy to learn a meaningful latent space. In this paper, we focus on improving ViT-AE (nicknamed ViT-AE++) for a more effective representation of 2D and 3D medical images. We propose two new loss functions to enhance the representation during training. The first loss term aims to improve self-reconstruction by considering the structured dependencies and indirectly improving the representation. The second loss term leverages contrastive loss to optimize the representation from two randomly masked views directly. We extended ViT-AE++ to a 3D fashion for volumetric medical images as an independent contribution. We extensively evaluate ViT-AE++ on both natural images and medical images, demonstrating consistent improvement over vanilla ViT-AE and its superiority over other contrastive learning approaches. Codes are here: https://github.com/chinmay5/vit_ae_plus_plus.git. | 翻訳日:2023-05-17 19:08:17 公開日:2023-05-15 |
# Pic2Word:ゼロショット合成画像検索のための単語への画像マッピング Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval ( http://arxiv.org/abs/2302.03084v2 ) ライセンス: Link先を確認 | Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, Chun-Liang Li, Chen-Yu Lee, Kate Saenko, Tomas Pfister | (参考訳) 合成画像検索(cir)では、ユーザはクエリ画像をテキストと組み合わせ、目的とするターゲットを記述する。
本研究では,ラベル付き三重項学習を必要とせずにCIRモデルを構築することを目的とした,Zero-Shot Composed Image Retrieval (ZS-CIR) という重要な課題について検討する。
コードはhttps://github.com/google-research/composed_image_retrievalで公開される予定だ。 In Composed Image Retrieval (CIR), a user combines a query image with text to describe their intended target. Existing methods rely on supervised learning of CIR models using labeled triplets consisting of the query image, text specification, and the target image. Labeling such triplets is expensive and hinders broad applicability of CIR. In this work, we propose to study an important task, Zero-Shot Composed Image Retrieval (ZS-CIR), whose goal is to build a CIR model without requiring labeled triplets for training. To this end, we propose a novel method, called Pic2Word, that requires only weakly labeled image-caption pairs and unlabeled image datasets to train. Unlike existing supervised CIR models, our model trained on weakly labeled or unlabeled datasets shows strong generalization across diverse ZS-CIR tasks, e.g., attribute editing, object composition, and domain conversion. Our approach outperforms several supervised CIR methods on the common CIR benchmark, CIRR and Fashion-IQ. Code will be made publicly available at https://github.com/google-research/composed_image_retrieval. | 翻訳日:2023-05-17 18:59:38 公開日:2023-05-15 |
# ChatGPTを用いたゼロショット臨床エンティティ認識 Zero-shot Clinical Entity Recognition using ChatGPT ( http://arxiv.org/abs/2303.16416v2 ) ライセンス: Link先を確認 | Yan Hu, Iqra Ameer, Xu Zuo, Xueqing Peng, Yujia Zhou, Zehan Li, Yiming Li, Jianfu Li, Xiaoqian Jiang, Hua Xu | (参考訳) 本研究では,2010 年の i2b2 チャレンジで定義された臨床名称のエンティティ認識タスクに対して,OpenAI が開発した大規模言語モデル ChatGPT を,2 つの異なるプロンプト戦略を持つゼロショット設定で検討した。
その結果,chatgpt はゼロショット設定で gpt-3 を上回り,f1 スコアは 0.418 (vs.0.250) と 0.620 (vs. 0.480) で一致した。
ChatGPTの性能は、教師付きBioClinicalBERTモデル(つまり、ゆるやかなマッチングF1スコア0.620 vs. 0.888)よりも依然として低かったが、本研究では、ゼロショット設定で臨床NERタスクに対するChatGPTの大きな可能性を示した。 In this study, we investigated the potential of ChatGPT, a large language model developed by OpenAI, for the clinical named entity recognition task defined in the 2010 i2b2 challenge, in a zero-shot setting with two different prompt strategies. We compared its performance with GPT-3 in a similar zero-shot setting, as well as a fine-tuned BioClinicalBERT model using a set of synthetic clinical notes from MTSamples. Our findings revealed that ChatGPT outperformed GPT-3 in the zero-shot setting, with F1 scores of 0.418 (vs.0.250) and 0.620 (vs. 0.480) for exact- and relaxed-matching, respectively. Moreover, prompts affected ChatGPT's performance greatly, with relaxed-matching F1 scores of 0.628 vs.0.541 for two different prompt strategies. Although ChatGPT's performance was still lower than that of the supervised BioClinicalBERT model (i.e., relaxed-matching F1 scores of 0.620 vs. 0.888), our study demonstrates the great potential of ChatGPT for clinical NER tasks in a zero-shot setting, which is much more appealing as it does not require any annotation. | 翻訳日:2023-05-17 18:41:29 公開日:2023-05-15 |
# Mind the Backbone:ロバストオブジェクト検出のためのバックボーン歪みの最小化 Mind the Backbone: Minimizing Backbone Distortion for Robust Object Detection ( http://arxiv.org/abs/2303.14744v2 ) ライセンス: Link先を確認 | Kuniaki Saito, Donghyun Kim, Piotr Teterwak, Rogerio Feris, Kate Saenko | (参考訳) ドメインシフトにロバストなオブジェクト検出器の構築は、現実世界のアプリケーションにとって非常に重要です。
以前のアプローチでは、事前トレーニングされたバックボーンを微調整し、それをin-distribution (id)データにオーバーフィットさせ、out-of-distribution (ood) 一般化に有用な特徴を歪めるリスクを負う。
本稿では,バックボーンの脆弱性を特徴的歪みを測定する手法としてRGN(Relative Gradient Norm)を提案し,高いRGNがOOD性能の低下と実際に相関していることを示す。
RGNの分析は興味深い結果をもたらす: 一部のバックボーンは微調整中にOODの堅牢性を失うが、そのアーキテクチャが初期モデルから過度にパラメータが変化するのを防ぐため、ロバスト性を失う。
コードはhttps://github.com/visionlearninggroup/mind_backで入手できる。 Building object detectors that are robust to domain shifts is critical for real-world applications. Prior approaches fine-tune a pre-trained backbone and risk overfitting it to in-distribution (ID) data and distorting features useful for out-of-distribution (OOD) generalization. We propose to use Relative Gradient Norm (RGN) as a way to measure the vulnerability of a backbone to feature distortion, and show that high RGN is indeed correlated with lower OOD performance. Our analysis of RGN yields interesting findings: some backbones lose OOD robustness during fine-tuning, but others gain robustness because their architecture prevents the parameters from changing too much from the initial model. Given these findings, we present recipes to boost OOD robustness for both types of backbones. Specifically, we investigate regularization and architectural choices for minimizing gradient updates so as to prevent the tuned backbone from losing generalizable features. Our proposed techniques complement each other and show substantial improvements over baselines on diverse architectures and datasets. Code is available at https://github.com/VisionLearningGroup/mind_back. | 翻訳日:2023-05-17 18:40:25 公開日:2023-05-15 |
# 合成体験リプレイ Synthetic Experience Replay ( http://arxiv.org/abs/2303.06614v2 ) ライセンス: Link先を確認 | Cong Lu, Philip J. Ball, Yee Whye Teh, Jack Parker-Holder | (参考訳) 過去10年の主なテーマは、大規模なニューラルネットワークと大規模なデータセットを組み合わせることで、素晴らしい結果が得られることだ。
deep reinforcement learning(rl)では、このパラダイムは経験リプレイを通じて一般的に実現され、過去の経験のデータセットがポリシやバリュー関数のトレーニングに使用される。
本研究では,生成モデルにおける最近の大きな進歩を活かし,エージェントの収集した経験を柔軟に評価するための拡散ベースアプローチであるsynthetic experience replay(synther)を提案する。
最後に、コードをhttps://github.com/conglu 1997/SynthER.comでオープンソース化します。 A key theme in the past decade has been that when large neural networks and large datasets combine they can produce remarkable results. In deep reinforcement learning (RL), this paradigm is commonly made possible through experience replay, whereby a dataset of past experiences is used to train a policy or value function. However, unlike in supervised or self-supervised learning, an RL agent has to collect its own data, which is often limited. Thus, it is challenging to reap the benefits of deep learning, and even small neural networks can overfit at the start of training. In this work, we leverage the tremendous recent progress in generative modeling and propose Synthetic Experience Replay (SynthER), a diffusion-based approach to flexibly upsample an agent's collected experience. We show that SynthER is an effective method for training RL agents across offline and online settings, in both proprioceptive and pixel-based environments. In offline settings, we observe drastic improvements when upsampling small offline datasets and see that additional synthetic data also allows us to effectively train larger networks. Furthermore, SynthER enables online agents to train with a much higher update-to-data ratio than before, leading to a significant increase in sample efficiency, without any algorithmic changes. We believe that synthetic training data could open the door to realizing the full potential of deep learning for replay-based RL algorithms from limited data. Finally, we open-source our code at https://github.com/conglu1997/SynthER. | 翻訳日:2023-05-17 18:39:13 公開日:2023-05-15 |
# シナジー関数の分散: 機械学習説明可能性のためのゲーム理論的相互作用手法の統合 Distributing Synergy Functions: Unifying Game-Theoretic Interaction Methods for Machine-Learning Explainability ( http://arxiv.org/abs/2305.03100v2 ) ライセンス: Link先を確認 | Daniel Lundstrom and Meisam Razaviyayn | (参考訳) ディープラーニングはコンピュータビジョンから自然言語処理まで、機械学習の多くの領域に革命をもたらしたが、これらの高性能モデルは一般に「ブラックボックス」である。
したがって、コミュニティは属性とインタラクションメソッドを開発し、採用する際に、目標とコンテキストを特定する必要がある。 Deep learning has revolutionized many areas of machine learning, from computer vision to natural language processing, but these high-performance models are generally "black box." Explaining such models would improve transparency and trust in AI-powered decision making and is necessary for understanding other practical needs such as robustness and fairness. A popular means of enhancing model transparency is to quantify how individual inputs contribute to model outputs (called attributions) and the magnitude of interactions between groups of inputs. A growing number of these methods import concepts and results from game theory to produce attributions and interactions. This work presents a unifying framework for game-theory-inspired attribution and $k^\text{th}$-order interaction methods. We show that, given modest assumptions, a unique full account of interactions between features, called synergies, is possible in the continuous input setting. We identify how various methods are characterized by their policy of distributing synergies. We also demonstrate that gradient-based methods are characterized by their actions on monomials, a type of synergy function, and introduce unique gradient-based methods. We show that the combination of various criteria uniquely defines the attribution/interaction methods. Thus, the community needs to identify goals and contexts when developing and employing attribution and interaction methods. | 翻訳日:2023-05-17 18:21:20 公開日:2023-05-15 |
# LLT: 線形法則に基づく特徴空間変換のためのRパッケージ LLT: An R package for Linear Law-based Feature Space Transformation ( http://arxiv.org/abs/2304.14211v2 ) ライセンス: Link先を確認 | Marcell T. Kurbucz, P\'eter P\'osfay, Antal Jakov\'ac | (参考訳) 線形法則に基づく特徴空間変換(LLT)アルゴリズムの目標は、単変量および多変量時系列の分類を支援することである。
LLT Rパッケージと適切なデータ構造を持つサンプルデータセットはGitHubで公開されている。 The goal of the linear law-based feature space transformation (LLT) algorithm is to assist with the classification of univariate and multivariate time series. The presented R package, called LLT, implements this algorithm in a flexible yet user-friendly way. This package first splits the instances into training and test sets. It then utilizes time-delay embedding and spectral decomposition techniques to identify the governing patterns (called linear laws) of each input sequence (initial feature) within the training set. Finally, it applies the linear laws of the training set to transform the initial features of the test set. These steps are performed by three separate functions called trainTest, trainLaw, and testTrans. Their application requires a predefined data structure; however, for fast calculation, they use only built-in functions. The LLT R package and a sample dataset with the appropriate data structure are publicly available on GitHub. | 翻訳日:2023-05-17 18:19:51 公開日:2023-05-15 |
# 不均衡ラベルサンプル分布を用いたファッション検出のためのデータ効率向上 Data Efficient Training with Imbalanced Label Sample Distribution for Fashion Detection ( http://arxiv.org/abs/2305.04379v3 ) ライセンス: Link先を確認 | Xin Shen, Praful Agrawal, Zhongwei Cheng | (参考訳) マルチラベル分類モデルは、視覚に基づくラベル予測や言語に基づく感情分類など、Eコマースに幅広い応用がある。
ファッション業界で人気のファッション属性タイプであるスリーブタイプとアーチタイプを用いた新しい重み付け機構の堅牢性をさらに評価した。 Multi-label classification models have a wide range of applications in E-commerce, including visual-based label predictions and language-based sentiment classifications. A major challenge in achieving satisfactory performance for these tasks in the real world is the notable imbalance in data distribution. For instance, in fashion attribute detection, there may be only six 'puff sleeve' clothes among 1000 products in most E-commerce fashion catalogs. To address this issue, we explore more data-efficient model training techniques rather than acquiring a huge amount of annotations to collect sufficient samples, which is neither economic nor scalable. In this paper, we propose a state-of-the-art weighted objective function to boost the performance of deep neural networks (DNNs) for multi-label classification with long-tailed data distribution. Our experiments involve image-based attribute classification of fashion apparels, and the results demonstrate favorable performance for the new weighting method compared to non-weighted and inverse-frequency-based weighting mechanisms. We further evaluate the robustness of the new weighting mechanism using two popular fashion attribute types in today's fashion industry: sleevetype and archetype. | 翻訳日:2023-05-17 18:10:05 公開日:2023-05-15 |
# 自律型GIS:次世代AI搭載GIS Autonomous GIS: the next-generation AI-powered GIS ( http://arxiv.org/abs/2305.06453v2 ) ライセンス: Link先を確認 | Zhenlong Li, Huan Ning | (参考訳) ChatGPTのような大規模言語モデル(LLM)は、人間の自然言語を強く理解し、推論、創造的記述、コード生成、翻訳、情報検索など様々な分野で研究され、応用されてきた。
我々は,Python 環境で GPT-4 API を用いた LLM-Geo というプロトタイプシステムを開発した。
我々は,GIScienceコミュニティに対して,自律型GISの研究・開発により多くの努力を払って,空間分析をより容易に,より早く,よりアクセスしやすいものにすることを提唱する。 Large Language Models (LLMs), such as ChatGPT, demonstrate a strong understanding of human natural language and have been explored and applied in various fields, including reasoning, creative writing, code generation, translation, and information retrieval. By adopting LLM as the reasoning core, we introduce Autonomous GIS as an AI-powered geographic information system (GIS) that leverages the LLM's general abilities in natural language understanding, reasoning and coding for addressing spatial problems with automatic spatial data collection, analysis and visualization. We envision that autonomous GIS will need to achieve five autonomous goals including self-generating, self-organizing, self-verifying, self-executing, and self-growing. We developed a prototype system called LLM-Geo using GPT-4 API in a Python environment, demonstrating what an autonomous GIS looks like and how it delivers expected results without human intervention using two case studies. For both case studies, LLM-Geo returned accurate results, including aggregated numbers, graphs, and maps, significantly reducing manual operation time. Although still lacking several important modules such as logging and code testing, LLM-Geo demonstrates a potential path towards next-generation AI-powered GIS. We advocate for the GIScience community to dedicate more effort to the research and development of autonomous GIS, making spatial analysis easier, faster, and more accessible to a broader audience. | 翻訳日:2023-05-17 18:01:53 公開日:2023-05-15 |
# SoGAR:自己監督型時空間注意に基づく社会集団活動認識 SoGAR: Self-supervised Spatiotemporal Attention-based Social Group Activity Recognition ( http://arxiv.org/abs/2305.06310v2 ) ライセンス: Link先を確認 | Naga VS Raviteja Chappa, Pha Nguyen, Alexander H Nelson, Han-Seok Seo, Xin Li, Page Daniel Dobbs, Khoa Luu | (参考訳) 本稿では,未ラベル映像データを効果的に活用できる自己教師型トランスフォーマーネットワークを用いた社会集団活動認識(SoGAR)への新たなアプローチを提案する。
提案手法は,JRDB-PAR,NBA,Volleyballの3つのグループ活動認識ベンチマークにおいて,F1スコア,MCA,MPCAの3指標を上回り,最先端の成果を得た。 This paper introduces a novel approach to Social Group Activity Recognition (SoGAR) using Self-supervised Transformers network that can effectively utilize unlabeled video data. To extract spatio-temporal information, we created local and global views with varying frame rates. Our self-supervised objective ensures that features extracted from contrasting views of the same video were consistent across spatio-temporal domains. Our proposed approach is efficient in using transformer-based encoders to alleviate the weakly supervised setting of group activity recognition. By leveraging the benefits of transformer models, our approach can model long-term relationships along spatio-temporal dimensions. Our proposed SoGAR method achieved state-of-the-art results on three group activity recognition benchmarks, namely JRDB-PAR, NBA, and Volleyball datasets, surpassing the current numbers in terms of F1-score, MCA, and MPCA metrics. | 翻訳日:2023-05-17 18:00:43 公開日:2023-05-15 |
# Semantic Embedded Deep Neural Network: マルチラベル画像分類性能向上のためのジェネリックアプローチ Semantic Embedded Deep Neural Network: A Generic Approach to Boost Multi-Label Image Classification Performance ( http://arxiv.org/abs/2305.05228v2 ) ライセンス: Link先を確認 | Xin Shen, Xiaonan Zhao, Rui Luo | (参考訳) 細粒度のマルチラベル分類モデルは、ファッション属性の検出からブランド認識まで、視覚的なラベル予測など、amazonのプロダクション機能に幅広く応用されている。
我々は,avg.relative improvement (avg.relative improvement) を全ラベルのaucスコアで15.27%向上させた。
結果は我々のアプローチに好成績を示した。 Fine-grained multi-label classification models have broad applications in Amazon production features, such as visual based label predictions ranging from fashion attribute detection to brand recognition. One challenge to achieve satisfactory performance for those classification tasks in real world is the wild visual background signal that contains irrelevant pixels which confuses model to focus onto the region of interest and make prediction upon the specific region. In this paper, we introduce a generic semantic-embedding deep neural network to apply the spatial awareness semantic feature incorporating a channel-wise attention based model to leverage the localization guidance to boost model performance for multi-label prediction. We observed an Avg.relative improvement of 15.27% in terms of AUC score across all labels compared to the baseline approach. Core experiment and ablation studies involve multi-label fashion attribute classification performed on Instagram fashion apparels' image. We compared the model performances among our approach, baseline approach, and 3 alternative approaches to leverage semantic features. Results show favorable performance for our approach. | 翻訳日:2023-05-17 17:58:56 公開日:2023-05-15 |
# 超伝導量子ビットにおける量子ゲートの誤差源 Error Sources of Quantum Gates in Superconducting Qubits ( http://arxiv.org/abs/2305.08916v1 ) ライセンス: Link先を確認 | Miha Papi\v{c}, Adrian Auer, In\'es de Vega | (参考訳) トランスモンベースの超伝導量子ビットアーキテクチャは、大規模量子計算の実現に最も期待できる候補の1つであるため、実装された量子ゲートにおけるエラーの主な原因は何かを知ることが重要である。
さらに,実験結果の少ない一連のゲートの不確実性に対する各ノイズ源の寄与を抽出できる学習ベースフレームワークを提供する。 As transmon based superconducting qubit architectures are one of the most promising candidates for the realization of large-scale quantum computation, it is crucial to know what are the main sources of the error in the implemented quantum gates. In this work we make a realistic assessment of the contributions of physical error sources to the infidelities of both single and two-qubit gates, where we focus on the non-adiabatic implementation of the CZ gate with tunable couplers. We consider all relevant noise sources, including non-Markovian noise, electronics imperfections and the effect of tunable couplers to the error of the computation. Furthermore, we provide a learning based framework that allows to extract the contribution of each noise source to the infidelity of a series of gates with a small number of experimental measurements. | 翻訳日:2023-05-17 17:43:16 公開日:2023-05-15 |
# キラルエッジ状態のトポロジー保護と非線形干渉 Non-Linear Interference Challenging Topological Protection of Chiral Edge States ( http://arxiv.org/abs/2305.08912v1 ) ライセンス: Link先を確認 | Benjamin Michen, Jan Carl Budich | (参考訳) 我々は,カイラルエッジモードで伝播するウェーブパケットのトポロジカル保護の概念に挑戦する非線形散乱効果について報告する。
まず, 強度依存性の光学指標を用いて非線形性が設計されているフォトニック結晶設定法について, 実験結果から予測を検証できる2つの物理プラットフォームを提案する。
第2に、非線形グロス・ピタエフスキー方程式によって制御される光学ハニカム格子内の低温原子のボース・アインシュタイン凝縮は、多体相互作用を効果的に説明できる。 We report on a non-linear scattering effect that challenges the notion of topological protection for wave packets propagating in chiral edge modes. Specifically, in a Floquet topological system with a non-linear potential, we demonstrate how a wave packet propagating in a chiral edge mode may be irreversibly deflected by scattering off a localized wave-packet, or pass the collision region virtually unaffected in an approximately linear fashion. An experimentally accessible knob to tune between those two scenarios is provided by the relative phase between the involved wave-packets. This genuinely non-linear interference phenomenon is in stark contrast to linear scattering off a static impurity, which cannot destroy a topological edge state. Besides corroborating our findings with numerically exact simulations, we propose two physical platforms where our predictions may be verified with state of the art experimental techniques: First, a photonic crystal setting where non-linearity has been engineered via an intensity-dependent optical index. Second, a Bose-Einstein condensate of cold atoms in an optical Honeycomb lattice governed by a non-linear Gross-Pitaevskii equation that effectively accounts for many-body interactions. | 翻訳日:2023-05-17 17:43:05 公開日:2023-05-15 |
# Colloquium:量子と古典的な離散時間結晶 Colloquium: Quantum and Classical Discrete Time Crystals ( http://arxiv.org/abs/2305.08904v1 ) ライセンス: Link先を確認 | Michael P. Zaletel, Mikhail Lukin, Christopher Monroe, Chetan Nayak, Frank Wilczek, Norman Y. Yao | (参考訳) 時間翻訳対称性の自発的な崩壊は、離散時間結晶という新しい物質相の発見につながった。
本稿では, 離散時間結晶の鍵となるエルゴディディティの破壊と, ACジョセフソン効果, 結合地図格子, ファラデー波など, 離散時間結晶の性質の多くを共通する多数の現象の源泉としてのエルゴディディティの遅延に着目した。
最後に、このコロキウムは、この分野における卓越した挑戦と、実験と理論の両面での新しい方向性のビジョンを説明することで結論付ける。 The spontaneous breaking of time translation symmetry has led to the discovery of a new phase of matter - the discrete time crystal. Discrete time crystals exhibit rigid subharmonic oscillations, which result from a combination of many-body interactions, collective synchronization, and ergodicity breaking. This Colloquium reviews recent theoretical and experimental advances in the study of quantum and classical discrete time crystals. We focus on the breaking of ergodicity as the key to discrete time crystals and the delaying of ergodicity as the source of numerous phenomena that share many of the properties of discrete time crystals, including the AC Josephson effect, coupled map lattices, and Faraday waves. Theoretically, there exists a diverse array of strategies to stabilize time crystalline order in both closed and open systems, ranging from localization and prethermalization to dissipation and error correction. Experimentally, many-body quantum simulators provide a natural platform for investigating signatures of time crystalline order; recent work utilizing trapped ions, solid-state spin systems, and superconducting qubits will be reviewed. Finally, this Colloquium concludes by describing outstanding challenges in the field and a vision for new directions on both the experimental and theoretical fronts. | 翻訳日:2023-05-17 17:42:45 公開日:2023-05-15 |
# 共通拡散騒音スケジューリングとサンプルステップの欠陥 Common Diffusion Noise Schedules and Sample Steps are Flawed ( http://arxiv.org/abs/2305.08891v1 ) ライセンス: Link先を確認 | Shanchuan Lin, Bingchen Liu, Jiashi Li, Xiao Yang | (参考訳) 一般的な拡散雑音のスケジュールは、信号対雑音比(snr)をゼロにする最後の時間ステップを強制せず、拡散サンプラーの実装のいくつかは、最後の時間ステップから開始しない。
安定拡散(Stable Diffusion)では、モデルが中輝度の画像のみを生成することを厳しく制限し、非常に明るく暗いサンプルを生成するのを防ぐ。
我々は,(1) ノイズスケジュールを再スケールして端末snrをゼロにする,(2) モデルをv予測でトレーニングする,(3) サンプリング器を最後の時間ステップから常に起動するように変更する,(4) 過度な露出を防止するための再スケール分類器フリーガイダンスを提案する。
これらの単純な変更により、トレーニングと推論の間に拡散プロセスが一致し、モデルは元のデータ分布に忠実なサンプルを生成することができる。 We discover that common diffusion noise schedules do not enforce the last timestep to have zero signal-to-noise ratio (SNR), and some implementations of diffusion samplers do not start from the last timestep. Such designs are flawed and do not reflect the fact that the model is given pure Gaussian noise at inference, creating a discrepancy between training and inference. We show that the flawed design causes real problems in existing implementations. In Stable Diffusion, it severely limits the model to only generate images with medium brightness and prevents it from generating very bright and dark samples. We propose a few simple fixes: (1) rescale the noise schedule to enforce zero terminal SNR; (2) train the model with v prediction; (3) change the sampler to always start from the last timestep; (4) rescale classifier-free guidance to prevent over-exposure. These simple changes ensure the diffusion process is congruent between training and inference and allow the model to generate samples more faithful to the original data distribution. | 翻訳日:2023-05-17 17:42:23 公開日:2023-05-15 |
# 差動畳み込みファジィ時系列予測 Differential Convolutional Fuzzy Time Series Forecasting ( http://arxiv.org/abs/2305.08890v1 ) ライセンス: Link先を確認 | Tianxiang Zhan, Yuanpeng He, Yong Deng, Zhen Li | (参考訳) ファジィ時系列予測(FTSF)は適用範囲が広い典型的な予測手法である。
最後に、DFCNNはFTSFを改善するためのさらなるアイデアを提供し、継続的な研究価値を保持している。 Fuzzy time series forecasting (FTSF) is a typical forecasting method with wide application. Traditional FTSF is regarded as an expert system which leads to lose the ability to recognize undefined feature. The mentioned is main reason of poor forecasting with FTSF. To solve the problem, the proposed model Differential Fuzzy Convolutional Neural Network (DFCNN) utilizes convolution neural network to re-implement FTSF with learnable ability. DFCNN is capable of recognizing the potential information and improve the forecasting accuracy. Thanks to learnable ability of neural network, length of fuzzy rules established in FTSF is expended to arbitrary length which expert is not able to be handle by expert system. At the same time, FTSF usually cannot achieve satisfactory performance of non-stationary time series due to trend of non-stationary time series. The trend of non-stationary time series causes the fuzzy set established by FTSF to invalid and cause the forecasting to fail. DFCNN utilizes the Difference algorithm to weaken the non-stationarity of time series, so that DFCNN can forecast the non-stationary time series with low error that FTSF cannot forecast in satisfactory performance. After mass of experiments, DFCNN has excellent prediction effect, which is ahead of the existing FTSF and common time series forecasting algorithms. Finally, DFCNN provides further ideas for improving FTSF and holds continued research value. | 翻訳日:2023-05-17 17:42:02 公開日:2023-05-15 |
# 新しいデータのための新しい方法?
HRM研究のための定量誘導法の概要と実例 New methods for new data? An overview and illustration of quantitative inductive methods for HRM research ( http://arxiv.org/abs/2305.08889v1 ) ライセンス: Link先を確認 | Alain LACROUX (UP1 EMS) | (参考訳) 要するに「データは新しい石油」は、データが現在進行中の第4次産業革命の本質的な源であり、一部のコメンテーターは、データ量そのものを急速に富の源泉に同化させ、ビッグデータの発展を準直接的な利益源とみなすようになった。
本研究の目的は,HRM研究に利用可能なデータ駆動手法の概要を最初に提示し,潜在プロファイル分析とガウス図形モデルを用いた探索的研究からなる実証図面を提案することである。 "Data is the new oil", in short, data would be the essential source of the ongoing fourth industrial revolution, which has led some commentators to assimilate too quickly the quantity of data to a source of wealth in itself, and consider the development of big data as an quasi direct cause of profit. Human resources management is not escaping this trend, and the accumulation of large amounts of data on employees is perceived by some entrepreneurs as a necessary and sufficient condition for the construction of predictive models of complex work behaviors such as absenteeism or job performance. In fact, the analogy is somewhat misleading: unlike oil, there are no major issues here concerning the production of data (whose flows are generated continuously and at low cost by various information systems), but rather their ''refining'', i.e. the operations necessary to transform this data into a useful product, namely into knowledge. This transformation is where the methodological challenges of data valuation lie, both for practitioners and for academic researchers. Considerations on the methods applicable to take advantage of the possibilities offered by these massive data are relatively recent, and often highlight the disruptive aspect of the current ''data deluge'' to point out that this evolution would be the source of a revival of empiricism in a ''fourth paradigm'' based on the intensive and ''agnostic'' exploitation of massive amounts of data in order to bring out new knowledge, following a purely inductive logic. Although we do not adopt this speculative point of view, it is clear that data-driven approaches are scarce in quantitative HRM studies. However, there are well-established methods, particularly in the field of data mining, which are based on inductive approaches. This area of quantitative analysis with an inductive aim is still relatively unexplored in HRM ( apart from typological analyses). The objective of this paper is first to give an overview of data driven methods that can be used for HRM research, before proposing an empirical illustration which consists in an exploratory research combining a latent profile analysis and an exploration by Gaussian graphical models. | 翻訳日:2023-05-17 17:41:40 公開日:2023-05-15 |
# Covariate-Distance Weighted Regression (CWR):住宅価格推定のための事例研究 Covariate-distance Weighted Regression (CWR): A Case Study for Estimation of House Prices ( http://arxiv.org/abs/2305.08887v1 ) ライセンス: Link先を確認 | Hone-Jay Chu, Po-Hung Chen, Sheng-Mao Chang, Muhammad Zeeshan Ali, Sumriti Ranjan Patra | (参考訳) 地理的重み付き回帰(GWR)は回帰モデルにおける空間的不均一性をモデル化するための一般的なツールである。
cwrは従来の空間回帰モデルから推定誤差を効果的に低減し、空間推定のための新規かつ実現可能なモデルを提供する。 Geographically weighted regression (GWR) is a popular tool for modeling spatial heterogeneity in a regression model. However, the current weighting function used in GWR only considers the geographical distance, while the attribute similarity is totally ignored. In this study, we proposed a covariate weighting function that combines the geographical distance and attribute distance. The covariate-distance weighted regression (CWR) is the extension of GWR including geographical distance and attribute distance. House prices are affected by numerous factors, such as house age, floor area, and land use. Prediction model is used to help understand the characteristics of regional house prices. The CWR was used to understand the relationship between the house price and controlling factors. The CWR can consider the geological and attribute distances, and produce accurate estimates of house price that preserve the weight matrix for geological and attribute distance functions. Results show that the house attributes/conditions and the characteristics of the house, such as floor area and house age, might affect the house price. After factor selection, in which only house age and floor area of a building are considered, the RMSE of the CWR model can be improved by 2.9%-26.3% for skyscrapers when compared to the GWR. CWR can effectively reduce estimation errors from traditional spatial regression models and provide novel and feasible models for spatial estimation. | 翻訳日:2023-05-17 17:41:00 公開日:2023-05-15 |
# データマイニングによる建物のエネルギー消費・コスト削減要因の同定 Identification of the Factors Affecting the Reduction of Energy Consumption and Cost in Buildings Using Data Mining Techniques ( http://arxiv.org/abs/2305.08886v1 ) ライセンス: Link先を確認 | Hamed Khosravi, Hadi Sahebi, Rahim khanizad, Imtiaz Ahmed | (参考訳) エネルギー消費の最適化とユーティリティシステムの調整は、建築業界にとって長年の関心事であった。
これを実現するために,3つの回帰モデル (Lasso Regression, Decision Tree, Random Forest) を用いて一次燃料使用量, 電力消費量, コスト削減量を予測する。
最後に, 原燃料使用量, 電力消費量, コストを削減できるポテンシャル・非ポテンシャルビルの実用的特徴について概観する。 Optimizing energy consumption and coordination of utility systems have long been a concern of the building industry. Buildings are one of the largest energy consumers in the world, making their energy efficiency crucial for preventing waste and reducing costs. Additionally, buildings generate substantial amounts of raw data, which can be used to understand energy consumption patterns and assist in developing optimization strategies. Using a real-world dataset, this research aims to identify the factors that influence building cost reduction and energy consumption. To achieve this, we utilize three regression models (Lasso Regression, Decision Tree, and Random Forest) to predict primary fuel usage, electrical energy consumption, and cost savings in buildings. An analysis of the factors influencing energy consumption and cost reduction is conducted, and the decision tree algorithm is optimized using metaheuristics. By employing metaheuristic techniques, we fine-tune the decision tree algorithm's parameters and improve its accuracy. Finally, we review the most practical features of potential and nonpotential buildings that can reduce primary fuel usage, electrical energy consumption, and costs | 翻訳日:2023-05-17 17:40:38 公開日:2023-05-15 |
# ニューラルネットワークのロバスト解釈可能性に関する因果解析 Causal Analysis for Robust Interpretability of Neural Networks ( http://arxiv.org/abs/2305.08950v1 ) ライセンス: Link先を確認 | Ola Ahmad, Nicolas Bereux, Vahid Hashemi, Freddy Lecue | (参考訳) ニューラルネットワークの内部機能を解釈することは、これらのブラックボックスモデルの信頼性の高い開発と展開に不可欠である。
さらに、基礎となる因果グラフはモデル内の神経相互作用を明らかにし、他のアプリケーション(例えばモデル修復)で有用なツールとなる。 Interpreting the inner function of neural networks is crucial for the trustworthy development and deployment of these black-box models. Prior interpretability methods focus on correlation-based measures to attribute model decisions to individual examples. However, these measures are susceptible to noise and spurious correlations encoded in the model during the training phase (e.g., biased inputs, model overfitting, or misspecification). Moreover, this process has proven to result in noisy and unstable attributions that prevent any transparent understanding of the model's behavior. In this paper, we develop a robust interventional-based method grounded by causal analysis to capture cause-effect mechanisms in pre-trained neural networks and their relation to the prediction. Our novel approach relies on path interventions to infer the causal mechanisms within hidden layers and isolate relevant and necessary information (to model prediction), avoiding noisy ones. The result is task-specific causal explanatory graphs that can audit model behavior and express the actual causes underlying its performance. We apply our method to vision models trained on classification tasks. On image classification tasks, we provide extensive quantitative experiments to show that our approach can capture more stable and faithful explanations than standard attribution-based methods. Furthermore, the underlying causal graphs reveal the neural interactions in the model, making it a valuable tool in other applications (e.g., model repair). | 翻訳日:2023-05-17 17:33:17 公開日:2023-05-15 |
# Bare Homography による画像マッチング Image Matching by Bare Homography ( http://arxiv.org/abs/2305.08946v1 ) ライセンス: Link先を確認 | Fabio Bellavia | (参考訳) 本稿では,シーンを粗い局所重なり面としてモデル化する,新しい非奥行き画像マッチングフレームワークslimeを提案する。
この分析によれば、この分野における印象的な進歩にもかかわらず、今後の研究で検討すべき改善の余地は広い。 This paper presents Slime, a novel non-deep image matching framework which models the scene as rough local overlapping planes. This intermediate representation sits in-between the local affine approximation of the keypoint patches and the global matching based on both geometrical and similarity constraints, providing a progressive pruning of the correspondences, as planes are easier to handle with respect to general scenes. Slime proceeds by selectively detect, expand, merge and refine the matches associated to almost-planar areas of the scene by exploiting homography constraints. As a result, both the coverage and stability of correct matches over the scene are amplified, allowing traditional hybrid matching pipelines to make up lost ground against recent end-to-end deep matching methods. In addition, the paper gives a thorough comparative analysis of recent state-of-the-art in image matching represented by end-to-end deep networks and hybrid pipelines. The evaluation considers both planar and non-planar scenes, taking into account critical and challenging scenarios including abrupt temporal image changes and strong variations in relative image rotations. According to this analysis, although the impressive progress done in this field, there is still a wide room for improvements to be investigated in future research. | 翻訳日:2023-05-17 17:32:38 公開日:2023-05-15 |
# 潜在的な再正規化、ラムシフト、平均力ギブス状態 -- シフトするかシフトしないか? Potential renormalisation, Lamb shift and mean-force Gibbs state -- to shift or not to shift? ( http://arxiv.org/abs/2305.08941v1 ) ライセンス: Link先を確認 | Luis A. Correa and Jonas Glatthard | (参考訳) オープンシステムは、たとえ浴槽に弱結合しても、「再編成エネルギー」によって定量化され、無視できないポテンシャル再正規化を経験することができる。
したがって、量子熱力学における熱流の計算において、見過ごされている問題に光を当てた。 An open system, even if coupled weakly to a bath, can experience a non-negligible potential renormalisation, quantified by the `reorganisation energy'. Often, the microscopic system-bath coupling gives rise to a counter term which adds to the bare Hamiltonian, exactly compensating for such potential distortion. On the other hand, when describing quantum dissipative dynamics with weak-coupling master equations, a number of `Lamb-shift terms' appear which, contrary to popular belief, cannot be neglected. And yet, the practice of vanishing both the counter term and Lamb-shift contributions is almost universal; and, surprisingly, it gives excellent results. In this paper we use a damped quantum harmonic oscillator to analytically show that subtracting the reorganisation energy from the Hamiltonian and then suppressing the Lamb-shift terms from the resulting master equation, does indeed yield an excellent approximation to the exact steady state and long-time dynamics. Put differently, those seemingly unjustified steps succeed at building the asymptotic mean-force Gibbs state -- or rather, its classical limit -- into the master equation. This can noticeably increase its accuracy, specially at moderate-to-low temperatures and even up to intermediate coupling. We thus shed light on an overlooked issue that becomes critical in the calculation of heat currents in quantum thermodynamics. | 翻訳日:2023-05-17 17:32:03 公開日:2023-05-15 |
# DopUS-Net:ドップラー信号に基づく高品質ロボット超音波イメージング DopUS-Net: Quality-Aware Robotic Ultrasound Imaging based on Doppler Signal ( http://arxiv.org/abs/2305.08938v1 ) ライセンス: Link先を確認 | Zhongliang Jiang, Felix Duelmer, Nassir Navab | (参考訳) 医用超音波(US)は、放射線のない利点のために、特に予備検診プログラムにおいて、血管疾患の評価とステージに広く用いられている。
実験の結果,再同定手法によるアプローチにより,セグメント化結果の精度とロバスト性が大幅に向上することがわかった(diceスコア:0:54から0:86、結合上の交差:0:47から0:78)。 Medical ultrasound (US) is widely used to evaluate and stage vascular diseases, in particular for the preliminary screening program, due to the advantage of being radiation-free. However, automatic segmentation of small tubular structures (e.g., the ulnar artery) from cross-sectional US images is still challenging. To address this challenge, this paper proposes the DopUS-Net and a vessel re-identification module that leverage the Doppler effect to enhance the final segmentation result. Firstly, the DopUS-Net combines the Doppler images with B-mode images to increase the segmentation accuracy and robustness of small blood vessels. It incorporates two encoders to exploit the maximum potential of the Doppler signal and recurrent neural network modules to preserve sequential information. Input to the first encoder is a two-channel duplex image representing the combination of the grey-scale Doppler and B-mode images to ensure anatomical spatial correctness. The second encoder operates on the pure Doppler images to provide a region proposal. Secondly, benefiting from the Doppler signal, this work first introduces an online artery re-identification module to qualitatively evaluate the real-time segmentation results and automatically optimize the probe pose for enhanced Doppler images. This quality-aware module enables the closed-loop control of robotic screening to further improve the confidence and robustness of image segmentation. The experimental results demonstrate that the proposed approach with the re-identification process can significantly improve the accuracy and robustness of the segmentation results (dice score: from 0:54 to 0:86; intersection over union: from 0:47 to 0:78). | 翻訳日:2023-05-17 17:31:37 公開日:2023-05-15 |
# MIMEx: Masked Input Modelingの本質的なリワード MIMEx: Intrinsic Rewards from Masked Input Modeling ( http://arxiv.org/abs/2305.08932v1 ) ライセンス: Link先を確認 | Toru Lin, Allan Jabri | (参考訳) 高次元観測環境の探索は困難である。
この観点から,マスク分布を柔軟に調整し,条件付き予測タスクの難易度を制御できる,固有報酬(Masked Input Modeling for Exploration, MIMEx)を導出するための一般的なフレームワークを提案する。
我々は,sparse-reward visuomotorタスク群における競合ベースラインと比較して,mimexが優れた結果が得られることを示す。 Exploring in environments with high-dimensional observations is hard. One promising approach for exploration is to use intrinsic rewards, which often boils down to estimating "novelty" of states, transitions, or trajectories with deep networks. Prior works have shown that conditional prediction objectives such as masked autoencoding can be seen as stochastic estimation of pseudo-likelihood. We show how this perspective naturally leads to a unified view on existing intrinsic reward approaches: they are special cases of conditional prediction, where the estimation of novelty can be seen as pseudo-likelihood estimation with different mask distributions. From this view, we propose a general framework for deriving intrinsic rewards -- Masked Input Modeling for Exploration (MIMEx) -- where the mask distribution can be flexibly tuned to control the difficulty of the underlying conditional prediction task. We demonstrate that MIMEx can achieve superior results when compared against competitive baselines on a suite of challenging sparse-reward visuomotor tasks. | 翻訳日:2023-05-17 17:31:08 公開日:2023-05-15 |
# AF2-Mutation: タンパク質第3次構造予測におけるαFold2の逆配列変異 AF2-Mutation: Adversarial Sequence Mutations against AlphaFold2 on Protein Tertiary Structure Prediction ( http://arxiv.org/abs/2305.08929v1 ) ライセンス: Link先を確認 | Zhongju Yuan, Tao Shen, Sheng Xu, Leiye Yu, Ruobing Ren, Siqi Sun | (参考訳) AlphaFold2 (AF2)のような深層学習に基づくアプローチは、タンパク質第3次構造予測を著しく進歩させ、実際の生物学的実験手法に匹敵する結果を達成している。
さらに,特定のタンパク質であるsns2に適用した場合,タンパク質構造決定に必須な生物学的に有意な残基を同定し,代替コンフォメーションを示唆し,実験プロセスを著しく高速化した。 Deep learning-based approaches, such as AlphaFold2 (AF2), have significantly advanced protein tertiary structure prediction, achieving results comparable to real biological experimental methods. While AF2 has shown limitations in predicting the effects of mutations, its robustness against sequence mutations remains to be determined. Starting with the wild-type (WT) sequence, we investigate adversarial sequences generated via an evolutionary approach, which AF2 predicts to be substantially different from WT. Our experiments on CASP14 reveal that by modifying merely three residues in the protein sequence using a combination of replacement, deletion, and insertion strategies, the alteration in AF2's predictions, as measured by the Local Distance Difference Test (lDDT), reaches 46.61. Moreover, when applied to a specific protein, SPNS2, our proposed algorithm successfully identifies biologically meaningful residues critical to protein structure determination and potentially indicates alternative conformations, thus significantly expediting the experimental process. | 翻訳日:2023-05-17 17:30:48 公開日:2023-05-15 |
# 絶対安定な離散時間結晶 Absolutely Stable Discrete Time Crystals ( http://arxiv.org/abs/2305.08925v1 ) ライセンス: Link先を確認 | Krzysztof Giergiel, Jia Wang, Bryan J. Dalton, Peter Hannaford, Krzysztof Sacha | (参考訳) 回転格子ポテンシャルによって周期的に駆動される環上の相互作用ボゾンは、絶対安定な離散時間結晶をサポートすることができる。
しかし、蹴られたボソン模型は時間と空間対称性の破れの間のよりリッチな相互作用を示している。 We show that interacting bosons on a ring which are driven periodically by a rotating lattice potential can support absolutely stable discrete time crystals. The absolute stability is demonstrated by an exact mapping of discrete time crystal states to low-lying eigenstates of a time-independent model that reveals spontaneous breaking of space translation symmetry. The mapping ensures that there are no residual time-dependent terms that could lead to heating of the system and destruction of discrete time crystals. With the help of the Bethe ansatz solutions we also analyze periodically kicked bosons where the mapping is approximate only and cannot guarantee the absolute stability of discrete time crystals. However, the kicked boson model shows a richer interplay between time and space symmetry breaking. | 翻訳日:2023-05-17 17:30:27 公開日:2023-05-15 |
# 量子コンピュータの資源効率利用 Resource-efficient utilization of quantum computers ( http://arxiv.org/abs/2305.08924v1 ) ライセンス: Link先を確認 | Ijaz Ahamed Mohammad, Matej Pivoluska, Martin Plesch | (参考訳) 量子コンピューティングの現在の状態は一般に、ノイズの多い中間スケール量子時代と呼ばれる。
本手法は,水素分子の基底状態エネルギーを求めるために用いられる変分量子アルゴリズムの具体例で実証する。 The current state of quantum computing is commonly described as the Noisy Intermediate-Scale Quantum era. Available computers contain a few dozens of qubits and can perform a few dozens of operations before the inevitable noise erases all information encoded in the calculation. Even if the technology advances fast within the next years, any use of quantum computers will be limited to short and simple tasks, serving as subroutines of more complex classical procedures. Even for these applications the resource efficiency, measured in the number of quantum computer runs, will be a key parameter. Here we suggest a general optimization procedure for hybrid quantum-classical algorithms that allows finding the optimal approach with limited quantum resources. We demonstrate this procedure on a specific example of variational quantum algorithm used to find the ground state energy of a hydrogen molecule. | 翻訳日:2023-05-17 17:30:16 公開日:2023-05-15 |
# U(1)対称系における高次相関関数の解析的アプローチ Analytical approach to higher-order correlation function in U(1) symmetric systems ( http://arxiv.org/abs/2305.08923v1 ) ライセンス: Link先を確認 | Zhi-Guang Lu, Cheng Shang, Ying Wu, and Xin-You L\"u | (参考訳) 我々は、弱いコヒーレント状態入力の下で散乱行列(S-行列)を用いて、$n$thの等時相関関数のコンパクトな解析解を導出した。
さらに,Python のユーザフレンドリなオープンソースライブラリであるquantum correlationsolvr を開発した。このツールは,上記の基準を満たす様々な散逸性量子システムを研究するための便利な手段を提供する。
本研究は,S行列を用いて光相関を解析し,複雑な系を探索する可能性を推し進めるための新たな基盤を打破する。 We derive a compact analytical solution of the $n$th-order equal-time correlation functions by using the scattering matrix (S-matrix) under a weak coherent state input. Our solution applies to any dissipative quantum system that satisfies the U(1) symmetry. We further extend our analytical solution into two categories depending on whether the input and output channels are identical. The first category provides a new path for studying cross-correlation and multiple drives cases, while the second category is instrumental in studying waveguide quantum electrodynamics systems. Our analytical solution allows for easy investigation of the statistical properties of multiple photons even in complex systems. Furthermore, we have developed a user-friendly open-source library in Python known as the quantum correlation solver, and this tool provides a convenient means to study various dissipative quantum systems that satisfy the abovementioned criterion. Our study breaks new ground for using the S-matrix to study the photonic correlation and advance the possibilities for exploring complex systems. | 翻訳日:2023-05-17 17:30:03 公開日:2023-05-15 |
# フェルミオン行列式を持たないゲージ場および物質に対する量子モンテカルロ Quantum Monte Carlo for Gauge Fields and Matter without the Fermion Determinant ( http://arxiv.org/abs/2305.08917v1 ) ライセンス: Link先を確認 | Debasish Banerjee and Emilie Huffman | (参考訳) 強相互作用するフェルミオン系のab-initioモンテカルロシミュレーションはフェルミオンサイン問題に苦しめられ、密度の高い量子物質の多くの興味深いレジーム、あるいは奇数のフェルミオンフレーバーの理論の非摂動的研究が困難である。
低温におけるガウスの法則の出現を、1+1-$dで$U(1)$モデルで示す。 Ab-initio Monte Carlo simulations of strongly-interacting fermionic systems are plagued by the fermion sign problem, making the non-perturbative study of many interesting regimes of dense quantum matter, or of theories of odd numbers of fermion flavors, challenging. Moreover, typical fermion algorithms require the computation (or sampling) of the fermion determinant. We focus instead on the meron cluster algorithm, which can solve the fermion sign problem in a class of models without involving the determinant. We develop and benchmark new meron algorithms to simulate fermions coupled to $\mathbb{Z}_2$ and $U(1)$ gauge fields to uncover potential exotic properties of matter, particularly relevant for quantum simulator experiments. We demonstrate the emergence of the Gauss' Law at low temperatures for a $U(1)$ model in $(1+1)-$d. | 翻訳日:2023-05-17 17:29:47 公開日:2023-05-15 |
# Bi-CMOS電子フォトニック集積回路量子光検出器 A Bi-CMOS electronic-photonic integrated circuit quantum light detector ( http://arxiv.org/abs/2305.08990v1 ) ライセンス: Link先を確認 | Joel F. Tasker, Jonathan Frazer, Giacomo Ferranti, Jonathan C. F. Matthews | (参考訳) 補完的金属酸化物半導体(CMOS)互換量子技術は、量子コンピュータ構築に必要な古典的読み出しと制御エレクトロニクスとのスケーラブルな統合を可能にする。
ここでは,250nmのリソグラフィバイポーラcmosプロセスで作製した80~\mu\mathrm{m} \times 220~\mu\mathrm{m}$の量子ノイズ制限モノリシック電子・フォトニック集積型ホモダイン検出器について報告する。
エレクトロニクスとフォトニクスのモノリシックな統合により、全体の容量は抑制される -- これは量子光の高帯域幅測定の主要なボトルネックである。
これは、CMOS電子フォトニクス統合による量子フォトニクスの性能向上を示す。 Complimentary metal-oxide-semiconductor (CMOS) compatible quantum technology enables scalable integration with the classical readout and control electronics needed to build quantum computers. Homodyne detectors have applications across quantum technologies including quantum computers, and they comprise photonics and electronics. Here we report a quantum noise limited monolithic electronic-photonic integrated homodyne detector, with an overall footprint of $80~\mu\mathrm{m} \times 220~\mu\mathrm{m}$, fabricated in a 250~nm lithography bi-polar CMOS process. By monolithic integration of the electronics and photonics, overall capacitance is suppressed -- this is the main bottleneck to high bandwidth measurement of quantum light. We measure a 3~dB bandwidth of 19.8~GHz and a maximum shot noise clearance of 15~dB. This exceeds bandwidth limits of detectors with macroscopic electronic interconnects, including wirebonding and flip-chip bonding. This demonstrates CMOS electronic-photonic integration enhancing performance of quantum photonics. | 翻訳日:2023-05-17 17:23:51 公開日:2023-05-15 |
# LoViT:手術用位相認識用長ビデオトランス LoViT: Long Video Transformer for Surgical Phase Recognition ( http://arxiv.org/abs/2305.08989v1 ) ライセンス: Link先を確認 | Yang Liu, Maxence Boels, Luis C. Garcia-Peraza-Herrera, Tom Vercauteren, Prokar Dasgupta, Alejandro Granados and Sebastien Ourselin | (参考訳) オンラインの手術相認識は、パフォーマンスを定量化し、手術ワークフローの実行を監督するコンテキストツールを構築する上で重要な役割を果たす。
本稿では,Long Video Transformer (LoViT) と呼ばれる,時間的に豊富な空間的特徴抽出器と,自己意図に基づく2つのL-Transモジュールからなる大規模時間的アグリゲータを組み合わせた,短時間・長期の時間的情報を融合する2段階の手法を提案する。
以上の結果から,本手法は,異なる手術手順と時間的シークエンシング特性の2つのデータセット上での外科的位相認識の最先端化に有効であり,また,ロングビデオ対応のメカニズムも導入している。 Online surgical phase recognition plays a significant role towards building contextual tools that could quantify performance and oversee the execution of surgical workflows. Current approaches are limited since they train spatial feature extractors using frame-level supervision that could lead to incorrect predictions due to similar frames appearing at different phases, and poorly fuse local and global features due to computational constraints which can affect the analysis of long videos commonly encountered in surgical interventions. In this paper, we present a two-stage method, called Long Video Transformer (LoViT) for fusing short- and long-term temporal information that combines a temporally-rich spatial feature extractor and a multi-scale temporal aggregator consisting of two cascaded L-Trans modules based on self-attention, followed by a G-Informer module based on ProbSparse self-attention for processing global temporal information. The multi-scale temporal head then combines local and global features and classifies surgical phases using phase transition-aware supervision. Our approach outperforms state-of-the-art methods on the Cholec80 and AutoLaparo datasets consistently. Compared to Trans-SVNet, LoViT achieves a 2.39 pp (percentage point) improvement in video-level accuracy on Cholec80 and a 3.14 pp improvement on AutoLaparo. Moreover, it achieves a 5.25 pp improvement in phase-level Jaccard on AutoLaparo and a 1.55 pp improvement on Cholec80. Our results demonstrate the effectiveness of our approach in achieving state-of-the-art performance of surgical phase recognition on two datasets of different surgical procedures and temporal sequencing characteristics whilst introducing mechanisms that cope with long videos. | 翻訳日:2023-05-17 17:23:31 公開日:2023-05-15 |
# 調和データサイロによるフェデレーション学習 Federated Learning over Harmonized Data Silos ( http://arxiv.org/abs/2305.08985v1 ) ライセンス: Link先を確認 | Dimitris Stripelis and Jose Luis Ambite | (参考訳) Federated Learning(フェデレートラーニング)は、地理的に分散したデータサイロがデータを共有せずに共同で機械学習モデルを学習することを可能にする分散機械学習アプローチである。
そこで本研究では,データ調和とデータ計算の重要なステップを取り入れたエンドツーエンドのフェデレーション学習統合システムのアーキテクチャビジョンを提案し,データ管理情報システムと機械学習の交わりに関するさらなる研究を促進する。 Federated Learning is a distributed machine learning approach that enables geographically distributed data silos to collaboratively learn a joint machine learning model without sharing data. Most of the existing work operates on unstructured data, such as images or text, or on structured data assumed to be consistent across the different sites. However, sites often have different schemata, data formats, data values, and access patterns. The field of data integration has developed many methods to address these challenges, including techniques for data exchange and query rewriting using declarative schema mappings, and for entity linkage. Therefore, we propose an architectural vision for an end-to-end Federated Learning and Integration system, incorporating the critical steps of data harmonization and data imputation, to spur further research on the intersection of data management information systems and machine learning. | 翻訳日:2023-05-17 17:23:01 公開日:2023-05-15 |
# help the helper: aiによる実践とフィードバックによる相互カウンセラー支援 Helping the Helper: Supporting Peer Counselors via AI-Empowered Practice and Feedback ( http://arxiv.org/abs/2305.08982v1 ) ライセンス: Link先を確認 | Shang-Ling Hsu, Raj Sanjay Shah, Prathik Senthil, Zahra Ashktorab, Casey Dugan, Werner Geyer, Diyi Yang | (参考訳) 何百万というユーザーがオンラインのピアカウンセリングプラットフォームを訪れ、関係性ストレスから不安までさまざまなトピックのサポートを求めている。
また、ケアは特に初心者カウンセラーが困難な状況で反応するのに役立ちます。 Millions of users come to online peer counseling platforms to seek support on diverse topics ranging from relationship stress to anxiety. However, studies show that online peer support groups are not always as effective as expected largely due to users' negative experiences with unhelpful counselors. Peer counselors are key to the success of online peer counseling platforms, but most of them often do not have systematic ways to receive guidelines or supervision. In this work, we introduce CARE: an interactive AI-based tool to empower peer counselors through automatic suggestion generation. During the practical training stage, CARE helps diagnose which specific counseling strategies are most suitable in the given context and provides tailored example responses as suggestions. Counselors can choose to select, modify, or ignore any suggestion before replying to the support seeker. Building upon the Motivational Interviewing framework, CARE utilizes large-scale counseling conversation data together with advanced natural language generation techniques to achieve these functionalities. We demonstrate the efficacy of CARE by performing both quantitative evaluations and qualitative user studies through simulated chats and semi-structured interviews. We also find that CARE especially helps novice counselors respond better in challenging situations. | 翻訳日:2023-05-17 17:22:47 公開日:2023-05-15 |
# 代理ソーシャルメディアを用いたホームレスの地域レベル測定の評価 An assessment of measuring local levels of homelessness through proxy social media signals ( http://arxiv.org/abs/2305.08978v1 ) ライセンス: Link先を確認 | Yoshi Meke Bird, Sarah E. Grobe, Michael V. Arnold, Sean P. Rogers, Mikaela I. Fudolig, Julia Witte Zimmerman, Christopher M. Danforth, Peter Sheridan Dodds | (参考訳) 近年の研究では、ソーシャルメディアのアクティビティが、自然言語処理によって検出可能な、国家レベルの公衆衛生対策のプロキシとして機能することを示唆している。
ソーシャルメディア分析への計算的アプローチは、ホームレスやホームレス政策の全国的および地域的影響に関する情報が豊富な低コストでリアルタイムなデータセットを提供する可能性があるが、現実的な問題は多く、ソーシャルメディアが他のホームレス対策を補完するプロキシとしての可能性を制限することにある。 Recent studies suggest social media activity can function as a proxy for measures of state-level public health, detectable through natural language processing. We present results of our efforts to apply this approach to estimate homelessness at the state level throughout the US during the period 2010-2019 and 2022 using a dataset of roughly 1 million geotagged tweets containing the substring ``homeless.'' Correlations between homelessness-related tweet counts and ranked per capita homelessness volume, but not general-population densities, suggest a relationship between the likelihood of Twitter users to personally encounter or observe homelessness in their everyday lives and their likelihood to communicate about it online. An increase to the log-odds of ``homeless'' appearing in an English-language tweet, as well as an acceleration in the increase in average tweet sentiment, suggest that tweets about homelessness are also affected by trends at the nation-scale. Additionally, changes to the lexical content of tweets over time suggest that reversals to the polarity of national or state-level trends may be detectable through an increase in political or service-sector language over the semantics of charity or direct appeals. An analysis of user account type also revealed changes to Twitter-use patterns by accounts authored by individuals versus entities that may provide an additional signal to confirm changes to homelessness density in a given jurisdiction. While a computational approach to social media analysis may provide a low-cost, real-time dataset rich with information about nationwide and localized impacts of homelessness and homelessness policy, we find that practical issues abound, limiting the potential of social media as a proxy to complement other measures of homelessness. | 翻訳日:2023-05-17 17:22:26 公開日:2023-05-15 |
# インクリメンタル学習とコンセプトドリフト適応を用いたストリーミングデータのオートエンコーダによる異常検出 Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation ( http://arxiv.org/abs/2305.08977v1 ) ライセンス: Link先を確認 | Jin Li, Kleanthis Malialis, Marios M. Polycarpou | (参考訳) 現代のデジタル世界では、様々なアプリケーション領域で大量のデータがストリーミング形式で生成されています。
さらに比較研究を行い,提案手法が既存のベースライン法と先進法を著しく上回ることを示す。 In our digital universe nowadays, enormous amount of data are produced in a streaming manner in a variety of application areas. These data are often unlabelled. In this case, identifying infrequent events, such as anomalies, poses a great challenge. This problem becomes even more difficult in non-stationary environments, which can cause deterioration of the predictive performance of a model. To address the above challenges, the paper proposes an autoencoder-based incremental learning method with drift detection (strAEm++DD). Our proposed method strAEm++DD leverages on the advantages of both incremental learning and drift detection. We conduct an experimental study using real-world and synthetic datasets with severe or extreme class imbalance, and provide an empirical analysis of strAEm++DD. We further conduct a comparative study, showing that the proposed method significantly outperforms existing baseline and advanced methods. | 翻訳日:2023-05-17 17:21:58 公開日:2023-05-15 |
# ハイブリッドトライアルにおける外部制御活用のための因果推論フレームワーク A Causal Inference Framework for Leveraging External Controls in Hybrid Trials ( http://arxiv.org/abs/2305.08969v1 ) ライセンス: Link先を確認 | Michael Valancius, Herb Pang, Jiawen Zhu, Stephen R Cole, Michele Jonsson Funk, Michael R Kosorok | (参考訳) 平均治療効果 (ate) を推定する効率を向上させるために, ランダム化試行からのデータを外部ソースからの制御データで拡張する場面において, 因果推論に関連する課題を検討する。
そこで本研究では,前回の治験から外部コントロール患者が存在する脊髄筋萎縮症患者の運動機能に対するrisdisplamの効果について検討した。 We consider the challenges associated with causal inference in settings where data from a randomized trial is augmented with control data from an external source to improve efficiency in estimating the average treatment effect (ATE). Through the development of a formal causal inference framework, we outline sufficient causal assumptions about the exchangeability between the internal and external controls to identify the ATE and establish the connection to a novel graphical criteria. We propose estimators, review efficiency bounds, develop an approach for efficient doubly-robust estimation even when unknown nuisance models are estimated with flexible machine learning methods, and demonstrate finite-sample performance through a simulation study. To illustrate the ideas and methods, we apply the framework to a trial investigating the effect of risdisplam on motor function in patients with spinal muscular atrophy for which there exists an external set of control patients from a previous trial. | 翻訳日:2023-05-17 17:21:44 公開日:2023-05-15 |
# 適応時間面を用いた足ロボットの動的運動追跡のためのイベントカメラによる視覚計測 Event Camera-based Visual Odometry for Dynamic Motion Tracking of a Legged Robot Using Adaptive Time Surface ( http://arxiv.org/abs/2305.08962v1 ) ライセンス: Link先を確認 | Shifan Zhu, Zhipeng Tang, Michael Yang, Erik Learned-Miller, Donghyun Kim | (参考訳) 本稿では,イベントとRGB-Dデータを組み合わせて,ダイナミックな移動動作やアクロバティックな動作におけるアジャイルレッグロボットの姿勢を推定する,直接スパース視覚計測法を提案する。
公共データセットと独自の四足ロボットデータセットの両方でフレームワークの性能を広範囲に評価し、動的動作中のアジャイルロボットの姿勢を正確に推定する効果を実証した。 Our paper proposes a direct sparse visual odometry method that combines event and RGB-D data to estimate the pose of agile-legged robots during dynamic locomotion and acrobatic behaviors. Event cameras offer high temporal resolution and dynamic range, which can eliminate the issue of blurred RGB images during fast movements. This unique strength holds a potential for accurate pose estimation of agile-legged robots, which has been a challenging problem to tackle. Our framework leverages the benefits of both RGB-D and event cameras to achieve robust and accurate pose estimation, even during dynamic maneuvers such as jumping and landing a quadruped robot, the Mini-Cheetah. Our major contributions are threefold: Firstly, we introduce an adaptive time surface (ATS) method that addresses the whiteout and blackout issue in conventional time surfaces by formulating pixel-wise decay rates based on scene complexity and motion speed. Secondly, we develop an effective pixel selection method that directly samples from event data and applies sample filtering through ATS, enabling us to pick pixels on distinct features. Lastly, we propose a nonlinear pose optimization formula that simultaneously performs 3D-2D alignment on both RGB-based and event-based maps and images, allowing the algorithm to fully exploit the benefits of both data streams. We extensively evaluate the performance of our framework on both public datasets and our own quadruped robot dataset, demonstrating its effectiveness in accurately estimating the pose of agile robots during dynamic movements. | 翻訳日:2023-05-17 17:21:26 公開日:2023-05-15 |
# バックプロパゲーションを伴わないニューラルネットワークのトレーニング--いいね! Training Neural Networks without Backpropagation: A Deeper Dive into the Likelihood Ratio Method ( http://arxiv.org/abs/2305.08960v1 ) ライセンス: Link先を確認 | Jinyang Jiang, Zeliang Zhang, Chenliang Xu, Zhaofei Yu, Yijie Peng | (参考訳) バックプロパゲーション(bp)は、ディープラーニングにおけるニューラルネットワークのトレーニングにおいて最も重要な勾配推定手法である。
これらの結果は、LR法が様々なニューラルネットワークのトレーニングに有効であることを示し、BP法に対する敵対攻撃下でのニューラルネットワークの堅牢性を大幅に向上することを示した。 Backpropagation (BP) is the most important gradient estimation method for training neural networks in deep learning. However, the literature shows that neural networks trained by BP are vulnerable to adversarial attacks. We develop the likelihood ratio (LR) method, a new gradient estimation method, for training a broad range of neural network architectures, including convolutional neural networks, recurrent neural networks, graph neural networks, and spiking neural networks, without recursive gradient computation. We propose three methods to efficiently reduce the variance of the gradient estimation in the neural network training process. Our experiments yield numerical results for training different neural networks on several datasets. All results demonstrate that the LR method is effective for training various neural networks and significantly improves the robustness of the neural networks under adversarial attacks relative to the BP method. | 翻訳日:2023-05-17 17:20:58 公開日:2023-05-15 |
# モジュラーモーションプログラムによるモーション質問応答 Motion Question Answering via Modular Motion Programs ( http://arxiv.org/abs/2305.08953v1 ) ライセンス: Link先を確認 | Mark Endo, Joy Hsu, Jiaman Li, Jiajun Wu | (参考訳) 現実世界で人間の行動を知覚し推論できる人工知能システムを構築するためには、まず、動きのシーケンス上で複雑な時空間推論を行うモデルを設計する必要がある。
さらに, 動作概念の学習, 属性・ニューラル演算, 時間的関係などを通じて, 記号的推論とモジュラー設計を用いて, 動作をグラウンド化するためのニューロシンボリック手法であるNSPoseを提案する。
我々は,NSPoseのHumanMotionQAタスクに対する適合性を実証し,すべてのベースライン手法より優れていることを示す。 In order to build artificial intelligence systems that can perceive and reason with human behavior in the real world, we must first design models that conduct complex spatio-temporal reasoning over motion sequences. Moving towards this goal, we propose the HumanMotionQA task to evaluate complex, multi-step reasoning abilities of models on long-form human motion sequences. We generate a dataset of question-answer pairs that require detecting motor cues in small portions of motion sequences, reasoning temporally about when events occur, and querying specific motion attributes. In addition, we propose NSPose, a neuro-symbolic method for this task that uses symbolic reasoning and a modular design to ground motion through learning motion concepts, attribute neural operators, and temporal relations. We demonstrate the suitability of NSPose for the HumanMotionQA task, outperforming all baseline methods. | 翻訳日:2023-05-17 17:20:43 公開日:2023-05-15 |
# 固有値問題に対するほぼ退化密度行列摂動理論の係数 Coefficients of almost-degenerate density matrix perturbation theory for eigenvalue problems ( http://arxiv.org/abs/2305.09026v1 ) ライセンス: Link先を確認 | Charles Arnal, Louis Garrigue | (参考訳) 固有値問題のほぼ退化摂動理論をスペクトルプロジェクタ、別名密度行列を用いて検討する。
級数の係数の表現におけるこれらの人工特異点を取り除き、固有値のギャップを任意に小さくし、結果の式で消えることさえできる。 We investigate almost-degenerate perturbation theory of eigenvalue problems, using spectral projectors, also named density matrices. When several eigenvalues are close to each other, the coefficients of the perturbative series become singular because inverses of differences between eigenvalues arise as some factors. We remove those artificial singularities in the expressions of the coefficients of the series, allowing eigenvalue gaps to be arbitrarily small and even vanishing in the resulting formulas. | 翻訳日:2023-05-17 17:12:45 公開日:2023-05-15 |
# 多言語難読検索のためのソフトプロンプトデコーディング Soft Prompt Decoding for Multilingual Dense Retrieval ( http://arxiv.org/abs/2305.09025v1 ) ライセンス: Link先を確認 | Zhiqi Huang, Hansi Zeng, Hamed Zamani and James Allan | (参考訳) 本研究では,複数の言語に文書を格納する多言語情報検索(MLIR)タスクについて検討する。
これは、多言語コレクションの不均一で不均衡な性質のためである - いくつかの言語はコレクションで表現され、大規模なトレーニングデータの恩恵を受けている。
我々は、言語バイアスが少なく、新しい言語へのゼロショット転送能力が向上していることを示すため、広範囲な分析を行う。 In this work, we explore a Multilingual Information Retrieval (MLIR) task, where the collection includes documents in multiple languages. We demonstrate that applying state-of-the-art approaches developed for cross-lingual information retrieval to MLIR tasks leads to sub-optimal performance. This is due to the heterogeneous and imbalanced nature of multilingual collections -- some languages are better represented in the collection and some benefit from large-scale training data. To address this issue, we present KD-SPD, a novel soft prompt decoding approach for MLIR that implicitly "translates" the representation of documents in different languages into the same embedding space. To address the challenges of data scarcity and imbalance, we introduce a knowledge distillation strategy. The teacher model is trained on rich English retrieval data, and by leveraging bi-text data, our distillation framework transfers its retrieval knowledge to the multilingual document encoder. Therefore, our approach does not require any multilingual retrieval training data. Extensive experiments on three MLIR datasets with a total of 15 languages demonstrate that KD-SPD significantly outperforms competitive baselines in all cases. We conduct extensive analyses to show that our method has less language bias and better zero-shot transfer ability towards new languages. | 翻訳日:2023-05-17 17:12:37 公開日:2023-05-15 |
# Tango: NLPタスクの概念化のナビゲートとパフォーマンスの測定 It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance ( http://arxiv.org/abs/2305.09022v1 ) ライセンス: Link先を確認 | Arjun Subramonian, Xingdi Yuan, Hal Daum\'e III, Su Lin Blodgett | (参考訳) NLPの進歩は、ベンチマークを通じてますます測定されるため、文脈化の進展には、いつ、なぜ実践者がベンチマークの有効性について意見が一致しないのかを理解する必要がある。
2) モデル性能の測定方法について検討する。
最後に,本分類に基づいて,ベンチマークを構築し,その限界を文書化する枠組みを提案する。 Progress in NLP is increasingly measured through benchmarks; hence, contextualizing progress requires understanding when and why practitioners may disagree about the validity of benchmarks. We develop a taxonomy of disagreement, drawing on tools from measurement modeling, and distinguish between two types of disagreement: 1) how tasks are conceptualized and 2) how measurements of model performance are operationalized. To provide evidence for our taxonomy, we conduct a meta-analysis of relevant literature to understand how NLP tasks are conceptualized, as well as a survey of practitioners about their impressions of different factors that affect benchmark validity. Our meta-analysis and survey across eight tasks, ranging from coreference resolution to question answering, uncover that tasks are generally not clearly and consistently conceptualized and benchmarks suffer from operationalization disagreements. These findings support our proposed taxonomy of disagreement. Finally, based on our taxonomy, we present a framework for constructing benchmarks and documenting their limitations. | 翻訳日:2023-05-17 17:12:13 公開日:2023-05-15 |
# Dated: エンジニアリング設計アプリケーションのための合成データセット作成ガイドライン DATED: Guidelines for Creating Synthetic Datasets for Engineering Design Applications ( http://arxiv.org/abs/2305.09018v1 ) ライセンス: Link先を確認 | Cyril Picard, J\"urg Schiffmann and Faez Ahmed | (参考訳) ChatGPTとDALL-Eがデモした人工知能の最近の進歩を、現実世界のアプリケーションに展開するには、膨大な、ドメイン固有の、パブリックアクセス可能なデータセットが必要である。
さらに, ターボ圧縮機データセットの作成により, これらのガイドラインの実用的意義を示す。
データセットとメソッドのコードとデータはhttps://github.com/cyrilpic/radcompで公開されている。 Exploiting the recent advancements in artificial intelligence, showcased by ChatGPT and DALL-E, in real-world applications necessitates vast, domain-specific, and publicly accessible datasets. Unfortunately, the scarcity of such datasets poses a significant challenge for researchers aiming to apply these breakthroughs in engineering design. Synthetic datasets emerge as a viable alternative. However, practitioners are often uncertain about generating high-quality datasets that accurately represent real-world data and are suitable for the intended downstream applications. This study aims to fill this knowledge gap by proposing comprehensive guidelines for generating, annotating, and validating synthetic datasets. The trade-offs and methods associated with each of these aspects are elaborated upon. Further, the practical implications of these guidelines are illustrated through the creation of a turbo-compressors dataset. The study underscores the importance of thoughtful sampling methods to ensure the appropriate size, diversity, utility, and realism of a dataset. It also highlights that design diversity does not equate to performance diversity or realism. By employing test sets that represent uniform, real, or task-specific samples, the influence of sample size and sampling strategy is scrutinized. Overall, this paper offers valuable insights for researchers intending to create and publish synthetic datasets for engineering design, thereby paving the way for more effective applications of AI advancements in the field. The code and data for the dataset and methods are made publicly accessible at https://github.com/cyrilpic/radcomp . | 翻訳日:2023-05-17 17:11:56 公開日:2023-05-15 |
# Gaussian Process Port-Hamiltonian Systems:Bayesian Learning with Physics Prior Gaussian Process Port-Hamiltonian Systems: Bayesian Learning with Physics Prior ( http://arxiv.org/abs/2305.09017v1 ) ライセンス: Link先を確認 | Thomas Beckers, Jacob Seidman, Paris Perdikaris, George J. Pappas | (参考訳) データ駆動アプローチは、収集されたデータに基づく複雑なダイナミクスのモデリングにおいて顕著な結果をもたらす。
この省略は2つの点で好ましくない: モデルは物理的事前知識を組み込むことによって、よりデータ効率が良くないし、モデル自体が物理的に正しいものではないかもしれない。
ガウス過程ポートハミルトニアン系 (gp-phs) を不確実性定量化を伴う物理形ベイズ学習手法として提案する。
さらに,提案手法はポートハミルトニアン系の構成的性質を保っている。 Data-driven approaches achieve remarkable results for the modeling of complex dynamics based on collected data. However, these models often neglect basic physical principles which determine the behavior of any real-world system. This omission is unfavorable in two ways: The models are not as data-efficient as they could be by incorporating physical prior knowledge, and the model itself might not be physically correct. We propose Gaussian Process Port-Hamiltonian systems (GP-PHS) as a physics-informed Bayesian learning approach with uncertainty quantification. The Bayesian nature of GP-PHS uses collected data to form a distribution over all possible Hamiltonians instead of a single point estimate. Due to the underlying physics model, a GP-PHS generates passive systems with respect to designated inputs and outputs. Further, the proposed approach preserves the compositional nature of Port-Hamiltonian systems. | 翻訳日:2023-05-17 17:11:36 公開日:2023-05-15 |
# 脳腫瘍分離(BraTS)チャレンジ2023: 腫瘍分離(BraSyn)のための脳MR画像合成 The Brain Tumor Segmentation (BraTS) Challenge 2023: Brain MR Image Synthesis for Tumor Segmentation (BraSyn) ( http://arxiv.org/abs/2305.09011v1 ) ライセンス: Link先を確認 | Hongwei Bran Li, Gian Marco Conte, Syed Muhammad Anwar, Florian Kofler, Koen van Leemput, Marie Piraud, Ivan Ezhov, Felix Meissen, Maruf Adewole, Syed Muhammad Anwar, Anastasia Janas, Anahita Fathi Kazerooni, Dominic LaBella, Ahmed W. Moawad, Keyvan Farahani, James Eddy, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Farouk Dako, Walter Wiggins, Zachary Reitman, Chunhao Wang, Xinyang Liu, Zhifan Jiang, Ariana Familiar, Elaine Johanson, Zeke Meier, Christos Davatzikos, John Freymann, Justin Kirby, Michel Bilello, Hassan M. Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Rivka R. Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko, Arash Nazeri, Marc Andr\'e Weber, Abhishek Mahajan, Suyash Mohan, John Mongan, Christopher Hess, Soonmee Cha, Javier Villanueva, Meyer Errol Colak, Priscila Crivellaro, Andras Jakab, Jake Albrecht, Udunna Anazodo, Mariam Aboian, Thomas Yu, Verena Chung, Timothy Bergquist, James Eddy, Jake Albrecht, Ujjwal Baid, Spyridon Bakas, Marius George Linguraru, Bjoern Menze, Juan Eugenio Iglesias, Benedikt Wiestler | (参考訳) 自動脳腫瘍分割法は確立されており、明確な臨床的有用性を持つパフォーマンスレベルに達する。
したがって, これらのシナリオにおいて, セグメンテーション性能の回復に欠かせないモダリティを置換することは, 臨床ルーチンにおいて, より広く採用されるためには, 極めて望ましいものである。
画像データセットは多様で多様であり、様々な病院や研究機関と連携して作成された。 Automated brain tumor segmentation methods are well established, reaching performance levels with clear clinical utility. Most algorithms require four input magnetic resonance imaging (MRI) modalities, typically T1-weighted images with and without contrast enhancement, T2-weighted images, and FLAIR images. However, some of these sequences are often missing in clinical practice, e.g., because of time constraints and/or image artifacts (such as patient motion). Therefore, substituting missing modalities to recover segmentation performance in these scenarios is highly desirable and necessary for the more widespread adoption of such algorithms in clinical routine. In this work, we report the set-up of the Brain MR Image Synthesis Benchmark (BraSyn), organized in conjunction with the Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2023. The objective of the challenge is to benchmark image synthesis methods that realistically synthesize missing MRI modalities given multiple available images to facilitate automated brain tumor segmentation pipelines. The image dataset is multi-modal and diverse, created in collaboration with various hospitals and research institutions. | 翻訳日:2023-05-17 17:11:22 公開日:2023-05-15 |
# 物理学強化ガウス過程変分オートエンコーダ Physics-enhanced Gaussian Process Variational Autoencoder ( http://arxiv.org/abs/2305.09006v1 ) ライセンス: Link先を確認 | Thomas Beckers, Qirui Wu, George J. Pappas | (参考訳) 変分オートエンコーダは、高次元の入出力データに基づいて低次元の潜在空間を学習できる。
提案手法の利点は振動粒子を用いたシミュレーションで強調される。 Variational autoencoders allow to learn a lower-dimensional latent space based on high-dimensional input/output data. Using video clips as input data, the encoder may be used to describe the movement of an object in the video without ground truth data (unsupervised learning). Even though the object's dynamics is typically based on first principles, this prior knowledge is mostly ignored in the existing literature. Thus, we propose a physics-enhanced variational autoencoder that places a physical-enhanced Gaussian process prior on the latent dynamics to improve the efficiency of the variational autoencoder and to allow physically correct predictions. The physical prior knowledge expressed as linear dynamical system is here reflected by the Green's function and included in the kernel function of the Gaussian process. The benefits of the proposed approach are highlighted in a simulation with an oscillating particle. | 翻訳日:2023-05-17 17:11:00 公開日:2023-05-15 |
# プラグアンドプレイ画像復元のための拡散モデル Denoising Diffusion Models for Plug-and-Play Image Restoration ( http://arxiv.org/abs/2305.08995v1 ) ライセンス: Link先を確認 | Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, Luc Van Gool | (参考訳) プラグアンドプレイ画像復元(IR)は,既往の暗黙のイメージとして市販のデノイザを用いて,様々な逆問題を解決するフレキシブルかつ解釈可能な方法として広く認識されている。
gaussian denoisersを識別するプラグイン・アンド・プレイ ir 法と比較して、diffpir は拡散モデルの生成能力を継承することが期待されている。
ソースコードは {\url{https://github.com/yuanzhi-zhu/DiffPIR}}で入手できる。 Plug-and-play Image Restoration (IR) has been widely recognized as a flexible and interpretable method for solving various inverse problems by utilizing any off-the-shelf denoiser as the implicit image prior. However, most existing methods focus on discriminative Gaussian denoisers. Although diffusion models have shown impressive performance for high-quality image synthesis, their potential to serve as a generative denoiser prior to the plug-and-play IR methods remains to be further explored. While several other attempts have been made to adopt diffusion models for image restoration, they either fail to achieve satisfactory results or typically require an unacceptable number of Neural Function Evaluations (NFEs) during inference. This paper proposes DiffPIR, which integrates the traditional plug-and-play method into the diffusion sampling framework. Compared to plug-and-play IR methods that rely on discriminative Gaussian denoisers, DiffPIR is expected to inherit the generative ability of diffusion models. Experimental results on three representative IR tasks, including super-resolution, image deblurring, and inpainting, demonstrate that DiffPIR achieves state-of-the-art performance on both the FFHQ and ImageNet datasets in terms of reconstruction faithfulness and perceptual quality with no more than 100 NFEs. The source code is available at {\url{https://github.com/yuanzhi-zhu/DiffPIR}} | 翻訳日:2023-05-17 17:10:44 公開日:2023-05-15 |
# 機械学習を用いた制御フローグラフによるマルウェア解析 Survey of Malware Analysis through Control Flow Graph using Machine Learning ( http://arxiv.org/abs/2305.08993v1 ) ライセンス: Link先を確認 | Shaswata Mitra, Stephen A. Torri, Sudip Mittal | (参考訳) マルウェアはコンピュータシステムやネットワークのセキュリティにとって重大な脅威であり、検出の動作と機能を分析するための高度な技術を必要とする。
具体的には,cfg ベースのマルウェア検出に適用された異なる ml アルゴリズムと同様に,これまで使用されてきた cfg 機能の種類を包括的に概観する。
我々は、これらのアプローチの課題と限界を詳細に分析するとともに、オープンな問題に対処する潜在的な解決策を提案し、この分野の研究の今後の方向性を約束する。 Malware is a significant threat to the security of computer systems and networks which requires sophisticated techniques to analyze the behavior and functionality for detection. Traditional signature-based malware detection methods have become ineffective in detecting new and unknown malware due to their rapid evolution. One of the most promising techniques that can overcome the limitations of signature-based detection is to use control flow graphs (CFGs). CFGs leverage the structural information of a program to represent the possible paths of execution as a graph, where nodes represent instructions and edges represent control flow dependencies. Machine learning (ML) algorithms are being used to extract these features from CFGs and classify them as malicious or benign. In this survey, we aim to review some state-of-the-art methods for malware detection through CFGs using ML, focusing on the different ways of extracting, representing, and classifying. Specifically, we present a comprehensive overview of different types of CFG features that have been used as well as different ML algorithms that have been applied to CFG-based malware detection. We provide an in-depth analysis of the challenges and limitations of these approaches, as well as suggest potential solutions to address some open problems and promising future directions for research in this field. | 翻訳日:2023-05-17 17:10:19 公開日:2023-05-15 |
# 脳腫瘍分離(BraTS)チャレンジ2023: 塗布による健康な脳組織の局所的合成 The Brain Tumor Segmentation (BraTS) Challenge 2023: Local Synthesis of Healthy Brain Tissue via Inpainting ( http://arxiv.org/abs/2305.08992v1 ) ライセンス: Link先を確認 | Florian Kofler, Felix Meissen, Felix Steinbauer, Robert Graf, Eva Oswald, Ezequiel de da Rosa, Hongwei Bran Li, Ujjwal Baid, Florian Hoelzl, Oezguen Turgut, Izabela Horvath, Diana Waldmannstetter, Christina Bukas, Maruf Adewole, Syed Muhammad Anwar, Anastasia Janas, Anahita Fathi Kazerooni, Dominic LaBella, Ahmed W Moawad, Keyvan Farahani, James Eddy, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Farouk Dako, Walter Wiggins, Zachary Reitman, Chunhao Wang, Xinyang Liu, Zhifan Jiang, Ariana Familiar, Gian-Marco Conte, Elaine Johanson, Zeke Meier, Christos Davatzikos, John Freymann, Justin Kirby, Michel Bilello, Hassan M Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Rivka R Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko, Arash Nazeri, Marc-Andr\'e Weber, Abhishek Mahajan, Suyash Mohan, John Mongan, Christopher Hess, Soonmee Cha, Javier Villanueva-Meyer, Errol Colak, Priscila Crivellaro, Andras Jakab, Jake Albrecht, Udunna Anazodo, Mariam Aboian, Juan Eugenio Iglesias, Koen Van Leemput, Spyridon Bakas, Daniel Rueckert, Benedikt Wiestler, Ivan Ezhov, Marie Piraud, Bjoern Menze | (参考訳) 脳MR画像の自動解析のための無数のアルゴリズムが臨床医の意思決定を支援するために利用可能である。
このジレンマを解決するために,BraTS 2023の塗装課題を紹介する。
このチャレンジは、カナダのバンクーバーで開催されたMICCAI 2023カンファレンスで開催されるBraTS 2023チャレンジの一部として組織されている。 A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with a scan that is already pathological. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantees for images featuring lesions. Examples include but are not limited to algorithms for brain anatomy parcellation, tissue segmentation, and brain extraction. To solve this dilemma, we introduce the BraTS 2023 inpainting challenge. Here, the participants' task is to explore inpainting techniques to synthesize healthy brain scans from lesioned ones. The following manuscript contains the task formulation, dataset, and submission procedure. Later it will be updated to summarize the findings of the challenge. The challenge is organized as part of the BraTS 2023 challenge hosted at the MICCAI 2023 conference in Vancouver, Canada. | 翻訳日:2023-05-17 17:09:59 公開日:2023-05-15 |
# 脳復号処理への変換学習のためのfMRIデータのペア配列に基づく自己教師付き事前学習 Self-Supervised Pretraining on Paired Sequences of fMRI Data for Transfer Learning to Brain Decoding Tasks ( http://arxiv.org/abs/2305.09057v1 ) ライセンス: Link先を確認 | Sean Paulsen, Michael Casey | (参考訳) 本研究では,機能的磁気共鳴イメージング(fMRI)データに基づくトランスフォーマーのための自己教師付き事前学習フレームワークを提案する。
本研究は,fMRIデータを用いた事前学習と伝達学習のためのトランスフォーマーアーキテクチャに関する文献の増大に寄与し,fMRIデータに基づく事前学習とマルチタスク事前学習の概念実証の役割を果たしている。 In this work we introduce a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a general understanding of the temporal and spatial dynamics of human auditory cortex during music listening. Our pretraining results are the first to suggest a synergistic effect of multitask training on fMRI data. Second, we finetune the pretrained models and train additional fresh models on a supervised fMRI classification task. We observe significantly improved accuracy on held-out runs with the finetuned models, which demonstrates the ability of our pretraining tasks to facilitate transfer learning. This work contributes to the growing body of literature on transformer architectures for pretraining and transfer learning with fMRI data, and serves as a proof of concept for our pretraining tasks and multitask pretraining on fMRI data. | 翻訳日:2023-05-17 17:04:42 公開日:2023-05-15 |
# 井戸制御型貯留層シミュレーションのための物理インフォーメーション畳み込みリカレントサーロゲートモデル Physics-informed Convolutional Recurrent Surrogate Model for Reservoir Simulation with Well Controls ( http://arxiv.org/abs/2305.09056v1 ) ライセンス: Link先を確認 | Jungang Chen, Eduardo Gildin and John E. Killough (Texas A&M University) | (参考訳) 本稿では,物理インフォームド畳み込みリカレントニューラルネットワーク(PICRNN)を用いた流体流動モデリングのための新しい代理モデルを提案する。
このモデルは畳み込み型long-short term memory (convlstm) を用いて、多孔質流れにおける状態進化ダイナミクスの時空間依存性を捉える。
将来の well/system 制御に基づく貯留層動力学予測におけるモデルの有効性を示す3つの数値ケースについて検討した。
提案モデルにより, 地下流動の効率的かつ正確な予測が可能となり, 貯水池工学における最適制御設計への応用が期待できる。 This paper presents a novel surrogate model for modeling subsurface fluid flow with well controls using a physics-informed convolutional recurrent neural network (PICRNN). The model uses a convolutional long-short term memory (ConvLSTM) to capture the spatiotemporal dependencies of the state evolution dynamics in the porous flow. The ConvLSTM is linked to the state space equations, enabling the incorporation of a discrete-time sequence of well control. The model requires initial state condition and a sequence of well controls as inputs, and predicts the state variables of the system, such as pressure, as output. By minimizing the residuals of reservoir flow state-space equations, the network is trained without the need for labeled data. The model is designed to serve as a surrogate model for predicting future reservoir states based on the initial reservoir state and input engineering controls. Boundary conditions are enforced into the state-space equations so no additional loss term is needed. Three numerical cases are studied, demonstrating the model's effectiveness in predicting reservoir dynamics based on future well/system controls. The proposed model provides a new approach for efficient and accurate prediction of subsurface fluid flow, with potential applications in optimal control design for reservoir engineering. | 翻訳日:2023-05-17 17:04:27 公開日:2023-05-15 |
# 再構成可能な量子インターネットサービスプロバイダ Reconfigurable Quantum Internet Service Provider ( http://arxiv.org/abs/2305.09048v1 ) ライセンス: Link先を確認 | Zhaohui Yang, Chaohan Cui | (参考訳) 近年の工学量子システムの発展により、スケーラブルな局所領域量子ネットワークの実現が可能になった。
アリゾナ大学(UA)のCenter for Quantum Networks(CQN)のファイバベースの量子ネットワークテストベッド上に構築され、Platform-as-a-Service(PaaS)アーキテクチャに基づいた統合QISPプロトタイプを開発し、古典的な制御ソフトウェアをオープンソースQISPフレームワークとして抽象化、モジュール化する。
我々の実験はQISPの堅牢性を示し、将来の量子ネットワークのためのアーキテクチャとプロトコルの設計と検証の基礎を築いた。 With the recent developments in engineering quantum systems, the realization of scalable local-area quantum networks has become viable. However, the design and implementation of a quantum network is a holistic task that is way beyond the scope of an abstract design problem. As such, a testbed on which multiple disciplines can verify the design and implementation across a full networking stack has become a necessary infrastructure for the future development of quantum networks. In this work, we demonstrate the concept of quantum internet service provider (QISP), in analogy to the conventional ISP that allows for the sharing of classical information between the network nodes. The QISP is significant for the next-generation quantum networks as it coordinates the production, management, control, and sharing of quantum information across the end-users of a quantum network. We construct a reconfigurable QISP comprising both the quantum hardware and classical control software. Building on the fiber-based quantum-network testbed of the Center for Quantum Networks (CQN) at the University of Arizona (UA), we develop an integrated QISP prototype based on a Platform-as-a-Service (PaaS) architecture, whose classical control software is abstracted and modularized as an open-source QISP framework. To verify and characterize the QISP's performance, we demonstrate multi-channel entanglement distribution and routing among multiple quantum-network nodes with a time-energy entangled-photon source. We further perform field tests of concurrent services for multiple users across the quantum-network testbed. Our experiment demonstrates the robust capabilities of the QISP, laying the foundation for the design and verification of architectures and protocols for future quantum networks. | 翻訳日:2023-05-17 17:04:07 公開日:2023-05-15 |
# 確率シンプレックス上の凸最適化 Convex optimization over a probability simplex ( http://arxiv.org/abs/2305.09046v1 ) ライセンス: Link先を確認 | James Chok and Geoffrey M. Vasil | (参考訳) 確率単純度 $\{w\in\mathbb{R}^n\ |\ \sum_i w_i=1\ \textrm{and}\ w_i\geq0\}$ 上の凸問題を最適化する新しい反復スキームCauchy-Simplexを提案する。
Cauchy-Simplex の各イテレーションは単純な操作で構成され、高次元問題に適している。
最後に,本アルゴリズムをオンライン学習問題に適用し,(1)専門家のアドバイスによる予測と(2)ユニバーサルポートフォリオによる平均後悔の収束を証明した。 We propose a new iteration scheme, the Cauchy-Simplex, to optimize convex problems over the probability simplex $\{w\in\mathbb{R}^n\ |\ \sum_i w_i=1\ \textrm{and}\ w_i\geq0\}$. Other works have taken steps to enforce positivity or unit normalization automatically but never simultaneously within a unified setting. This paper presents a natural framework for manifestly requiring the probability condition. Specifically, we map the simplex to the positive quadrant of a unit sphere, envisage gradient descent in latent variables, and map the result back in a way that only depends on the simplex variable. Moreover, proving rigorous convergence results in this formulation leads inherently to tools from information theory (e.g. cross entropy and KL divergence). Each iteration of the Cauchy-Simplex consists of simple operations, making it well-suited for high-dimensional problems. We prove that it has a convergence rate of ${O}(1/T)$ for convex functions, and numerical experiments of projection onto convex hulls show faster convergence than similar algorithms. Finally, we apply our algorithm to online learning problems and prove the convergence of the average regret for (1) Prediction with expert advice and (2) Universal Portfolios. | 翻訳日:2023-05-17 17:03:40 公開日:2023-05-15 |
# 大規模データに対するスケーラブルかつロバストなテンソルリング分解 Scalable and Robust Tensor Ring Decomposition for Large-scale Data ( http://arxiv.org/abs/2305.09044v1 ) ライセンス: Link先を確認 | Yicong He and George K. Atia | (参考訳) テンソルリング(TR)分解は高次テンソルの表現性能に優れており,近年注目されている。
まず, 難解な項目を適応的に満たし, 分解過程中に異常点を識別できる新しい自動重み付き急降下法を開発した。
実験の結果,提案手法が既存のtr分解法よりも,異常値の存在下で優れており,既存の頑健なテンソル補完アルゴリズムよりもかなり高速に動作することがわかった。 Tensor ring (TR) decomposition has recently received increased attention due to its superior expressive performance for high-order tensors. However, the applicability of traditional TR decomposition algorithms to real-world applications is hindered by prevalent large data sizes, missing entries, and corruption with outliers. In this work, we propose a scalable and robust TR decomposition algorithm capable of handling large-scale tensor data with missing entries and gross corruptions. We first develop a novel auto-weighted steepest descent method that can adaptively fill the missing entries and identify the outliers during the decomposition process. Further, taking advantage of the tensor ring model, we develop a novel fast Gram matrix computation (FGMC) approach and a randomized subtensor sketching (RStS) strategy which yield significant reduction in storage and computational complexity. Experimental results demonstrate that the proposed method outperforms existing TR decomposition methods in the presence of outliers, and runs significantly faster than existing robust tensor completion algorithms. | 翻訳日:2023-05-17 17:03:11 公開日:2023-05-15 |
# 階層型無線ネットワークにおける適応フェデレーションプルーニング Adaptive Federated Pruning in Hierarchical Wireless Networks ( http://arxiv.org/abs/2305.09042v1 ) ライセンス: Link先を確認 | Xiaonan Liu and Shiqiang Wang and Yansha Deng and Arumugam Nallanathan | (参考訳) Federated Learning(FL)は、サーバがプライベートデータセットにアクセスすることなく、複数のデバイスによって更新されたモデルを集約する、有望なプライバシ保護分散学習フレームワークである。
最適化問題を解き、KKT(Karush Kuhn Tucker)条件を用いることで、プルーニング比と無線リソース割り当ての閉形式解が導出される。
シミュレーションの結果,提案したHFLとモデルプルーニングを併用したHFLは,モデルプルーニングを使用せず,通信コストを約50%削減できることがわかった。 Federated Learning (FL) is a promising privacy-preserving distributed learning framework where a server aggregates models updated by multiple devices without accessing their private datasets. Hierarchical FL (HFL), as a device-edge-cloud aggregation hierarchy, can enjoy both the cloud server's access to more datasets and the edge servers' efficient communications with devices. However, the learning latency increases with the HFL network scale due to the increasing number of edge servers and devices with limited local computation capability and communication bandwidth. To address this issue, in this paper, we introduce model pruning for HFL in wireless networks to reduce the neural network scale. We present the convergence analysis of an upper on the l2 norm of gradients for HFL with model pruning, analyze the computation and communication latency of the proposed model pruning scheme, and formulate an optimization problem to maximize the convergence rate under a given latency threshold by jointly optimizing the pruning ratio and wireless resource allocation. By decoupling the optimization problem and using Karush Kuhn Tucker (KKT) conditions, closed-form solutions of pruning ratio and wireless resource allocation are derived. Simulation results show that our proposed HFL with model pruning achieves similar learning accuracy compared with the HFL without model pruning and reduces about 50 percent communication cost. | 翻訳日:2023-05-17 17:02:53 公開日:2023-05-15 |
# トレーサグラフィにおける強化学習の意義 What Matters in Reinforcement Learning for Tractography ( http://arxiv.org/abs/2305.09041v1 ) ライセンス: Link先を確認 | Antoine Th\'eberge, Christian Desrosiers, Maxime Descoteaux, Pierre-Marc Jodoin | (参考訳) 近年,手作業による基準流路の整備を行なわずに白質の構造を再構築するためのトラクトグラフィー法と訓練薬の学習のために深部強化学習(RL)が提案されている。
トラクトログラフィのための強化学習を探求したいユーザや研究者のために、オープンソースのコードベース、トレーニングされたモデル、データセットもリリースしています。 Recently, deep reinforcement learning (RL) has been proposed to learn the tractography procedure and train agents to reconstruct the structure of the white matter without manually curated reference streamlines. While the performances reported were competitive, the proposed framework is complex, and little is still known about the role and impact of its multiple parts. In this work, we thoroughly explore the different components of the proposed framework, such as the choice of the RL algorithm, seeding strategy, the input signal and reward function, and shed light on their impact. Approximately 7,400 models were trained for this work, totalling nearly 41,000 hours of GPU time. Our goal is to guide researchers eager to explore the possibilities of deep RL for tractography by exposing what works and what does not work with the category of approach. As such, we ultimately propose a series of recommendations concerning the choice of RL algorithm, the input to the agents, the reward function and more to help future work using reinforcement learning for tractography. We also release the open source codebase, trained models, and datasets for users and researchers wanting to explore reinforcement learning for tractography. | 翻訳日:2023-05-17 17:02:27 公開日:2023-05-15 |
# 動的学習システムにおけるアルゴリズム検閲 Algorithmic Censoring in Dynamic Learning Systems ( http://arxiv.org/abs/2305.09035v1 ) ライセンス: Link先を確認 | Jennifer Chien, Margaret Roberts, Berk Ustun | (参考訳) 選択的ラベリングを受ける動的学習システムは検閲、すなわち1つ以上の点の部分群に割り当てられた持続的負の予測を示す。
検閲やランダム化探索に対する保護措置も検討しています - どちらも、守られないポイントのラベルを確実に収集するものです。
以上の結果から,検閲の無防備な害を浮き彫りにし,様々なデータ生成プロセスにおける緩和戦略の有効性を実証した。 Dynamic learning systems subject to selective labeling exhibit censoring, i.e. persistent negative predictions assigned to one or more subgroups of points. In applications like consumer finance, this results in groups of applicants that are persistently denied and thus never enter into the training data. In this work, we formalize censoring, demonstrate how it can arise, and highlight difficulties in detection. We consider safeguards against censoring - recourse and randomized-exploration - both of which ensure we collect labels for points that would otherwise go unobserved. The resulting techniques allow examples from censored groups to enter into the training data and correct the model. Our results highlight the otherwise unmeasured harms of censoring and demonstrate the effectiveness of mitigation strategies across a range of data generating processes. | 翻訳日:2023-05-17 17:02:06 公開日:2023-05-15 |
# AI in the Loop -- 自動医療画像分割パイプライン監視のためのフォールドパフォーマンスの分離機能 AI in the Loop -- Functionalizing Fold Performance Disagreement to Monitor Automated Medical Image Segmentation Pipelines ( http://arxiv.org/abs/2305.09031v1 ) ライセンス: Link先を確認 | Harrison C. Gottlich, Panagiotis Korfiatis, Adriana V. Gregory, Timothy L. Kline | (参考訳) 機械学習のワークフローを臨床実践に安全に実装し、モデルのトレーニング中に難しいケースを特定するためには、パフォーマンス予測を自動でフラグする手法が不可欠である。
クロスフォールドなサブモデルの不一致と人間のオブザーバー値の比較は、モデルの認識の不確実性 - 関連するトレーニングデータ不足による知識不足 - を、臨床で採用するための重要な機能として近似する効率的な方法である。 Methods for automatically flag poor performing-predictions are essential for safely implementing machine learning workflows into clinical practice and for identifying difficult cases during model training. We present a readily adoptable method using sub-models trained on different dataset folds, where their disagreement serves as a surrogate for model confidence. Thresholds informed by human interobserver values were used to determine whether a final ensemble model prediction would require manual review. In two different datasets (abdominal CT and MR predicting kidney tumors), our framework effectively identified low performing automated segmentations. Flagging images with a minimum Interfold test Dice score below human interobserver variability maximized the number of flagged images while ensuring maximum ensemble test Dice. When our internally trained model was applied to an external publicly available dataset (KiTS21), flagged images included smaller tumors than those observed in our internally trained dataset, demonstrating the methods robustness to flagging poor performing out-of-distribution input data. Comparing interfold sub-model disagreement against human interobserver values is an efficient way to approximate a model's epistemic uncertainty - its lack of knowledge due to insufficient relevant training data - a key functionality for adopting these applications in clinical practice. | 翻訳日:2023-05-17 17:01:55 公開日:2023-05-15 |
# SKIの高速化 - 非対称カーネルによるToeplitzニューラルネットワークの高速化 SKI to go Faster: Accelerating Toeplitz Neural Networks via Asymmetric Kernels ( http://arxiv.org/abs/2305.09028v1 ) ライセンス: Link先を確認 | Alexander Moreno, Jonathan Mei, Luke Walters | (参考訳) Toeplitz Neural Networks (TNN) (Qin et. al. 2023) は、印象的な結果を持つ最近のシーケンスモデルである。
これらは O(n log n) 計算複雑性と O(n) 相対位置エンコーダ (RPE) 多層パーセプトロン (MLP) と崩壊バイアス呼び出しを必要とする。
1) 学習した核は,主対角線付近にスパイクな振る舞いを示す。
2) RPE MLP は遅い。
低階成分に対しては、線形補間により RPE MLP を置換し、O(n) の複雑性に対して非対称な構造化カーネル補間 (SKI) (Wilson et. al. 2015) を用いる。
因果モデルでは、"高速"因果マスク (Katharopoulos et. al. 2020) はSKIの利点を否定する。
これは O(n log n) の複雑性を維持するが、絶対的なスピードアップを達成する。
我々は,最小限のスコア劣化を伴って,ロングレンジアリーナ(Tay et al. 2020)の速度状態を設定した。 Toeplitz Neural Networks (TNNs) (Qin et. al. 2023) are a recent sequence model with impressive results. They require O(n log n) computational complexity and O(n) relative positional encoder (RPE) multi-layer perceptron (MLP) and decay bias calls. We aim to reduce both. We first note that the RPE is a non-SPD (symmetric positive definite) kernel and the Toeplitz matrices are pseudo-Gram matrices. Further 1) the learned kernels display spiky behavior near the main diagonals with otherwise smooth behavior; 2) the RPE MLP is slow. For bidirectional models, this motivates a sparse plus low-rank Toeplitz matrix decomposition. For the sparse component's action, we do a small 1D convolution. For the low rank component, we replace the RPE MLP with linear interpolation and use asymmetric Structured Kernel Interpolation (SKI) (Wilson et. al. 2015) for O(n) complexity: we provide rigorous error analysis. For causal models, "fast" causal masking (Katharopoulos et. al. 2020) negates SKI's benefits. Working in the frequency domain, we avoid an explicit decay bias. To enforce causality, we represent the kernel via the real part of its frequency response using the RPE and compute the imaginary part via a Hilbert transform. This maintains O(n log n) complexity but achieves an absolute speedup. Modeling the frequency response directly is also competitive for bidirectional training, using one fewer FFT. We set a speed state of the art on Long Range Arena (Tay et. al. 2020) with minimal score degradation. | 翻訳日:2023-05-17 17:01:28 公開日:2023-05-15 |
# スキンディープ:コンピュータビジョンベンチマークデータセットのためのスキントーンアノテーションにおける主観性の検討 Skin Deep: Investigating Subjectivity in Skin Tone Annotations for Computer Vision Benchmark Datasets ( http://arxiv.org/abs/2305.09072v1 ) ライセンス: Link先を確認 | Teanna Barrett, Quan Ze Chen, Amy X. Zhang | (参考訳) 人間の画像を分析するコンピュータビジョンシステムの人種差をよく観察するために、研究者たちは、公正さ評価のための人種メタデータよりも客観的なアノテーションとして肌の色に目を向けた。
我々は,皮膚トーンを用いた評価手順の設計,解析,文書化において,より大きな反射性を求める。 To investigate the well-observed racial disparities in computer vision systems that analyze images of humans, researchers have turned to skin tone as more objective annotation than race metadata for fairness performance evaluations. However, the current state of skin tone annotation procedures is highly varied. For instance, researchers use a range of untested scales and skin tone categories, have unclear annotation procedures, and provide inadequate analyses of uncertainty. In addition, little attention is paid to the positionality of the humans involved in the annotation process--both designers and annotators alike--and the historical and sociological context of skin tone in the United States. Our work is the first to investigate the skin tone annotation process as a sociotechnical project. We surveyed recent skin tone annotation procedures and conducted annotation experiments to examine how subjective understandings of skin tone are embedded in skin tone annotation procedures. Our systematic literature review revealed the uninterrogated association between skin tone and race and the limited effort to analyze annotator uncertainty in current procedures for skin tone annotation in computer vision evaluation. Our experiments demonstrated that design decisions in the annotation procedure such as the order in which the skin tone scale is presented or additional context in the image (i.e., presence of a face) significantly affected the resulting inter-annotator agreement and individual uncertainty of skin tone annotations. We call for greater reflexivity in the design, analysis, and documentation of procedures for evaluation using skin tone. | 翻訳日:2023-05-17 16:53:58 公開日:2023-05-15 |
# FiMReSt:多変量規則スキュートカーネルの有限混合 -非対称散乱非ガウス核を持つ多クラスタデータに対するフレキシブル確率モデル FiMReSt: Finite Mixture of Multivariate Regulated Skew-t Kernels -- A Flexible Probabilistic Model for Multi-Clustered Data with Asymmetrically-Scattered Non-Gaussian Kernels ( http://arxiv.org/abs/2305.09071v1 ) ライセンス: Link先を確認 | Sarmad Mehrdad, S. Farokh Atashzar | (参考訳) 近年,データクラスタの歪度と統計的自由度(S-DoF)を考慮に入れたフレキシブルな確率論的モデリング手法としてスキュー・ト混合モデルを導入し,モデリングの一般化性の向上と重尾と歪性への堅牢性を実現している。
(c) S-DoF の収束 Recently skew-t mixture models have been introduced as a flexible probabilistic modeling technique taking into account both skewness in data clusters and the statistical degree of freedom (S-DoF) to improve modeling generalizability, and robustness to heavy tails and skewness. In this paper, we show that the state-of-the-art skew-t mixture models fundamentally suffer from a hidden phenomenon named here as "S-DoF explosion," which results in local minima in the shapes of normal kernels during the non-convex iterative process of expectation maximization. For the first time, this paper provides insights into the instability of the S-DoF, which can result in the divergence of the kernels from the mixture of t-distribution, losing generalizability and power for modeling the outliers. Thus, in this paper, we propose a regularized iterative optimization process to train the mixture model, enhancing the generalizability and resiliency of the technique. The resulting mixture model is named Finite Mixture of Multivariate Regulated Skew-t (FiMReSt) Kernels, which stabilizes the S-DoF profile during optimization process of learning. To validate the performance, we have conducted a comprehensive experiment on several real-world datasets and a synthetic dataset. The results highlight (a) superior performance of the FiMReSt, (b) generalizability in the presence of outliers, and (c) convergence of S-DoF. | 翻訳日:2023-05-17 16:53:30 公開日:2023-05-15 |
# 報酬機能を進化させるためのオフライン時間学習学習フレームワーク An Offline Time-aware Apprenticeship Learning Framework for Evolving Reward Functions ( http://arxiv.org/abs/2305.09070v1 ) ライセンス: Link先を確認 | Xi Yang, Ge Gao, Min Chi | (参考訳) Apprenticeship Learning(AL)は、専門家のデモンストレーションを観察し、模倣することによって効果的な意思決定ポリシーを誘導するプロセスである。
実験の結果,テーマは競争状態のベースラインを大きく上回ることがわかった。 Apprenticeship learning (AL) is a process of inducing effective decision-making policies via observing and imitating experts' demonstrations. Most existing AL approaches, however, are not designed to cope with the evolving reward functions commonly found in human-centric tasks such as healthcare, where offline learning is required. In this paper, we propose an offline Time-aware Hierarchical EM Energy-based Sub-trajectory (THEMES) AL framework to tackle the evolving reward functions in such tasks. The effectiveness of THEMES is evaluated via a challenging task -- sepsis treatment. The experimental results demonstrate that THEMES can significantly outperform competitive state-of-the-art baselines. | 翻訳日:2023-05-17 16:53:06 公開日:2023-05-15 |
# SGP-TOD: Schema-Guided LLM Prompting によるタスクボットの構築 SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting ( http://arxiv.org/abs/2305.09067v1 ) ライセンス: Link先を確認 | Xiaoying Zhang, Baolin Peng, Kun Li, Jingyan Zhou, Helen Meng | (参考訳) エンド・ツー・エンドのタスクボットの構築と、最小限の人的努力による新機能の統合は、ダイアログ研究における長年の課題である。
近年の大規模言語モデル (LLM) は、様々な下流タスクにおける会話のエンゲージメントと命令の順守において、例外的な熟練度を示している。
Multiwoz, RADDLE, STARデータセットによる実験結果から, SGP-TODはタスク固有のデータを持たず, 最先端(SOTA)ゼロショット性能を示し, 数発のアプローチを大幅に上回ることがわかった。
コードとデータを公開しています。 Building end-to-end task bots and maintaining their integration with new functionalities using minimal human efforts is a long-standing challenge in dialog research. Recently large language models (LLMs) have demonstrated exceptional proficiency in conversational engagement and adherence to instructions across various downstream tasks. In this work, we introduce SGP-TOD, Schema-Guided Prompting for building Task-Oriented Dialog systems effortlessly based on LLMs. Utilizing the symbolic knowledge -- task schema, we instruct fixed LLMs to generate appropriate responses on novel tasks, circumventing the need for training data. Specifically, SGP-TOD comprises three components: a LLM for engaging with users, a DST Prompter to aid the LLM with dialog state tracking, which is then used to retrieve database items, and a Policy Prompter to elicit proper responses adhering to the provided dialog policy. Experimental results on Multiwoz, RADDLE and STAR datasets show that our training-free strategy SGP-TOD, without any task-specific data, yields state-of-the-art (SOTA) zero-shot performance, greatly surpasses the few-shot approaches. In a domain-extension setting, SGP-TOD aptly adapts to new functionalities by merely adding supplementary schema rules. We make our code and data publicly available. | 翻訳日:2023-05-17 16:52:58 公開日:2023-05-15 |
# 人間によるAIのメンタルモデルを捉える:項目応答理論のアプローチ Capturing Humans' Mental Models of AI: An Item Response Theory Approach ( http://arxiv.org/abs/2305.09064v1 ) ライセンス: Link先を確認 | Markelle Kelly, Aakriti Kumar, Padhraic Smyth, Mark Steyvers | (参考訳) 人間がAIチームメイトをどのように知覚するかの理解を改善することは、人間とAIチームの一般的な理解にとって重要な基礎となります。
これらの知見が人間とAIの相互作用に与える影響について考察した。 Improving our understanding of how humans perceive AI teammates is an important foundation for our general understanding of human-AI teams. Extending relevant work from cognitive science, we propose a framework based on item response theory for modeling these perceptions. We apply this framework to real-world experiments, in which each participant works alongside another person or an AI agent in a question-answering setting, repeatedly assessing their teammate's performance. Using this experimental data, we demonstrate the use of our framework for testing research questions about people's perceptions of both AI agents and other people. We contrast mental models of AI teammates with those of human teammates as we characterize the dimensionality of these mental models, their development over time, and the influence of the participants' own self-perception. Our results indicate that people expect AI agents' performance to be significantly better on average than the performance of other humans, with less variation across different types of problems. We conclude with a discussion of the implications of these findings for human-AI interaction. | 翻訳日:2023-05-17 16:52:33 公開日:2023-05-15 |
# 境界KRnetとその密度推定・近似への応用 Bounded KRnet and its applications to density estimation and approximation ( http://arxiv.org/abs/2305.09063v1 ) ライセンス: Link先を確認 | Li Zeng, Xiaoliang Wan, Tao Zhou | (参考訳) 本稿では,B-KRnetと呼ばれる非可逆写像を有界領域上で開発し,データに対する密度推定/近似や,Fokker-Planck方程式やKeller-Segel方程式などのPDEの解に適用する。
B-KRnet と KRnet の主な違いは、B-KRnet がハイパーキューブ上で定義されるのに対し、KRnet は全空間上で定義されることである。
KRnet と B-KRnet を結合することにより、ある次元が有界で他の次元が非有界な高次元領域上の深部生成モデルを定義できる。
B-KRnetの有効性を示すために,様々な数値実験を行った。 In this paper, we develop an invertible mapping, called B-KRnet, on a bounded domain and apply it to density estimation/approximation for data or the solutions of PDEs such as the Fokker-Planck equation and the Keller-Segel equation. Similar to KRnet, the structure of B-KRnet adapts the triangular form of the Knothe-Rosenblatt rearrangement into a normalizing flow model. The main difference between B-KRnet and KRnet is that B-KRnet is defined on a hypercube while KRnet is defined on the whole space, in other words, we introduce a new mechanism in B-KRnet to maintain the exact invertibility. Using B-KRnet as a transport map, we obtain an explicit probability density function (PDF) model that corresponds to the pushforward of a prior (uniform) distribution on the hypercube. To approximate PDFs defined on a bounded computational domain, B-KRnet is more effective than KRnet. By coupling KRnet and B-KRnet, we can also define a deep generative model on a high-dimensional domain where some dimensions are bounded and other dimensions are unbounded. A typical case is the solution of the stationary kinetic Fokker-Planck equation, which is a PDF of position and momentum. Based on B-KRnet, we develop an adaptive learning approach to approximate partial differential equations whose solutions are PDFs or can be regarded as a PDF. In addition, we apply B-KRnet to density estimation when only data are available. A variety of numerical experiments is presented to demonstrate the effectiveness of B-KRnet. | 翻訳日:2023-05-17 16:52:15 公開日:2023-05-15 |
# SuSana Distanciaが必要なのは、距離に基づく2つの新しい損失関数による距離学習におけるクラス分離可能性の強化 SuSana Distancia is all you need: Enforcing class separability in metric learning via two novel distance-based loss functions for few-shot image classification ( http://arxiv.org/abs/2305.09062v1 ) ライセンス: Link先を確認 | Mauricio Mendez-Ruiza, Jorge Gonzalez-Zapatab, Ivan Reyes-Amezcuab, Daniel Flores-Araizaa, Francisco Lopez-Tiroa, Andres Mendez-Vazquezb, and Gilberto Ochoa-Ruiz | (参考訳) 少数ショット学習は、いくつかのラベル付きデータサンプルだけで新しい概念を学ぶことを目的とした、困難な研究分野である。
最初の損失関数はプロト三重項損失(proto-triplet loss)である。
実験では,Caltech CUB, Dogs, Carsといった他のドメインに対して,最先端技術と比較して競合的な一般化能力を実証した。 Few-shot learning is a challenging area of research that aims to learn new concepts with only a few labeled samples of data. Recent works based on metric-learning approaches leverage the meta-learning approach, which is encompassed by episodic tasks that make use a support (training) and query set (test) with the objective of learning a similarity comparison metric between those sets. Due to the lack of data, the learning process of the embedding network becomes an important part of the few-shot task. Previous works have addressed this problem using metric learning approaches, but the properties of the underlying latent space and the separability of the difference classes on it was not entirely enforced. In this work, we propose two different loss functions which consider the importance of the embedding vectors by looking at the intra-class and inter-class distance between the few data. The first loss function is the Proto-Triplet Loss, which is based on the original triplet loss with the modifications needed to better work on few-shot scenarios. The second loss function, which we dub ICNN loss is based on an inter and intra class nearest neighbors score, which help us to assess the quality of embeddings obtained from the trained network. Our results, obtained from a extensive experimental setup show a significant improvement in accuracy in the miniImagenNet benchmark compared to other metric-based few-shot learning methods by a margin of 2%, demonstrating the capability of these loss functions to allow the network to generalize better to previously unseen classes. In our experiments, we demonstrate competitive generalization capabilities to other domains, such as the Caltech CUB, Dogs and Cars datasets compared with the state of the art. | 翻訳日:2023-05-17 16:51:48 公開日:2023-05-15 |
# Koopman Message Passing を用いた非線形ネットワークダイナミクスのための線形埋め込み学習 Learning Linear Embeddings for Non-Linear Network Dynamics with Koopman Message Passing ( http://arxiv.org/abs/2305.09060v1 ) ライセンス: Link先を確認 | King Fai Yeh, Paris Flood, William Redman, and Pietro Li\`o | (参考訳) 近年、クープマン作用素理論は非線形力学系の線形表現を開発するための強力なツールとなっている。
また、ニューラルネットワークアーキテクチャの非線形トレーニングダイナミクスにもアプローチを適用し、古典的なオプティマイザによってトレーニングされたネットワークに匹敵する性能でネットワークパラメータを生成する線形表現を得る。 Recently, Koopman operator theory has become a powerful tool for developing linear representations of non-linear dynamical systems. However, existing data-driven applications of Koopman operator theory, including both traditional and deep learning approaches, perform poorly on non-linear network dynamics problems as they do not address the underlying geometric structure. In this paper we present a novel approach based on Koopman operator theory and message passing networks that finds a linear representation for the dynamical system which is globally valid at any time step. The linearisations found by our method produce predictions on a suite of network dynamics problems that are several orders of magnitude better than current state-of-the-art techniques. We also apply our approach to the highly non-linear training dynamics of neural network architectures, and obtain linear representations which can generate network parameters with comparable performance to networks trained by classical optimisers. | 翻訳日:2023-05-17 16:51:11 公開日:2023-05-15 |
# デジタルポンド : 家庭と企業のための新しい形態のお金」に対する反応 Response to "The digital pound: a new form of money for households and businesses" ( http://arxiv.org/abs/2305.09059v1 ) ライセンス: Link先を確認 | Geoffrey Goodell | (参考訳) この文書には、イングランド銀行とhm財務省が発行した諮問論文「the digital pound: a new form of money for households and business?」への回答が含まれており、2020年の「central bank digital currency: opportunities, challenges and design」、2021年の「new forms of digital money」を含むシリーズの最新諮問論文である。
このコンサルテーション・ペーパー(Consultation Paper)は、イングランド銀行がイギリスで小売用に採用した中央銀行デジタル通貨(CBDC)に関する論文である。
本書の第3部では、協議質問について直接取り上げなければならない。 This document includes a response to a consultation Paper published by the Bank of England and HM Treasury, "The digital pound: a new form of money for households and businesses?", the latest Consultation Paper in a series that includes "Central Bank Digital Currency: opportunities, challenges and design" in 2020 and "New forms of digital money" in 2021. This Consultation Paper is about the adoption of central bank digital currency (CBDC) for retail use in the United Kingdom by the Bank of England. We shall address the consultation questions directly in the third section of this document. | 翻訳日:2023-05-17 16:50:54 公開日:2023-05-15 |
# MLaaSにおけるプライベートトレーニングセット検査 Private Training Set Inspection in MLaaS ( http://arxiv.org/abs/2305.09058v1 ) ライセンス: Link先を確認 | Mingxue Xu, Tongtong Xu, Po-Yu Chen | (参考訳) マシンラーニング・アズ・ア・サービス(MLaaS)は、MLモデルの使用を目指すが、トレーニングデータ、計算リソース、あるいはMLの専門知識が欠如している顧客のための、一般的なクラウドベースのソリューションである。
実験の結果,本ソリューションは,メンバシップインスペクションの精度が最大 0.87 であり,多様性と公平性分布を検査する信頼性が 99.3% に達することがわかった。 Machine Learning as a Service (MLaaS) is a popular cloud-based solution for customers who aim to use an ML model but lack training data, computation resources, or expertise in ML. In this case, the training datasets are typically a private possession of the ML or data companies and are inaccessible to the customers, but the customers still need an approach to confirm that the training datasets meet their expectations and fulfil regulatory measures like fairness. However, no existing work addresses the above customers' concerns. This work is the first attempt to solve this problem, taking data origin as an entry point. We first define origin membership measurement and based on this, we then define diversity and fairness metrics to address customers' concerns. We then propose a strategy to estimate the values of these two metrics in the inaccessible training dataset, combining shadow training techniques from membership inference and an efficient featurization scheme in multiple instance learning. The evaluation contains an application of text review polarity classification applications based on the language BERT model. Experimental results show that our solution can achieve up to 0.87 accuracy for membership inspection and up to 99.3% confidence in inspecting diversity and fairness distribution. | 翻訳日:2023-05-17 16:50:43 公開日:2023-05-15 |
# グラフニューラル埋め込みを用いたアクティブセマンティック定位 Active Semantic Localization with Graph Neural Embedding ( http://arxiv.org/abs/2305.06141v3 ) ライセンス: Link先を確認 | Mitsuki Yoshida, Kanji Tanaka, Ryogo Yamamoto, and Daiki Iwata | (参考訳) セマンティックローカライゼーション(セマンティックローカライゼーション)、すなわち、セマンティックイメージのモダリティを備えたロボットの自己ローカライゼーションは、ポイントゴールナビゲーション、オブジェクトゴールナビゲーション、ビジョン言語ナビゲーションといった近年出現するAIアプリケーションにおいて重要である。
本研究では, 軽量で完全cpuベースの, ドメイン適応型セマンティックローカライズフレームワークであるgraph neural localizerについて検討する。このアプローチは, (1) 局地的特徴とグローバル特徴の視点的, 外観的不変性を組み合わせたシーングラフ, (2) グラフデータの直接学習/認識を可能にするgraph neural network (非ベクトルデータ) という,最近の2つの技術から着想を得たものである。
フォトリアリスティック・ハビタットシミュレータを用いて、自己教師あり学習と教師なしドメイン適応の2つのシナリオの実験を行い、提案手法の有効性を検証した。 Semantic localization, i.e., robot self-localization with semantic image modality, is critical in recently emerging embodied AI applications such as point-goal navigation, object-goal navigation and vision language navigation. However, most existing works on semantic localization focus on passive vision tasks without viewpoint planning, or rely on additional rich modalities (e.g., depth measurements). Thus, the problem is largely unsolved. In this work, we explore a lightweight, entirely CPU-based, domain-adaptive semantic localization framework, called graph neural localizer.Our approach is inspired by two recently emerging technologies: (1) Scene graph, which combines the viewpoint- and appearance- invariance of local and global features; (2) Graph neural network, which enables direct learning/recognition of graph data (i.e., non-vector data). Specifically, a graph convolutional neural network is first trained as a scene graph classifier for passive vision, and then its knowledge is transferred to a reinforcement-learning planner for active vision. Experiments on two scenarios, self-supervised learning and unsupervised domain adaptation, using a photo-realistic Habitat simulator validate the effectiveness of the proposed method. | 翻訳日:2023-05-17 11:00:16 公開日:2023-05-15 |
# 3つの超伝導gmon量子ビットの最大絡み合いw状態の最適合成 Optimal preparation of the maximally entangled W state of three superconducting gmon qubits ( http://arxiv.org/abs/1909.09289v2 ) ライセンス: Link先を確認 | Dalton Jones and Armin Rahmani | (参考訳) 超伝導gmon量子ビットは、高度にチューニング可能な量子コンピューティングデバイスを可能にする。
ポントリャーギンの最小原理への接続を用いて、断熱進化を短くするこれらの 'bang-bang'' プロトコルのパターンを完全に特徴づける。
プロトコルは非常に堅牢で、高性能な3量子ビット量子ゲートの開発を促進する。 Superconducting gmon qubits allow for highly tuneable quantum computing devices. Optimally controlled evolution of these systems is of considerable interest. We determine the optimal dynamical protocols for the generation of the maximally entangled W state of three qubits from an easily prepared initial product state. These solutions are found by simulated annealing. Using the connection to the Pontryagin's minimum principle, we fully characterize the patterns of these ``bang-bang'' protocols, which shortcut the adiabatic evolution. The protocols are remarkably robust, facilitating the development of high-performance three-qubit quantum gates. | 翻訳日:2023-05-17 02:16:30 公開日:2023-05-15 |
# 軽量アグリゲーションとモーメント・アクセラレーションを用いたAdaGradの統一解析 A Unified Analysis of AdaGrad with Weighted Aggregation and Momentum Acceleration ( http://arxiv.org/abs/1808.03408v4 ) ライセンス: Link先を確認 | Li Shen, Congliang Chen, Fangyu Zou, Zequn Jie, Ju Sun and Wei Liu | (参考訳) 適応学習率と運動量法をSGDに統合すると、AdaGrad, RMSProp, Adam, AccAdaGrad, \textit{etc} などの適応確率的アルゴリズムが効率的に高速化される。
このギャップを埋めるために, (1) 重球運動量とネステロフ加速度勾配運動量の両方をカバーする統一運動量スキームを取り入れ, (2) アダグラード, アッカダグラード, アダム, rmsprop の学習率を統一化できる新しい重み付き適応学習率を採用している,という特徴を持つ, adausm とよばれる \emph{weighted adagrad with unified momentum} を提案する。
さらに、AdaUSM において多項式的に成長する重みを取ると、非凸確率環境における$\mathcal{O}(\log(T)/\sqrt{T})$収束率を得る。
また,adam と rmsprop の適応学習速度は, 指数関数的に増大する adausm に対応するため, adam と rmsprop を理解するための新しい視点を提供する。
最後に、様々なディープラーニングモデルとデータセットに関するAdaUSMとSGDの比較実験、AdaGrad、AdaEMA、Adam、AMSGradの比較を行った。 Integrating adaptive learning rate and momentum techniques into SGD leads to a large class of efficiently accelerated adaptive stochastic algorithms, such as AdaGrad, RMSProp, Adam, AccAdaGrad, \textit{etc}. In spite of their effectiveness in practice, there is still a large gap in their theories of convergences, especially in the difficult non-convex stochastic setting. To fill this gap, we propose \emph{weighted AdaGrad with unified momentum}, dubbed AdaUSM, which has the main characteristics that (1) it incorporates a unified momentum scheme which covers both the heavy ball momentum and the Nesterov accelerated gradient momentum; (2) it adopts a novel weighted adaptive learning rate that can unify the learning rates of AdaGrad, AccAdaGrad, Adam, and RMSProp. Moreover, when we take polynomially growing weights in AdaUSM, we obtain its $\mathcal{O}(\log(T)/\sqrt{T})$ convergence rate in the non-convex stochastic setting. We also show that the adaptive learning rates of Adam and RMSProp correspond to taking exponentially growing weights in AdaUSM, thereby providing a new perspective for understanding Adam and RMSProp. Lastly, comparative experiments of AdaUSM against SGD with momentum, AdaGrad, AdaEMA, Adam, and AMSGrad on various deep learning models and datasets are also carried out. | 翻訳日:2023-05-17 02:16:22 公開日:2023-05-15 |
# 回帰のための負相関学習を用いたハイブリッドアンサンブル法 A hybrid ensemble method with negative correlation learning for regression ( http://arxiv.org/abs/2104.02317v5 ) ライセンス: Link先を確認 | Yun Bai, Ganglin Tian, Yanfei Kang, Suling Jia | (参考訳) アンサンブルの必須分野であるハイブリッドアンサンブルは回帰分野で繁栄し、多様性の重要性を実証する研究が行われている。
結論として、本研究の価値は使いやすさと有効性にあるため、ハイブリッドアンサンブルは多様性と正確性を受け入れることができる。 Hybrid ensemble, an essential branch of ensembles, has flourished in the regression field, with studies confirming diversity's importance. However, previous ensembles consider diversity in the sub-model training stage, with limited improvement compared to single models. In contrast, this study automatically selects and weights sub-models from a heterogeneous model pool. It solves an optimization problem using an interior-point filtering linear-search algorithm. The objective function innovatively incorporates negative correlation learning as a penalty term, with which a diverse model subset can be selected. The best sub-models from each model class are selected to build the NCL ensemble, which performance is better than the simple average and other state-of-the-art weighting methods. It is also possible to improve the NCL ensemble with a regularization term in the objective function. In practice, it is difficult to conclude the optimal sub-model for a dataset prior due to the model uncertainty. Regardless, our method would achieve comparable accuracy as the potential optimal sub-models. In conclusion, the value of this study lies in its ease of use and effectiveness, allowing the hybrid ensemble to embrace diversity and accuracy. | 翻訳日:2023-05-17 01:51:07 公開日:2023-05-15 |
# 引数マイニングのためのマルチタスク注意残差ネットワーク Multi-Task Attentive Residual Networks for Argument Mining ( http://arxiv.org/abs/2102.12227v2 ) ライセンス: Link先を確認 | Andrea Galassi, Marco Lippi, Paolo Torroni | (参考訳) 複数の引数マイニングタスクにおける残差ネットワークとニューラルアテンションの利用について検討する。
以上の結果から,本手法は高度な計算フットプリントやコーパス固有の設計を持つ最先端アーキテクチャに対する強力な競合であり,汎用性,性能精度,モデルサイズ削減の両立を図っている。 We explore the use of residual networks and neural attention for multiple argument mining tasks. We propose a residual architecture that exploits attention, multi-task learning, and makes use of ensemble, without any assumption on document or argument structure. We present an extensive experimental evaluation on five different corpora of user-generated comments, scientific publications, and persuasive essays. Our results show that our approach is a strong competitor against state-of-the-art architectures with a higher computational footprint or corpus-specific design, representing an interesting compromise between generality, performance accuracy and reduced model size. | 翻訳日:2023-05-17 01:50:27 公開日:2023-05-15 |
# 有効複素数値ベクトルポテンシャルを持つアハロノフ・ボーム効果 Aharonov-Bohm effect with an effective complex-valued vector potential ( http://arxiv.org/abs/2101.11914v2 ) ライセンス: Link先を確認 | Ismael L. Paiva, Yakir Aharonov, Jeff Tollaksen, Mordecai Waegell | (参考訳) 量子電荷と磁場の動的源との相互作用は、アハロノフ・ボームのシナリオで考慮される。
さらに、これらの結果が対応原理にどう影響するかを議論し、古典システムの研究に関係のある複素ベクトルポテンシャルを作る。 The interaction between a quantum charge and a dynamic source of a magnetic field is considered in the Aharonov-Bohm scenario. It is shown that, in weak interactions with a post-selection of the source, the effective vector potential is, generally, complex-valued. This leads to new experimental protocols to detect the Aharonov-Bohm phase before the source is fully encircled. While this does not necessarily change the nonlocal status of the Aharonov-Bohm effect, it brings new insights into it. Moreover, we discuss how these results might have consequences for the correspondence principle, making complex vector potentials relevant to the study of classical systems. | 翻訳日:2023-05-17 01:50:16 公開日:2023-05-15 |
# 低ランク非巡回グラフと因果構造学習について On Low Rank Directed Acyclic Graphs and Causal Structure Learning ( http://arxiv.org/abs/2006.05691v2 ) ライセンス: Link先を確認 | Zhuangyan Fang, Shengyu Zhu, Jiji Zhang, Yue Liu, Zhitang Chen, Yangbo He | (参考訳) 近年のいくつかの進歩にもかかわらず、有向非巡回グラフ(DAG)で表される学習因果構造は、学習すべきグラフがスパースでない場合、高次元設定において難しい課題である。
実験では, 各種データモデル, 特に比較的大規模で高密度なグラフに対する低階適応の有効性を実証した。
さらに、バリデーション手順では、グラフが低いランクに制限されない場合でも、適応性は優れた、または同等の性能を維持する。 Despite several advances in recent years, learning causal structures represented by directed acyclic graphs (DAGs) remains a challenging task in high dimensional settings when the graphs to be learned are not sparse. In this paper, we propose to exploit a low rank assumption regarding the (weighted) adjacency matrix of a DAG causal model to help address this problem. We utilize existing low rank techniques to adapt causal structure learning methods to take advantage of this assumption and establish several useful results relating interpretable graphical conditions to the low rank assumption. Specifically, we show that the maximum rank is highly related to hubs, suggesting that scale-free networks, which are frequently encountered in practice, tend to be low rank. Our experiments demonstrate the utility of the low rank adaptations for a variety of data models, especially with relatively large and dense graphs. Moreover, with a validation procedure, the adaptations maintain a superior or comparable performance even when graphs are not restricted to be low rank. | 翻訳日:2023-05-17 01:49:44 公開日:2023-05-15 |
# Common Fateによる教師なしオブジェクト学習 Unsupervised Object Learning via Common Fate ( http://arxiv.org/abs/2110.06562v2 ) ライセンス: Link先を確認 | Matthias Tangemann, Steffen Schneider, Julius von K\"ugelgen, Francesco Locatello, Peter Gehler, Thomas Brox, Matthias K\"ummerer, Matthias Bethge, Bernhard Sch\"olkopf | (参考訳) ビデオから生成オブジェクトモデルを学習することは、長い問題であり、因果的シーンモデリングに必要である。
提案手法は,入力ビデオに含まれるオクルージョンを超えて一般化された生成モデルを学習し,トレーニングセットにないオブジェクト数や密度を許容することにより,トレーニング配信外の可視シーンをサンプリングするモジュール方式でシーンを表現可能であることを示す。 Learning generative object models from unlabelled videos is a long standing problem and required for causal scene modeling. We decompose this problem into three easier subtasks, and provide candidate solutions for each of them. Inspired by the Common Fate Principle of Gestalt Psychology, we first extract (noisy) masks of moving objects via unsupervised motion segmentation. Second, generative models are trained on the masks of the background and the moving objects, respectively. Third, background and foreground models are combined in a conditional "dead leaves" scene model to sample novel scene configurations where occlusions and depth layering arise naturally. To evaluate the individual stages, we introduce the Fishbowl dataset positioned between complex real-world scenes and common object-centric benchmarks of simplistic objects. We show that our approach allows learning generative models that generalize beyond the occlusions present in the input videos, and represent scenes in a modular fashion that allows sampling plausible scenes outside the training distribution by permitting, for instance, object numbers or densities not observed in the training set. | 翻訳日:2023-05-17 01:42:11 公開日:2023-05-15 |
# 線形光学と光検出は、近最適不明瞭なコヒーレント状態の識別を達成する Linear optics and photodetection achieve near-optimal unambiguous coherent state discrimination ( http://arxiv.org/abs/2109.00008v4 ) ライセンス: Link先を確認 | Jasminder S. Sidhu, Michael S. Bullock, Saikat Guha, and Cosmo Lupo | (参考訳) 理想的なレーザー光の量子記述である量子電磁場のコヒーレント状態は、光通信の情報キャリアとして素候補である。
以上の結果から,現在利用可能な光学部品は,複数の多モードコヒーレント状態のほぼ最適不明瞭な識別を実現するのに十分であることが示唆された。 Coherent states of the quantum electromagnetic field, the quantum description of ideal laser light, are prime candidates as information carriers for optical communications. A large body of literature exists on their quantum-limited estimation and discrimination. However, very little is known about the practical realizations of receivers for unambiguous state discrimination (USD) of coherent states. Here we fill this gap and outline a theory of USD with receivers that are allowed to employ: passive multimode linear optics, phase-space displacements, auxiliary vacuum modes, and on-off photon detection. Our results indicate that, in some regimes, these currently-available optical components are typically sufficient to achieve near-optimal unambiguous discrimination of multiple, multimode coherent states. | 翻訳日:2023-05-17 01:41:12 公開日:2023-05-15 |
# 量子位相空間における連続的メジャー化 Continuous majorization in quantum phase space ( http://arxiv.org/abs/2108.09167v2 ) ライセンス: Link先を確認 | Zacharie Van Herstraeten, Michael G. Jabbour and Nicolas J. Cerf | (参考訳) 量子位相空間における主化理論の役割を考察する。
位相空間におけるエントロピーの不確実性関係の文脈において、この予想のいくつかの意味を議論することで結論付ける。 We explore the role of majorization theory in quantum phase space. To this purpose, we restrict ourselves to quantum states with positive Wigner functions and show that the continuous version of majorization theory provides an elegant and very natural approach to exploring the information-theoretic properties of Wigner functions in phase space. After identifying all Gaussian pure states as equivalent in the precise sense of continuous majorization, which can be understood in light of Hudson's theorem, we conjecture a fundamental majorization relation: any positive Wigner function is majorized by the Wigner function of a Gaussian pure state (especially, the bosonic vacuum state or ground state of the harmonic oscillator). As a consequence, any Schur-concave function of the Wigner function is lower bounded by the value it takes for the vacuum state. This implies in turn that the Wigner entropy is lower bounded by its value for the vacuum state, while the converse is notably not true. Our main result is then to prove this fundamental majorization relation for a relevant subset of Wigner-positive quantum states which are mixtures of the three lowest eigenstates of the harmonic oscillator. Beyond that, the conjecture is also supported by numerical evidence. We conclude by discussing some implications of this conjecture in the context of entropic uncertainty relations in phase space. | 翻訳日:2023-05-17 01:40:59 公開日:2023-05-15 |
# 機会が訪れるときの貿易:地域意識と反復的リファインメントラベリングによる物価変動予測 Trade When Opportunity Comes: Price Movement Forecasting via Locality-Aware Attention and Iterative Refinement Labeling ( http://arxiv.org/abs/2107.11972v3 ) ライセンス: Link先を確認 | Liang Zeng, Lei Wang, Hui Niu, Ruchen Zhang, Ling Wang, Jian Li | (参考訳) 価格変動予測は、現在の市場状況やその他の関連情報に基づいて、金融資産の将来の動向を予測することを目的としている。
そこで本稿では,LA-Attention (Locality-Aware Attention) と Iterative Refinement Labeling (RA-Labeling) の2つの主要コンポーネントからなる価格変動予測フレームワーク LARA を提案する。
1) la-attentionはラベル情報に応じて、潜在的に有益なサンプルを自動的に抽出する。
さらに, LA-Attentionは, メトリクス学習技術を用いて, タスク固有距離測定を楽しみ, 潜在的に有益なサンプルに効果的に注意を分散させる。
大規模なアブレーション研究と実験により、LARAは確かにより信頼できる取引機会を捉えていることが示された。 Price movement forecasting aims at predicting the future trends of financial assets based on the current market conditions and other relevant information. Recently, machine learning (ML) methods have become increasingly popular and achieved promising results for price movement forecasting in both academia and industry. Most existing ML solutions formulate the forecasting problem as a classification (to predict the direction) or a regression (to predict the return) problem over the entire set of training data. However, due to the extremely low signal-to-noise ratio and stochastic nature of financial data, good trading opportunities are extremely scarce. As a result, without careful selection of potentially profitable samples, such ML methods are prone to capture the patterns of noises instead of real signals. To address this issue, we propose a novel price movement forecasting framework named LARA consisting of two main components: Locality-Aware Attention (LA-Attention) and Iterative Refinement Labeling (RA-Labeling). (1) LA-Attention automatically extracts the potentially profitable samples by attending to label information. Moreover, equipped with metric learning techniques, LA-Attention enjoys task-specific distance metrics and effectively distributes attention to potentially profitable samples. (2) RA-Labeling further iteratively refines the noisy labels of potentially profitable samples, and combines the learned predictors robust to the unseen and noisy samples. In a set of experiments on three real-world financial markets: stocks, cryptocurrencies, and ETFs, LARA significantly outperforms several machine learning based methods on the Qlib quantitative investment platform. Extensive ablation studies and experiments also demonstrate that LARA indeed captures more reliable trading opportunities. | 翻訳日:2023-05-17 01:40:35 公開日:2023-05-15 |
# ユニエンコーダ:世代対話システムのための高速かつ正確な応答選択パラダイム Uni-Encoder: A Fast and Accurate Response Selection Paradigm for Generation-Based Dialogue Systems ( http://arxiv.org/abs/2106.01263v5 ) ライセンス: Link先を確認 | Chiyu Song, Hongliang He, Haofei Yu, Pengfei Fang, Leyang Cui and Zhenzhong Lan | (参考訳) サンプル・アンド・ランクは現代世代の対話システムにとって重要なデコード戦略である。
例えば、ubuntu v2データセットの約4倍の速度でr10@1を2.9%改善している。 Sample-and-rank is a key decoding strategy for modern generation-based dialogue systems. It helps achieve diverse and high-quality responses by selecting an answer from a small pool of generated candidates. The current state-of-the-art ranking methods mainly use an encoding paradigm called Cross-Encoder, which separately encodes each context-candidate pair and ranks the candidates according to their fitness scores. However, Cross-Encoder repeatedly encodes the same lengthy context for each candidate, resulting in high computational costs. Poly-Encoder addresses the above problems by reducing the interaction between context and candidates, but with a price of performance drop. In this work, we develop a new paradigm called Uni-Encoder, that keeps the full attention over each pair as in Cross-Encoder while only encoding the context once, as in Poly-Encoder. Uni-Encoder encodes all the candidates with the context in one forward pass. We use the same positional embedding for all candidates to ensure they are treated equally and design a new attention mechanism to avoid confusion. Our Uni-Encoder can simulate other ranking paradigms using different attention and response concatenation methods. Extensive experiments show that our proposed paradigm achieves new state-of-the-art results on four benchmark datasets with high computational efficiency. For instance, it improves R10@1 by 2.9% with an approximately 4X faster inference speed on the Ubuntu V2 dataset. | 翻訳日:2023-05-17 01:38:47 公開日:2023-05-15 |
# 高精度・高速量子計算のための変分命令セットを用いた量子コンパイル Quantum compiling with a variational instruction set for accurate and fast quantum computing ( http://arxiv.org/abs/2203.15574v4 ) ライセンス: Link先を確認 | Ying Lu, Peng-Fei Zhou, Shao-Ming Fei, Shi-Ju Ran | (参考訳) 量子命令セット(qis)は、量子ハードウェアの量子ビットを制御することで物理的に実現可能な量子ゲートとして定義される。
高い柔軟性と効率性を持つ一般的なコンパイルアプローチとして、量子ビットは異なる量子回路で定義でき、異なる相互作用を持つ量子ハードウェアに適応することができる。 The quantum instruction set (QIS) is defined as the quantum gates that are physically realizable by controlling the qubits in quantum hardware. Compiling quantum circuits into the product of the gates in a properly defined QIS is a fundamental step in quantum computing. We here propose the quantum variational instruction set (QuVIS) formed by flexibly designed multi-qubit gates for higher speed and accuracy of quantum computing. The controlling of qubits for realizing the gates in a QuVIS is variationally achieved using the fine-grained time optimization algorithm. Significant reductions in both the error accumulation and time cost are demonstrated in realizing the swaps of multiple qubits and quantum Fourier transformations, compared with the compiling by a standard QIS such as the quantum microinstruction set (QuMIS, formed by several one- and two-qubit gates including one-qubit rotations and controlled-NOT gates). With the same requirement on quantum hardware, the time cost for QuVIS is reduced to less than one half of that for QuMIS. Simultaneously, the error is suppressed algebraically as the depth of the compiled circuit is reduced. As a general compiling approach with high flexibility and efficiency, QuVIS can be defined for different quantum circuits and be adapted to the quantum hardware with different interactions. | 翻訳日:2023-05-17 01:33:27 公開日:2023-05-15 |
# テキスト認識のための自己教師型インシシト・グリフアテンション Self-supervised Implicit Glyph Attention for Text Recognition ( http://arxiv.org/abs/2203.03382v4 ) ライセンス: Link先を確認 | Tongkun Guan, Chaochen Gu, Jingzheng Tu, Xue Yang, Qi Feng, Yudi Zhao, Xiaokang Yang, Wei Shen | (参考訳) 注意機構は、文字レベルの表現を抽出する能力のため、シーンテキスト認識(STR)メソッドにおける \emph{de facto} モジュールとなっている。
上記の問題に対処するため,我々はstr,self-supervised implicit glyph attention (siga) のための新しい注意機構を提案する。
実験の結果,SIGA は従来の注目に基づく STR 手法よりも,公開コンテキストベンチマークとコントリビューションレスベンチマークにおいて,注意の正しさと最終認識性能の両面において,一貫して,はるかに優れた性能を示した。 The attention mechanism has become the \emph{de facto} module in scene text recognition (STR) methods, due to its capability of extracting character-level representations. These methods can be summarized into implicit attention based and supervised attention based, depended on how the attention is computed, i.e., implicit attention and supervised attention are learned from sequence-level text annotations and or character-level bounding box annotations, respectively. Implicit attention, as it may extract coarse or even incorrect spatial regions as character attention, is prone to suffering from an alignment-drifted issue. Supervised attention can alleviate the above issue, but it is character category-specific, which requires extra laborious character-level bounding box annotations and would be memory-intensive when handling languages with larger character categories. To address the aforementioned issues, we propose a novel attention mechanism for STR, self-supervised implicit glyph attention (SIGA). SIGA delineates the glyph structures of text images by jointly self-supervised text segmentation and implicit attention alignment, which serve as the supervision to improve attention correctness without extra character-level annotations. Experimental results demonstrate that SIGA performs consistently and significantly better than previous attention-based STR methods, in terms of both attention correctness and final recognition performance on publicly available context benchmarks and our contributed contextless benchmarks. | 翻訳日:2023-05-17 01:33:04 公開日:2023-05-15 |
# 暗号通貨の評価 - 説明可能なAIアプローチ Cryptocurrency Valuation: An Explainable AI Approach ( http://arxiv.org/abs/2201.12893v4 ) ライセンス: Link先を確認 | Yulin Liu and Luyao Zhang | (参考訳) 現在、暗号通貨資産の基礎に関する説得力のあるプロキシは存在しない。
第1に、私たちの市場と資金の比率は、古典的な金融理論と、アドホックではなくBitcoin会計のユニークなUTXOモデルに基づくものであり、第2に、この比率の買い得と売り上げ高の影響を実証する実証的証拠であり、最後に、将来の研究において例外となるPython Package Indexを介して、オープンソースソフトウェアとしてトレーディングアルゴリズムを配布する。 Currently, there are no convincing proxies for the fundamentals of cryptocurrency assets. We propose a new market-to-fundamental ratio, the price-to-utility (PU) ratio, utilizing unique blockchain accounting methods. We then proxy various existing fundamental-to-market ratios by Bitcoin historical data and find they have little predictive power for short-term bitcoin returns. However, PU ratio effectively predicts long-term bitcoin returns than alternative methods. Furthermore, we verify the explainability of PU ratio using machine learning. Finally, we present an automated trading strategy advised by the PU ratio that outperforms the conventional buy-and-hold and market-timing strategies. Our research contributes to explainable AI in finance from three facets: First, our market-to-fundamental ratio is based on classic monetary theory and the unique UTXO model of Bitcoin accounting rather than ad hoc; Second, the empirical evidence testifies the buy-low and sell-high implications of the ratio; Finally, we distribute the trading algorithms as open-source software via Python Package Index for future research, which is exceptional in finance research. | 翻訳日:2023-05-17 01:31:57 公開日:2023-05-15 |
# 2つの時間スケール更新ルールを持つ生成逆数ネットワークのトレーニングのための臨界バッチサイズの存在と推定 Existence and Estimation of Critical Batch Size for Training Generative Adversarial Networks with Two Time-Scale Update Rule ( http://arxiv.org/abs/2201.11989v5 ) ライセンス: Link先を確認 | Naoki Sato and Hideaki Iiduka | (参考訳) 従来,2つの時間スケール更新規則(TTUR)は,異なる学習率,あるいは異なる減衰率などの異なる学習速度を用いて,理論上,実際に生成的敵ネットワーク(GAN)を訓練するのに有用であった。
さらに, 学習速度だけでなく, バッチサイズも, TTURを用いたGANの訓練において重要であり, どちらも訓練に必要なステップ数に影響を与える。
さらに, 評価された臨界バッチサイズは, 理論結果から推定したサイズに近いことがわかった。 Previous results have shown that a two time-scale update rule (TTUR) using different learning rates, such as different constant rates or different decaying rates, is useful for training generative adversarial networks (GANs) in theory and in practice. Moreover, not only the learning rate but also the batch size is important for training GANs with TTURs and they both affect the number of steps needed for training. This paper studies the relationship between batch size and the number of steps needed for training GANs with TTURs based on constant learning rates. We theoretically show that, for a TTUR with constant learning rates, the number of steps needed to find stationary points of the loss functions of both the discriminator and generator decreases as the batch size increases and that there exists a critical batch size minimizing the stochastic first-order oracle (SFO) complexity. Then, we use the Fr'echet inception distance (FID) as the performance measure for training and provide numerical results indicating that the number of steps needed to achieve a low FID score decreases as the batch size increases and that the SFO complexity increases once the batch size exceeds the measured critical batch size. Moreover, we show that measured critical batch sizes are close to the sizes estimated from our theoretical results. | 翻訳日:2023-05-17 01:31:37 公開日:2023-05-15 |
# フェデレートx武装バンディット Federated X-Armed Bandit ( http://arxiv.org/abs/2205.15268v3 ) ライセンス: Link先を確認 | Wenjie Li, Qifan Song, Jean Honorio, Guang Lin | (参考訳) この研究は、異なるクライアントが同じドメインで定義された異種な局所目的関数に直面するフェデレートされた$\mathcal{x}$-armed banditの最初のフレームワークを確立し、グローバルな最適化を協調的に決定する必要がある。
合成関数と実データセットの実験結果は、様々な集中型および連合型ベースラインアルゴリズムに対する \texttt{fed-pne} の利点を検証する。 This work establishes the first framework of federated $\mathcal{X}$-armed bandit, where different clients face heterogeneous local objective functions defined on the same domain and are required to collaboratively figure out the global optimum. We propose the first federated algorithm for such problems, named \texttt{Fed-PNE}. By utilizing the topological structure of the global objective inside the hierarchical partitioning and the weak smoothness property, our algorithm achieves sublinear cumulative regret with respect to both the number of clients and the evaluation budget. Meanwhile, it only requires logarithmic communications between the central server and clients, protecting the client privacy. Experimental results on synthetic functions and real datasets validate the advantages of \texttt{Fed-PNE} over various centralized and federated baseline algorithms. | 翻訳日:2023-05-17 01:21:50 公開日:2023-05-15 |
# コンテキスト・スペクタcoolとその質問応答および他の自然言語処理タスクへの応用 COOL, a Context Outlooker, and its Application to Question Answering and other Natural Language Processing Tasks ( http://arxiv.org/abs/2204.09593v2 ) ライセンス: Link先を確認 | Fangyi Zhu, See-Kiong Ng, St\'ephane Bressan | (参考訳) vision outlookerは、ローカル注意の形式であるoutlook attentionを追加することで、自己照準機構を実装するvision transformersの性能を向上させる。
本稿では,自然言語処理のためのoutlook attentionメカニズムを提案する。
提案手法は,既存の最先端手法との競合性能を実現する。 Vision outlooker improves the performance of vision transformers, which implements a self-attention mechanism by adding an outlook attention, a form of local attention. In natural language processing, as has been the case in computer vision and other domains, transformer-based models constitute the state-of-the-art for most processing tasks. In this domain, too, many authors have argued and demonstrated the importance of local context. We present an outlook attention mechanism, COOL, for natural language processing. COOL, added on top of the self-attention layers of a transformer-based model, encodes local syntactic context considering word proximity and more pair-wise constraints than dynamic convolution used by existing approaches. A comparative empirical performance evaluation of an implementation of COOL with different transformer-based models confirms the opportunity for improvement over a baseline using the original models alone for various natural language processing tasks, including question answering. The proposed approach achieves competitive performance with existing state-of-the-art methods on some tasks. | 翻訳日:2023-05-17 01:20:26 公開日:2023-05-15 |
# 一般ハミルトニアンの量子力学に対する確率的アプローチ Stochastic approach for quantum metrology with generic Hamiltonians ( http://arxiv.org/abs/2204.01055v2 ) ライセンス: Link先を確認 | Le Bin Ho | (参考訳) 近年, 乗法パラメータを持つハミルトニアンの変分量子距離論が提案され, 推定精度は変分回路で最適化できる。
我々の研究は、量子回路アルゴリズムを用いた一般ハミルトン派による量子力学の研究に光を当てている。 Recently, variational quantum metrology was proposed for Hamiltonians with multiplicative parameters, wherein the estimation precision can be optimized via variational circuits. However, systems with generic Hamiltonians still lack these variational schemes. This work introduces a quantum-circuit-based approach for studying quantum metrology with generic Hamiltonians. We present a time-dependent stochastic parameter-shift rule for the derivatives of evolved quantum states, whereby the quantum Fisher information can be obtained. The scheme can be executed in universal quantum computers under the family of parameterized gates. In magnetic field estimations, we demonstrate the consistency between the results obtained from the stochastic parameter-shift rule and the exact results, while the results obtained from a standard parameter-shift rule slightly deviate from the exact ones. Our work sheds light on studying quantum metrology with generic Hamiltonians using quantum circuit algorithms. | 翻訳日:2023-05-17 01:19:56 公開日:2023-05-15 |
# プッシュフォワード生成モデルの潜時空間幾何学の展開 Unveiling the Latent Space Geometry of Push-Forward Generative Models ( http://arxiv.org/abs/2207.10541v3 ) ライセンス: Link先を確認 | Thibaut Issenhuth, Ugo Tanielian, J\'er\'emie Mary, David Picard | (参考訳) 多くの深い生成モデルは、GAN(Generative Adversarial Networks)やVAE(VAE)のような連続生成器によってガウス測度のプッシュフォワードとして定義される。
さらに,遅延空間における単純なクラスタ構造を強制し,GANの性能を向上するトランケーション手法を提案する。 Many deep generative models are defined as a push-forward of a Gaussian measure by a continuous generator, such as Generative Adversarial Networks (GANs) or Variational Auto-Encoders (VAEs). This work explores the latent space of such deep generative models. A key issue with these models is their tendency to output samples outside of the support of the target distribution when learning disconnected distributions. We investigate the relationship between the performance of these models and the geometry of their latent space. Building on recent developments in geometric measure theory, we prove a sufficient condition for optimality in the case where the dimension of the latent space is larger than the number of modes. Through experiments on GANs, we demonstrate the validity of our theoretical results and gain new insights into the latent space geometry of these models. Additionally, we propose a truncation method that enforces a simplicial cluster structure in the latent space and improves the performance of GANs. | 翻訳日:2023-05-17 01:13:56 公開日:2023-05-15 |
# 自動音声キャプションと言語に基づく音声検索 Automated Audio Captioning and Language-Based Audio Retrieval ( http://arxiv.org/abs/2207.04156v2 ) ライセンス: Link先を確認 | Clive Gomes, Hyejin Park, Patrick Kollman, Yi Song, Iffanice Houndayi, Ankit Shah | (参考訳) 本プロジェクトは,(1)自動音声キャプションと(2)言語に基づく音声検索の2つのサブタスクを有するDCASE 2022コンペティション(タスク6)に参加した。
モデルは, BLEU1, BLEU2, BLEU3, ROUGEL, METEOR, CIDEr, SPICE, SPIDErの音声キャプション, R1, R5, R10, mARP10で評価した。
Automated Audio Captioningの最終的なアーキテクチャはベースラインのパフォーマンスに近いが、Language-based Audio Retrievalのモデルはそれを上回っている。 This project involved participation in the DCASE 2022 Competition (Task 6) which had two subtasks: (1) Automated Audio Captioning and (2) Language-Based Audio Retrieval. The first subtask involved the generation of a textual description for audio samples, while the goal of the second was to find audio samples within a fixed dataset that match a given description. For both subtasks, the Clotho dataset was used. The models were evaluated on BLEU1, BLEU2, BLEU3, ROUGEL, METEOR, CIDEr, SPICE, and SPIDEr scores for audio captioning and R1, R5, R10 and mARP10 scores for audio retrieval. We have conducted a handful of experiments that modify the baseline models for these tasks. Our final architecture for Automated Audio Captioning is close to the baseline performance, while our model for Language-Based Audio Retrieval has surpassed its counterpart. | 翻訳日:2023-05-17 01:13:40 公開日:2023-05-15 |
# 量子回路における遅延チョイス量子消去器の相補性関係 Complementarity relations of a delayed-choice quantum eraser in a quantum circuit ( http://arxiv.org/abs/2207.03946v3 ) ライセンス: Link先を確認 | Dah-Wei Chiou, Hsiu-Chuan Hsu | (参考訳) 本稿では,2対の量子子間の絡み合いの度合いが調整可能であるという拡張により,遅延チョイス量子消去器を二部交絡によりエミュレートする量子回路を提案する。
次に、IBM Quantumプラットフォームが提供する量子コンピュータの実験を行い、理論的予測を検証する。
また, 遅延ゲートを用いて, 方向情報の測定を遅延させ, 真の「遅延重み」方式で測定できることを確認した。 We propose a quantum circuit that emulates a delayed-choice quantum eraser via bipartite entanglement with the extension that the degree of entanglement between the two paired quantons is adjustable. This provides a broader setting to test complementarity relations between interference visibility and which-way distinguishability in the scenario that the which-way information is obtained through entanglement without direct contact with the quantum state for interference. The visibility-distinguishability relations are investigated from three perspectives that differ in how the which-way information is taken into consideration. These complementarity relations can be understood in terms of entropic uncertainty relations in the information-theoretic framework and the triality relation that incorporates single-particle and bipartite properties. We then perform experiments on the quantum computers provided by the IBM Quantum platform to verify the theoretical predictions. We also apply the delay gate to delay the measurement of the which-way information to affirm that the measurement can be made truly in the "delayed-choice" manner. | 翻訳日:2023-05-17 01:13:22 公開日:2023-05-15 |
# ログデータにおける異常検出のためのディープラーニング:調査 Deep Learning for Anomaly Detection in Log Data: A Survey ( http://arxiv.org/abs/2207.03820v2 ) ライセンス: Link先を確認 | Max Landauer, Sebastian Onder, Florian Skopik, Markus Wurzenberger | (参考訳) 自動ログファイル解析は、システム障害などの関連するインシデントを早期に検出する。
この調査は既存のアプローチを定量的に比較するものではなく、異なるモデルアーキテクチャの関連する側面を読者が理解できるようにすることを目的としている。 Automatic log file analysis enables early detection of relevant incidents such as system failures. In particular, self-learning anomaly detection techniques capture patterns in log data and subsequently report unexpected log event occurrences to system operators without the need to provide or manually model anomalous scenarios in advance. Recently, an increasing number of approaches leveraging deep learning neural networks for this purpose have been presented. These approaches have demonstrated superior detection performance in comparison to conventional machine learning techniques and simultaneously resolve issues with unstable data formats. However, there exist many different architectures for deep learning and it is non-trivial to encode raw and unstructured log data to be analyzed by neural networks. We therefore carry out a systematic literature review that provides an overview of deployed models, data pre-processing mechanisms, anomaly detection techniques, and evaluations. The survey does not quantitatively compare existing approaches but instead aims to help readers understand relevant aspects of different model architectures and emphasizes open issues for future work. | 翻訳日:2023-05-17 01:13:04 公開日:2023-05-15 |
# 最適かつロバストなカテゴリーレベル知覚:2次元および3次元意味的キーポイントによる物体のポーズと形状推定 Optimal and Robust Category-level Perception: Object Pose and Shape Estimation from 2D and 3D Semantic Keypoints ( http://arxiv.org/abs/2206.12498v2 ) ライセンス: Link先を確認 | Jingnan Shi, Heng Yang, Luca Carlone | (参考訳) カテゴリーレベルの知覚問題を考えると、与えられたカテゴリーのオブジェクト(例えば車)を2dまたは3dのセンサーデータで認識し、クラス内の変化にかかわらずオブジェクトの3dポーズと形状を再構築する必要がある(例えば、異なるカーモデルが異なる形状を持つ)。
PACE3D* と PACE2D* は,それぞれ 3D と 2D のキーポイントを用いたポーズと形状推定に最適である。
この目標に向けて、我々は、測定値の互換性をモデル化するために互換性ハイパーグラフを使用するprune outliersのための一般的なグラフ理論フレームワークであるrobinを提案する。
コードをhttps://github.com/MIT-SPARK/PACEでリリースします。 We consider a category-level perception problem, where one is given 2D or 3D sensor data picturing an object of a given category (e.g., a car), and has to reconstruct the 3D pose and shape of the object despite intra-class variability (i.e., different car models have different shapes). We consider an active shape model, where -- for an object category -- we are given a library of potential CAD models describing objects in that category, and we adopt a standard formulation where pose and shape are estimated from 2D or 3D keypoints via non-convex optimization. Our first contribution is to develop PACE3D* and PACE2D*, the first certifiably optimal solvers for pose and shape estimation using 3D and 2D keypoints, respectively. Both solvers rely on the design of tight (i.e., exact) semidefinite relaxations. Our second contribution is to develop outlier-robust versions of both solvers, named PACE3D# and PACE2D#. Towards this goal, we propose ROBIN, a general graph-theoretic framework to prune outliers, which uses compatibility hypergraphs to model measurements' compatibility. We show that in category-level perception problems these hypergraphs can be built from the winding orders of the keypoints (in 2D) or their convex hulls (in 3D), and many outliers can be filtered out via maximum hyperclique computation. The last contribution is an extensive experimental evaluation. Besides providing an ablation study on simulated datasets and on the PASCAL3D+ dataset, we combine our solver with a deep keypoint detector, and show that PACE3D# improves over the state of the art in vehicle pose estimation in the ApolloScape datasets, and its runtime is compatible with practical applications. We release our code at https://github.com/MIT-SPARK/PACE. | 翻訳日:2023-05-17 01:11:00 公開日:2023-05-15 |
# テンプレートに基づく時間適応による動的文脈化単語埋め込みの学習 Learning Dynamic Contextualised Word Embeddings via Template-based Temporal Adaptation ( http://arxiv.org/abs/2208.10734v2 ) ライセンス: Link先を確認 | Xiaohang Tang, Yi Zhou, Danushka Bollegala | (参考訳) dynamic contextized word embeddeds (dcwes) は、単語の時間的意味変化を表す。
2つの異なるタイムスタンプ $t_1$ と $t_2$ でそれぞれ取られたコーパスの2つのスナップショット $c_1$ と $c_2$ を考えると、まずは教師なしの方法を提案する。
(a)$c_1$ と $c_2$ のどちらも関連する用語と、
複数の実験により, 提案手法はテスト文の難易度を$C_2$で低減し, 現状よりも優れていた。 Dynamic contextualised word embeddings (DCWEs) represent the temporal semantic variations of words. We propose a method for learning DCWEs by time-adapting a pretrained Masked Language Model (MLM) using time-sensitive templates. Given two snapshots $C_1$ and $C_2$ of a corpus taken respectively at two distinct timestamps $T_1$ and $T_2$, we first propose an unsupervised method to select (a) \emph{pivot} terms related to both $C_1$ and $C_2$, and (b) \emph{anchor} terms that are associated with a specific pivot term in each individual snapshot. We then generate prompts by filling manually compiled templates using the extracted pivot and anchor terms. Moreover, we propose an automatic method to learn time-sensitive templates from $C_1$ and $C_2$, without requiring any human supervision. Next, we use the generated prompts to adapt a pretrained MLM to $T_2$ by fine-tuning using those prompts. Multiple experiments show that our proposed method reduces the perplexity of test sentences in $C_2$, outperforming the current state-of-the-art. | 翻訳日:2023-05-17 01:03:40 公開日:2023-05-15 |
# 新規タスクのオンラインワンショット学習のための多様な知識ソースの統合 Integrating Diverse Knowledge Sources for Online One-shot Learning of Novel Tasks ( http://arxiv.org/abs/2208.09554v3 ) ライセンス: Link先を確認 | James R. Kirk, Robert E. Wray, Peter Lindes, John E. Laird | (参考訳) 自律エージェントは、さまざまな潜在的なタスク知識ソースを描画することができるが、現在のアプローチは、常に1つまたは2つだけに焦点を当てている。
soar cognitive architectureで開発されたエージェントは、環境とのインタラクション、タスクの実行と検索の知識、人間の自然言語命令、大きな言語モデル(gpt-3)から得られた応答など、ドメインとタスクの知識のソースを使用する。
その結果、エージェントが様々な知識ソースをオンラインに統合することで、一発のタスク学習全体が改善され、迅速かつ信頼性の高いタスク学習に必要な人的フィードバックが削減されることがわかった。 Autonomous agents are able to draw on a wide variety of potential sources of task knowledge; however current approaches invariably focus on only one or two. Here we investigate the challenges and impact of exploiting diverse knowledge sources to learn online, in one-shot, new tasks for a simulated office mobile robot. The resulting agent, developed in the Soar cognitive architecture, uses the following sources of domain and task knowledge: interaction with the environment, task execution and search knowledge, human natural language instruction, and responses retrieved from a large language model (GPT-3). We explore the distinct contributions of these knowledge sources and evaluate the performance of different combinations in terms of learning correct task knowledge and human workload. Results show that an agent's online integration of diverse knowledge sources improves one-shot task learning overall, reducing human feedback needed for rapid and reliable task learning. | 翻訳日:2023-05-17 01:03:16 公開日:2023-05-15 |
# DCGANを用いた糖尿病網膜症画像の品質と多様性の評価 Evaluating the Quality and Diversity of DCGAN-based Generatively Synthesized Diabetic Retinopathy Imagery ( http://arxiv.org/abs/2208.05593v2 ) ライセンス: Link先を確認 | Cristina-Madalina Dragan, Muhammad Muneeb Saad, Mubashir Husain Rehmani, and Ruairi O'Reilly | (参考訳) 公開されている糖尿病網膜症(DR)データセットは不均衡であり、DRを持つ画像の数が限られている。
この不均衡に対処するには、GAN(Generative Adversarial Networks)を使用して、データセットを合成画像で拡張する。
合成画像の品質と多様性を評価するために、マルチスケール構造類似度指数(MS-SSIM)、コサイン距離(CD)、Fr\echet Inception Distance(FID)などの評価指標を用いる。
本研究は, 深層畳み込みgan (dcgan) が生成する合成増殖性dr画像に適用する評価指標の実験的評価に寄与する。
さらに、F1とAUCスコアが示すように、畳み込みニューラルネットワーク(CNN)と効率的なネット分類器の強化データセットに対する優れた性能は、不均衡データセットを増大させる合成画像の有効性を示す。 Publicly available diabetic retinopathy (DR) datasets are imbalanced, containing limited numbers of images with DR. This imbalance contributes to overfitting when training machine learning classifiers. The impact of this imbalance is exacerbated as the severity of the DR stage increases, affecting the classifiers' diagnostic capacity. The imbalance can be addressed using Generative Adversarial Networks (GANs) to augment the datasets with synthetic images. Generating synthetic images is advantageous if high-quality and diversified images are produced. To evaluate the quality and diversity of synthetic images, several evaluation metrics, such as Multi-Scale Structural Similarity Index (MS-SSIM), Cosine Distance (CD), and Fr\'echet Inception Distance (FID) are used. Understanding the effectiveness of each metric in evaluating the quality and diversity of GAN-based synthetic images is critical to select images for augmentation. To date, there has been limited analysis of the appropriateness of these metrics in the context of biomedical imagery. This work contributes an empirical assessment of these evaluation metrics as applied to synthetic Proliferative DR imagery generated by a Deep Convolutional GAN (DCGAN). Furthermore, the metrics' capacity to indicate the quality and diversity of synthetic images and a correlation with classifier performance is undertaken. This enables a quantitative selection of synthetic imagery and an informed augmentation strategy. Results indicate that FID is suitable for evaluating the quality, while MS-SSIM and CD are suitable for evaluating the diversity of synthetic imagery. Furthermore, the superior performance of Convolutional Neural Network (CNN) and EfficientNet classifiers, as indicated by the F1 and AUC scores, for the augmented datasets demonstrates the efficacy of synthetic imagery to augment the imbalanced dataset. | 翻訳日:2023-05-17 01:02:02 公開日:2023-05-15 |
# 散逸性反磁性における運動エネルギーと磁気モーメントの分配 Partition of kinetic energy and magnetic moment in dissipative diamagnetism ( http://arxiv.org/abs/2208.00161v3 ) ライセンス: Link先を確認 | Jasleen Kaur, Aritra Ghosh, Malay Bandyopadhyay | (参考訳) 本稿では,2次元における散逸性シクロトロン運動に起因する散逸性双磁性を,エネルギー平衡定理の量子対の光で解析する。
より伝統的なギブズアプローチで得られたものとの比較研究を行い、完全な合意を得る。 In this paper, we analyze dissipative diamagnetism, arising due to dissipative cyclotron motion in two dimensions, in the light of the quantum counterpart of energy equipartition theorem. We consider a charged quantum particle moving in a harmonic well, in the presence of a uniform magnetic field, and coupled to a quantum heat bath which is taken to be composed of an infinite number of independent quantum oscillators. The quantum counterpart of energy equipartition theorem tells us that it is possible to express the mean kinetic energy of the dissipative oscillator as a two-fold average, where, the first averaging is performed over the Gibbs canonical state of the heat bath while the second one is governed by a probability distribution function $P_k(\omega)$. We analyze this result further, and also demonstrate its consistency in the weak-coupling limit. Following this, we compute the equilibrium magnetic moment of the system, and reveal an interesting connection with the quantum counterpart of energy equipartition theorem. The expressions for kinetic energy and magnetic moment are reformulated in the context of superstatistics, i.e. the superposition of two statistics. A comparative study of the present results with those obtained from the more traditional Gibbs approach is performed and a perfect agreement is obtained. | 翻訳日:2023-05-17 01:01:01 公開日:2023-05-15 |
# 機械翻訳評価のためのラウンドトリップ翻訳の再考 Rethinking Round-Trip Translation for Machine Translation Evaluation ( http://arxiv.org/abs/2209.07351v3 ) ライセンス: Link先を確認 | Terry Yue Zhuo, Qiongkai Xu, Xuanli He, Trevor Cohn | (参考訳) 低リソース言語翻訳の自動評価は並列コーパスの欠如に悩まされる。
しかし, 統計的機械翻訳(SMT)の時代において, 前向き翻訳とラウンドトリップ翻訳による評価スコアの曖昧な相関が観察された。
一 対応する前方翻訳スコアを予測すること
二 最新の品質推定モデルの性能を向上させること、及び
三 クロスシステム検証により、共通業務における敵の識別 Automatic evaluation on low-resource language translation suffers from a deficiency of parallel corpora. Round-trip translation could be served as a clever and straightforward technique to alleviate the requirement of the parallel evaluation corpus. However, there was an observation of obscure correlations between the evaluation scores by forward and round-trip translations in the era of statistical machine translation (SMT). In this paper, we report the surprising finding that round-trip translation can be used for automatic evaluation without the references. Firstly, our revisit on the round-trip translation in SMT evaluation unveils that its long-standing misunderstanding is essentially caused by copying mechanism. After removing copying mechanism in SMT, round-trip translation scores can appropriately reflect the forward translation performance. Then, we demonstrate the rectification is overdue as round-trip translation could benefit multiple machine translation evaluation tasks. To be more specific, round-trip translation could be used i) to predict corresponding forward translation scores; ii) to improve the performance of the recently advanced quality estimation model; and iii) to identify adversarial competitors in shared tasks via cross-system verification. | 翻訳日:2023-05-17 00:53:36 公開日:2023-05-15 |
# オフポリティ強化学習における再利用バイアスについて On the Reuse Bias in Off-Policy Reinforcement Learning ( http://arxiv.org/abs/2209.07074v2 ) ライセンス: Link先を確認 | Chengyang Ying, Zhongkai Hao, Xinning Zhou, Hang Su, Dong Yan, Jun Zhu | (参考訳) 重要サンプリング (is) はオフポリシー評価において一般的な手法であり、サンプル効率を高めるためにリプレイバッファ内の軌道の再重み付けを行う。
本稿では,isの再利用バイアスの新しい概念 -- 評価と最適化のためにリプレイバッファの再利用によって生じるオフポリシー評価のバイアス -- にも不安定性が関係していることを明らかにする。
これらの分析に基づいて, 再利用バイアスの悪影響を緩和する実用的なアルゴリズムとともに, 新たなバイアス正規化重要度サンプリング(biris)フレームワークを提案する。
実験の結果,本手法はムジョコにおける一連の連続制御タスクのサンプル効率を大幅に向上できることがわかった。 Importance sampling (IS) is a popular technique in off-policy evaluation, which re-weights the return of trajectories in the replay buffer to boost sample efficiency. However, training with IS can be unstable and previous attempts to address this issue mainly focus on analyzing the variance of IS. In this paper, we reveal that the instability is also related to a new notion of Reuse Bias of IS -- the bias in off-policy evaluation caused by the reuse of the replay buffer for evaluation and optimization. We theoretically show that the off-policy evaluation and optimization of the current policy with the data from the replay buffer result in an overestimation of the objective, which may cause an erroneous gradient update and degenerate the performance. We further provide a high-probability upper bound of the Reuse Bias, and show that controlling one term of the upper bound can control the Reuse Bias by introducing the concept of stability for off-policy algorithms. Based on these analyses, we finally present a novel Bias-Regularized Importance Sampling (BIRIS) framework along with practical algorithms, which can alleviate the negative impact of the Reuse Bias. Experimental results show that our BIRIS-based methods can significantly improve the sample efficiency on a series of continuous control tasks in MuJoCo. | 翻訳日:2023-05-17 00:53:22 公開日:2023-05-15 |
# オープン量子システムとしてのDQC1 DQC1 as an Open Quantum System ( http://arxiv.org/abs/2209.03947v2 ) ライセンス: Link先を確認 | Jake Xuereb, Steve Campbell, John Goold, Andr\'e Xuereb | (参考訳) dqc1複雑性クラス、すなわち1量子ビットモデルのパワーは、オープン量子システムとして検討される。
応用として, DQC1トレース推定アルゴリズムの平衡と非平衡熱力学について検討する。
異なる計算入力、すなわち、推定されるトレースは、量子ビットのレジスタ全体にわたって異なるエネルギー交換を生じさせ、論理量子ビットの温度が経験した変動の大きさとアルゴリズムの品質に影響することを示す。 The DQC1 complexity class, or power of one qubit model, is examined as an open quantum system. We study the dynamics of a register of qubits carrying out a DQC1 algorithm and show that, for any algorithm in the complexity class, the evolution of the logical qubit can be described as an open quantum system undergoing a dynamics which is unital. Unital quantum channels respect the Tasaki-Crooks fluctuation theorem and we demonstrate how this is captured by the thermodynamics of the logical qubit. As an application, we investigate the equilibrium and non-equilibrium thermodynamics of the DQC1 trace estimation algorithm. We show that different computational inputs, i.e. different traces being estimated, lead to different energetic exchanges across the register of qubits and that the temperature of the logical qubit impacts the magnitude of fluctuations experienced and quality of the algorithm. | 翻訳日:2023-05-17 00:53:00 公開日:2023-05-15 |
# 正定距離をもつ非ハーミット系に対する剛ヒルベルト空間アプローチ Rigged Hilbert Space Approach for Non-Hermite Systems with Positive Definite Metric ( http://arxiv.org/abs/2209.01598v4 ) ライセンス: Link先を確認 | Shousuke Ohmori and Junichi Takahashi | (参考訳) 正定値計量を持つ非ヘルマイト量子系に対する厳密なヒルベルト空間に基づくディラックのブラケット形式について検討する。
応用例として、あるパリティ時間対称量子系に対する厳密なヒルベルト空間処理の具体的記述を示す。 We investigate Dirac's bra-ket formalism based on a rigged Hilbert space for a non-Hermite quantum system with a positive-definite metric. First, the rigged Hilbert space, characterized by positive-definite metric, is established. With the aid of the nuclear spectral theorem for the obtained rigged Hilbert space, spectral expansions are shown for the bra-kets by the generalized eigenvectors of a quasi-Hermite operator. The spectral expansions are utilized to endow the complete bi-orthogonal system and the transformation theory between the Hermite and non-Hermite systems. As an example of application, we show a specific description of our rigged Hilbert space treatment for some parity-time symmetrical quantum systems. | 翻訳日:2023-05-17 00:52:33 公開日:2023-05-15 |
# 退化パラメトリック発振器の閾値における放射統計 Radiation statistics of a degenerate parametric oscillator at threshold ( http://arxiv.org/abs/2208.14886v3 ) ライセンス: Link先を確認 | Fabian Hassler, Steven Kim, Lisa Arndt | (参考訳) 駆動強度の関数として、縮退パラメトリック発振器は、自発振動が発生する不安定性を示す。
さらに,最初の3つの累積物質の割合は,系の微細な詳細から独立して予測し,その結果を実験用プラットフォームに接続する。 As a function of the driving strength, a degenerate parametric oscillator exhibits an instability at which spontaneous oscillations occur. Close to threshold, both the nonlinearity as well as fluctuations are vital to the accurate description of the dynamics. We study the statistics of the radiation that is emitted by the degenerate parametric oscillator at threshold. For a weak nonlinearity, we can employ a quasiclassical description. We identify a universal Liouvillian that captures the relevant long-time dynamics for large photon-numbers. We find that the cumulants obey a universal power-law scaling as a function of the nonlinearity. The Fano factor shows a maximum close, but not coinciding, with the threshold. Moreover, we predict a certain ratio of the first three cumulants to be independent of the microscopic details of the system and connect the results to experimental platforms. | 翻訳日:2023-05-17 00:52:22 公開日:2023-05-15 |
# StoryTrans: 談話表現とコンテンツエンハンスを備えた非並列ストーリーオーサリング StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing ( http://arxiv.org/abs/2208.13423v2 ) ライセンス: Link先を確認 | Xuekai Zhu, Jian Guan, Minlie Huang, Juan Liu | (参考訳) 非並列テキストスタイル転送は自然言語生成において重要なタスクである。
大規模な実験により,本モデルはスタイル転送とコンテンツ保存の全体的な性能において,強いベースラインを上回ります。 Non-parallel text style transfer is an important task in natural language generation. However, previous studies concentrate on the token or sentence level, such as sentence sentiment and formality transfer, but neglect long style transfer at the discourse level. Long texts usually involve more complicated author linguistic preferences such as discourse structures than sentences. In this paper, we formulate the task of non-parallel story author-style transfer, which requires transferring an input story into a specified author style while maintaining source semantics. To tackle this problem, we propose a generation model, named StoryTrans, which leverages discourse representations to capture source content information and transfer them to target styles with learnable style embeddings. We use an additional training objective to disentangle stylistic features from the learned discourse representation to prevent the model from degenerating to an auto-encoder. Moreover, to enhance content preservation, we design a mask-and-fill framework to explicitly fuse style-specific keywords of source texts into generation. Furthermore, we constructed new datasets for this task in Chinese and English, respectively. Extensive experiments show that our model outperforms strong baselines in overall performance of style transfer and content preservation. | 翻訳日:2023-05-17 00:52:12 公開日:2023-05-15 |
# 第2量子化における周期固体の量子計算 Quantum Computation for Periodic Solids in Second Quantization ( http://arxiv.org/abs/2210.02403v2 ) ライセンス: Link先を確認 | Aleksei V. Ivanov, Christoph S\"underhauf, Nicole Holzmann, Tom Ellaby, Rachel N. Kerber, Glenn Jones, Joan Camps | (参考訳) 本研究では,誤差補正量子コンピュータ上での周期固体の基底状態エネルギー計算のための量子アルゴリズムを提案する。
このアルゴリズムは第2量子化におけるスパース量子化アプローチに基づいており、Bloch と Wannier 基底集合のために開発された。
(i)ハミルトニアンの l$_1$ のノルムはかなり低い。
200--900のスピン軌道で近似したハミルトンの基底状態エネルギー推定には、Tゲートが必要で、物理誤差レートが0.1\%の3.$10{}^{10}$-$10^{12}$Tゲートと3.$\cdot10^8$$の物理量子ビットが必要である。 In this work, we present a quantum algorithm for ground-state energy calculations of periodic solids on error-corrected quantum computers. The algorithm is based on the sparse qubitization approach in second quantization and developed for Bloch and Wannier basis sets. We show that Wannier functions require less computational resources with respect to Bloch functions because: (i) the L$_1$ norm of the Hamiltonian is considerably lower and (ii) the translational symmetry of Wannier functions can be exploited in order to reduce the amount of classical data that must be loaded into the quantum computer. The resource requirements of the quantum algorithm are estimated for periodic solids such as NiO and PdO. These transition metal oxides are industrially relevant for their catalytic properties. We find that ground-state energy estimation of Hamiltonians approximated using 200--900 spin orbitals requires {\it ca.}~$10{}^{10}$--$10^{12}$ T gates and up to $3\cdot10^8$ physical qubits for a physical error rate of $0.1\%$. | 翻訳日:2023-05-17 00:43:50 公開日:2023-05-15 |
# FusionRetro:再合成計画のためのインコンテキスト反応による分子表現融合 FusionRetro: Molecule Representation Fusion via In-context Reactions for Retrosynthetic Planning ( http://arxiv.org/abs/2209.15315v3 ) ライセンス: Link先を確認 | Songtao Liu, Zhengkai Tu, Minkai Xu, Zuobai Zhang, Lu Lin, Rex Ying, Jian Tang, Peilin Zhao, Dinghao Wu | (参考訳) 再合成計画(Retrosynthetic Planning)は、材料からターゲット分子への完全な多段階合成経路を考案することを目的としている。
提案手法は, 逆合成計画にコンテキスト内反応を利用した最初の試みである。
総合実験により, 経路上のコンテキスト情報を融合することにより, 特に長い合成経路において, ベースライン上での逆合成計画の性能が著しく向上することを示した。
コードはhttps://github.com/SongtaoLiu0823/FusionRetroで公開されている。 Retrosynthetic planning aims to devise a complete multi-step synthetic route from starting materials to a target molecule. Current strategies use a decoupled approach of single-step retrosynthesis models and search algorithms, taking only the product as the input to predict the reactants for each planning step and ignoring valuable context information along the synthetic route. In this work, we propose a novel framework that utilizes context information for improved retrosynthetic planning. We view synthetic routes as reaction graphs and propose to incorporate context through three principled steps: encode molecules into embeddings, aggregate information over routes, and readout to predict reactants. Our approach is the first attempt to utilize in-context reactions for retrosynthetic planning. The entire framework can be efficiently optimized in an end-to-end fashion and produce more practical and accurate predictions. Comprehensive experiments demonstrate that by fusing in the context information over routes, our model significantly improves the performance of retrosynthetic planning over baselines that are not context-aware, especially for long synthetic routes. Code is available at https://github.com/SongtaoLiu0823/FusionRetro. | 翻訳日:2023-05-17 00:43:18 公開日:2023-05-15 |
# モジュールアーキテクチャのための量子LDPC符号 Quantum LDPC Codes for Modular Architectures ( http://arxiv.org/abs/2209.14329v3 ) ライセンス: Link先を確認 | Armands Strikis, Lucas Berent | (参考訳) 量子コンピュータの規模を拡大するために、モジュラリティは多くの量子コンピューティング技術において中心的な役割を果たす。
最後に、モジュール間の接続のツイストを可能にする接続制約を緩和し、より良いパラメータを持つコードを構築する方法を示す。 In efforts to scale the size of quantum computers, modularity plays a central role across most quantum computing technologies. In the light of fault tolerance, this necessitates designing quantum error-correcting codes that are compatible with the connectivity arising from the architectural layouts. In this paper, we aim to bridge this gap by giving a novel way to view and construct quantum LDPC codes tailored for modular architectures. We demonstrate that if the intra- and inter-modular qubit connectivity can be viewed as corresponding to some classical or quantum LDPC codes, then their hypergraph product code fully respects the architectural connectivity constraints. Finally, we show that relaxed connectivity constraints that allow twists of connections between modules pave a way to construct codes with better parameters. | 翻訳日:2023-05-17 00:42:40 公開日:2023-05-15 |
# 多言語ニューラルマシン翻訳のためのスイッチトバックトランスレーションによる多言語合意の双方向改訂 Revamping Multilingual Agreement Bidirectionally via Switched Back-translation for Multilingual Neural Machine Translation ( http://arxiv.org/abs/2209.13940v3 ) ライセンス: Link先を確認 | Hongyuan Lu, Haoyang Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Wai Lam | (参考訳) マルチリンガル・コンセンサス(MA)がマルチリンガル・ニューラル・マシン翻訳(MNMT)の重要性を示しているにもかかわらず、この分野の現在の手法には2つの欠点がある。
我々は,事前学習されたmnmtモデルの微調整のための新しい普遍的多言語合意フレームワークである \textbf{b}idirectional \textbf{m}ultilingual \textbf{a}greement (\textbf{s}witched \textbf{b}ack-\textbf{t}ranslation (\textbf{bma-sbt}) を提案する。
一 翻訳目標を用いて他のソース言語で書かれた合成テキストを作成するスイッチングBTと呼ばれる新しい方法を用いて、上記の並列データの必要性を免除し、
実験によると、BMA-SBTはTED Talks、News、Europarlの3つのベンチマークでMNMTのタスクの強いベースラインを明らかに改善している。
詳細な分析から,BMA-SBTは従来のBT法に付加的な改善をもたらすことが示された。 Despite the fact that multilingual agreement (MA) has shown its importance for multilingual neural machine translation (MNMT), current methodologies in the field have two shortages: (i) require parallel data between multiple language pairs, which is not always realistic and (ii) optimize the agreement in an ambiguous direction, which hampers the translation performance. We present \textbf{B}idirectional \textbf{M}ultilingual \textbf{A}greement via \textbf{S}witched \textbf{B}ack-\textbf{t}ranslation (\textbf{BMA-SBT}), a novel and universal multilingual agreement framework for fine-tuning pre-trained MNMT models, which (i) exempts the need for aforementioned parallel data by using a novel method called switched BT that creates synthetic text written in another source language using the translation target and (ii) optimizes the agreement bidirectionally with the Kullback-Leibler Divergence loss. Experiments indicate that BMA-SBT clearly improves the strong baselines on the task of MNMT with three benchmarks: TED Talks, News, and Europarl. In-depth analyzes indicate that BMA-SBT brings additive improvements to the conventional BT method. | 翻訳日:2023-05-17 00:42:27 公開日:2023-05-15 |
# 改良型サブシーズン予測のための適応バイアス補正 Adaptive Bias Correction for Improved Subseasonal Forecasting ( http://arxiv.org/abs/2209.10666v3 ) ライセンス: Link先を確認 | Soukayna Mouatadid, Paulo Orenstein, Genevieve Flaspohler, Judah Cohen, Miruna Oprescu, Ernest Fraenkel, Lester Mackey | (参考訳) 気温と降水量を2~6週間予測する季節的予測は、効果的な水配分、山火事管理、干ばつや洪水の緩和に不可欠だ。
これらの性能改善を実践的なワークフローと組み合わせ、ABCのスキル向上を説明し、特定の気候条件に基づいて高度な機会の窓を特定する。 Subseasonal forecasting -- predicting temperature and precipitation 2 to 6 weeks ahead -- is critical for effective water allocation, wildfire management, and drought and flood mitigation. Recent international research efforts have advanced the subseasonal capabilities of operational dynamical models, yet temperature and precipitation prediction skills remain poor, partly due to stubborn errors in representing atmospheric dynamics and physics inside dynamical models. Here, to counter these errors, we introduce an adaptive bias correction (ABC) method that combines state-of-the-art dynamical forecasts with observations using machine learning. We show that, when applied to the leading subseasonal model from the European Centre for Medium-Range Weather Forecasts (ECMWF), ABC improves temperature forecasting skill by 60-90% (over baseline skills of 0.18-0.25) and precipitation forecasting skill by 40-69% (over baseline skills of 0.11-0.15) in the contiguous U.S. We couple these performance improvements with a practical workflow to explain ABC skill gains and identify higher-skill windows of opportunity based on specific climate conditions. | 翻訳日:2023-05-17 00:41:43 公開日:2023-05-15 |
# スピン探査とエネルギー流体力学への爆発障害 Exploiting disorder to probe spin and energy hydrodynamics ( http://arxiv.org/abs/2209.09322v2 ) ライセンス: Link先を確認 | Pai Peng, Bingtian Ye, Norman Y. Yao, Paola Cappellaro | (参考訳) 大規模量子プラットフォームにおける顕著な課題は、強力な相互作用を同時に達成し、最も興味深い振る舞いと、それらを探索できるローカルアドレッシングをもたらすことである。
本手法は, 固体スピンアンサンブルに存在する内在性障害を利用して相関関数の非局所成分を脱相する。
フロッケ工学による相互作用ハミルトニアンのチューニングにより, 弾道流体力学と拡散流体力学のクロスオーバーについて検討した。
興味深いことに、系が相互作用可能かつ(ほぼ)可積分である場合、拡散スピン輸送と弾道エネルギー輸送の共存を観測する。 An outstanding challenge in large-scale quantum platforms is to simultaneously achieve strong interactions, giving rise to the most interesting behaviors, and local addressing -that can probe them. In the context of correlated phases, local addressing enables one to directly probe the nature of the system's order. Meanwhile, for out-ofequilibrium dynamics, such addressing allows the study of quantum information spreading and operator growth. Here, we introduce a novel technique that enables the measurement of local correlation functions, down to single-site resolution, despite access to only global controls. Our approach leverages the intrinsic disorder present in a solid-state spin ensemble to dephase the nonlocal components of the correlation function. Utilizing this toolset, we measure both the spin and energy transport in nuclear spin chains. By tuning the interaction Hamiltonian via Floquet engineering, we investigate the cross-over between ballistic and diffusive hydrodynamics. Interestingly, when the system is both interacting and (nearly-)integrable, we observe the coexistence of diffusive spin transport with ballistic energy transport. | 翻訳日:2023-05-17 00:41:19 公開日:2023-05-15 |
# 深層学習法による角度分解光電子分光の格子構造除去 Removing grid structure in angle-resolved photoemission spectra via deep learning method ( http://arxiv.org/abs/2210.11200v2 ) ライセンス: Link先を確認 | Junde Liu, Dongchen Huang, Yi-feng Yang, and Tian Qian | (参考訳) 分光データは、しばしば望ましくない外因性信号を含む。
他の外因性シグナルを排除し、スペクトルの自己相関のみに基づくスペクトル品質を高めるため、全ての分光測定に拡張される可能性がある。 Spectroscopic data may often contain unwanted extrinsic signals. For example, in ARPES experiment, a wire mesh is typically placed in front of the CCD to block stray photo-electrons, but could cause a grid-like structure in the spectra during quick measurement mode. In the past, this structure was often removed using the mathematical Fourier filtering method by erasing the periodic structure. However, this method may lead to information loss and vacancies in the spectra because the grid structure is not strictly linearly superimposed. Here, we propose a deep learning method to effectively overcome this problem. Our method takes advantage of the self-correlation information within the spectra themselves and can greatly optimize the quality of the spectra while removing the grid structure and noise simultaneously. It has the potential to be extended to all spectroscopic measurements to eliminate other extrinsic signals and enhance the spectral quality based on the self-correlation of the spectra solely. | 翻訳日:2023-05-17 00:35:26 公開日:2023-05-15 |
# ノイズの多い木データ構造と量子応用 Noisy Tree Data Structures and Quantum Applications ( http://arxiv.org/abs/2210.11197v2 ) ライセンス: Link先を確認 | Kamil Khadiev, Nikita Savelyev, Mansur Ziatdinov and Denis Melnikov | (参考訳) 本稿では,歩行木と呼ばれるノイズの多いデータ構造を構築する手法を提案する。
赤黒木(Self-Balanced Binary Search Treeの実装)とセグメントツリーに適用する。
同時に、古典的なものよりも効果的である。 The paper presents a technique for constructing noisy data structures called a walking tree. We apply it for a Red-Black tree (an implementation of a Self-Balanced Binary Search Tree) and a segment tree. We obtain the same complexity of the main operations for these data structures as in the case without noise (asymptotically). We present several applications of the data structures for quantum algorithms. Finally, we suggest new quantum solution for strings sorting problem and show the lower bound. The upper and lower bounds are the same up to a log factor. At the same time, it is more effective than classical counterparts. | 翻訳日:2023-05-17 00:35:12 公開日:2023-05-15 |
# 学習自由深層学習法による分光データデノイズ化 Spectroscopic data de-noising via training-set-free deep learning method ( http://arxiv.org/abs/2210.10494v2 ) ライセンス: Link先を確認 | Dongchen Huang, Junde Liu, Tian Qian, and Yi-feng Yang | (参考訳) 脱ノイズはスペクトルのポストプロセッシングにおいて重要な役割を果たす。
さらに,本手法はトレーニングセットの特定の特性に制限されないため,高品質な多次元トレーニングデータを取得することが困難な他の分野やアプリケーションシナリオにも拡張できる可能性がある。 De-noising plays a crucial role in the post-processing of spectra. Machine learning-based methods show good performance in extracting intrinsic information from noisy data, but often require a high-quality training set that is typically inaccessible in real experimental measurements. Here, using spectra in angle-resolved photoemission spectroscopy (ARPES) as an example, we develop a de-noising method for extracting intrinsic spectral information without the need for a training set. This is possible as our method leverages the self-correlation information of the spectra themselves. It preserves the intrinsic energy band features and thus facilitates further analysis and processing. Moreover, since our method is not limited by specific properties of the training set compared to previous ones, it may well be extended to other fields and application scenarios where obtaining high-quality multidimensional training data is challenging. | 翻訳日:2023-05-17 00:35:05 公開日:2023-05-15 |
# 多粒度不確かさ正規化によるテキストフィードバックによる合成画像検索 Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization ( http://arxiv.org/abs/2211.07394v4 ) ライセンス: Link先を確認 | Yiyang Chen, Zhedong Zheng, Wei Ji, Leigang Qu, Tat-Seng Chua | (参考訳) テキストフィードバックによる合成画像検索について検討した。
2) 不確実性モデリングに基づいて,変動範囲に応じて一致目標を適応させる不確実性正規化を導入する。
公開データセットである \ie, fashioniq, fashion200k, shoes では,提案手法はそれぞれ,強いベースラインに対して +4.03%, + 3.38%, + 2.40% recall@50 の精度を達成している。 We investigate composed image retrieval with text feedback. Users gradually look for the target of interest by moving from coarse to fine-grained feedback. However, existing methods merely focus on the latter, i.e., fine-grained search, by harnessing positive and negative pairs during training. This pair-based paradigm only considers the one-to-one distance between a pair of specific points, which is not aligned with the one-to-many coarse-grained retrieval process and compromises the recall rate. In an attempt to fill this gap, we introduce a unified learning approach to simultaneously modeling the coarse- and fine-grained retrieval by considering the multi-grained uncertainty. The key idea underpinning the proposed method is to integrate fine- and coarse-grained retrieval as matching data points with small and large fluctuations, respectively. Specifically, our method contains two modules: uncertainty modeling and uncertainty regularization. (1) The uncertainty modeling simulates the multi-grained queries by introducing identically distributed fluctuations in the feature space. (2) Based on the uncertainty modeling, we further introduce uncertainty regularization to adapt the matching objective according to the fluctuation range. Compared with existing methods, the proposed strategy explicitly prevents the model from pushing away potential candidates in the early stage and thus improves the recall rate. On the three public datasets, \ie, FashionIQ, Fashion200k, and Shoes, the proposed method has achieved +4.03%, + 3.38%, and + 2.40% Recall@50 accuracy over a strong baseline, respectively. | 翻訳日:2023-05-17 00:26:09 公開日:2023-05-15 |
# マクロスピン系における単一マグノンの量子制御 Quantum control of a single magnon in a macroscopic spin system ( http://arxiv.org/abs/2211.06644v2 ) ライセンス: Link先を確認 | Da Xu, Xu-Ke Gu, He-Kang Li, Yuan-Chao Weng, Yi-Pu Wang, Jie Li, H. Wang, Shi-Yao Zhu, J. Q. You | (参考訳) 古典的でない量子状態は、古典的なものとは異なる量子系の重要な特徴である。
Autler-Townes効果を介して量子ビット周波数 {\it in situ} をチューニングすることにより、単一マグノンと真空の重畳状態を含む古典的でない量子状態を生成するために、この単一のマグノンを操作する。
我々の実験は、マクロスピン系における非古典的量子状態の決定論的生成を初めて報告し、量子工学におけるその有望な応用を探求する方法を提供する。 Non-classical quantum states are the pivotal features of a quantum system that differs from its classical counterpart. However, the generation and coherent control of quantum states in a macroscopic spin system remain an outstanding challenge. Here we experimentally demonstrate the quantum control of a single magnon in a macroscopic spin system (i.e., 1~mm-diameter yttrium-iron-garnet sphere) coupled to a superconducting qubit via a microwave cavity. By tuning the qubit frequency {\it in situ} via the Autler-Townes effect, we manipulate this single magnon to generate its non-classical quantum states, including the single-magnon state and the superposition state of a single magnon and vacuum. Moreover, we confirm the deterministic generation of these non-classical states by Wigner tomography. Our experiment offers the first reported deterministic generation of the non-classical quantum states in a macroscopic spin system and paves a way to explore its promising applications in quantum engineering. | 翻訳日:2023-05-17 00:25:41 公開日:2023-05-15 |
# 自律船の時空間リカレント強化学習 Spatial-temporal recurrent reinforcement learning for autonomous ships ( http://arxiv.org/abs/2211.01004v2 ) ライセンス: Link先を確認 | Martin Waltz and Ostap Okhrin | (参考訳) 本稿では,自律船の操縦に使用できる深層Q$-networksのための時空間リカレントニューラルネットワークアーキテクチャを提案する。
さらに, エージェントによる異なる状況の簡易評価を可能にするため, 最先端の衝突リスク指標を提案する。
最終的な方針は、'around the clock'問題と呼ばれる、新たに作られた18のマルチシップシナリオを含む、一般的な今津問題(1987年)のカスタムセットで検証される。
さらに、新しいアーキテクチャはマルチエージェントシナリオにデプロイされた場合の堅牢性を示し、アクタークリティカルなフレームワークを含む他の深層強化学習アルゴリズムと互換性がある。 This paper proposes a spatial-temporal recurrent neural network architecture for deep $Q$-networks that can be used to steer an autonomous ship. The network design makes it possible to handle an arbitrary number of surrounding target ships while offering robustness to partial observability. Furthermore, a state-of-the-art collision risk metric is proposed to enable an easier assessment of different situations by the agent. The COLREG rules of maritime traffic are explicitly considered in the design of the reward function. The final policy is validated on a custom set of newly created single-ship encounters called `Around the Clock' problems and the commonly used Imazu (1987) problems, which include 18 multi-ship scenarios. Performance comparisons with artificial potential field and velocity obstacle methods demonstrate the potential of the proposed approach for maritime path planning. Furthermore, the new architecture exhibits robustness when it is deployed in multi-agent scenarios and it is compatible with other deep reinforcement learning algorithms, including actor-critic frameworks. | 翻訳日:2023-05-17 00:24:41 公開日:2023-05-15 |
# 深層強化学習を用いた適応型大近所探索のオンライン制御 Online Control of Adaptive Large Neighborhood Search using Deep Reinforcement Learning ( http://arxiv.org/abs/2211.00759v2 ) ライセンス: Link先を確認 | Robbert Reijnen, Yingqian Zhang, Hoong Chuin Lau, Zaharah Bukhsh | (参考訳) Adaptive Large Neighborhood Search (ALNS)アルゴリズムは複雑な組合せ最適化問題(COP)の解法においてかなりの成功を収めている。
この制限に対処するために、ヒューリスティックスを選択し、パラメータを調整し、検索プロセス中の受け入れ基準を制御できるDeep Reinforcement Learning (DRL)アプローチを提案する。
我々のアプローチの実装は公開される予定だ。 The Adaptive Large Neighborhood Search (ALNS) algorithm has shown considerable success in solving complex combinatorial optimization problems (COPs). ALNS selects various heuristics adaptively during the search process, leveraging their strengths to find good solutions for optimization problems. However, the effectiveness of ALNS depends on the proper configuration of its selection and acceptance parameters. To address this limitation, we propose a Deep Reinforcement Learning (DRL) approach that selects heuristics, adjusts parameters, and controls the acceptance criteria during the search process. The proposed method aims to learn, based on the state of the search, how to configure the next iteration of the ALNS to obtain good solutions to the underlying optimization problem. We evaluate the proposed method on a time-dependent orienteering problem with stochastic weights and time windows, used in an IJCAI competition. The results show that our approach outperforms vanilla ALNS and ALNS tuned with Bayesian Optimization. In addition, it obtained better solutions than two state-of-the-art DRL approaches, which are the winning methods of the competition, with much fewer observations required for training. The implementation of our approach will be made publicly available. | 翻訳日:2023-05-17 00:24:26 公開日:2023-05-15 |
# ハイブリッドスパムメール検出のための遅発型マルチモーダル融合モデル A Late Multi-Modal Fusion Model for Detecting Hybrid Spam E-mail ( http://arxiv.org/abs/2210.14616v4 ) ライセンス: Link先を確認 | Zhibo Zhang, Ernesto Damiani, Hussam Al Hamadi, Chan Yeob Yeun, Fatma Taher | (参考訳) 近年、スパマーは、画像とテキストの両方を組み合わせたハイブリッドスパムメールを導入して、その意図を難読化しようとしている。
合成ニューラルネットワーク(cnn)と単語の連続袋を,ハイブリッドスパムのイメージ部分とテキスト部分からそれぞれ抽出し,生成した特徴をランダムフォレスト(rf),決定木(dt),ナイーブベイズ(nb),サポートベクターマシン(svm)などのsgmoid層と機械学習に基づく分類器に供給し,電子メールハムやスパムを判定した。 In recent years, spammers are now trying to obfuscate their intents by introducing hybrid spam e-mail combining both image and text parts, which is more challenging to detect in comparison to e-mails containing text or image only. The motivation behind this research is to design an effective approach filtering out hybrid spam e-mails to avoid situations where traditional text-based or image-baesd only filters fail to detect hybrid spam e-mails. To the best of our knowledge, a few studies have been conducted with the goal of detecting hybrid spam e-mails. Ordinarily, Optical Character Recognition (OCR) technology is used to eliminate the image parts of spam by transforming images into text. However, the research questions are that although OCR scanning is a very successful technique in processing text-and-image hybrid spam, it is not an effective solution for dealing with huge quantities due to the CPU power required and the execution time it takes to scan e-mail files. And the OCR techniques are not always reliable in the transformation processes. To address such problems, we propose new late multi-modal fusion training frameworks for a text-and-image hybrid spam e-mail filtering system compared to the classical early fusion detection frameworks based on the OCR method. Convolutional Neural Network (CNN) and Continuous Bag of Words were implemented to extract features from image and text parts of hybrid spam respectively, whereas generated features were fed to sigmoid layer and Machine Learning based classifiers including Random Forest (RF), Decision Tree (DT), Naive Bayes (NB) and Support Vector Machine (SVM) to determine the e-mail ham or spam. | 翻訳日:2023-05-17 00:24:07 公開日:2023-05-15 |
# リアルタイム車載LiDAR知覚のための点雲のディープラーニング表現の解析 Analyzing Deep Learning Representations of Point Clouds for Real-Time In-Vehicle LiDAR Perception ( http://arxiv.org/abs/2210.14612v3 ) ライセンス: Link先を確認 | Marc Uecker and Tobias Fleck and Marcel Pflugfelder and J. Marius Z\"ollner | (参考訳) LiDARセンサーは、車両の周囲の正確な高解像度の3D表現を提供するため、現代の自動運転車の不可欠な部分である。
最後に、ニューラルポイントクラウド処理手法の今後の発展に関する洞察とガイダンスを提供する。 LiDAR sensors are an integral part of modern autonomous vehicles as they provide an accurate, high-resolution 3D representation of the vehicle's surroundings. However, it is computationally difficult to make use of the ever-increasing amounts of data from multiple high-resolution LiDAR sensors. As frame-rates, point cloud sizes and sensor resolutions increase, real-time processing of these point clouds must still extract semantics from this increasingly precise picture of the vehicle's environment. One deciding factor of the run-time performance and accuracy of deep neural networks operating on these point clouds is the underlying data representation and the way it is computed. In this work, we examine the relationship between the computational representations used in neural networks and their performance characteristics. To this end, we propose a novel computational taxonomy of LiDAR point cloud representations used in modern deep neural networks for 3D point cloud processing. Using this taxonomy, we perform a structured analysis of different families of approaches. Thereby, we uncover common advantages and limitations in terms of computational efficiency, memory requirements, and representational capacity as measured by semantic segmentation performance. Finally, we provide some insights and guidance for future developments in neural point cloud processing methods. | 翻訳日:2023-05-17 00:23:30 公開日:2023-05-15 |
# 拡散モデルディープフェイクの検出に向けて Towards the Detection of Diffusion Model Deepfakes ( http://arxiv.org/abs/2210.14571v3 ) ライセンス: Link先を確認 | Jonas Ricker, Simon Damm, Thorsten Holz, Asja Fischer | (参考訳) 拡散モデル(dms)は画像合成において有望な方法として最近登場した。
しかし, DM生成画像の検出にはほとんど注意が払われていないため, 社会に悪影響を及ぼすおそれがある。
本稿では,この課題を2つの異なる角度から解決する。第1に,様々なdm上でgans(generative adversarial networks)が生成する画像に対して非常に効果的である最先端検出器の性能を評価する。
本研究がDM生成画像の有効検出に関するさらなる研究の基盤と出発点となると確信している。 Diffusion models (DMs) have recently emerged as a promising method in image synthesis. However, to date, only little attention has been paid to the detection of DM-generated images, which is critical to prevent adverse impacts on our society. In this work, we address this pressing challenge from two different angles: First, we evaluate the performance of state-of-the-art detectors, which are very effective against images generated by generative adversarial networks (GANs), on a variety of DMs. Second, we analyze DM-generated images in the frequency domain and study different factors that influence the spectral properties of these images. Most importantly, we demonstrate that GANs and DMs produce images with different characteristics, which requires adaptation of existing classifiers to ensure reliable detection. We are convinced that this work provides the foundation and starting point for further research on effective detection of DM-generated images. | 翻訳日:2023-05-17 00:23:02 公開日:2023-05-15 |
# 線形クラスタ状態からのGHZ状態の抽出 Extracting GHZ states from linear cluster states ( http://arxiv.org/abs/2211.16758v3 ) ライセンス: Link先を確認 | Jarn de Jong, Frederik Hahn, Nikolay Tcholtchev, Manfred Hauswirth, and Anna Pappa | (参考訳) 量子情報処理アーキテクチャは通常、最寄りの絡み合いの生成しかできない。
我々は、GHZ状態を共有するノードの集合のサイズに対して、厳密な上限である$\lfloor (n+3)/2 \rfloor$を証明し、局所クリフォードユニタリー、局所パウリ測定、古典的通信を用いて、$n$ qubitsの線形クラスタ状態から得ることができる。
最後に、これらの変換をIBMQ Montreal量子デバイス上で、最大$n=19$ qubitsの線形クラスタ状態に対して示す。 Quantum information processing architectures typically only allow for nearest-neighbour entanglement creation. In many cases, this prevents the direct generation of GHZ states, which are commonly used for many communication and computation tasks. Here, we show how to obtain GHZ states between nodes in a network that are connected in a straight line, naturally allowing them to initially share linear cluster states. We prove a strict upper bound of $\lfloor (n+3)/2 \rfloor$ on the size of the set of nodes sharing a GHZ state that can be obtained from a linear cluster state of $n$ qubits, using local Clifford unitaries, local Pauli measurements, and classical communication. Furthermore, we completely characterize all selections of nodes below this threshold that can share a GHZ state obtained within this setting. Finally, we demonstrate these transformations on the IBMQ Montreal quantum device for linear cluster states of up to $n=19$ qubits. | 翻訳日:2023-05-17 00:16:22 公開日:2023-05-15 |
# ロボットシステムの学習と制御のためのリー群強制変分積分器ネットワーク Lie Group Forced Variational Integrator Networks for Learning and Control of Robot Systems ( http://arxiv.org/abs/2211.16006v4 ) ライセンス: Link先を確認 | Valentin Duruisseaux, Thai Duong, Melvin Leok, Nikolay Atanasov | (参考訳) 物理法則の事前知識と力学系の構造特性をディープラーニングアーキテクチャの設計に組み込むことは、計算効率と一般化能力を向上させるための強力な技術であることが証明されている。
さらに、学習した離散時間ダイナミクスは、計算にスケーラブルな離散時間(最適)制御戦略で利用することができる。 Incorporating prior knowledge of physics laws and structural properties of dynamical systems into the design of deep learning architectures has proven to be a powerful technique for improving their computational efficiency and generalization capacity. Learning accurate models of robot dynamics is critical for safe and stable control. Autonomous mobile robots, including wheeled, aerial, and underwater vehicles, can be modeled as controlled Lagrangian or Hamiltonian rigid-body systems evolving on matrix Lie groups. In this paper, we introduce a new structure-preserving deep learning architecture, the Lie group Forced Variational Integrator Network (LieFVIN), capable of learning controlled Lagrangian or Hamiltonian dynamics on Lie groups, either from position-velocity or position-only data. By design, LieFVINs preserve both the Lie group structure on which the dynamics evolve and the symplectic structure underlying the Hamiltonian or Lagrangian systems of interest. The proposed architecture learns surrogate discrete-time flow maps allowing accurate and fast prediction without numerical-integrator, neural-ODE, or adjoint techniques, which are needed for vector fields. Furthermore, the learnt discrete-time dynamics can be utilized with computationally scalable discrete-time (optimal) control strategies. | 翻訳日:2023-05-17 00:15:45 公開日:2023-05-15 |
# 強結合光機械系における量子場ゆらぎのスペクトル解析 Spectral Analysis of Quantum Field Fluctuations in a Strongly Coupled Optomechanical System ( http://arxiv.org/abs/2211.14168v2 ) ライセンス: Link先を確認 | A. Ranfagni, F. Marino and F. Marin | (参考訳) 強固でコヒーレントな量子光学結合系におけるレビトダイナミックス実験により、発振器が広帯域量子スペクトル分析器として働くことを実証する。
さらに, 2次元力学系では, 真空揺らぎによって生じる量子バックアクションは, 全体感受性の破壊的干渉により, 狭いスペクトル領域において強く抑制される。 With a levitodynamics experiment in the strong and coherent quantum optomechanical coupling regime, we demonstrate that the oscillator acts as a broadband quantum spectrum analyzer. The asymmetry between positive and negative frequency branches in the displacement spectrum traces out the spectral features of the quantum fluctuations in the cavity field, which are thus explored over a wide spectral range. Moreover, in our two-dimensional mechanical system the quantum back-action, generated by such vacuum fluctuations, is strongly suppressed in a narrow spectral region due to a destructive interference in the overall susceptibility. | 翻訳日:2023-05-17 00:15:23 公開日:2023-05-15 |
# sllen: 意味認識による低光度画像強調ネットワーク SLLEN: Semantic-aware Low-light Image Enhancement Network ( http://arxiv.org/abs/2211.11571v2 ) ライセンス: Link先を確認 | Mingye Ju, Chuheng Chen, Charles A. Guo, Jinshan Pan, Jinhui Tang, and Dacheng Tao | (参考訳) 低照度画像強調(LLE)には,意味的特徴を効果的に探索する方法が不可欠である。
そこで我々は,LLE主ネットワーク (LLEmN) とSS補助ネットワーク (SSaN) を組み合わせた,シンプルで効果的な意味認識型LLEネットワーク (SSLEN) を開発した。
SSaN は HSF と IEF を提供する SS ロールとして機能するように設計されている。
提案したSLLENと他の最先端技術との比較は、SLLENのLLE品質に対する優位性を示している。 How to effectively explore semantic feature is vital for low-light image enhancement (LLE). Existing methods usually utilize the semantic feature that is only drawn from the output produced by high-level semantic segmentation (SS) network. However, if the output is not accurately estimated, it would affect the high-level semantic feature (HSF) extraction, which accordingly interferes with LLE. To this end, we develop a simple and effective semantic-aware LLE network (SSLEN) composed of a LLE main-network (LLEmN) and a SS auxiliary-network (SSaN). In SLLEN, LLEmN integrates the random intermediate embedding feature (IEF), i.e., the information extracted from the intermediate layer of SSaN, together with the HSF into a unified framework for better LLE. SSaN is designed to act as a SS role to provide HSF and IEF. Moreover, thanks to a shared encoder between LLEmN and SSaN, we further propose an alternating training mechanism to facilitate the collaboration between them. Unlike currently available approaches, the proposed SLLEN is able to fully lever the semantic information, e.g., IEF, HSF, and SS dataset, to assist LLE, thereby leading to a more promising enhancement performance. Comparisons between the proposed SLLEN and other state-of-the-art techniques demonstrate the superiority of SLLEN with respect to LLE quality over all the comparable alternatives. | 翻訳日:2023-05-17 00:14:53 公開日:2023-05-15 |
# 畳み込みガウスニューラルプロセスを用いた環境センサ配置 Environmental Sensor Placement with Convolutional Gaussian Neural Processes ( http://arxiv.org/abs/2211.10381v5 ) ライセンス: Link先を確認 | Tom R. Andersson, Wessel P. Bruinsma, Stratis Markou, James Requeima, Alejandro Coca-Castro, Anna Vaughan, Anna-Louise Ellis, Matthew A. Lazzara, Dani Jones, J. Scott Hosking, Richard E. Turner | (参考訳) 環境センサーは、気象状況や気候変動の影響を監視するために不可欠である。
gaussian process (gp)モデルはこの目的のために広く使われているが、複雑な非定常動作のキャプチャや大規模データセットへのスケーリングに苦労している。
本稿では,畳み込みガウス過程(convolutional gaussian neural process, convgnp)を用いてこの問題に対処する。
本手法と物理ベースのセンサ配置手法を対比し, センサ配置レコメンデーションシステムに向けた今後のステップを提案する。
私たちの研究は、現実のデジタル表現を改善するために、積極的に測定サンプリングを行う環境デジタル双子の実現に役立ちます。 Environmental sensors are crucial for monitoring weather conditions and the impacts of climate change. However, it is challenging to place sensors in a way that maximises the informativeness of their measurements, particularly in remote regions like Antarctica. Probabilistic machine learning models can suggest informative sensor placements by finding sites that maximally reduce prediction uncertainty. Gaussian process (GP) models are widely used for this purpose, but they struggle with capturing complex non-stationary behaviour and scaling to large datasets. This paper proposes using a convolutional Gaussian neural process (ConvGNP) to address these issues. A ConvGNP uses neural networks to parameterise a joint Gaussian distribution at arbitrary target locations, enabling flexibility and scalability. Using simulated surface air temperature anomaly over Antarctica as training data, the ConvGNP learns spatial and seasonal non-stationarities, outperforming a non-stationary GP baseline. In a simulated sensor placement experiment, the ConvGNP better predicts the performance boost obtained from new observations than GP baselines, leading to more informative sensor placements. We contrast our approach with physics-based sensor placement methods and propose future steps towards an operational sensor placement recommendation system. Our work could help to realise environmental digital twins that actively direct measurement sampling to improve the digital representation of reality. | 翻訳日:2023-05-17 00:14:16 公開日:2023-05-15 |
# 複素ガウス混合モデルを用いた深部音声強調の不確かさ推定 Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models ( http://arxiv.org/abs/2212.04831v2 ) ライセンス: Link先を確認 | Huajian Fang and Timo Gerkmann | (参考訳) 単一チャンネルのディープ音声強調手法は、その精度を測らずにクリーン音声を抽出するために単一の乗法マスクを推定することが多い。
異なるデータセットに対する実験結果から,提案アルゴリズムは予測の不確かさを効果的に把握し,強力な統計モデルと深層学習を組み合わせることにより,優れた音声強調性能が得られることが示された。 Single-channel deep speech enhancement approaches often estimate a single multiplicative mask to extract clean speech without a measure of its accuracy. Instead, in this work, we propose to quantify the uncertainty associated with clean speech estimates in neural network-based speech enhancement. Predictive uncertainty is typically categorized into aleatoric uncertainty and epistemic uncertainty. The former accounts for the inherent uncertainty in data and the latter corresponds to the model uncertainty. Aiming for robust clean speech estimation and efficient predictive uncertainty quantification, we propose to integrate statistical complex Gaussian mixture models (CGMMs) into a deep speech enhancement framework. More specifically, we model the dependency between input and output stochastically by means of a conditional probability density and train a neural network to map the noisy input to the full posterior distribution of clean speech, modeled as a mixture of multiple complex Gaussian components. Experimental results on different datasets show that the proposed algorithm effectively captures predictive uncertainty and that combining powerful statistical models and deep learning also delivers a superior speech enhancement performance. | 翻訳日:2023-05-17 00:06:46 公開日:2023-05-15 |
# MoFusion: Denoising-Diffusion-based Motion Synthesisのためのフレームワーク MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis ( http://arxiv.org/abs/2212.04495v2 ) ライセンス: Link先を確認 | Rishabh Dabral and Muhammad Hamza Mughal and Vladislav Golyanik and Christian Theobalt | (参考訳) 従来の人間の運動合成法は決定論的か、あるいは運動の多様性と運動の質のトレードオフに苦しむ。
また, 運動拡散フレームワークにおける運動可能性について, 計画的な重み付け戦略を通じて, よく知られた運動的損失を導入する方法を提案する。
我々は、読者に私たちの補足ビデオを見て、https://vcai.mpi-inf.mpg.de/projects/MoFusion.comを訪れるように促します。 Conventional methods for human motion synthesis are either deterministic or struggle with the trade-off between motion diversity and motion quality. In response to these limitations, we introduce MoFusion, i.e., a new denoising-diffusion-based framework for high-quality conditional human motion synthesis that can generate long, temporally plausible, and semantically accurate motions based on a range of conditioning contexts (such as music and text). We also present ways to introduce well-known kinematic losses for motion plausibility within the motion diffusion framework through our scheduled weighting strategy. The learned latent space can be used for several interactive motion editing applications -- like inbetweening, seed conditioning, and text-based editing -- thus, providing crucial abilities for virtual character animation and robotics. Through comprehensive quantitative evaluations and a perceptual user study, we demonstrate the effectiveness of MoFusion compared to the state of the art on established benchmarks in the literature. We urge the reader to watch our supplementary video and visit https://vcai.mpi-inf.mpg.de/projects/MoFusion. | 翻訳日:2023-05-17 00:06:26 公開日:2023-05-15 |
# 連続学習の統計力学--変動原理と平均場ポテンシャル Statistical mechanics of continual learning: variational principle and mean-field potential ( http://arxiv.org/abs/2212.02846v3 ) ライセンス: Link先を確認 | Chan Li and Zhenye Huang and Wenxuan Zou and Haiping Huang | (参考訳) 人工知能への障害は、異なる性質の複数のタスクの継続的な学習によって設定される。
そこで, ニューラルネットワークは, 勾配が定義する離散重み空間ではなく, フィールド空間で訓練され, さらに, 重みの不確かさが自然に組み込まれ, タスク間のシナプス資源を調節する, 変分ベイズ学習環境を提案する。
提案する原理的フレームワークは弾性重み強化にもつながり,神経科学はメタ塑性に触発され,深層ネットワークを用いた実世界のマルチタスク学習に理論に基づく手法を提供する。 An obstacle to artificial general intelligence is set by the continual learning of multiple tasks of different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory ground. Here, we focus on the continual learning in single-layered and multi-layered neural networks of binary weights. A variational Bayesian learning setting is thus proposed, where the neural network is trained in a field-space, rather than the gradient-ill-defined discrete-weight space, and furthermore, the weight uncertainty is naturally incorporated, and modulates the synaptic resources among tasks. From a physics perspective, we translate the variational continual learning into the Franz-Parisi thermodynamic potential framework, where the previous task knowledge acts as a prior and a reference as well. We thus interprete the continual learning of the binary perceptron in a teacher-student setting as a Franz-Parisi potential computation. The learning performance can then be analytically studied with mean-field order parameters, whose predictions coincide with the numerical experiments using stochastic gradient descent methods. Based on the variational principle and Gaussian field approximation of internal preactivations in hidden layers, we also derive the learning algorithm considering weight uncertainty, which outperforms the current metaplasticity algorithm in continually learning multiple tasks. Our proposed principled frameworks also connect to elastic weight consolidation, and neuroscience inspired metaplasticity, providing a theory-grounded method for the real-world multi-task learning with deep networks. | 翻訳日:2023-05-17 00:05:34 公開日:2023-05-15 |
# 相対的一般化のためのカリキュラム学習 Curriculum Learning for Relative Overgeneralization ( http://arxiv.org/abs/2212.02733v2 ) ライセンス: Link先を確認 | Lin Shi and Bei Peng | (参考訳) マルチエージェント強化学習(MARL)では、VDNやQMIXのような多くの一般的な手法が、協調作業における最適関節動作の効用が準最適関節動作の効用より低い場合に生じる、相対的過一般化(RO)として知られる重要なマルチエージェントの病態に影響を受けやすい。
しかし, 実験結果から, 強力なROを示す協調作業の解決に失敗する可能性が示唆された。
QMIXに適用すると、CUROは深刻なRO問題を克服し、性能を著しく向上し、StarCraft IIマイクロマネジメントベンチマークを含む様々な協調型マルチエージェントタスクに最先端の結果をもたらすことが示される。 In multi-agent reinforcement learning (MARL), many popular methods, such as VDN and QMIX, are susceptible to a critical multi-agent pathology known as relative overgeneralization (RO), which arises when the optimal joint action's utility falls below that of a sub-optimal joint action in cooperative tasks. RO can cause the agents to get stuck into local optima or fail to solve cooperative tasks that require significant coordination between agents within a given timestep. Recent value-based MARL algorithms such as QPLEX and WQMIX can overcome RO to some extent. However, our experimental results show that they can still fail to solve cooperative tasks that exhibit strong RO. In this work, we propose a novel approach called curriculum learning for relative overgeneralization (CURO) to better overcome RO. To solve a target task that exhibits strong RO, in CURO, we first fine-tune the reward function of the target task to generate source tasks that are tailored to the current ability of the learning agent and train the agent on these source tasks first. Then, to effectively transfer the knowledge acquired in one task to the next, we use a transfer learning method that combines value function transfer with buffer transfer, which enables more efficient exploration in the target task. We demonstrate that, when applied to QMIX, CURO overcomes severe RO problem and significantly improves performance, yielding state-of-the-art results in a variety of cooperative multi-agent tasks, including the challenging StarCraft II micromanagement benchmarks. | 翻訳日:2023-05-17 00:04:40 公開日:2023-05-15 |
# 論理とコモンセンスによる時間知識グラフの完成 Logic and Commonsense-Guided Temporal Knowledge Graph Completion ( http://arxiv.org/abs/2211.16865v2 ) ライセンス: Link先を確認 | Guanglin Niu, Bo Li | (参考訳) 時間的知識グラフ(TKG)は、時間を含むデータに由来する事象を記憶する。
さらに, 補助コモンセンス知識を用いて, 事象の再現性を正確に評価した。
本論文のソースコードとデータセットはhttps://github.com/ngl567/LCGE.comで公開されている。 A temporal knowledge graph (TKG) stores the events derived from the data involving time. Predicting events is extremely challenging due to the time-sensitive property of events. Besides, the previous TKG completion (TKGC) approaches cannot represent both the timeliness and the causality properties of events, simultaneously. To address these challenges, we propose a Logic and Commonsense-Guided Embedding model (LCGE) to jointly learn the time-sensitive representation involving timeliness and causality of events, together with the time-independent representation of events from the perspective of commonsense. Specifically, we design a temporal rule learning algorithm to construct a rule-guided predicate embedding regularization strategy for learning the causality among events. Furthermore, we could accurately evaluate the plausibility of events via auxiliary commonsense knowledge. The experimental results of TKGC task illustrate the significant performance improvements of our model compared with the existing approaches. More interestingly, our model is able to provide the explainability of the predicted results in the view of causal inference. The source code and datasets of this paper are available at https://github.com/ngl567/LCGE. | 翻訳日:2023-05-17 00:03:19 公開日:2023-05-15 |
# フェデレーションハイパーパラメータチューニングにおけるノイズ評価について On Noisy Evaluation in Federated Hyperparameter Tuning ( http://arxiv.org/abs/2212.08930v4 ) ライセンス: Link先を確認 | Kevin Kuo, Pratiksha Thaker, Mikhail Khodak, John Nguyen, Daniel Jiang, Ameet Talwalkar, Virginia Smith | (参考訳) ハイパーパラメータチューニングは、連合学習アプリケーションの成功に不可欠である。
私たちの研究は、フェデレーションハイパーパラメータチューニングにおける将来の作業のための一般的な課題、ベースライン、ベストプラクティスを確立します。 Hyperparameter tuning is critical to the success of federated learning applications. Unfortunately, appropriately selecting hyperparameters is challenging in federated networks. Issues of scale, privacy, and heterogeneity introduce noise in the tuning process and make it difficult to evaluate the performance of various hyperparameters. In this work, we perform the first systematic study on the effect of noisy evaluation in federated hyperparameter tuning. We first identify and rigorously explore key sources of noise, including client subsampling, data and systems heterogeneity, and data privacy. Surprisingly, our results indicate that even small amounts of noise can significantly impact tuning methods-reducing the performance of state-of-the-art approaches to that of naive baselines. To address noisy evaluation in such scenarios, we propose a simple and effective approach that leverages public proxy data to boost the evaluation signal. Our work establishes general challenges, baselines, and best practices for future work in federated hyperparameter tuning. | 翻訳日:2023-05-16 23:56:38 公開日:2023-05-15 |
# 多言語翻訳における干渉の原因と治療 Causes and Cures for Interference in Multilingual Translation ( http://arxiv.org/abs/2212.07530v2 ) ライセンス: Link先を確認 | Uri Shaham and Maha Elbayad and Vedanuj Goswami and Omer Levy and Shruti Bhosale | (参考訳) 多言語機械翻訳モデルは、異なる言語ペア間のシナジーの恩恵を受けるが、干渉も受ける。
また,データ内の各言語対の比率を制御するためにサンプリング温度をチューニングすることが,低資源言語対と高資源言語対の干渉量を効果的にバランスさせる上で重要であることを示す。 Multilingual machine translation models can benefit from synergy between different language pairs, but also suffer from interference. While there is a growing number of sophisticated methods that aim to eliminate interference, our understanding of interference as a phenomenon is still limited. This work identifies the main factors that contribute to interference in multilingual machine translation. Through systematic experimentation, we find that interference (or synergy) are primarily determined by model size, data size, and the proportion of each language pair within the total dataset. We observe that substantial interference occurs mainly when the model is very small with respect to the available training data, and that using standard transformer configurations with less than one billion parameters largely alleviates interference and promotes synergy. Moreover, we show that tuning the sampling temperature to control the proportion of each language pair in the data is key to balancing the amount of interference between low and high resource language pairs effectively, and can lead to superior performance overall. | 翻訳日:2023-05-16 23:55:48 公開日:2023-05-15 |
# ブロックチェーンに関するAI倫理: ブロックチェーンセキュリティのためのTwitterデータに関するトピック分析 AI Ethics on Blockchain: Topic Analysis on Twitter Data for Blockchain Security ( http://arxiv.org/abs/2212.06951v3 ) ライセンス: Link先を確認 | Yihang Fu, Zesen Zhuang, Luyao Zhang | (参考訳) Blockchainは、分散ネットワークを使用してコンピュータシステムをよりセキュアにする権限を与えている。
鉱夫は、いわゆるmev(miner extractable value)と呼ばれる取引を注文して利益を得ることができる。
以上の結果から, このツイートは, セキュリティ, 公平性, 情緒的感情, およびMEVに対するソリューションへの欲求など, 倫理的懸念の深いトピックを議論した。
私たちの研究は、ブロックチェーンセキュリティ、MEVソリューション、AI倫理のインターフェースにおける文献に貢献します。 Blockchain has empowered computer systems to be more secure using a distributed network. However, the current blockchain design suffers from fairness issues in transaction ordering. Miners are able to reorder transactions to generate profits, the so-called miner extractable value (MEV). Existing research recognizes MEV as a severe security issue and proposes potential solutions, including prominent Flashbots. However, previous studies have mostly analyzed blockchain data, which might not capture the impacts of MEV in a much broader AI society. Thus, in this research, we applied natural language processing (NLP) methods to comprehensively analyze topics in tweets on MEV. We collected more than 20000 tweets with MEV and Flashbots hashtags and analyzed their topics. Our results show that the tweets discussed profound topics of ethical concern, including security, equity, emotional sentiments, and the desire for solutions to MEV. We also identify the co-movements of MEV activities on blockchain and social media platforms. Our study contributes to the literature at the interface of blockchain security, MEV solutions, and AI ethics. | 翻訳日:2023-05-16 23:55:13 公開日:2023-05-15 |
# プライオリティ投票力の測定 - デリゲートを真剣に考える Measuring a Priori Voting Power -- Taking Delegations Seriously ( http://arxiv.org/abs/2301.02462v4 ) ライセンス: Link先を確認 | Rachael Colley, Th\'eo Delemazure, Hugo Gilbert | (参考訳) 本稿では,自由民主主義選挙における有権者の事前投票力を測定するための新たな権限指標を紹介する。
票の重みが多項式境界である場合であっても, 投票者の臨界度は#P-hardであることを示す。
我々は、その理論的な性質を強調し、有権者の投票権をいかに制限するかを示す数値的な結果を提供する。 We introduce new power indices to measure the a priori voting power of voters in liquid democracy elections where an underlying network restricts delegations. We argue that our power indices are natural extensions of the standard Penrose-Banzhaf index in simple voting games. We show that computing the criticality of a voter is #P-hard even when voting weights are polynomially-bounded in the size of the instance. However, for specific settings, such as when the underlying network is a bipartite or complete graph, recursive formulas can compute these indices for weighted voting games in pseudo-polynomial time. We highlight their theoretical properties and provide numerical results to illustrate how restricting the possible delegations can alter voters' voting power. | 翻訳日:2023-05-16 23:46:55 公開日:2023-05-15 |
# 絡み合いから運動の準局所積分を計測する Measuring out quasi-local integrals of motion from entanglement ( http://arxiv.org/abs/2301.01787v2 ) ライセンス: Link先を確認 | B. Lu, C. Bertoni, S. J. Thomson, J. Eisert | (参考訳) 運動の準局所積分は、相互作用と障害が結合する興味深い現象である多体局所化の現代の理解を支える重要な概念である。
この絡み合いは、実験で測定できる、明確に定義された長さのスケールを生み出すことを実証する。 Quasi-local integrals of motion are a key concept underpinning the modern understanding of many-body localisation, an intriguing phenomenon in which interactions and disorder come together. Despite the existence of several numerical ways to compute them - and astoundingly in the light of the observation that much of the phenomenology of many properties can be derived from them - it is not obvious how to directly measure aspects of them in real quantum simulations; in fact, the smoking gun of their experimental observation is arguably still missing. In this work, we propose a way to extract the real-space properties of such quasi-local integrals of motion based on a spatially-resolved entanglement probe able to distinguish Anderson from many-body localisation from non-equilibrium dynamics. We complement these findings with a new rigorous entanglement bound and compute the relevant quantities using tensor networks. We demonstrate that the entanglement gives rise to a well-defined length scale that can be measured in experiments. | 翻訳日:2023-05-16 23:46:15 公開日:2023-05-15 |
# 大規模言語モデルのための並列コンテキストWindows Parallel Context Windows for Large Language Models ( http://arxiv.org/abs/2212.10947v2 ) ライセンス: Link先を確認 | Nir Ratner, Yoav Levine, Yonatan Belinkov, Ori Ram, Inbal Magar, Omri Abend, Ehud Karpas, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham | (参考訳) 長文処理に適用される場合、Large Language Models (LLM) はコンテキストウィンドウによって制限される。
そこで本研究では,市販llmのコンテキストウインドウ制限を緩和する手法であるparallel context windows (pcw)を提案する。
この結果から,Parallel Context Windows は,長いテキストシーケンスを必要とするさまざまな設定で既製の LLM を適用するための有望な方法として注目されている。
コードをhttps://github.com/ai21labs/parallel-context-windowsで公開しています。 When applied for processing long text, Large Language Models (LLMs) are limited by their context window. Existing efforts to address this limitation involve training specialized architectures, and cannot be easily applied to off-the-shelf LLMs. We present Parallel Context Windows (PCW), a method that alleviates the context window restriction for any off-the-shelf LLM without further training. The key to the approach is to carve a long context into chunks (``windows''), restrict the attention mechanism to apply only within each window, and re-use the positional embeddings across the windows. Our main results test the PCW approach on in-context learning with models that range in size between 750 million and 178 billion parameters, and show substantial improvements for tasks with diverse input and output spaces. We show additional benefits in other settings where long context windows may be beneficial: multi-hop questions and retrieval-augmented question answering with multiple retrieved documents. Our results highlight Parallel Context Windows as a promising method for applying off-the-shelf LLMs in a range of settings that require long text sequences. We make our code publicly available at https://github.com/ai21labs/parallel-context-windows. | 翻訳日:2023-05-16 23:45:14 公開日:2023-05-15 |
# GPTはなぜインコンテキストを学習できるのか?
メタオプティマイザとしての言語モデル Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers ( http://arxiv.org/abs/2212.10559v3 ) ライセンス: Link先を確認 | Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei | (参考訳) 大規模な事前訓練された言語モデルは、驚くべきインコンテキスト学習(ICL)能力を示している。
コードは \url{https://aka.ms/icl} で利用可能である。 Large pretrained language models have shown surprising in-context learning (ICL) ability. With a few demonstration input-label pairs, they can predict the label for an unseen input without parameter updates. Despite the great success in performance, its working mechanism still remains an open question. In this paper, we explain language models as meta-optimizers and understand in-context learning as implicit finetuning. Theoretically, we figure out that Transformer attention has a dual form of gradient descent. On top of it, we understand ICL as follows: GPT first produces meta-gradients according to the demonstration examples, and then these meta-gradients are applied to the original GPT to build an ICL model. We comprehensively compare the behaviors of in-context learning and explicit finetuning on real tasks to provide empirical evidence that supports our understanding. Experimental results show that in-context learning behaves similarly to explicit finetuning from multiple perspectives. Inspired by the dual form between Transformer attention and gradient descent, we design a momentum-based attention by analogy with gradient descent with momentum. The improved performance over vanilla attention further supports our understanding from another perspective, and more importantly, shows the potential to utilize our understanding for future model design. The code is available at \url{https://aka.ms/icl}. | 翻訳日:2023-05-16 23:44:29 公開日:2023-05-15 |
# オープンドメイン質問応答における誤情報攻撃の防止 Defending Against Misinformation Attacks in Open-Domain Question Answering ( http://arxiv.org/abs/2212.10002v2 ) ライセンス: Link先を確認 | Orion Weller, Aleem Khan, Nathaniel Weir, Dawn Lawrie, Benjamin Van Durme | (参考訳) オープンドメイン質問応答 (ODQA) の最近の研究により, 探索コレクションの敵性中毒が生産システムの精度を大幅に低下させる可能性が示されている。
我々は、これらの新しい通路を、新しい信頼法の設計を通してモデルに統合し、予測された回答と、検索されたコンテキストにおけるその出現を比較する(我々は、応答冗長性から \textit{confidence} と呼ぶ)。
これらの手法を組み合わせることで、さまざまなレベルのデータ中毒/知識の衝突に対して、ほぼ20%の正確な一致をもたらす中毒攻撃から防御する、シンプルで効果的な方法が得られます。 Recent work in open-domain question answering (ODQA) has shown that adversarial poisoning of the search collection can cause large drops in accuracy for production systems. However, little to no work has proposed methods to defend against these attacks. To do so, we rely on the intuition that redundant information often exists in large corpora. To find it, we introduce a method that uses query augmentation to search for a diverse set of passages that could answer the original question but are less likely to have been poisoned. We integrate these new passages into the model through the design of a novel confidence method, comparing the predicted answer to its appearance in the retrieved contexts (what we call \textit{Confidence from Answer Redundancy}, i.e. CAR). Together these methods allow for a simple but effective way to defend against poisoning attacks that provides gains of nearly 20\% exact match across varying levels of data poisoning/knowledge conflicts. | 翻訳日:2023-05-16 23:44:08 公開日:2023-05-15 |
# 認知症検出のための変分量子回路を用いたハイブリッド量子古典ニューラルネットワークの実装 Implementing a Hybrid Quantum-Classical Neural Network by Utilizing a Variational Quantum Circuit for Detection of Dementia ( http://arxiv.org/abs/2301.12505v2 ) ライセンス: Link先を確認 | Ryan Kim | (参考訳) MRI(MRI)は脳卒中、腫瘍、その他の認知症の原因となる疾患をスキャンする一般的な技術である。
量子コンピューティングアプリケーション この提案するニューラルネットワークアーキテクチャは、完全接続(fc)層を使用しており、変動量子回路(vqc)を実装して期待値を得るための特徴の数を減らす。
本研究で作成したVQCは,アダマールゲート,画素のanh(intensity) * (pi/2) でパラメータ化された回転Yゲート,制御ノット(CNOT)ゲート,および測定演算子を用いて,期待値を得る。
さらに、提案するアーキテクチャは一般に柔軟であり、転送学習タスク、時間とリソースの節約に使用できる。 Magnetic resonance imaging (MRI) is a common technique to scan brains for strokes, tumors, and other abnormalities that cause forms of dementia. However, correctly diagnosing forms of dementia from MRIs is difficult, as nearly 1 in 3 patients with Alzheimer's were misdiagnosed in 2019, an issue neural networks can rectify. Quantum computing applications This proposed novel neural network architecture uses a fully-connected (FC) layer, which reduces the number of features to obtain an expectation value by implementing a variational quantum circuit (VQC). The VQC created in this study utilizes a layer of Hadamard gates, Rotation-Y gates that are parameterized by tanh(intensity) * (pi/2) of a pixel, controlled-not (CNOT) gates, and measurement operators to obtain the expected values. This study found that the proposed hybrid quantum-classical convolutional neural network (QCCNN) provided 97.5% and 95.1% testing and validation accuracies, respectively, which was considerably higher than the classical neural network (CNN) testing and validation accuracies of 91.5% and 89.2%. Additionally, using a testing set of 100 normal and 100 dementia MRI images, the QCCNN detected normal and demented images correctly 95% and 98% of the time, compared to the CNN accuracies of 89% and 91%. With hospitals like Massachusetts General Hospital beginning to adopt machine learning applications for biomedical image detection, this proposed architecture would approve accuracies and potentially save more lives. Furthermore, the proposed architecture is generally flexible, and can be used for transfer-learning tasks, saving time and resources. | 翻訳日:2023-05-16 23:38:45 公開日:2023-05-15 |
# 時系列解析のためのコンテキスト固有カーネルベース隠れマルコフモデル Context-specific kernel-based hidden Markov model for time series analysis ( http://arxiv.org/abs/2301.09870v2 ) ライセンス: Link先を確認 | Carlos Puerto-Santana, Concha Bielza, Pedro Larra\~naga, Gustav Eje Henter | (参考訳) 伝統的な隠れマルコフモデルは確率的動的データの理解とモデル化に有用なツールであり、非ガウス的データの場合、ガウス的隠れマルコフモデルの混合のようなモデルが用いられる。
以上の結果から,提案モデルによる確率と分類精度の利点を定量化し,分析した。 Traditional hidden Markov models have been a useful tool to understand and model stochastic dynamic data; in the case of non-Gaussian data, models such as mixture of Gaussian hidden Markov models can be used. However, these suffer from the computation of precision matrices and have a lot of unnecessary parameters. As a consequence, such models often perform better when it is assumed that all variables are independent, a hypothesis that may be unrealistic. Hidden Markov models based on kernel density estimation are also capable of modeling non-Gaussian data, but they assume independence between variables. In this article, we introduce a new hidden Markov model based on kernel density estimation, which is capable of capturing kernel dependencies using context-specific Bayesian networks. The proposed model is described, together with a learning algorithm based on the expectation-maximization algorithm. Additionally, the model is compared to related HMMs on synthetic and real data. From the results, the benefits in likelihood and classification accuracy from the proposed model are quantified and analyzed. | 翻訳日:2023-05-16 23:37:49 公開日:2023-05-15 |
# FPGAを用いた表面符号のスケーラブル量子誤り補正 Scalable Quantum Error Correction for Surface Codes using FPGA ( http://arxiv.org/abs/2301.08419v2 ) ライセンス: Link先を確認 | Namitha Liyanage, Yue Wu, Alexander Deters and Lin Zhong | (参考訳) フォールトトレラント量子コンピュータは、現れるよりも早くデコードし、エラーを訂正しなければならない。
Union-Find (UF) デコーダは平均時間複雑性が$O(d^3)$よりわずかに高いことを約束している。
我々はXilinx VCU129 FPGAで最大21ドルで実装でき、その場合、測定ラウンドあたりの平均復号時間は0.1 %の現象雑音下で11.5 nsであり、既存のデコーダ実装よりもかなり高速である。
heliosの測定ラウンド毎のデコード時間は$d$で減少するため、heliosはバックログを増加させずに任意の大きな$d$の表面コードをデコードできる。 A fault-tolerant quantum computer must decode and correct errors faster than they appear. The faster errors can be corrected, the more time the computer can do useful work. The Union-Find (UF) decoder is promising with an average time complexity slightly higher than $O(d^3)$. We report a distributed version of the UF decoder that exploits parallel computing resources for further speedup. Using an FPGA-based implementation, we empirically show that this distributed UF decoder has a sublinear average time complexity with regard to $d$, given $O(d^3)$ parallel computing resources. The decoding time per measurement round decreases as $d$ increases, a first time for a quantum error decoder. The implementation employs a scalable architecture called Helios that organizes parallel computing resources into a hybrid tree-grid structure. We are able to implement $d$ up to 21 with a Xilinx VCU129 FPGA, for which an average decoding time is 11.5 ns per measurement round under phenomenological noise of 0.1\%, significantly faster than any existing decoder implementation. Since the decoding time per measurement round of Helios decreases with $d$, Helios can decode a surface code of arbitrarily large $d$ without a growing backlog. | 翻訳日:2023-05-16 23:37:32 公開日:2023-05-15 |
# マスク付き自動エンコーディングは自然言語を大規模に監視するのに役立たない Masked Autoencoding Does Not Help Natural Language Supervision at Scale ( http://arxiv.org/abs/2301.07836v4 ) ライセンス: Link先を確認 | Floris Weers, Vaishaal Shankar, Angelos Katharopoulos, Yinfei Yang, Tom Gunter | (参考訳) 自己監督と自然言語監督は、様々な下流タスクに優れた汎用画像エンコーダを訓練する2つのエキサイティングな方法として登場した。
私たちの研究は、大規模な画像テキストトレーニングにおける自己監督の有効性(あるいは欠如)について、必要な明確さを提供します。 Self supervision and natural language supervision have emerged as two exciting ways to train general purpose image encoders which excel at a variety of downstream tasks. Recent works such as M3AE and SLIP have suggested that these approaches can be effectively combined, but most notably their results use small pre-training datasets (<50M samples) and don't effectively reflect the large-scale regime (>100M examples) that is commonly used for these approaches. Here we investigate whether a similar approach can be effective when trained with a much larger amount of data. We find that a combination of two state of the art approaches: masked auto-encoders, MAE and contrastive language image pre-training, CLIP provides a benefit over CLIP when trained on a corpus of 11.3M image-text pairs, but little to no benefit (as evaluated on a suite of common vision tasks) over CLIP when trained on a large corpus of 1.4B images. Our work provides some much needed clarity into the effectiveness (or lack thereof) of self supervision for large-scale image-text training. | 翻訳日:2023-05-16 23:36:16 公開日:2023-05-15 |
# おもちゃモデルにおける創発的微分同相不変性 Emergent diffeomorphism invariance in toy models ( http://arxiv.org/abs/2301.04448v2 ) ライセンス: Link先を確認 | Hrvoje Nikolic | (参考訳) 半古典的および量子重力の概念上の困難は、古典的な一般相対性理論の微分同相不変性から生じる。
これらの困難に光を当てる動機付けとして, 1次元微分同相不変性,すなわち時間再パラメータ化不変性がエネルギー保存から古典レベルに出現する玩具モデルの研究を行った。
それでもこれらの問題は、不変性が古典的なレベルでのみ現れることを考慮すると容易に解決できるが、量子化する必要がある基本理論は微分同相不変ではない。 Conceptual difficulties in semiclassical and quantum gravity arise from diffeomorphism invariance of classical general relativity. With a motivation to shed some light on these difficulties, we study a class of toy models for which one-dimensional diffeomorphism invariance, namely time-reparametrization invariance, emerges at the classical level from energy conservation. An attempt to quantize the models while taking the invariance seriously leads to toy versions of the problem of time in quantum gravity, of the cosmological constant problem, and of the black hole firewall problem. Nevertheless, all these problems are easily resolved by taking into account that the invariance emerges only at the classical level, while the fundamental theory that needs to be quantized is not diffeomorphism invariant. | 翻訳日:2023-05-16 23:35:21 公開日:2023-05-15 |
# ガウス過程状態を持つ効率的なabイニティオ電子構造の枠組み A framework for efficient ab initio electronic structure with Gaussian Process States ( http://arxiv.org/abs/2302.01099v3 ) ライセンス: Link先を確認 | Yannic Rath and George H. Booth | (参考訳) 本稿では、量子多体状態の表現を現代機械学習にインスパイアされた現実的なフェルミオン系の効率的なシミュレーションのための一般的なフレームワークについて述べる。
機械学習における系統的改良可能なカーネルモデルにインスパイアされた最近導入されたansatzである「gaussian process state」の適用により、計算フォック空間の表現を定義するための異なる選択について論じる。
我々は、最大64個の電子を持つ系に対して、三次元水素中のモット転移の単純化されたモデルを含む競争精度を示すことができ、構成サンプルの適度な数であっても、同様のアプローチよりも大幅に改善されていることを示す。 We present a general framework for the efficient simulation of realistic fermionic systems with modern machine learning inspired representations of quantum many-body states, towards a universal tool for ab initio electronic structure. These machine learning inspired ansatzes have recently come to the fore in both a (first quantized) continuum and discrete Fock space representations, where however the inherent scaling of the latter approach for realistic interactions has so far limited practical applications. With application to the 'Gaussian Process State', a recently introduced ansatz inspired by systematically improvable kernel models in machine learning, we discuss different choices to define the representation of the computational Fock space. We show how local representations are particularly suited for stochastic sampling of expectation values, while also indicating a route to overcome the discrepancy in the scaling compared to continuum formulated models. We are able to show competitive accuracy for systems with up to 64 electrons, including a simplified (yet fully ab initio) model of the Mott transition in three-dimensional hydrogen, indicating a significant improvement over similar approaches, even for moderate numbers of configurational samples. | 翻訳日:2023-05-16 23:26:51 公開日:2023-05-15 |
# 予後関連因子の組織学的および臨床組織型Glioblastomaパターンの検出 Detecting Histologic & Clinical Glioblastoma Patterns of Prognostic Relevance ( http://arxiv.org/abs/2302.00669v2 ) ライセンス: Link先を確認 | Bhakti Baheti, Sunny Rai, Shubham Innani, Garv Mehdiratta, Sharath Chandra Guntuku, MacLean P. Nasrallah, Spyridon Bakas | (参考訳) グリオブラスト腫は中枢神経系で最も一般的で攻撃的な悪性成人腫瘍であり、グリム予後と異種形態および分子プロファイルがある。
xgboost と shapley additive explanations (shap) を用いて、関連する臨床患者データの予後のさらなる妥当性を単独および統合的に評価する。
短いosと長いosに関連する腫瘍の形態と臨床パターンを特定することで、臨床神経病理学者は治療チームにさらなる関連する予後情報を提供し、グリオブラスト腫の理解と治療のための生物学的研究の道筋を示唆することができる。 Glioblastoma is the most common and aggressive malignant adult tumor of the central nervous system, with a grim prognosis and heterogeneous morphologic and molecular profiles. Since adopting the current standard-of-care treatment 18 years ago, no substantial prognostic improvement has been noticed. Accurate prediction of patient overall survival (OS) from histopathology whole slide images (WSI) integrated with clinical data using advanced computational methods could optimize clinical decision-making and patient management. Here, we focus on identifying prognostically relevant glioblastoma characteristics from H&E stained WSI & clinical data relating to OS. The exact approach for WSI capitalizes on the comprehensive curation of apparent artifactual content and an interpretability mechanism via a weakly supervised attention-based multiple-instance learning algorithm that further utilizes clustering to constrain the search space. The automatically placed patterns of high diagnostic value classify each WSI as representative of short or long-survivors. Further assessment of the prognostic relevance of the associated clinical patient data is performed both in isolation and in an integrated manner, using XGBoost and SHapley Additive exPlanations (SHAP). Identifying tumor morphological & clinical patterns associated with short and long OS will enable the clinical neuropathologist to provide additional relevant prognostic information to the treating team and suggest avenues of biological investigation for understanding and potentially treating glioblastoma. | 翻訳日:2023-05-16 23:26:17 公開日:2023-05-15 |
# バイアスドプロンプトによる視覚言語モデルのデバイアス Debiasing Vision-Language Models via Biased Prompts ( http://arxiv.org/abs/2302.00070v2 ) ライセンス: Link先を確認 | Ching-Yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka | (参考訳) 機械学習モデルは、トレーニングデータセットからバイアスを継承することが示されている。
提案するクローズドフォームソリューションにより,大規模パイプラインへの統合が容易になり,実験結果から,新たなデータやトレーニングを必要とせずに,識別的および生成的視覚言語モデルの社会的バイアスと散発的相関を効果的に低減できることが示された。 Machine learning models have been shown to inherit biases from their training datasets. This can be particularly problematic for vision-language foundation models trained on uncurated datasets scraped from the internet. The biases can be amplified and propagated to downstream applications like zero-shot classifiers and text-to-image generative models. In this study, we propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. In particular, we show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models. The proposed closed-form solution enables easy integration into large-scale pipelines, and empirical results demonstrate that our approach effectively reduces social bias and spurious correlation in both discriminative and generative vision-language models without the need for additional data or training. | 翻訳日:2023-05-16 23:25:36 公開日:2023-05-15 |
# Zero3D:Semantic-Driven Multi-Category 3D Shape Generation Zero3D: Semantic-Driven Multi-Category 3D Shape Generation ( http://arxiv.org/abs/2301.13591v3 ) ライセンス: Link先を確認 | Bo Han, Yitong Fu, Yixuan Shen | (参考訳) 意味駆動型3d形状生成は、テキストに基づく3dオブジェクトの生成を目的としている。
1) 大規模ペアデータ不足の問題を緩和するために, 事前学習したCLIPモデルに基づいてテキスト, 2次元画像, 3次元形状をブリッジし,
2) マルチカテゴリの3次元形状特徴を得るため,CLIP埋め込みに条件付き3次元形状ベクトルを生成する条件フローモデルを適用した。
3) マルチカテゴリ3次元形状を生成するために, 多カテゴリ形状ベクトルに条件付き隠れ層拡散モデルを用い, トレーニング時間とメモリ消費を大幅に削減する。 Semantic-driven 3D shape generation aims to generate 3D objects conditioned on text. Previous works face problems with single-category generation, low-frequency 3D details, and requiring a large number of paired datasets for training. To tackle these challenges, we propose a multi-category conditional diffusion model. Specifically, 1) to alleviate the problem of lack of large-scale paired data, we bridge the text, 2D image and 3D shape based on the pre-trained CLIP model, and 2) to obtain the multi-category 3D shape feature, we apply the conditional flow model to generate 3D shape vector conditioned on CLIP embedding. 3) to generate multi-category 3D shape, we employ the hidden-layer diffusion model conditioned on the multi-category shape vector, which greatly reduces the training time and memory consumption. | 翻訳日:2023-05-16 23:25:22 公開日:2023-05-15 |
# 任意決定は個人差分訓練の隠れたコストである Arbitrary Decisions are a Hidden Cost of Differentially Private Training ( http://arxiv.org/abs/2302.14517v2 ) ライセンス: Link先を確認 | Bogdan Kulynych, Hsiang Hsu, Carmela Troncoso, Flavio P. Calmon | (参考訳) プライバシ保存機械学習で使用されるメカニズムは、モデルトレーニング中に差分プライバシー(DP)を保証することを目的としていることが多い。
我々は,個人レベルのアプリケーションに適用する前に,dp補償アルゴリズムの予測多重性を監査するべきであると結論づけた。 Mechanisms used in privacy-preserving machine learning often aim to guarantee differential privacy (DP) during model training. Practical DP-ensuring training methods use randomization when fitting model parameters to privacy-sensitive data (e.g., adding Gaussian noise to clipped gradients). We demonstrate that such randomization incurs predictive multiplicity: for a given input example, the output predicted by equally-private models depends on the randomness used in training. Thus, for a given input, the predicted output can vary drastically if a model is re-trained, even if the same training dataset is used. The predictive-multiplicity cost of DP training has not been studied, and is currently neither audited for nor communicated to model designers and stakeholders. We derive a bound on the number of re-trainings required to estimate predictive multiplicity reliably. We analyze--both theoretically and through extensive experiments--the predictive-multiplicity cost of three DP-ensuring algorithms: output perturbation, objective perturbation, and DP-SGD. We demonstrate that the degree of predictive multiplicity rises as the level of privacy increases, and is unevenly distributed across individuals and demographic groups in the data. Because randomness used to ensure DP during training explains predictions for some examples, our results highlight a fundamental challenge to the justifiability of decisions supported by differentially private models in high-stakes settings. We conclude that practitioners should audit the predictive multiplicity of their DP-ensuring algorithms before deploying them in applications of individual-level consequence. | 翻訳日:2023-05-16 23:19:44 公開日:2023-05-15 |
# BrainCLIP:遺伝性自然視刺激復号のための脳と視覚言語表現Via CLIP BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP for Generic Natural Visual Stimulus Decoding ( http://arxiv.org/abs/2302.12971v3 ) ライセンス: Link先を確認 | Yulong Liu, Yongqiang Ma, Wei Zhou, Guibo Zhu, Nanning Zheng | (参考訳) ペアサンプルの欠如と機能的MRI(fMRI)信号の低信号対雑音比のため、知覚された自然画像の再構成や、fMRIデータからそれらの意味的内容の復号は難しい作業である。
BrainCLIPはまた、高い意味的忠実度で視覚刺激を再構築し、高レベルな意味的特徴の観点から、fMRIベースの自然画像再構成のための新しい最先端技術を確立することができる。 Due to the lack of paired samples and the low signal-to-noise ratio of functional MRI (fMRI) signals, reconstructing perceived natural images or decoding their semantic contents from fMRI data are challenging tasks. In this work, we propose, for the first time, a task-agnostic fMRI-based brain decoding model, BrainCLIP, which leverages CLIP's cross-modal generalization ability to bridge the modality gap between brain activity, image, and text. Our experiments demonstrate that CLIP can act as a pivot for generic brain decoding tasks, including zero-shot visual categories decoding, fMRI-image/text matching, and fMRI-to-image generation. Specifically, BrainCLIP aims to train a mapping network that transforms fMRI patterns into a well-aligned CLIP embedding space by combining visual and textual supervision. Our experiments show that this combination can boost the decoding model's performance on certain tasks like fMRI-text matching and fMRI-to-image generation. On the zero-shot visual category decoding task, BrainCLIP achieves significantly better performance than BraVL, a recently proposed multi-modal method specifically designed for this task. BrainCLIP can also reconstruct visual stimuli with high semantic fidelity and establishes a new state-of-the-art for fMRI-based natural image reconstruction in terms of high-level semantic features. | 翻訳日:2023-05-16 23:19:16 公開日:2023-05-15 |
# プロトタイプ画像分類における正当性チェックとパッチ可視化の改善 Sanity checks and improvements for patch visualisation in prototype-based image classification ( http://arxiv.org/abs/2302.08508v2 ) ライセンス: Link先を確認 | Romain Xu-Darme (LSL, MRIM), Georges Qu\'enot (MRIM), Zakaria Chihani (LSL), Marie-Christine Rousset (SLIDE) | (参考訳) 本研究では,プロトタイプをベースとした視覚分類モデルであるProtoPNetとProtoTreeを用いて,ビジュアル化手法の詳細な分析を行う。
2つのきめ細かいデータセット(CUB-200-2011とStanford Cars)を用いて、これらの手法が画像内の関心領域を正しく識別せず、従ってモデル動作を反映しないことを示す。
次に,削除基準を用いて,Smoothgrads や PRP などの塩分濃度法がより忠実な画像パッチを提供することを示す。
また,いくつかのデータセット(例えば CUB-200-2011)で提供されるオブジェクトのセグメンテーションに基づく新しい関連度尺度を提案し,ProtoPNet と ProtoTree が生成した不正確なパッチの可視化により,より忠実な手法を用いることでバイアスを軽減できることを示す。
最後に,同じ可視化方法を共有する他のプロトタイプモデルに対する知見の意義について考察する。 In this work, we perform an in-depth analysis of the visualisation methods implemented in two popular self-explaining models for visual classification based on prototypes - ProtoPNet and ProtoTree. Using two fine-grained datasets (CUB-200-2011 and Stanford Cars), we first show that such methods do not correctly identify the regions of interest inside of the images, and therefore do not reflect the model behaviour. Secondly, using a deletion metric, we demonstrate quantitatively that saliency methods such as Smoothgrads or PRP provide more faithful image patches. We also propose a new relevance metric based on the segmentation of the object provided in some datasets (e.g. CUB-200-2011) and show that the imprecise patch visualisations generated by ProtoPNet and ProtoTree can create a false sense of bias that can be mitigated by the use of more faithful methods. Finally, we discuss the implications of our findings for other prototype-based models sharing the same visualisation method. | 翻訳日:2023-05-16 23:17:35 公開日:2023-05-15 |
# 不確実性推定法とその医用画像への応用 A Review of Uncertainty Estimation and its Application in Medical Imaging ( http://arxiv.org/abs/2302.08119v2 ) ライセンス: Link先を確認 | Ke Zou and Zhihao Chen and Xuedong Yuan and Xiaojing Shen and Meng Wang and Huazhu Fu | (参考訳) 病気の早期スクリーニングのための医療におけるAIシステムの利用は、非常に臨床的に重要である。
さらに, 医用画像に不確実性推定を組み込んだ深層学習モデルの最近の進歩を概観する。
このレビューがコミュニティにさらなる関心を喚起し、医学画像における不確実性推定モデルの適用に関する最新の参照を研究者に提供することを期待している。 The use of AI systems in healthcare for the early screening of diseases is of great clinical importance. Deep learning has shown great promise in medical imaging, but the reliability and trustworthiness of AI systems limit their deployment in real clinical scenes, where patient safety is at stake. Uncertainty estimation plays a pivotal role in producing a confidence evaluation along with the prediction of the deep model. This is particularly important in medical imaging, where the uncertainty in the model's predictions can be used to identify areas of concern or to provide additional information to the clinician. In this paper, we review the various types of uncertainty in deep learning, including aleatoric uncertainty and epistemic uncertainty. We further discuss how they can be estimated in medical imaging. More importantly, we review recent advances in deep learning models that incorporate uncertainty estimation in medical imaging. Finally, we discuss the challenges and future directions in uncertainty estimation in deep learning for medical imaging. We hope this review will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of uncertainty estimation models in medical imaging. | 翻訳日:2023-05-16 23:17:16 公開日:2023-05-15 |
# 圧電材料のためのスパースヒステリシスモデルの発見 Discovery of sparse hysteresis models for piezoelectric materials ( http://arxiv.org/abs/2302.05313v5 ) ライセンス: Link先を確認 | Abhishek Chandra, Bram Daniels, Mitrofan Curti, Koen Tiels, Elena A. Lomonova and Daniel M. Tartakovsky | (参考訳) 本稿では,近年の機械学習,特にスパース回帰技術を活用した圧電材料におけるヒステリシスのモデル化手法を提案する。
本研究は, ヒステリシスの原因となる力学系を逐次しきい値付き最小二乗法を用いてモデル化し, シミュレーションと実験の両方の圧電材料データに対するヒステリシスを正確に予測する簡潔なモデルを構築した。
ソースコードはhttps://github.com/chandratue/SmartHysteresisで入手できる。 This article presents an approach for modelling hysteresis in piezoelectric materials, that leverages recent advancements in machine learning, particularly in sparse-regression techniques. While sparse regression has previously been used to model various scientific and engineering phenomena, its application to nonlinear hysteresis modelling in piezoelectric materials has yet to be explored. The study employs the least-squares algorithm with a sequential threshold to model the dynamic system responsible for hysteresis, resulting in a concise model that accurately predicts hysteresis for both simulated and experimental piezoelectric material data. Several numerical experiments are performed, including learning butterfly-shaped hysteresis and modelling real-world hysteresis data for a piezoelectric actuator. The presented approach is compared to traditional regression-based and neural network methods, demonstrating its efficiency and robustness. Source code is available at https://github.com/chandratue/SmartHysteresis | 翻訳日:2023-05-16 23:16:59 公開日:2023-05-15 |
# 拡散モデルを用いたシンボリック音楽の生成 Generating symbolic music using diffusion models ( http://arxiv.org/abs/2303.08385v2 ) ライセンス: Link先を確認 | Lilac Atassi | (参考訳) Denoising Diffusion Probabilistic Modelは単純だが非常に強力な生成モデルとして登場した。
このコードはコミュニティによるメソッドの使用と開発を促進するために公開されています。 Denoising Diffusion Probabilistic models have emerged as simple yet very powerful generative models. Unlike other generative models, diffusion models do not suffer from mode collapse or require a discriminator to generate high-quality samples. In this paper, a diffusion model that uses a binomial prior distribution to generate piano rolls is proposed. The paper also proposes an efficient method to train the model and generate samples. The generated music has coherence at time scales up to the length of the training piano roll segments. The paper demonstrates how this model is conditioned on the input and can be used to harmonize a given melody, complete an incomplete piano roll, or generate a variation of a given piece. The code is publicly shared to encourage the use and development of the method by the community. | 翻訳日:2023-05-16 23:08:59 公開日:2023-05-15 |
# 原子と極性分子間の電荷-双極子相互作用によるライドバーグ封鎖の観測 Observation of Rydberg blockade due to the charge-dipole interaction between an atom and a polar molecule ( http://arxiv.org/abs/2303.06126v2 ) ライセンス: Link先を確認 | Alexander Guttridge, Daniel K. Ruttley, Archie C. Baldock, Rosario Gonz\'alez-F\'erez, H. R. Sadeghpour, C. S. Adams and Simon L. Cornish | (参考訳) 我々は、単一rb原子と単一rbcs分子との電荷-双極子相互作用により、ライドバーグの閉じ込めを示す。
電荷-双極子相互作用は、原子-分子分離が310(40)$~nmに設定されると、rb(52s) rydberg状態への遷移を遮断する。
以上の結果から,rydberg原子を用いて個別に捕捉された分子間で量子情報が伝達されるハイブリッドプラットフォームが期待できる。 We demonstrate Rydberg blockade due to the charge-dipole interaction between a single Rb atom and a single RbCs molecule confined in optical tweezers. The molecule is formed by magnetoassociation of a Rb+Cs atom pair and subsequently transferred to the rovibrational ground state with an efficiency of 91(1)\%. Species-specific tweezers are used to control the separation between the atom and molecule. The charge-dipole interaction causes blockade of the transition to the Rb(52s) Rydberg state, when the atom-molecule separation is set to $310(40)$~nm. The observed excitation dynamics are in good agreement with simulations using calculated interaction potentials. Our results open up the prospect of a hybrid platform where quantum information is transferred between individually trapped molecules using Rydberg atoms. | 翻訳日:2023-05-16 23:07:37 公開日:2023-05-15 |
# 時間非依存変動密度関数計算によるダイヤモンド中の荷電窒素空孔中心の電子励起 Electronic excitations of the charged nitrogen-vacancy center in diamond obtained using time-independent variational density functional calculations ( http://arxiv.org/abs/2303.03838v2 ) ライセンス: Link先を確認 | Aleksei V. Ivanov, Yorick L. A. Schmerwitz, Gianluca Levi, Hannes J\'onsson | (参考訳) 量子応用における固体中の点欠陥の光スピン初期化機構の解明には、関連する励起電子状態の正確な記述が必要である。
以前の報告とは対照的に、局所的および半局所的な密度汎関数の使用は、低次の三重項状態と一重項状態、すなわち${}^{3}A_2 < {}^{1}E < {}^{1}A_1 < {}^{3}E$の正しい順序を与える。
ここで用いられるアプローチは、例えば量子技術に関連するシステムにおける点欠陥の電子的励起を研究するための有望なツールである。 Elucidation of the mechanism for optical spin initialization of point defects in solids in the context of quantum applications requires an accurate description of the excited electronic states involved. While variational density functional calculations have been successful in describing the ground state of a great variety of systems, doubts have been expressed in the literature regarding the ability of such calculations to describe electronic excitations of point defects. A direct orbital optimization method is used here to perform time-independent, variational density functional calculations of a prototypical defect, the negatively charged nitrogen-vacancy center in diamond. The calculations include up to 511 atoms subject to periodic boundary conditions and the excited state calculations require similar computational effort as ground state calculations. Contrary to some previous reports, the use of local and semilocal density functionals gives the correct ordering of the low-lying triplet and singlet states, namely ${}^{3}A_2 < {}^{1}E < {}^{1}A_1 < {}^{3}E$. Furthermore, the more advanced meta generalized gradient approximation functionals give results that are in remarkably good agreement with high-level, many-body calculations as well as available experimental estimates, even for the excited singlet state which is often referred to as having multireference character. The lowering of the energy in the triplet excited state as the atom coordinates are optimized in accordance with analytical forces is also close to the experimental estimate and the resulting zero-phonon line triplet excitation energy is underestimated by only 0.15 eV. The approach used here is found to be a promising tool for studying electronic excitations of point defects in, for example, systems relevant for quantum technologies. | 翻訳日:2023-05-16 23:06:22 公開日:2023-05-15 |
# 対話生成のための階層的行動探索型深層rl Deep RL with Hierarchical Action Exploration for Dialogue Generation ( http://arxiv.org/abs/2303.13465v3 ) ライセンス: Link先を確認 | Itsugun Cho, Ryota Takahashi, Yusaku Yanase, Hiroaki Saito | (参考訳) 伝統的に、自然言語のアクション空間が広大なため、アクションサンプリングによるグリージーポリシーの改善と対話生成に近似動的プログラミングが用いられている。
提案手法は, きめ細かい階層に基づくアクションを抽出し, より少ないポリシー反復で最適な動作を実現する。
さらなるテストにより,本アルゴリズムは説明可能性と制御性の両方を示し,より高い報酬を期待できる応答を生成する。 Traditionally, approximate dynamic programming is employed in dialogue generation with greedy policy improvement through action sampling, as the natural language action space is vast. However, this practice is inefficient for reinforcement learning (RL) due to the sparsity of eligible responses with high action values, which leads to weak improvement sustained by random sampling. This paper presents theoretical analysis and experiments that reveal the performance of the dialogue policy is positively correlated with the sampling size. To overcome this limitation, we introduce a novel dual-granularity Q-function that explores the most promising response category to intervene in the sampling process. Our approach extracts actions based on a grained hierarchy, thereby achieving the optimum with fewer policy iterations. Additionally, we use offline RL and learn from multiple reward functions designed to capture emotional nuances in human interactions. Empirical studies demonstrate that our algorithm outperforms baselines across automatic metrics and human evaluations. Further testing reveals that our algorithm exhibits both explainability and controllability and generates responses with higher expected rewards. | 翻訳日:2023-05-16 23:00:04 公開日:2023-05-15 |
# 微分可能論理の論理:dlの一様意味論に向けて Logic of Differentiable Logics: Towards a Uniform Semantics of DL ( http://arxiv.org/abs/2303.10650v2 ) ライセンス: Link先を確認 | Natalia \'Slusarz, Ekaterina Komendantskaya, Matthew L. Daggitt, Robert Stewart, Kathrin Stark | (参考訳) 近年、論理仕様を満たすためにニューラルネットワークをトレーニングする方法として微分論理(DL)が提案されている。
我々はLDLを用いて、既存のDLの理論的特性を確立し、ニューラルネットワーク検証における実証的研究を行う。 Differentiable logics (DL) have recently been proposed as a method of training neural networks to satisfy logical specifications. A DL consists of a syntax in which specifications are stated and an interpretation function that translates expressions in the syntax into loss functions. These loss functions can then be used during training with standard gradient descent algorithms. The variety of existing DLs and the differing levels of formality with which they are treated makes a systematic comparative study of their properties and implementations difficult. This paper remedies this problem by suggesting a meta-language for defining DLs that we call the Logic of Differentiable Logics, or LDL. Syntactically, it generalises the syntax of existing DLs to FOL, and for the first time introduces the formalism for reasoning about vectors and learners. Semantically, it introduces a general interpretation function that can be instantiated to define loss functions arising from different existing DLs. We use LDL to establish several theoretical properties of existing DLs, and to conduct their empirical study in neural network verification. | 翻訳日:2023-05-16 22:59:18 公開日:2023-05-15 |
# 新しいベンチマーク: 平均教師付き学習と下流ドメイン適応のためのブレンダー付き合成データの有用性について A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation ( http://arxiv.org/abs/2303.09165v3 ) ライセンス: Link先を確認 | Hui Tang and Kui Jia | (参考訳) コンピュータビジョンにおけるディープラーニングは、大規模ラベル付きトレーニングデータの価格で大きな成功を収めた。
さらに, 合成データと実データとの伝達性を比較するため, シミュレーションから現実への適応を下流タスクとして用いることにより, 合成データの事前学習が実テスト結果の向上にも寄与することを示す。
コードとデータセットはhttps://github.com/huitangtang/on_the_utility_of_synthetic_dataで入手できる。 Deep learning in computer vision has achieved great success with the price of large-scale labeled training data. However, exhaustive data annotation is impracticable for each task of all domains of interest, due to high labor costs and unguaranteed labeling accuracy. Besides, the uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist. All these nuisances may hinder the verification of typical theories and exposure to new findings. To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization. We in this work push forward along this line by doing profound and extensive research on bare supervised learning and downstream domain adaptation. Specifically, under the well-controlled, IID data setting enabled by 3D rendering, we systematically verify the typical, important learning insights, e.g., shortcut learning, and discover the new laws of various data regimes and network architectures in generalization. We further investigate the effect of image formation factors on generalization, e.g., object scale, material texture, illumination, camera viewpoint, and background in a 3D scene. Moreover, we use the simulation-to-reality adaptation as a downstream task for comparing the transferability between synthetic and real data when used for pre-training, which demonstrates that synthetic data pre-training is also promising to improve real test results. Lastly, to promote future research, we develop a new large-scale synthetic-to-real benchmark for image classification, termed S2RDA, which provides more significant challenges for transfer from simulation to reality. The code and datasets are available at https://github.com/huitangtang/On_the_Utility_of_Synthetic_Data. | 翻訳日:2023-05-16 22:57:34 公開日:2023-05-15 |
# インタラクティブプロンプトによる効率的なマルチモーダル融合 Efficient Multimodal Fusion via Interactive Prompting ( http://arxiv.org/abs/2304.06306v2 ) ライセンス: Link先を確認 | Yaowei Li, Ruijie Quan, Linchao Zhu, Yi Yang | (参考訳) 大規模事前学習は、コンピュータビジョンや自然言語処理のような一助的な分野を新しい時代にもたらした。
また, インモーダル変換器の深層層のみにプロンプトベクトルを追加することを提案することで, トレーニングメモリ使用量を大幅に削減できることも注目に値する。
実験の結果,提案手法はトレーニング可能なパラメータが3%未満で,最大66%のメモリ使用量の削減が可能な他のマルチモーダルファインタニング手法と同等の性能を達成できた。 Large-scale pre-training has brought unimodal fields such as computer vision and natural language processing to a new era. Following this trend, the size of multi-modal learning models constantly increases, leading to an urgent need to reduce the massive computational cost of finetuning these models for downstream tasks. In this paper, we propose an efficient and flexible multimodal fusion method, namely PMF, tailored for fusing unimodally pre-trained transformers. Specifically, we first present a modular multimodal fusion framework that exhibits high flexibility and facilitates mutual interactions among different modalities. In addition, we disentangle vanilla prompts into three types in order to learn different optimizing objectives for multimodal learning. It is also worth noting that we propose to add prompt vectors only on the deep layers of the unimodal transformers, thus significantly reducing the training memory usage. Experiment results show that our proposed method achieves comparable performance to several other multimodal finetuning methods with less than 3% trainable parameters and up to 66% saving of training memory usage. | 翻訳日:2023-05-16 22:50:21 公開日:2023-05-15 |
# エッカートと湯川ポテンシャルのクラスを持つクライン・ゴルドン方程式の任意の$\ell$-状態解とその非相対論的熱的性質 Arbitrary $\ell$-state solutions of the Klein-Gordon equation with the Eckart plus a class of Yukawa potential and its non-relativistic thermal properties ( http://arxiv.org/abs/2304.00406v2 ) ライセンス: Link先を確認 | Mehmet Demirci and Ramazan Sever | (参考訳) 我々は, パラメトリックニキフォロフ-ウバロフ法を用いて, クライン・ゴードン方程式とエッカートと湯川ポテンシャルのクラスを組み合わせた境界状態解を報告する。
さらに, ポテンシャルモデルに対する非相対論的熱力学量(分配関数, 平均エネルギー, 自由エネルギー, 比熱, エントロピー)を計算し, いくつかの二原子分子について検討した。
エネルギー固有値は、パラメータ $\delta$ とともに、量子数 $n_r$ と $\ell$ に関して敏感である。
その結果、エネルギー固有値はより小さい量子数 $\ell$ またはより小さいパラメータ $\delta$ でより有界であることが示される。 We report bound state solutions of the Klein Gordon equation with a novel combined potential, the Eckart plus a class of Yukawa potential, by means of the parametric Nikiforov-Uvarov method. To deal the centrifugal and the coulombic behavior terms, we apply the Greene-Aldrich approximation scheme. We present any $\ell$-state energy eigenvalues and the corresponding normalized wave functions of a mentioned system in a closed form. We discuss various special cases related to our considered potential which are utility for other physical systems and show that these are consistent with previous reports in literature. Moreover, we calculate the non-relativistic thermodynamic quantities (partition function, mean energy, free energy, specific heat and entropy) for the potential model in question, and investigate them for a few diatomic molecules. We find that the energy eigenvalues are sensitive with regard to the quantum numbers $n_r$ and $\ell$ as well as the parameter $\delta$. Our results show that energy eigenvalues are more bounded at either smaller quantum number $\ell$ or smaller parameter $\delta$. | 翻訳日:2023-05-16 22:48:56 公開日:2023-05-15 |
# リダイレクトウォーキングによるフル没入型マルチユーザーバーチャルリアリティの予測コンテキスト認識 Predictive Context-Awareness for Full-Immersive Multiuser Virtual Reality with Redirected Walking ( http://arxiv.org/abs/2303.17907v3 ) ライセンス: Link先を確認 | Filip Lemic, Jakob Struye, Thomas Van Onsem, Jeroen Famaey, Xavier Costa Perez | (参考訳) 仮想現実(VR)技術の進歩は、没入性の向上、マルチユーザバーチャルエクスペリエンス(VE)のサポート、ユーザがリダイレクトウォーキング(RDW)を通じて専用のVRセットアップに制限されたまま、VE内で自由に移動できるようにすることに焦点を当てている。
一 RDWによるマルチユーザーVR設定における横動きの予測及び
二 方位運動予測器の訓練のための合成頭部回転データセットの作成
実験の結果,long short-term memory (lstm) ネットワークは側方運動の予測に有望な精度を発揮でき,vesによる文脈認識はこの精度をさらに向上させることがわかった。
さらに, 配向データ生成のためのTimeGANに基づく手法により, 実験により得られたデータと密に一致した合成サンプルを作成できることを示す。 The advancement of Virtual Reality (VR) technology is focused on improving its immersiveness, supporting multiuser Virtual Experiences (VEs), and enabling the users to move freely within their VEs while still being confined within specialized VR setups through Redirected Walking (RDW). To meet their extreme data-rate and latency requirements, future VR systems will require supporting wireless networking infrastructures operating in millimeter Wave (mmWave) frequencies that leverage highly directional communication in both transmission and reception through beamforming and beamsteering. We propose the use of predictive context-awareness to optimize transmitter and receiver-side beamforming and beamsteering. By predicting users' short-term lateral movements in multiuser VR setups with Redirected Walking (RDW), transmitter-side beamforming and beamsteering can be optimized through Line-of-Sight (LoS) "tracking" in the users' directions. At the same time, predictions of short-term orientational movements can be utilized for receiver-side beamforming for coverage flexibility enhancements. We target two open problems in predicting these two context information instances: i) predicting lateral movements in multiuser VR settings with RDW, and ii) generating synthetic head rotation datasets for training orientational movements predictors. Our experimental results demonstrate that Long Short-Term Memory (LSTM) networks feature promising accuracy in predicting lateral movements, and context-awareness stemming from VEs further enhances this accuracy. Additionally, we show that a TimeGAN-based approach for orientational data generation can create synthetic samples that closely match experimentally obtained ones. | 翻訳日:2023-05-16 22:48:35 公開日:2023-05-15 |
# コンピュータビジョンにおける双曲幾何学:畳み込みニューラルネットワークの新しいフレームワーク Hyperbolic Geometry in Computer Vision: A Novel Framework for Convolutional Neural Networks ( http://arxiv.org/abs/2303.15919v2 ) ライセンス: Link先を確認 | Ahmad Bdeir and Kristian Schwethelm and Niels Landwehr | (参考訳) 実世界のビジュアルデータは、双曲空間において効果的に表現できる固有の階層構造を示す。
私たちのコードはhttps://github.com/kschwethelm/HyperbolicCVで公開されています。 Real-world visual data exhibit intrinsic hierarchical structures that can be represented effectively in hyperbolic spaces. Hyperbolic neural networks (HNNs) are a promising approach for learning feature representations in such spaces. However, current HNNs in computer vision rely on Euclidean backbones and only project features to the hyperbolic space in the task heads, limiting their ability to fully leverage the benefits of hyperbolic geometry. To address this, we present HCNN, the first fully hyperbolic convolutional neural network (CNN) designed for computer vision tasks. Based on the Lorentz model, we generalize fundamental components of CNNs and propose novel formulations of the convolutional layer, batch normalization, and multinomial logistic regression. Experimentation on standard vision tasks demonstrates the superiority of our HCNN framework and the Lorentz model in both hybrid and fully hyperbolic settings. Overall, we believe our contributions provide a foundation for developing more powerful HNNs that can better represent complex structures found in image data. Our code is publicly available at https://github.com/kschwethelm/HyperbolicCV. | 翻訳日:2023-05-16 22:47:41 公開日:2023-05-15 |
# SVD-DIP : DIPによるCT再建におけるオーバーフィッティングの克服 SVD-DIP: Overcoming the Overfitting Problem in DIP-based CT Reconstruction ( http://arxiv.org/abs/2303.15748v3 ) ライセンス: Link先を確認 | Marco Nittscher, Michael Lameter, Riccardo Barbano, Johannes Leuschner, Bangti Jin, Peter Maass | (参考訳) deep image prior(dip)は、画像再構成のためのよく確立された教師なしのディープラーニング手法である。
このときの DIP の最適化は、左特異ベクトルと右特異ベクトルを固定しながら、特異値の微調整のみからなる。
オーバーフィットを克服することにより,ディップ最適化の安定性が大幅に向上した。 The deep image prior (DIP) is a well-established unsupervised deep learning method for image reconstruction; yet it is far from being flawless. The DIP overfits to noise if not early stopped, or optimized via a regularized objective. We build on the regularized fine-tuning of a pretrained DIP, by adopting a novel strategy that restricts the learning to the adaptation of singular values. The proposed SVD-DIP uses ad hoc convolutional layers whose pretrained parameters are decomposed via the singular value decomposition. Optimizing the DIP then solely consists in the fine-tuning of the singular values, while keeping the left and right singular vectors fixed. We thoroughly validate the proposed method on real-measured $\mu$CT data of a lotus root as well as two medical datasets (LoDoPaB and Mayo). We report significantly improved stability of the DIP optimization, by overcoming the overfitting to noise. | 翻訳日:2023-05-16 22:47:24 公開日:2023-05-15 |
# 重要ノードのブリッジネス同定によるスキップグラムに基づくノード埋め込みのポストホック説明の生成 Generating Post-hoc Explanations for Skip-gram-based Node Embeddings by Identifying Important Nodes with Bridgeness ( http://arxiv.org/abs/2304.12036v3 ) ライセンス: Link先を確認 | Hogun Park and Jennifer Neville | (参考訳) ネットワーク内のノード表現学習は、ネットワーク固有の特性と構造を保持しながら、連続ベクトル空間内の関係情報を符号化する重要な機械学習技術である。
しかし, 埋込法や理論研究が欠如していることから, 埋込法に関するポストホックな説明は難しい問題である。
さらに, 学習グラフ埋め込みベクトルに関するトップq大域的説明をより効率的に行うために, graph-wgd と呼ぶ新しい勾配に基づく説明法を提案する。
実験により, Graph-wGD を用いたスコアによるノードのランク付けは, 真のブリッジネススコアと高い相関性を示した。
また, Graph-wGD が選択したトップqノードレベルの説明は,5つの実世界のグラフを用いて,近年の代替案で選択されたノードと比較して,より重要度が高く,乱れ時にクラスラベルの予測値が大きく変化する。 Node representation learning in a network is an important machine learning technique for encoding relational information in a continuous vector space while preserving the inherent properties and structures of the network. Recently, unsupervised node embedding methods such as DeepWalk, LINE, struc2vec, PTE, UserItem2vec, and RWJBG have emerged from the Skip-gram model and perform better performance in several downstream tasks such as node classification and link prediction than the existing relational models. However, providing post-hoc explanations of Skip-gram-based embeddings remains a challenging problem because of the lack of explanation methods and theoretical studies applicable for embeddings. In this paper, we first show that global explanations to the Skip-gram-based embeddings can be found by computing bridgeness under a spectral cluster-aware local perturbation. Moreover, a novel gradient-based explanation method, which we call GRAPH-wGD, is proposed that allows the top-q global explanations about learned graph embedding vectors more efficiently. Experiments show that the ranking of nodes by scores using GRAPH-wGD is highly correlated with true bridgeness scores. We also observe that the top-q node-level explanations selected by GRAPH-wGD have higher importance scores and produce more changes in class label prediction when perturbed, compared with the nodes selected by recent alternatives, using five real-world graphs. | 翻訳日:2023-05-16 21:05:14 公開日:2023-05-15 |
# 確率的論理推論を用いた逐次レコメンデーション Sequential Recommendation with Probabilistic Logical Reasoning ( http://arxiv.org/abs/2304.11383v2 ) ライセンス: Link先を確認 | Huanhuan Yuan, Pengpeng Zhao, Xuefeng Xian and Guanfeng Liu and Victor S. Sheng and Lei Zhao | (参考訳) 深層学習と記号学習は、逐次勧告(SR)においてよく用いられる方法である。
最後に、様々なシーケンシャルレコメンデーションモデルに対する実験により、SR-PLRの有効性を示す。 Deep learning and symbolic learning are two frequently employed methods in Sequential Recommendation (SR). Recent neural-symbolic SR models demonstrate their potential to enable SR to be equipped with concurrent perception and cognition capacities. However, neural-symbolic SR remains a challenging problem due to open issues like representing users and items in logical reasoning. In this paper, we combine the Deep Neural Network (DNN) SR models with logical reasoning and propose a general framework named Sequential Recommendation with Probabilistic Logical Reasoning (short for SR-PLR). This framework allows SR-PLR to benefit from both similarity matching and logical reasoning by disentangling feature embedding and logic embedding in the DNN and probabilistic logic network. To better capture the uncertainty and evolution of user tastes, SR-PLR embeds users and items with a probabilistic method and conducts probabilistic logical reasoning on users' interaction patterns. Then the feature and logic representations learned from the DNN and logic network are concatenated to make the prediction. Finally, experiments on various sequential recommendation models demonstrate the effectiveness of the SR-PLR. | 翻訳日:2023-05-16 21:04:45 公開日:2023-05-15 |
# metropolisアルゴリズムは、ローカルオプティマに対してどの程度うまく対処できるのか? How Well Does the Metropolis Algorithm Cope With Local Optima? ( http://arxiv.org/abs/2304.10848v2 ) ライセンス: Link先を確認 | Benjamin Doerr, Taha El Ghazi El Houssaini, Amirhossein Rajabi, and Carsten Witt | (参考訳) メトロポリスアルゴリズム (MA) は古典的な確率的局所探索ヒューリスティックである。
私たちの研究はまた、maにグローバル変異演算子を装備することを提案しています。 The Metropolis algorithm (MA) is a classic stochastic local search heuristic. It avoids getting stuck in local optima by occasionally accepting inferior solutions. To better and in a rigorous manner understand this ability, we conduct a mathematical runtime analysis of the MA on the CLIFF benchmark. Apart from one local optimum, cliff functions are monotonically increasing towards the global optimum. Consequently, to optimize a cliff function, the MA only once needs to accept an inferior solution. Despite seemingly being an ideal benchmark for the MA to profit from its main working principle, our mathematical runtime analysis shows that this hope does not come true. Even with the optimal temperature (the only parameter of the MA), the MA optimizes most cliff functions less efficiently than simple elitist evolutionary algorithms (EAs), which can only leave the local optimum by generating a superior solution possibly far away. This result suggests that our understanding of why the MA is often very successful in practice is not yet complete. Our work also suggests to equip the MA with global mutation operators, an idea supported by our preliminary experiments. | 翻訳日:2023-05-16 21:04:27 公開日:2023-05-15 |
# Tetra-NeRF:Tetrahedraを用いたニューラルラジアンスフィールドの表現 Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra ( http://arxiv.org/abs/2304.09987v2 ) ライセンス: Link先を確認 | Jonas Kulhanek and Torsten Sattler | (参考訳) ニューラル・ラジアンス・フィールド(NeRF)は、新しいビュー合成と3次元再構成の問題に対して、非常に最近かつ非常にポピュラーなアプローチである。
提案手法は, 3次元幾何処理, 三角形ベースのレンダリング, 現代のニューラル放射場の概念をエレガントに組み合わせる。
点ベース表現と比較して,本手法は性能が向上する。 Neural Radiance Fields (NeRFs) are a very recent and very popular approach for the problems of novel view synthesis and 3D reconstruction. A popular scene representation used by NeRFs is to combine a uniform, voxel-based subdivision of the scene with an MLP. Based on the observation that a (sparse) point cloud of the scene is often available, this paper proposes to use an adaptive representation based on tetrahedra obtained by the Delaunay triangulation instead of the uniform subdivision or point-based representations. We show that such a representation enables efficient training and leads to state-of-the-art results. Our approach elegantly combines concepts from 3D geometry processing, triangle-based rendering, and modern neural radiance fields. Compared to voxel-based representations, ours provides more detail around parts of the scene likely to be close to the surface. Compared to point-based representations, our approach achieves better performance. | 翻訳日:2023-05-16 21:04:08 公開日:2023-05-15 |
# 高周波トレーディング予測のための最適出力長短期記憶セル Optimum Output Long Short-Term Memory Cell for High-Frequency Trading Forecasting ( http://arxiv.org/abs/2304.09840v3 ) ライセンス: Link先を確認 | Adamantios Ntakaris, Moncef Gabbouj, Juho Kanniainen | (参考訳) 高頻度取引は、正確な株価予測のために情報遅延のない高速データ処理を必要とする。
これらの時間不規則性を考慮したよく文書化されテストされた手法は、long short-term memory neural networkと呼ばれるリカレントニューラルネットワークの一種である。
本改訂したセルは,2つの高液量米国株と2つの低液量北欧株で試験されたリミットオーダーブック中価格予測などのオンライン高頻度トレーディング予測タスクにおいて,他のリカレントニューラルネットワークと比較して低い予測誤差を達成している。 High-frequency trading requires fast data processing without information lags for precise stock price forecasting. This high-paced stock price forecasting is usually based on vectors that need to be treated as sequential and time-independent signals due to the time irregularities that are inherent in high-frequency trading. A well-documented and tested method that considers these time-irregularities is a type of recurrent neural network, named long short-term memory neural network. This type of neural network is formed based on cells that perform sequential and stale calculations via gates and states without knowing whether their order, within the cell, is optimal. In this paper, we propose a revised and real-time adjusted long short-term memory cell that selects the best gate or state as its final output. Our cell is running under a shallow topology, has a minimal look-back period, and is trained online. This revised cell achieves lower forecasting error compared to other recurrent neural networks for online high-frequency trading forecasting tasks such as the limit order book mid-price prediction as it has been tested on two high-liquid US and two less-liquid Nordic stocks. | 翻訳日:2023-05-16 21:03:37 公開日:2023-05-15 |
# ChatPLUG: オープンドメイン生成対話システム ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human ( http://arxiv.org/abs/2304.07849v3 ) ライセンス: Link先を確認 | Junfeng Tian, Hehong Chen, Guohai Xu, Ming Yan, Xing Gao, Jianhai Zhang, Chenliang Li, Jiayi Liu, Wenshen Xu, Haiyang Xu, Qi Qian, Wei Wang, Qinghao Ye, Jiejing Zhang, Ji Zhang, Fei Huang, Jingren Zhou | (参考訳) 本稿では,デジタルヒューマンアプリケーションのための中国のオープンドメイン対話システムChatPLUGについて述べる。
自動評価と人間評価の両方において, \modelname は最先端の中国語対話システムよりも優れており,様々なテキスト理解と生成タスクにおいて,強力なマルチタスク一般化を示す。
さらに、高速な推論でスマートスピーカーやインスタントメッセージアプリケーションのような現実世界のアプリケーションに \modelname をデプロイします。
私たちのモデルとコードは、ModelScopeで公開されます。 https://modelscope.cn/models/damo/ChatPLUG-3.7B and Github: https://github.com/X-PLUG/ChatPLUG。 In this paper, we present ChatPLUG, a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format. Different from other open-domain dialogue models that focus on large-scale pre-training and scaling up model size or dialogue corpus, we aim to build a powerful and practical dialogue system for digital human with diverse skills and good multi-task generalization by internet-augmented instruction tuning. To this end, we first conduct large-scale pre-training on both common document corpus and dialogue data with curriculum learning, so as to inject various world knowledge and dialogue abilities into ChatPLUG. Then, we collect a wide range of dialogue tasks spanning diverse features of knowledge, personality, multi-turn memory, and empathy, on which we further instruction tune \modelname via unified natural language instruction templates. External knowledge from an internet search is also used during instruction finetuning for alleviating the problem of knowledge hallucinations. We show that \modelname outperforms state-of-the-art Chinese dialogue systems on both automatic and human evaluation, and demonstrates strong multi-task generalization on a variety of text understanding and generation tasks. In addition, we deploy \modelname to real-world applications such as Smart Speaker and Instant Message applications with fast inference. Our models and code will be made publicly available on ModelScope: https://modelscope.cn/models/damo/ChatPLUG-3.7B and Github: https://github.com/X-PLUG/ChatPLUG . | 翻訳日:2023-05-16 21:02:21 公開日:2023-05-15 |
# 不確定な距離表現のための経験的ブレグマン分岐の学習 Learning Empirical Bregman Divergence for Uncertain Distance Representation ( http://arxiv.org/abs/2304.07689v3 ) ライセンス: Link先を確認 | Zhiyuan Li, Ziru Liu, Anna Zou, Anca L. Ralescu | (参考訳) ディープメトリック学習技術は、ディープネットワークを用いたサンプルの埋め込みを学習することで、様々な教師なしおよび教師なしの学習タスクの視覚的表現に使われている。
bregman divergenceは様々な距離メトリクスの測定を一般化し、ディープメトリック学習の多くの分野に出現する。
さらに,本手法が一般的な5つのデータセットに対して,他のsata深層メトリック学習手法,特にパターン認識問題に対して効果的に動作することを示す。 Deep metric learning techniques have been used for visual representation in various supervised and unsupervised learning tasks through learning embeddings of samples with deep networks. However, classic approaches, which employ a fixed distance metric as a similarity function between two embeddings, may lead to suboptimal performance for capturing the complex data distribution. The Bregman divergence generalizes measures of various distance metrics and arises throughout many fields of deep metric learning. In this paper, we first show how deep metric learning loss can arise from the Bregman divergence. We then introduce a novel method for learning empirical Bregman divergence directly from data based on parameterizing the convex function underlying the Bregman divergence with a deep learning setting. We further experimentally show that our approach performs effectively on five popular public datasets compared to other SOTA deep metric learning methods, particularly for pattern recognition problems. | 翻訳日:2023-05-16 21:01:45 公開日:2023-05-15 |
# National Vulnerability Databaseにおけるソフトウェア脆弱性のテキスト記述からの知識グラフの構築 Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database ( http://arxiv.org/abs/2305.00382v2 ) ライセンス: Link先を確認 | Anders M{\o}lmen H{\o}st and Pierre Lison and Leon Moonen | (参考訳) 知識グラフは、脆弱性評価や脅威分析など、いくつかのサイバーセキュリティタスクを約束している。
本研究では,NVD(National Vulnerability Database)の情報から脆弱性知識グラフを構築するための新しい手法を提案する。
本手法は,サイバーセキュリティに使用される知識グラフの欠落したエンティティの修正に有効であることを示す。 Knowledge graphs have shown promise for several cybersecurity tasks, such as vulnerability assessment and threat analysis. In this work, we present a new method for constructing a vulnerability knowledge graph from information in the National Vulnerability Database (NVD). Our approach combines named entity recognition (NER), relation extraction (RE), and entity prediction using a combination of neural models, heuristic rules, and knowledge graph embeddings. We demonstrate how our method helps to fix missing entities in knowledge graphs used for cybersecurity and evaluate the performance. | 翻訳日:2023-05-16 20:54:16 公開日:2023-05-15 |
# 宇宙から何か分離する? Segment anything, from space? ( http://arxiv.org/abs/2304.13000v2 ) ライセンス: Link先を確認 | Simiao Ren, Francesco Luzi, Saad Lahrichi, Kaleb Kassaw, Leslie M. Collins, Kyle Bradbury, Jordan M. Malof | (参考訳) 近年,視覚タスク用に開発された最初の基礎モデルが開発され,SAM (Segment Anything Model) と呼ばれる。
これは作業用紙であり、追加の分析と結果が完了すると更新される。 Recently, the first foundation model developed specifically for vision tasks was developed, termed the "Segment Anything Model" (SAM). SAM can segment objects in input imagery based upon cheap input prompts, such as one (or more) points, a bounding box, or a mask. The authors examined the zero-shot image segmentation accuracy of SAM on a large number of vision benchmark tasks and found that SAM usually achieved recognition accuracy similar to, or sometimes exceeding, vision models that had been trained on the target tasks. The impressive generalization of SAM for segmentation has major implications for vision researchers working on natural imagery. In this work, we examine whether SAM's impressive performance extends to overhead imagery problems, and help guide the community's response to its development. We examine SAM's performance on a set of diverse and widely-studied benchmark tasks. We find that SAM does often generalize well to overhead imagery, although it fails in some cases due to the unique characteristics of overhead imagery and the target objects. We report on these unique systematic failure cases for remote sensing imagery that may comprise useful future research for the community. Note that this is a working paper, and it will be updated as additional analysis and results are completed. | 翻訳日:2023-05-16 20:52:38 公開日:2023-05-15 |
# 大規模マルチタスク中国語理解の測定 Measuring Massive Multitask Chinese Understanding ( http://arxiv.org/abs/2304.12986v2 ) ライセンス: Link先を確認 | Hui Zeng | (参考訳) 大規模な中国語モデルの開発は盛んであるが、それに対応する能力評価が不足している。
複数の分野にわたる知識の幅と深さを包括的に評価することにより、このテストはモデルの欠点をより正確に識別することができる。 The development of large-scale Chinese language models is flourishing, yet there is a lack of corresponding capability assessments. Therefore, we propose a test to measure the multitask accuracy of large Chinese language models. This test encompasses four major domains, including medicine, law, psychology, and education, with 15 subtasks in medicine and 8 subtasks in education. We found that the best-performing models in the zero-shot setting outperformed the worst-performing models by nearly 18.6 percentage points on average. Across the four major domains, the highest average zero-shot accuracy of all models is 0.512. In the subdomains, only the GPT-3.5-turbo model achieved a zero-shot accuracy of 0.693 in clinical medicine, which was the highest accuracy among all models across all subtasks. All models performed poorly in the legal domain, with the highest zero-shot accuracy reaching only 0.239. By comprehensively evaluating the breadth and depth of knowledge across multiple disciplines, this test can more accurately identify the shortcomings of the models. | 翻訳日:2023-05-16 20:52:18 公開日:2023-05-15 |
# 教育のための人工知能(agi) Artificial General Intelligence (AGI) for Education ( http://arxiv.org/abs/2304.12479v2 ) ライセンス: Link先を確認 | Ehsan Latif, Gengchen Mai, Matthew Nyaaba, Xuansheng Wu, Ninghao Liu, Guoyu Lu, Sheng Li, Tianming Liu, and Xiaoming Zhai | (参考訳) 人工知能 (AGI) は, GPT-4 や ChatGPT といった大規模言語モデルやチャットボットの出現により, 将来の技術としてグローバルに認識されるようになった。
AGIの開発は、研究と応用活動を進めるために、教育者とAIエンジニアの学際的なコラボレーションを必要とする。 Artificial general intelligence (AGI) has gained global recognition as a future technology due to the emergence of breakthrough large language models and chatbots such as GPT-4 and ChatGPT, respectively. AGI aims to replicate human intelligence through computer systems, which is one of the critical technologies having the potential to revolutionize the field of education. Compared to conventional AI models, typically designed for a limited range of tasks, demand significant amounts of domain-specific data for training and may not always consider intricate interpersonal dynamics in education. AGI, driven by the recent large pre-trained models, represents a significant leap in the capability of machines to perform tasks that require human-level intelligence, such as reasoning, problem-solving, decision-making, and even understanding human emotions and social interactions. This work reviews AGI's key concepts, capabilities, scope, and potential within future education, including setting educational goals, designing pedagogy and curriculum, and performing assessments. We also provide rich discussions over various ethical issues in education faced by AGI and how AGI will affect human educators. The development of AGI necessitates interdisciplinary collaborations between educators and AI engineers to advance research and application efforts. | 翻訳日:2023-05-16 20:52:02 公開日:2023-05-15 |
# lmsの基盤--言語モデルによるフィギュラティブ言語解釈における具体化の効果の検討 LMs stand their Ground: Investigating the Effect of Embodiment in Figurative Language Interpretation by Language Models ( http://arxiv.org/abs/2305.03445v2 ) ライセンス: Link先を確認 | Philipp Wicke | (参考訳) 表現言語は、その解釈は、従来の順序や意味から逸脱するような言葉の使用に基づいているため、言語モデルの課題である。
しかし, 言語モデルに関する具体的言語解釈の文脈において, 具体化と具体性や獲得年齢といった特徴との関係は研究されていない。
この分析は、他の特徴(単語の長さや具体性など)と多行性を規定し、より大きな言語モデルが具体的言語理解を促進する程度まで具体的概念を概念化するという最初の証拠を提供する。 Figurative language is a challenge for language models since its interpretation is based on the use of words in a way that deviates from their conventional order and meaning. Yet, humans can easily understand and interpret metaphors, similes or idioms as they can be derived from embodied metaphors. Language is a proxy for embodiment and if a metaphor is conventional and lexicalised, it becomes easier for a system without a body to make sense of embodied concepts. Yet, the intricate relation between embodiment and features such as concreteness or age of acquisition has not been studied in the context of figurative language interpretation concerning language models. Hence, the presented study shows how larger language models perform better at interpreting metaphoric sentences when the action of the metaphorical sentence is more embodied. The analysis rules out multicollinearity with other features (e.g. word length or concreteness) and provides initial evidence that larger language models conceptualise embodied concepts to a degree that facilitates figurative language understanding. | 翻訳日:2023-05-16 20:45:50 公開日:2023-05-15 |
# HiPool: グラフニューラルネットワークによる長いドキュメントのモデリング HiPool: Modeling Long Documents Using Graph Neural Networks ( http://arxiv.org/abs/2305.03319v2 ) ライセンス: Link先を確認 | Irene Li, Aosong Feng, Dragomir Radev, Rex Ying | (参考訳) 自然言語処理(nlp)における長いシーケンスのエンコーディングは難しい問題である。
提案手法は,特に長いシーケンスにおいて,性能とスケーラビリティを向上した階層的逐次モデルより優れていることを示す。 Encoding long sequences in Natural Language Processing (NLP) is a challenging problem. Though recent pretraining language models achieve satisfying performances in many NLP tasks, they are still restricted by a pre-defined maximum length, making them challenging to be extended to longer sequences. So some recent works utilize hierarchies to model long sequences. However, most of them apply sequential models for upper hierarchies, suffering from long dependency issues. In this paper, we alleviate these issues through a graph-based method. We first chunk the sequence with a fixed length to model the sentence-level information. We then leverage graphs to model intra- and cross-sentence correlations with a new attention mechanism. Additionally, due to limited standard benchmarks for long document classification (LDC), we propose a new challenging benchmark, totaling six datasets with up to 53k samples and 4034 average tokens' length. Evaluation shows our model surpasses competitive baselines by 2.6% in F1 score, and 4.8% on the longest sequence dataset. Our method is shown to outperform hierarchical sequential models with better performance and scalability, especially for longer sequences. | 翻訳日:2023-05-16 20:45:32 公開日:2023-05-15 |
# 人間中心信頼フレームワーク--HCIの視点から Human-centered trust framework: An HCI perspective ( http://arxiv.org/abs/2305.03306v2 ) ライセンス: Link先を確認 | Sonia Sousa, Jose Cravino, Paulo Martins, David Lamas | (参考訳) この研究の理論的根拠は、現在の人工知能(AI)のユーザ信頼談話に基づいている。
この記事は、提案されたソリューションに対するユーザの信頼の意図と行動を測定するために使用できる、いくつかのユーザーリサーチツールを提供することで終わる。 The rationale of this work is based on the current user trust discourse of Artificial Intelligence (AI). We aim to produce novel HCI approaches that use trust as a facilitator for the uptake (or appropriation) of current technologies. We propose a framework (HCTFrame) to guide non-experts to unlock the full potential of user trust in AI design. Results derived from a data triangulation of findings from three literature reviews demystify some misconceptions of user trust in computer science and AI discourse, and three case studies are conducted to assess the effectiveness of a psychometric scale in mapping potential users' trust breakdowns and concerns. This work primarily contributes to the fight against the tendency to design technical-centered vulnerable interactions, which can eventually lead to additional real and perceived breaches of trust. The proposed framework can be used to guide system designers on how to map and define user trust and the socioethical and organisational needs and characteristics of AI system design. It can also guide AI system designers on how to develop a prototype and operationalise a solution that meets user trust requirements. The article ends by providing some user research tools that can be employed to measure users' trust intentions and behaviours towards a proposed solution. | 翻訳日:2023-05-16 20:45:13 公開日:2023-05-15 |
# 因果世界モデルによる説明可能な強化学習 Explainable Reinforcement Learning via a Causal World Model ( http://arxiv.org/abs/2305.02749v2 ) ライセンス: Link先を確認 | Zhongwei Yu, Jingqing Ruan, Dengpeng Xing | (参考訳) 強化学習(RL)のための説明を生成することは、行動が未来に長期的な影響をもたらす可能性があるため困難である。
その結果,我々の因果モデルが説明可能性と学習の橋渡しとなることを示した。 Generating explanations for reinforcement learning (RL) is challenging as actions may produce long-term effects on the future. In this paper, we develop a novel framework for explainable RL by learning a causal world model without prior knowledge of the causal structure of the environment. The model captures the influence of actions, allowing us to interpret the long-term effects of actions through causal chains, which present how actions influence environmental variables and finally lead to rewards. Different from most explanatory models which suffer from low accuracy, our model remains accurate while improving explainability, making it applicable in model-based learning. As a result, we demonstrate that our causal model can serve as the bridge between explainability and learning. | 翻訳日:2023-05-16 20:44:34 公開日:2023-05-15 |
# CryCeleb:幼児のCry音に基づく話者検証データセット CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds ( http://arxiv.org/abs/2305.00969v3 ) ライセンス: Link先を確認 | David Budaghyan, Arsenii Gorin, Cem Subakan, Charles C. Onu | (参考訳) 本稿では,乳幼児の叫び声をラベル付けしたUbenwa CryCelebデータセットと,乳幼児の泣き声に基づく公的な話者検証課題であるCryCeleb 2023タスクについて述べる。
乳児の泣き声解析研究を促進するため,786人の新生児から6時間以上手作業で泣き声を分割した。 This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries, and the accompanying CryCeleb 2023 task - a public speaker verification challenge based on infant cry sounds. We release for academic usage more than 6 hours of manually segmented cry sounds from 786 newborns to encourage research in infant cry analysis. | 翻訳日:2023-05-16 20:43:44 公開日:2023-05-15 |
# VCSUM:中国の多国間会議要約データセット VCSUM: A Versatile Chinese Meeting Summarization Dataset ( http://arxiv.org/abs/2305.05280v2 ) ライセンス: Link先を確認 | Han Wu, Mingjie Zhan, Haochen Tan, Zhaohui Hou, Ding Liang, and Linqi Song | (参考訳) ニュースやチャットの要約と比較して,会議要約の発達は限られたデータによって著しく減速する。
このように、データセットは、セグメンテーションベースの要約、多粒度要約、検索-then-generate summarizationなど、様々な要約タスクやメソッドに適応することができる。
データセットとコードはhttps://github.com/hahawu/VCSumで公開される。 Compared to news and chat summarization, the development of meeting summarization is hugely decelerated by the limited data. To this end, we introduce a versatile Chinese meeting summarization dataset, dubbed VCSum, consisting of 239 real-life meetings, with a total duration of over 230 hours. We claim our dataset is versatile because we provide the annotations of topic segmentation, headlines, segmentation summaries, overall meeting summaries, and salient sentences for each meeting transcript. As such, the dataset can adapt to various summarization tasks or methods, including segmentation-based summarization, multi-granularity summarization and retrieval-then-generate summarization. Our analysis confirms the effectiveness and robustness of VCSum. We also provide a set of benchmark models regarding different downstream summarization tasks on VCSum to facilitate further research. The dataset and code will be released at https://github.com/hahahawu/VCSum. | 翻訳日:2023-05-16 20:35:47 公開日:2023-05-15 |
# MultiTACRED:TAC関係抽出データセットの多言語版 MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset ( http://arxiv.org/abs/2305.04582v2 ) ライセンス: Link先を確認 | Leonhard Hennig, Philippe Thomas, Sebastian M\"oller | (参考訳) 関係抽出(RE)は、多言語設定への拡張が、TACRED(Zhang et al., 2017)のような大規模な英語データセットに匹敵するリソースの不足によって妨げられている情報抽出の基本的なタスクである。
しかし, MTシステムや, 代名詞ドロップ, 複合化, インフレクションなどの言語的特徴により, データセットの品質やREモデルの性能が低下しているため, 様々な翻訳やアノテーションの予測誤差も観察できる。 Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al., 2017). To address this gap, we introduce the MultiTACRED dataset, covering 12 typologically diverse languages from 9 language families, which is created by machine-translating TACRED instances and automatically projecting their entity annotations. We analyze translation and annotation projection quality, identify error categories, and experimentally evaluate fine-tuned pretrained mono- and multilingual language models in common transfer learning scenarios. Our analyses show that machine translation is a viable strategy to transfer RE instances, with native speakers judging more than 83% of the translated instances to be linguistically and semantically acceptable. We find monolingual RE model performance to be comparable to the English original for many of the target languages, and that multilingual models trained on a combination of English and target language data can outperform their monolingual counterparts. However, we also observe a variety of translation and annotation projection errors, both due to the MT systems and linguistic features of the target languages, such as pronoun-dropping, compounding and inflection, that degrade dataset quality and RE model performance. | 翻訳日:2023-05-16 20:34:13 公開日:2023-05-15 |
# クロストークに基づくパラメータ化量子回路近似 Crosstalk-Based Parameterized Quantum Circuit Approximation ( http://arxiv.org/abs/2305.04172v2 ) ライセンス: Link先を確認 | Mohannad Ibrahim, Nicholas T. Bronn, Gregory T. Byrd | (参考訳) 本稿では,ハードウェアの主な特性であるクロストーク動作を主近似ドライバとして使用する変分量子アルゴリズム(vqas)に対するアンサッツ近似手法を提案する。
クロストーク適応スケジューリングを利用することで,回路レベルの近似・最適化を ansatz に適用することができる。
我々は、アプリケーションがクロストークに対する応答が異なることを考慮し、この近似戦略は、表現力があり、トレーニング可能で、特定のワークロードに適したクロストーク緩和レベルを持つアンサーゼを作成するために使用できると信じている。 In this paper, we propose an ansatz approximation approach for variational quantum algorithms (VQAs) that uses one of the hardware's main attributes, its crosstalk behavior, as its main approximation driver. By utilizing crosstalk-adaptive scheduling, we are able to apply a circuit-level approximation/optimization to our ansatz. Our design procedure involves first characterizing the hardware's crosstalk and then approximating the circuit by a desired level of crosstalk mitigation, all while effectively reducing its duration and gate counts. We demonstrate the effect of crosstalk mitigation on expressibility, trainability, and entanglement: key components that drive the utility of parameterized circuits. We tested our approach on real quantum hardware against a base configuration, and our results showed superior performance for the circuit-level optimized ansatz over a base ansatz for two quantum chemistry benchmarks. We take into consideration that applications vary in their response to crosstalk, and we believe that this approximation strategy can be used to create ansatze that are expressive, trainable, and with crosstalk mitigation levels tailored for specific workloads. | 翻訳日:2023-05-16 20:33:12 公開日:2023-05-15 |
# 駆動散逸多体系におけるRydbergクラスターからのエルゴディディティ破壊 Ergodicity breaking from Rydberg clusters in a driven-dissipative many-body system ( http://arxiv.org/abs/2305.07032v2 ) ライセンス: Link先を確認 | Dong-Sheng Ding and Zhengyang Bai and Zong-Kai Liu and Bao-Sen Shi and Guang-Can Guo and Weibin Li and C. Stuart. Adams | (参考訳) 散逸がコヒーレントカップリングと分散二体相互作用から生じる量子コヒーレンスを必然的に損なうとき、量子多体系のエルゴディク性破れの傾向を調べることは困難である。
ここでは, 誘導散逸性Rydberg原子気体中のエルゴードからエルゴード破壊ダイナミクスへの遷移の実験的証拠を報告する。
報告された結果は、リミットサイクルのようなエルゴーディティの破れダイナミクスを探究し、非平衡相転移のベンチマークを可能にする有望な候補であることを示した。 It is challenging to probe ergodicity breaking trends of a quantum many-body system when dissipation inevitably damages quantum coherence originated from coherent coupling and dispersive two-body interactions. Rydberg atoms provide a test bed to detect emergent exotic many-body phases and non-ergodic dynamics where the strong Rydberg atom interaction competes with and overtakes dissipative effects even at room temperature. Here we report experimental evidence of a transition from ergodic towards ergodic breaking dynamics in driven-dissipative Rydberg atomic gases. The broken ergodicity is featured by the long-time phase oscillation, which is attributed from the formation of Rydberg excitation clusters in limit cycle phases. The broken symmetry in the limit cycle is a direct manifestation of many-body interactions, which is verified by tuning atomic densities in our experiment. The reported result reveals that Rydberg many-body systems are a promising candidate to probe ergodicity breaking dynamics, such as limit cycles, and enable the benchmark of non-equilibrium phase transition. | 翻訳日:2023-05-16 20:26:31 公開日:2023-05-15 |
# 火災伝播の不確かさ推定のためのニューラルエミュレータ A Neural Emulator for Uncertainty Estimation of Fire Propagation ( http://arxiv.org/abs/2305.06139v2 ) ライセンス: Link先を確認 | Andrew Bolt, Conrad Sanderson, Joel Janek Dabrowski, Carolyn Huston, Petra Kuhnert | (参考訳) 野火の伝播は、風速や方向といった環境条件の小さな変化が観測される行動に大きな変化をもたらす非常に確率的な過程である。
エミュレートされた火災のアンサンブルを介して確率マップを生成するための関連するニューラルネットワーク(エミュレータ)と比較して、提案手法は、ほぼ1桁高速で競合するジャカード類似度スコアを生成する。 Wildfire propagation is a highly stochastic process where small changes in environmental conditions (such as wind speed and direction) can lead to large changes in observed behaviour. A traditional approach to quantify uncertainty in fire-front progression is to generate probability maps via ensembles of simulations. However, use of ensembles is typically computationally expensive, which can limit the scope of uncertainty analysis. To address this, we explore the use of a spatio-temporal neural-based modelling approach to directly estimate the likelihood of fire propagation given uncertainty in input parameters. The uncertainty is represented by deliberately perturbing the input weather forecast during model training. The computational load is concentrated in the model training process, which allows larger probability spaces to be explored during deployment. Empirical evaluations indicate that the proposed model achieves comparable fire boundaries to those produced by the traditional SPARK simulation platform, with an overall Jaccard index (similarity score) of 67.4% on a set of 35 simulated fires. When compared to a related neural model (emulator) which was employed to generate probability maps via ensembles of emulated fires, the proposed approach produces competitive Jaccard similarity scores while being approximately an order of magnitude faster. | 翻訳日:2023-05-16 20:25:10 公開日:2023-05-15 |
# 確率的テクスチャフィルタリング Stochastic Texture Filtering ( http://arxiv.org/abs/2305.05810v2 ) ライセンス: Link先を確認 | Marcos Fajardo, Bartlomiej Wronski, Marco Salvi, Matt Pharr | (参考訳) 2次元テクスチャマップと3次元ボクセルアレイは、描画されたシーンの表面やボリュームにリッチなディテールを加えるために広く使われており、フィルターされたテクスチャルックアップは高品質な画像を生成するのに不可欠である。
さらに、この誤差は時空間デノイングまたは適度なピクセルサンプリングレートによってうまく処理される。 2D texture maps and 3D voxel arrays are widely used to add rich detail to the surfaces and volumes of rendered scenes, and filtered texture lookups are integral to producing high-quality imagery. We show that filtering textures after evaluating lighting, rather than before BSDF evaluation as is current practice, gives a more accurate solution to the rendering equation. These benefits are not merely theoretical, but are apparent in common cases. We further show that stochastically sampling texture filters is crucial for enabling this approach, which has not been possible previously except in limited cases. Stochastic texture filtering offers additional benefits, including efficient implementation of high-quality texture filters and efficient filtering of textures stored in compressed and sparse data structures, including neural representations. We demonstrate applications in both real-time and offline rendering and show that the additional stochastic error is minimal. Furthermore, this error is handled well by either spatiotemporal denoising or moderate pixel sampling rates. | 翻訳日:2023-05-16 20:24:35 公開日:2023-05-15 |
# 医用報告書要約と医用対話生成における階層プルーニングを用いたパラメータ効率の良い微調整 Parameter-Efficient Fine-Tuning with Layer Pruning on Medical Report Summarization and Medical Dialogue Generation ( http://arxiv.org/abs/2305.08285v1 ) ライセンス: Link先を確認 | Yunqi Zhu and Xuebing Yang and Yuanyuan Wu and Wensheng Zhang | (参考訳) 言語モデルのサイズが大きくなると、パラメータ効率の良い微調整(例えば、Adapter、LoRA、即時チューニング)において、事前訓練されたモデルを凍結する研究の関心が高まり、複数の下流タスクに対して小さな訓練可能なパラメータを注入する。
元のモデルの0.6%のパラメータをチューニングし、30%以上のトランスフォーマー層をprunすることで、フレームワークはトレーニングフェーズの100%を高速化し、gpuメモリ使用量の50%を削減することができる。 The increasing size of language models raises great research interests in parameter-efficient fine-tuning (e.g. Adapter, LoRA and prompt tuning) that freezes the pre-trained model, and injects small-scale trainable parameters for multiple downstream tasks. To further enhance the efficiency of fine-tuning, we propose a framework that integrates LoRA and structured layer pruning. In addition, based on MIMIC-IV-Note, we create two deidentified medical report summarization datasets. Further, We validate the integrated framework on the proposed two datasets and two medical dialogue datasets. By tuning 0.6% parameters of the original model and pruning over 30% Transformer-layers, the framework can speed up 100% of the training phase and reduce 50% of GPU memory usage, while preserving over 92% generation qualities on free-text sequence-to-sequence tasks. | 翻訳日:2023-05-16 16:39:43 公開日:2023-05-15 |
# 事前データから言語モデル、下流タスクへ:不公平なNLPモデルによる政治的バイアスの軌跡を追跡する From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models ( http://arxiv.org/abs/2305.08283v1 ) ライセンス: Link先を確認 | Shangbin Feng, Chan Young Park, Yuhan Liu, Yulia Tsvetkov | (参考訳) 大規模言語モデル(LM)は、ニュース、ディスカッションフォーラム、書籍、オンライン百科事典といった様々なデータソースで事前訓練されている。
我々は,nlp研究の意義を議論し,不公平さを緩和するための今後の方向性を提案する。 Large language models (LMs) are pretrained on diverse data sources: news, discussion forums, books, online encyclopedias. A significant portion of this data includes facts and opinions which, on one hand, celebrate democracy and diversity of ideas, and on the other hand are inherently socially biased. Our work develops new methods to (1) measure media biases in LMs trained on such corpora, along the social and economic axes, and (2) measure the fairness of downstream NLP models trained on top of politically biased LMs. We focus on hate speech and misinformation detection, aiming to empirically quantify the effects of political (social, economic) biases in pretraining data on the fairness of high-stakes social-oriented tasks. Our findings reveal that pretrained LMs do have political leanings which reinforce the polarization present in pretraining corpora, propagating social biases into hate speech predictions and media biases into misinformation detectors. We discuss the implications of our findings for NLP research and propose future directions to mitigate the unfairness. | 翻訳日:2023-05-16 16:39:23 公開日:2023-05-15 |
# 天候対応ラベルシフト攻撃によるロバストな一般化 t-RAIN: Robust generalization under weather-aliasing label shift attacks ( http://arxiv.org/abs/2305.08302v1 ) ライセンス: Link先を確認 | Aboli Marathe, Sanjana Prabhu | (参考訳) 古典的な教師付き学習設定では、分類器はバランスの取れたラベル分布の仮定に適合し、同時に顕著な結果が得られる。
このマッピングはモデルテストの精度を2.1, 4.4, 1.9, 2.7%向上させる。
本稿では,82.69 AP (雪) と62.31 AP (霧) が最適である実地および合成気象領域の歩行者検出結果について述べる。 In the classical supervised learning settings, classifiers are fit with the assumption of balanced label distributions and produce remarkable results on the same. In the real world, however, these assumptions often bend and in turn adversely impact model performance. Identifying bad learners in skewed target distributions is even more challenging. Thus achieving model robustness under these "label shift" settings is an important task in autonomous perception. In this paper, we analyze the impact of label shift on the task of multi-weather classification for autonomous vehicles. We use this information as a prior to better assess pedestrian detection in adverse weather. We model the classification performance as an indicator of robustness under 4 label shift scenarios and study the behavior of multiple classes of models. We propose t-RAIN a similarity mapping technique for synthetic data augmentation using large scale generative models and evaluate the performance on DAWN dataset. This mapping boosts model test accuracy by 2.1, 4.4, 1.9, 2.7 % in no-shift, fog, snow, dust shifts respectively. We present state-of-the-art pedestrian detection results on real and synthetic weather domains with best performing 82.69 AP (snow) and 62.31 AP (fog) respectively. | 翻訳日:2023-05-16 16:29:39 公開日:2023-05-15 |
# 異常」:対照的な知識注入による医学報告の曖昧化 "Nothing Abnormal": Disambiguating Medical Reports via Contrastive Knowledge Infusion ( http://arxiv.org/abs/2305.08300v1 ) ライセンス: Link先を確認 | Zexue He, An Yan, Amilcare Gentili, Julian McAuley, Chun-Nan Hsu | (参考訳) 医療報告の共有は患者中心のケアに不可欠である。
しかし、異なるオーディエンスは医療報告書を書いたり読んだりする際に異なる目的を持っている。例えば、医療専門家は病理学をより気にかけるが、患者は診断にもっと関心を持っている("is there any abnormality?
私たちのコードと注釈付きデータは、将来の研究を促進するためにリリースされます。 Sharing medical reports is essential for patient-centered care. A recent line of work has focused on automatically generating reports with NLP methods. However, different audiences have different purposes when writing/reading medical reports -- for example, healthcare professionals care more about pathology, whereas patients are more concerned with the diagnosis ("Is there any abnormality?"). The expectation gap results in a common situation where patients find their medical reports to be ambiguous and therefore unsure about the next steps. In this work, we explore the audience expectation gap in healthcare and summarize common ambiguities that lead patients to be confused about their diagnosis into three categories: medical jargon, contradictory findings, and misleading grammatical errors. Based on our analysis, we define a disambiguation rewriting task to regenerate an input to be unambiguous while preserving information about the original content. We further propose a rewriting algorithm based on contrastive pretraining and perturbation-based rewriting. In addition, we create two datasets, OpenI-Annotated based on chest reports and VA-Annotated based on general medical reports, with available binary labels for ambiguity and abnormality presence annotated by radiology specialists. Experimental results on these datasets show that our proposed algorithm effectively rewrites input sentences in a less ambiguous way with high content fidelity. Our code and annotated data are released to facilitate future research. | 翻訳日:2023-05-16 16:29:19 公開日:2023-05-15 |
# 言語モデルのコンテキスト内学習を改善するシンボルチューニング Symbol tuning improves in-context learning in language models ( http://arxiv.org/abs/2305.08298v1 ) ライセンス: Link先を確認 | Jerry Wei and Le Hou and Andrew Lampinen and Xiangning Chen and Da Huang and Yi Tay and Xinyun Chen and Yifeng Lu and Denny Zhou and Tengyu Ma and Quoc V. Le | (参考訳) 我々は、自然言語ラベル(例えば「ポジティブ/ネガティブ感情」)を任意の記号(例えば「フード/バー」)に置き換える、文脈内入力ラベルペアで言語モデルを微調整するシンボルチューニングを提案する。
第2に、シンボルチューニングモデルはアルゴリズム推論タスクにおいてはるかに強力であり、リスト関数ベンチマークでは最大18.2%、simple turing conceptsベンチマークでは最大15.3%のパフォーマンスが向上している。
最後に、シンボル調整されたモデルでは、インコンテキストで示されるフリップペインラベルが大幅に改善され、インコンテキスト情報を使用して、事前のセマンティック知識をオーバーライドする能力が向上した。 We present symbol tuning - finetuning language models on in-context input-label pairs where natural language labels (e.g., "positive/negative sentiment") are replaced with arbitrary symbols (e.g., "foo/bar"). Symbol tuning leverages the intuition that when a model cannot use instructions or natural language labels to figure out a task, it must instead do so by learning the input-label mappings. We experiment with symbol tuning across Flan-PaLM models up to 540B parameters and observe benefits across various settings. First, symbol tuning boosts performance on unseen in-context learning tasks and is much more robust to underspecified prompts, such as those without instructions or without natural language labels. Second, symbol-tuned models are much stronger at algorithmic reasoning tasks, with up to 18.2% better performance on the List Functions benchmark and up to 15.3% better performance on the Simple Turing Concepts benchmark. Finally, symbol-tuned models show large improvements in following flipped-labels presented in-context, meaning that they are more capable of using in-context information to override prior semantic knowledge. | 翻訳日:2023-05-16 16:28:55 公開日:2023-05-15 |
# 野生の顔面メッシュをアニメーション化・再ターゲティングするための神経顔装置 Neural Face Rigging for Animating and Retargeting Facial Meshes in the Wild ( http://arxiv.org/abs/2305.08296v1 ) ライセンス: Link先を確認 | Dafei Qin, Jun Saito, Noam Aigerman, Thibault Groueix, Taku Komura | (参考訳) 本稿では,野生の人間の顔の3dモデルの自動配置と再ターゲティングのためのエンドツーエンドのディープラーニング手法を提案する。
NFR(Neural Face Rigging)と呼ばれる我々のアプローチには3つの重要な特性がある。
(i) nfrの表現空間は、芸術的制御のための人間の解釈可能な編集パラメータを維持する。
様々な実験を通じて、nfrは、アーティストが制御し、編集可能なパラメータを提供しながら、既存のデータセット全体にわたって、リアルで正確な顔変形を自動生成する能力を示す。 We propose an end-to-end deep-learning approach for automatic rigging and retargeting of 3D models of human faces in the wild. Our approach, called Neural Face Rigging (NFR), holds three key properties: (i) NFR's expression space maintains human-interpretable editing parameters for artistic controls; (ii) NFR is readily applicable to arbitrary facial meshes with different connectivity and expressions; (iii) NFR can encode and produce fine-grained details of complex expressions performed by arbitrary subjects. To the best of our knowledge, NFR is the first approach to provide realistic and controllable deformations of in-the-wild facial meshes, without the manual creation of blendshapes or correspondence. We design a deformation autoencoder and train it through a multi-dataset training scheme, which benefits from the unique advantages of two data sources: a linear 3DMM with interpretable control parameters as in FACS, and 4D captures of real faces with fine-grained details. Through various experiments, we show NFR's ability to automatically produce realistic and accurate facial deformations across a wide range of existing datasets as well as noisy facial scans in-the-wild, while providing artist-controlled, editable parameters. | 翻訳日:2023-05-16 16:28:35 公開日:2023-05-15 |
# CLCIFAR: 注釈付き補完ラベルを用いたCIFAR-Derivedベンチマークデータセット CLCIFAR: CIFAR-Derived Benchmark Datasets with Human Annotated Complementary Labels ( http://arxiv.org/abs/2305.08295v1 ) ライセンス: Link先を確認 | Hsiu-Hsuan Wang, Wei-I Lin, Hsuan-Tien Lin | (参考訳) 弱教師付き学習パラダイムとして、補足ラベル学習(cll)は、インスタンスが属さないクラスである補足ラベルのみから多クラス分類を学習することを目的としている。
https://github.com/ntucllab/complementary_cifar.com/ というリンクでデータセットにアクセスできます。 As a weakly-supervised learning paradigm, complementary label learning (CLL) aims to learn a multi-class classifier from only complementary labels, classes to which an instance does not belong. Despite various studies have addressed how to learn from CLL, those methods typically rely on some distributional assumptions on the complementary labels, and are benchmarked only on some synthetic datasets. It remains unclear how the noise or bias arising from the human annotation process would affect those CLL algorithms. To fill the gap, we design a protocol to collect complementary labels annotated by human. Two datasets, CLCIFAR10 and CLCIFAR20, based on CIFAR10 and CIFAR100, respectively, are collected. We analyzed the empirical transition matrices of the collected datasets, and observed that they are noisy and biased. We then performed extensive benchmark experiments on the collected datasets with various CLL algorithms to validate whether the existing algorithms can learn from the real-world complementary datasets. The dataset can be accessed with the following link: https://github.com/ntucllab/complementary_cifar. | 翻訳日:2023-05-16 16:28:04 公開日:2023-05-15 |
# ランドマークと外観を優先したアイデンティティ保存型会話顔生成 Identity-Preserving Talking Face Generation with Landmark and Appearance Priors ( http://arxiv.org/abs/2305.08293v1 ) ライセンス: Link先を確認 | Weizhi Zhong, Chaowei Fang, Yinqi Cai, Pengxu Wei, Gangming Zhao, Liang Lin, Guanbin Li | (参考訳) 音声から会話の顔ビデオを生成することは、多くの研究の関心を集めている。
大規模な実験により,本手法は既存の対面生成法よりも現実的で,リップシンクで,アイデンティティを保った動画を作成できることが示された。 Generating talking face videos from audio attracts lots of research interest. A few person-specific methods can generate vivid videos but require the target speaker's videos for training or fine-tuning. Existing person-generic methods have difficulty in generating realistic and lip-synced videos while preserving identity information. To tackle this problem, we propose a two-stage framework consisting of audio-to-landmark generation and landmark-to-video rendering procedures. First, we devise a novel Transformer-based landmark generator to infer lip and jaw landmarks from the audio. Prior landmark characteristics of the speaker's face are employed to make the generated landmarks coincide with the facial outline of the speaker. Then, a video rendering model is built to translate the generated landmarks into face images. During this stage, prior appearance information is extracted from the lower-half occluded target face and static reference images, which helps generate realistic and identity-preserving visual content. For effectively exploring the prior information of static reference images, we align static reference images with the target face's pose and expression based on motion fields. Moreover, auditory features are reused to guarantee that the generated face images are well synchronized with the audio. Extensive experiments demonstrate that our method can produce more realistic, lip-synced, and identity-preserving videos than existing person-generic talking face generation methods. | 翻訳日:2023-05-16 16:27:46 公開日:2023-05-15 |
# 大型言語モデルガイド木 Large Language Model Guided Tree-of-Thought ( http://arxiv.org/abs/2305.08291v1 ) ライセンス: Link先を確認 | Jieyi Long | (参考訳) 本稿では,自動回帰型大規模言語モデル(llm)の問題解決能力を向上させるための新しいアプローチであるtree-of-thought(tot)フレームワークを紹介する。
ToT をソフトウェアシステムとして実装するために,プロンプトエージェント,チェッカーモジュール,メモリモジュール,ToT コントローラなどの追加モジュールを LLM に追加する。
提案手法の有効性を検証するため,ToTを用いたSudoku Puzzleの解法を実装した。
totベースのsudokuソルバの実装は、githubで利用可能です。 \url{https://github.com/jieyilong/tree-of-thought-puzzle-solver}。 In this paper, we introduce the Tree-of-Thought (ToT) framework, a novel approach aimed at improving the problem-solving capabilities of auto-regressive large language models (LLMs). The ToT technique is inspired by the human mind's approach for solving complex reasoning tasks through trial and error. In this process, the human mind explores the solution space through a tree-like thought process, allowing for backtracking when necessary. To implement ToT as a software system, we augment an LLM with additional modules including a prompter agent, a checker module, a memory module, and a ToT controller. In order to solve a given problem, these modules engage in a multi-round conversation with the LLM. The memory module records the conversation and state history of the problem solving process, which allows the system to backtrack to the previous steps of the thought-process and explore other directions from there. To verify the effectiveness of the proposed technique, we implemented a ToT-based solver for the Sudoku Puzzle. Experimental results show that the ToT framework can significantly increase the success rate of Sudoku puzzle solving. Our implementation of the ToT-based Sudoku solver is available on GitHub: \url{https://github.com/jieyilong/tree-of-thought-puzzle-solver}. | 翻訳日:2023-05-16 16:27:24 公開日:2023-05-15 |
# SWAN: テキスト会話システム監査のためのジェネリックフレームワーク SWAN: A Generic Framework for Auditing Textual Conversational Systems ( http://arxiv.org/abs/2305.08290v1 ) ライセンス: Link先を確認 | Tetsuya Sakai | (参考訳) 本稿では,会話セッションのサンプルを入力として,テキスト対話システムの監査を行うためのシンプルで汎用的なフレームワークを提案する。
このフレームワークは、会話セッションから抽出されたナゲットシーケンスに基づいてSWAN(Schematized Weighted Average Nugget)スコアを算出する。
この論文はICTIR 2023の基調講演(2023年7月23日発表予定)の準備中に書かれた。 We present a simple and generic framework for auditing a given textual conversational system, given some samples of its conversation sessions as its input. The framework computes a SWAN (Schematised Weighted Average Nugget) score based on nugget sequences extracted from the conversation sessions. Following the approaches of S-measure and U-measure, SWAN utilises nugget positions within the conversations to weight the nuggets based on a user model. We also present a schema of twenty (+1) criteria that may be worth incorporating in the SWAN framework. In our future work, we plan to devise conversation sampling methods that are suitable for the various criteria, construct seed user turns for comparing multiple systems, and validate specific instances of SWAN for the purpose of preventing negative impacts of conversational systems on users and society. This paper was written while preparing for the ICTIR 2023 keynote (to be given on July 23, 2023). | 翻訳日:2023-05-16 16:27:04 公開日:2023-05-15 |
# 劣化雑音下でのマルチパラメータ推定のための変分量子メトロジー Variational quantum metrology for multiparameter estimation under dephasing noise ( http://arxiv.org/abs/2305.08289v1 ) ライセンス: Link先を確認 | Trung Kien Le and Hung Q. Nguyen and Le Bin Ho | (参考訳) 本稿では,量子力学の精度を高めるために,ハイブリッド量子古典変分法を提案する。
実際、全てのパラメータを同時に推定し、標準の量子限界を超える能力を示し、メトロロジー応用のための強力なツールである。 We present a hybrid quantum-classical variational scheme to enhance precision in quantum metrology. In the scheme, both the initial state and the measurement basis in the quantum part are parameterized and optimized via the classical part. It enables the maximization of information gained about the measured quantity. We discuss specific applications to 3D magnetic field sensing under several dephasing noise modes. Indeed, we demonstrate its ability to simultaneously estimate all parameters and surpass the standard quantum limit, making it a powerful tool for metrological applications. | 翻訳日:2023-05-16 16:26:49 公開日:2023-05-15 |
# Train/TestによるJavaメソッドの言語モデル A Language Model of Java Methods with Train/Test Deduplication ( http://arxiv.org/abs/2305.08286v1 ) ライセンス: Link先を確認 | Chia-Yi Su, Aakash Bansal, Vijayanta Jain, Sepideh Ghanavati, Collin Mcmillan | (参考訳) このツールのデモンストレーションは、javaソースコードの言語モデルのための研究ツールキットを示します。
私たちはすべてのツールとデータをオープンソースにし、hughingfaceとgithubから利用できます。 This tool demonstration presents a research toolkit for a language model of Java source code. The target audience includes researchers studying problems at the granularity level of subroutines, statements, or variables in Java. In contrast to many existing language models, we prioritize features for researchers including an open and easily-searchable training set, a held out test set with different levels of deduplication from the training set, infrastructure for deduplicating new examples, and an implementation platform suitable for execution on equipment accessible to a relatively modest budget. Our model is a GPT2-like architecture with 350m parameters. Our training set includes 52m Java methods (9b tokens) and 13m StackOverflow threads (10.5b tokens). To improve accessibility of research to more members of the community, we limit local resource requirements to GPUs with 16GB video memory. We provide a test set of held out Java methods that include descriptive comments, including the entire Java projects for those methods. We also provide deduplication tools using precomputed hash tables at various similarity thresholds to help researchers ensure that their own test examples are not in the training set. We make all our tools and data open source and available via Huggingface and Github. | 翻訳日:2023-05-16 16:26:43 公開日:2023-05-15 |
# ディープラーニングを用いたスクリーントーンアウェアマンガ超解像 Screentone-Aware Manga Super-Resolution Using DeepLearning ( http://arxiv.org/abs/2305.08325v1 ) ライセンス: Link先を確認 | Chih-Yuan Yao, Husan-Ting Chou, Yu-Sheng Lin, Kuo-wei Chen | (参考訳) マンガは世界中で広く愛されている娯楽であり、ハンドヘルドデバイスの普及に伴い、紙から電子スクリーンへと変化してきた。
本稿では,まず,深層学習アルゴリズムを用いてマンガの異なるスクリーントーンの領域と行を分類し,次に,各ブロックの異なる分類に基づく品質向上のための対応する超解像モデルを用いて,その組み合わせにより,画像の解像度を改善しつつ,マンガのスクリーントーンと行の意味を維持できる画像を得る。 Manga, as a widely beloved form of entertainment around the world, have shifted from paper to electronic screens with the proliferation of handheld devices. However, as the demand for image quality increases with screen development, high-quality images can hinder transmission and affect the viewing experience. Traditional vectorization methods require a significant amount of manual parameter adjustment to process screentone. Using deep learning, lines and screentone can be automatically extracted and image resolution can be enhanced. Super-resolution can convert low-resolution images to high-resolution images while maintaining low transmission rates and providing high-quality results. However, traditional Super Resolution methods for improving manga resolution do not consider the meaning of screentone density, resulting in changes to screentone density and loss of meaning. In this paper, we aims to address this issue by first classifying the regions and lines of different screentone in the manga using deep learning algorithm, then using corresponding super-resolution models for quality enhancement based on the different classifications of each block, and finally combining them to obtain images that maintain the meaning of screentone and lines in the manga while improving image resolution. | 翻訳日:2023-05-16 16:21:09 公開日:2023-05-15 |
# C-Eval: ファンデーションモデルのためのマルチレベル中国語評価スイート C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models ( http://arxiv.org/abs/2305.08322v1 ) ライセンス: Link先を確認 | Yuzhen Huang, Yuzhuo Bai, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Fanchao Qi, Yao Fu, Maosong Sun, Junxian He | (参考訳) 新しいNLPベンチマークは、大規模言語モデル(LLM)の急速な開発に合わせて緊急に必要である。
C-EvalにはC-Eval Hardが伴い、C-Evalの高度な推論能力を必要とする非常に困難な課題のサブセットである。
c-evalはファンデーションモデルの重要な強みと欠点を分析し、中国ユーザーの開発と成長を促進するのに役立つと予測している。 New NLP benchmarks are urgently needed to align with the rapid development of large language models (LLMs). We present C-Eval, the first comprehensive Chinese evaluation suite designed to assess advanced knowledge and reasoning abilities of foundation models in a Chinese context. C-Eval comprises multiple-choice questions across four difficulty levels: middle school, high school, college, and professional. The questions span 52 diverse disciplines, ranging from humanities to science and engineering. C-Eval is accompanied by C-Eval Hard, a subset of very challenging subjects in C-Eval that requires advanced reasoning abilities to solve. We conduct a comprehensive evaluation of the most advanced LLMs on C-Eval, including both English- and Chinese-oriented models. Results indicate that only GPT-4 could achieve an average accuracy of over 60%, suggesting that there is still significant room for improvement for current LLMs. We anticipate C-Eval will help analyze important strengths and shortcomings of foundation models, and foster their development and growth for Chinese users. | 翻訳日:2023-05-16 16:20:46 公開日:2023-05-15 |
# 量子超越性の最も単純なモデルに向けて:-ボックストラップにおける原子ボゾンサンプリング Towards the simplest model of quantum supremacy: Atomic boson sampling in a box trap ( http://arxiv.org/abs/2305.08320v1 ) ライセンス: Link先を確認 | V. V. Kocharovsky, Vl. V. Kocharovsky, W. D. Shannon, S. V. Tarasov | (参考訳) 箱トラップに閉じ込められたボース・アインシュタイン凝縮(bec)ガスの非凝縮分数からの相互作用原子のボゾンサンプリングを,量子多体系の計算シャープハードネスを研究するための新しいプラットフォームとして記述する。
そこで本研究では,BECガスを用いた実験において,古典計算におけるボソンサンプリングの量子超越性の顕在化について論じる。 We describe boson sampling of interacting atoms from the noncondensed fraction of Bose-Einstein-condensed (BEC) gas confined in a box trap as a new platform for studying computational sharpP-hardness of quantum many-body systems. In this case the theory becomes really simple and transparent. We calculate analytically the characteristic function and statistics of the excited-state atom occupations in the Bogoliubov approximation by means of the newly found hafnian master theorem and show their analogy to those of the Gaussian boson sampling of noninteracting photons in a linear interferometer. Importantly, due to interatomic interactions, the squeezing and interference of atom excited states, both of which are necessary for the computational sharpP-hardness of boson sampling, are self-generated in the gas and do not require neither sophisticated external sources of bosons in squeezed states nor controlled couplers, beam splitters and phase shifters needed for boson sampling in optical interferometers. On this basis, we discuss how to get manifestations of quantum supremacy of boson sampling over classical computing in the experiments with BEC gas. | 翻訳日:2023-05-16 16:20:29 公開日:2023-05-15 |
# 有限トレース上の合成の戦略について On Strategies in Synthesis Over Finite Traces ( http://arxiv.org/abs/2305.08319v1 ) ライセンス: Link先を確認 | Suguman Bansal and Yong Li and Lucas Martinelli Tabajara and Moshe Y. Vardi and Andrew Wells | (参考訳) 有限トレース($\ltlf$)上の線形時相論理からの反応合成の革新は、$\ltlf$合成ツールによって生成される戦略の正確性を検証する能力によって増幅される。
これは、"em $\ltlf$ model checking}" に対する我々の取り組みを動機付けます。
$\ltlf$ 合成によって生成される戦略は、それぞれ有限だが無限の長さのトランスデューサまたは'em non-terminating}トランスデューサを用いて表現することができる。
中心となる結果は、非終端トランスデューサの$\ltlf$モデルチェックは、終端トランスデューサよりも\emph{exponentially difficult}であるということです。
問題はそれぞれ \expspace-complete と $\pspace$-complete である。
これは、私たちの知る限りでは、$\ltlf$合成において1つのトランスデューサをもう1つ使うという \emph{first} の証拠である。 The innovations in reactive synthesis from {\em Linear Temporal Logics over finite traces} ($\ltlf$) will be amplified by the ability to verify the correctness of the strategies generated by $\ltlf$ synthesis tools. This motivates our work on {\em $\ltlf$ model checking}. $\ltlf$ model checking, however, is not straightforward. The strategies generated by $\ltlf$ synthesis may be represented using {\em terminating} transducers or {\em non-terminating} transducers where executions are of finite-but-unbounded length or infinite length, respectively. For synthesis, there is no evidence that one type of transducer is better than the other since they both demonstrate the same complexity and similar algorithms. In this work, we show that for model checking, the two types of transducers are fundamentally different. Our central result is that $\ltlf$ model checking of non-terminating transducers is \emph{exponentially harder} than that of terminating transducers. We show that the problems are \expspace-complete and $\pspace$-complete, respectively. Hence, considering the feasibility of verification, $\ltlf$ synthesis tools should synthesize terminating transducers. This is, to the best of our knowledge, the \emph{first} evidence to use one transducer over the other in $\ltlf$ synthesis. | 翻訳日:2023-05-16 16:20:06 公開日:2023-05-15 |
# 自動車再配置のためのCMSGクロスメディアセマンティックグラフ特徴マッチングアルゴリズム CMSG Cross-Media Semantic-Graph Feature Matching Algorithm for Autonomous Vehicle Relocalization ( http://arxiv.org/abs/2305.08318v1 ) ライセンス: Link先を確認 | Shuhang Tan, Hengyu Liu, Zhiling Wang | (参考訳) 再局在化は地図に基づく局在化アルゴリズムの基礎である。
その結果,CMSGはNVIDIA 1080 Ti GPU上で25FPSの速度で現行の単一センサ方式と比較して,同等あるいはそれ以上の精度で動作可能であることがわかった。 Relocalization is the basis of map-based localization algorithms. Camera and LiDAR map-based methods are pervasive since their robustness under different scenarios. Generally, mapping and localization using the same sensor have better accuracy since matching features between the same type of data is easier. However, due to the camera's lack of 3D information and the high cost of LiDAR, cross-media methods are developing, which combined live image data and Lidar map. Although matching features between different media is challenging, we believe cross-media is the tendency for AV relocalization since its low cost and accuracy can be comparable to the same-sensor-based methods. In this paper, we propose CMSG, a novel cross-media algorithm for AV relocalization tasks. Semantic features are utilized for better interpretation the correlation between point clouds and image features. What's more, abstracted semantic graph nodes are introduced, and a graph network architecture is integrated to better extract the similarity of semantic features. Validation experiments are conducted on the KITTI odometry dataset. Our results show that CMSG can have comparable or even better accuracy compared to current single-sensor-based methods at a speed of 25 FPS on NVIDIA 1080 Ti GPU. | 翻訳日:2023-05-16 16:19:42 公開日:2023-05-15 |
# semignn-ppi:効率良く汎用的なタンパク質-タンパク質相互作用予測のための自己センシングマルチグラフニューラルネットワーク SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Efficient and Generalizable Protein-Protein Interaction Prediction ( http://arxiv.org/abs/2305.08316v1 ) ライセンス: Link先を確認 | Ziyuan Zhao, Peisheng Qian, Xulei Yang, Zeng Zeng, Cuntai Guan, Wai Leong Tam, Xiaoli Li | (参考訳) タンパク質とタンパク質の相互作用(PPI)は様々な生物学的プロセスにおいて重要であり、その研究は薬物開発や疾患の診断に重要な意味を持つ。
我々はさらに、GNNとMean Teacherと結婚し、ラベルなしグラフ構造化PPIデータを自己アンサンブルグラフ学習に効果的に活用する。
異なる評価設定の異なるスケールのPPIデータセットに対する大規模な実験は、SemiGNN-PPIが最先端のPPI予測手法よりも優れていることを示している。 Protein-protein interactions (PPIs) are crucial in various biological processes and their study has significant implications for drug development and disease diagnosis. Existing deep learning methods suffer from significant performance degradation under complex real-world scenarios due to various factors, e.g., label scarcity and domain shift. In this paper, we propose a self-ensembling multigraph neural network (SemiGNN-PPI) that can effectively predict PPIs while being both efficient and generalizable. In SemiGNN-PPI, we not only model the protein correlations but explore the label dependencies by constructing and processing multiple graphs from the perspectives of both features and labels in the graph learning process. We further marry GNN with Mean Teacher to effectively leverage unlabeled graph-structured PPI data for self-ensemble graph learning. We also design multiple graph consistency constraints to align the student and teacher graphs in the feature embedding space, enabling the student model to better learn from the teacher model by incorporating more relationships. Extensive experiments on PPI datasets of different scales with different evaluation settings demonstrate that SemiGNN-PPI outperforms state-of-the-art PPI prediction methods, particularly in challenging scenarios such as training with limited annotations and testing on unseen data. | 翻訳日:2023-05-16 16:19:22 公開日:2023-05-15 |
# 開量子系におけるLiouville-Majoranaモードの散逸 Dissipation induced Liouville-Majorana modes in open quantum system ( http://arxiv.org/abs/2305.08311v1 ) ライセンス: Link先を確認 | Xing-Shuo Xu, Xiang-Fa Zhou, Guang-Can Guo, and Zheng-Wei Zhou | (参考訳) オープンシステムでは、トポロジカルエッジ状態はコヒーレンスを失い、トポロジカル量子計算や量子メモリでは利用できない。
この研究は、量子ジャンプによって引き起こされるオープンシステムの新しい安定な位相状態を探す新しい道を開く。 In open systems, topological edge states quickly lose coherence and cannot be used in topological quantum computation and quantum memory. Here we show that for dissipative quantum spin (or fermionic) systems, topologically non-Hermitian Liouville-Majorana edge modes (LMEMs) can survive in the extended Liouville-Fock space, which is beyond the scope of topological modes defined in usual Hermitian system. By vectorizing the Lindblad equation of the system using the third quantization, we prove that it reduces to a series of non-Hermitian Kitaev chains in the extended Liouville-Fock space, and topologically LMEMs are protected due to its internal symmetry. Furthermore, we provide an explicit method for detecting these modes and prove that the purity of the density matrix characterizes the long-range correlation of LMEMs. The work opens new avenues of searching for novel stable topological states in open systems induced by quantum jumps. | 翻訳日:2023-05-16 16:18:54 公開日:2023-05-15 |
# fusion blossom: qec用の高速mwpmデコーダ Fusion Blossom: Fast MWPM Decoders for QEC ( http://arxiv.org/abs/2305.08307v1 ) ライセンス: Link先を確認 | Yue Wu and Lin Zhong | (参考訳) Minimum-Weight Perfect Matching (MWPM) デコーダは量子エラー訂正(QEC)デコーダで広く使われている。
私たちはparity blossomと呼ばれる高速なmwpmデコーダを設計し、実装しました。
さらに,parity blossom の並列版である fusion blossom の設計と実装を行った。
実際の回路レベルのノイズが0.1%になると、Fusion Blossomは毎秒100万回の計測ラウンドをコード距離33までデコードできる。
fusion blossomは、測定ラウンドに関係なく、コード距離21で0.7msデコーディングレイテンシに達するストリームデコーディングモードもサポートする。 The Minimum-Weight Perfect Matching (MWPM) decoder is widely used in Quantum Error Correction (QEC) decoding. Despite its high accuracy, existing implementations of the MWPM decoder cannot catch up with quantum hardware, e.g., 1 million measurements per second for superconducting qubits. They suffer from a backlog of measurements that grows exponentially and as a result, cannot realize the power of quantum computation. We design and implement a fast MWPM decoder, called Parity Blossom, which reaches a time complexity almost proportional to the number of defect measurements. We further design and implement a parallel version of Parity Blossom called Fusion Blossom. Given a practical circuit-level noise of 0.1%, Fusion Blossom can decode a million measurement rounds per second up to a code distance of 33. Fusion Blossom also supports stream decoding mode that reaches a 0.7 ms decoding latency at code distance 21 regardless of the measurement rounds. | 翻訳日:2023-05-16 16:18:34 公開日:2023-05-15 |
# マイクロ波-光量子界面のための色中心を持つダイヤモンド光学キャビティの設計 Design of a diamond optomechanical cavity with a color center for microwave-to-optical quantum interfaces ( http://arxiv.org/abs/2305.08306v1 ) ライセンス: Link先を確認 | Byunggi Kim, Hodaka Kurokawa, Hideo Kosaka, and Masahiro Nomura | (参考訳) マイクロ波と光子間の量子伝達は、リモート量子コンピューティング量子ビット間の量子通信において重要な役割を果たす。
このオプティメカルキャビティは、集光器キャビティ領域付近に窒化アルミニウム(AlN)パッド圧電カプラを内蔵し、超小径のメカニカルモードおよび光学モードボリュームは、それぞれ~1.5 \times 10^{-4}({\Lambda}_p)^3および~0.2({\lambda}/n)^3を保持する。
我々の量子変換方式は、様々な距離の量子ネットワークに信頼できるプラットフォームを提供します。 Quantum transduction between microwave and optical photons holds a key role in quantum communications among remote quantum computing qubits. Although the quantum transduction schemes generating communication photons have been successfully demonstrated by using optomechanical interfaces, the low conversion efficiency remains an obstacle to the implementation of a quantum network consisting of multiple qubits. Here, we present an efficient quantum transduction scheme using a one-dimensional (1D) diamond optomechanical cavity tuned at a color-center emission without the optomechanical coupling. The optomechanical cavity incorporates a thin aluminum nitride (AlN) pad piezoelectric coupler near the concentrator cavity region, while keeping the ultrasmall mechanical and optical mode-volumes of ~1.5 \times 10^{-4} ({\Lambda}_p)^3 and ~0.2({\lambda}/n)^3, respectively. Energy level of a coherent color-center electron is manipulated with a strong mechanical mode-color center coupling rate up to 16.4 MHz. In our system, we theoretically predict that the conversion efficiency from a single microwave photon into an optical photon can reach 15%. Our quantum transduction scheme will offer a reliable platform for quantum networks in the various range of distances. | 翻訳日:2023-05-16 16:18:19 公開日:2023-05-15 |
# 次世代トランシーバの深部展開 Deep-Unfolding for Next-Generation Transceivers ( http://arxiv.org/abs/2305.08303v1 ) ライセンス: Link先を確認 | Qiyu Hu, Yunlong Cai, Guangyi Zhang, Guanding Yu, Geoffrey Ye Li | (参考訳) 超高データレート、極端に高い信頼性、低レイテンシといった将来のワイヤレスネットワークのパフォーマンス要件は、次世代マルチインプット多重出力(mimo)トランスシーバの定義に関する世界的な研究を刺激している。
さらに,今後の研究におけるオープンな課題が強調されている。 The stringent performance requirements of future wireless networks, such as ultra-high data rates, extremely high reliability and low latency, are spurring worldwide studies on defining the next-generation multiple-input multiple-output (MIMO) transceivers. For the design of advanced transceivers in wireless communications, optimization approaches often leading to iterative algorithms have achieved great success for MIMO transceivers. However, these algorithms generally require a large number of iterations to converge, which entails considerable computational complexity and often requires fine-tuning of various parameters. With the development of deep learning, approximating the iterative algorithms with deep neural networks (DNNs) can significantly reduce the computational time. However, DNNs typically lead to black-box solvers, which requires amounts of data and extensive training time. To further overcome these challenges, deep-unfolding has emerged which incorporates the benefits of both deep learning and iterative algorithms, by unfolding the iterative algorithm into a layer-wise structure analogous to DNNs. In this article, we first go through the framework of deep-unfolding for transceiver design with matrix parameters and its recent advancements. Then, some endeavors in applying deep-unfolding approaches in next-generation advanced transceiver design are presented. Moreover, some open issues for future research are highlighted. | 翻訳日:2023-05-16 16:17:54 公開日:2023-05-15 |
# 多人数対話読解のための参照型二重チャネル注意ネットワーク Coreference-aware Double-channel Attention Network for Multi-party Dialogue Reading Comprehension ( http://arxiv.org/abs/2305.08348v1 ) ライセンス: Link先を確認 | Yanling Li, Bowei Zou, Yifan Fan, Mengxing Dong, Yu Hong | (参考訳) MDRC(Multi-party Dialogue Reading Comprehension)に挑戦する。
提案手法は細調整したBERT および ELECTRA ベースラインと比較して, 両コーパスの大幅な改善が得られた。
最大パフォーマンスゲインは約2.5\% F1スコアである。
MDRCモデルは、ほとんどの場合、最先端のモデルよりも優れています。 We tackle Multi-party Dialogue Reading Comprehension (abbr., MDRC). MDRC stands for an extractive reading comprehension task grounded on a batch of dialogues among multiple interlocutors. It is challenging due to the requirement of understanding cross-utterance contexts and relationships in a multi-turn multi-party conversation. Previous studies have made great efforts on the utterance profiling of a single interlocutor and graph-based interaction modeling. The corresponding solutions contribute to the answer-oriented reasoning on a series of well-organized and thread-aware conversational contexts. However, the current MDRC models still suffer from two bottlenecks. On the one hand, a pronoun like "it" most probably produces multi-skip reasoning throughout the utterances of different interlocutors. On the other hand, an MDRC encoder is potentially puzzled by fuzzy features, i.e., the mixture of inner linguistic features in utterances and external interactive features among utterances. To overcome the bottlenecks, we propose a coreference-aware attention modeling method to strengthen the reasoning ability. In addition, we construct a two-channel encoding network. It separately encodes utterance profiles and interactive relationships, so as to relieve the confusion among heterogeneous features. We experiment on the benchmark corpora Molweni and FriendsQA. Experimental results demonstrate that our approach yields substantial improvements on both corpora, compared to the fine-tuned BERT and ELECTRA baselines. The maximum performance gain is about 2.5\% F1-score. Besides, our MDRC models outperform the state-of-the-art in most cases. | 翻訳日:2023-05-16 16:11:22 公開日:2023-05-15 |
# kepr: ジェネレーティブ・コモンセンス質問応答における知識の強化と可能性ランキング KEPR: Knowledge Enhancement and Plausibility Ranking for Generative Commonsense Question Answering ( http://arxiv.org/abs/2305.08347v1 ) ライセンス: Link先を確認 | Zhifeng Li and Bowei Zou and Yifan Fan and Yu Hong | (参考訳) gencqa(generative commonsense question answering)は、質問に対して回答のリストを自動的に生成するタスクである。
そこで本稿では,Generate-Then-Rankパイプラインアーキテクチャに基づくKEPR(Knowledge Enhancement and Plausibility Ranking)アプローチを提案する。
具体的には、キーワードのWiktionary Commonsense知識の観点から質問を拡張し、正規化パターンで修正する。
関連する知識の取得にデンスパス検索を用い、回答を生成するために異なるPLM(BART, GPT2, T5)ネットワークを使用する。
実験モデルでは、ケプラーのt5ベースのgencqaが最高の性能を得ており、これは主要な標準メートル法である inc@3 において最大60.91%である。
ProtoQAの現在のリーダーボードでは、既存のGenCQAモデルよりも優れています。 Generative commonsense question answering (GenCQA) is a task of automatically generating a list of answers given a question. The answer list is required to cover all reasonable answers. This presents the considerable challenges of producing diverse answers and ranking them properly. Incorporating a variety of closely-related background knowledge into the encoding of questions enables the generation of different answers. Meanwhile, learning to distinguish positive answers from negative ones potentially enhances the probabilistic estimation of plausibility, and accordingly, the plausibility-based ranking. Therefore, we propose a Knowledge Enhancement and Plausibility Ranking (KEPR) approach grounded on the Generate-Then-Rank pipeline architecture. Specifically, we expand questions in terms of Wiktionary commonsense knowledge of keywords, and reformulate them with normalized patterns. Dense passage retrieval is utilized for capturing relevant knowledge, and different PLM-based (BART, GPT2 and T5) networks are used for generating answers. On the other hand, we develop an ELECTRA-based answer ranking model, where logistic regression is conducted during training, with the aim of approximating different levels of plausibility in a polar classification scenario. Extensive experiments on the benchmark ProtoQA show that KEPR obtains substantial improvements, compared to the strong baselines. Within the experimental models, the T5-based GenCQA with KEPR obtains the best performance, which is up to 60.91% at the primary canonical metric Inc@3. It outperforms the existing GenCQA models on the current leaderboard of ProtoQA. | 翻訳日:2023-05-16 16:11:03 公開日:2023-05-15 |
# ラベル強化による補足学習におけるラベル共有効率の向上 Enhancing Label Sharing Efficiency in Complementary-Label Learning with Label Augmentation ( http://arxiv.org/abs/2305.08344v1 ) ライセンス: Link先を確認 | Wei-I Lin, Gang Niu, Hsuan-Tien Lin, Masashi Sugiyama | (参考訳) 補足ラベル学習(cll)は、特定のインスタンスが属さないクラスである補足ラベルのみを使用して通常の分類器を訓練する、弱い教師付き学習の一形態である。
実験結果から,従来のCLLモデルよりも相補的ラベル拡張により経験的性能が向上することが確認された。 Complementary-label Learning (CLL) is a form of weakly supervised learning that trains an ordinary classifier using only complementary labels, which are the classes that certain instances do not belong to. While existing CLL studies typically use novel loss functions or training techniques to solve this problem, few studies focus on how complementary labels collectively provide information to train the ordinary classifier. In this paper, we fill the gap by analyzing the implicit sharing of complementary labels on nearby instances during training. Our analysis reveals that the efficiency of implicit label sharing is closely related to the performance of existing CLL models. Based on this analysis, we propose a novel technique that enhances the sharing efficiency via complementary-label augmentation, which explicitly propagates additional complementary labels to each instance. We carefully design the augmentation process to enrich the data with new and accurate complementary labels, which provide CLL models with fresh and valuable information to enhance the sharing efficiency. We then verify our proposed technique by conducting thorough experiments on both synthetic and real-world datasets. Our results confirm that complementary-label augmentation can systematically improve empirical performance over state-of-the-art CLL models. | 翻訳日:2023-05-16 16:10:36 公開日:2023-05-15 |
# データから物理法則を発見する有限表現法 Finite Expression Methods for Discovering Physical Laws from Data ( http://arxiv.org/abs/2305.08342v1 ) ライセンス: Link先を確認 | Zhongyi Jiang and Chunmei Wang and Haizhao Yang | (参考訳) 非線形力学は様々な科学的・工学的な分野において広く見られる現象である。
FEXは時間依存型PDE問題や時間変動係数を持つ非線形力学系を含む様々な問題において,既存の手法(PDE-Net, SINDy, GP, SPL)よりも優れた性能を示した。
さらに、FEXは、低メモリと良好な時間複雑性を維持しながら、シンボル支配方程式を正確に近似する柔軟性と表現力を示した。 Nonlinear dynamics is a pervasive phenomenon observed in various scientific and engineering disciplines. However, uncovering analytical expressions that describe nonlinear dynamics from limited data remains a challenging and essential task. In this paper, we propose a new deep symbolic learning method called the ``finite expression method'' (FEX) to identify the governing equations within the space of functions containing a finite set of analytic expressions, based on observed dynamic data. The core idea is to leverage FEX to generate analytical expressions of the governing equations by learning the derivatives of partial differential equation (PDE) solutions using convolutions. Our numerical results demonstrate that FEX outperforms all existing methods (such as PDE-Net, SINDy, GP, and SPL) in terms of numerical performance across various problems, including time-dependent PDE problems and nonlinear dynamical systems with time-varying coefficients. Furthermore, the results highlight that FEX exhibits flexibility and expressive power in accurately approximating symbolic governing equations, while maintaining low memory and favorable time complexity. | 翻訳日:2023-05-16 16:10:13 公開日:2023-05-15 |
# コーパス言語学におけるLLM補助アノテーションの使用:局所文法解析を事例として Using LLM-assisted Annotation for Corpus Linguistics: A Case Study of Local Grammar Analysis ( http://arxiv.org/abs/2305.08339v1 ) ライセンス: Link先を確認 | Danni Yu, Luyang Li, Hang Su | (参考訳) 大規模言語モデル(LLM)に基づくチャットボットは、言語理解において強力な能力を示している。
その結果, Bing チャットボットはタスクにおける ChatGPT を著しく上回った。
そこで本研究では,llm支援アノテーションがコーパス研究に有望な自動アプローチであることを示す。 Chatbots based on Large Language Models (LLMs) have shown strong capabilities in language understanding. In this study, we explore the potential of LLMs in assisting corpus-based linguistic studies through automatic annotation of texts with specific categories of linguistic information. Specifically, we examined to what extent LLMs understand the functional elements constituting the speech act of apology from a local grammar perspective, by comparing the performance of ChatGPT (powered by GPT-3.5), Bing chatbot (powered by GPT-4), and a human coder in the annotation task. The results demonstrate that Bing chatbot significantly outperformed ChatGPT in the task. Compared to human annotator, the overall performance of Bing chatbot was slightly less satisfactory. However, it already achieved high F1 scores: 99.95% for the tag of APOLOGISING, 91.91% for REASON, 95.35% for APOLOGISER, 89.74% for APOLOGISEE, and 96.47% for INTENSIFIER. Therefore, we propose that LLM-assisted annotation is a promising automated approach for corpus studies. | 翻訳日:2023-05-16 16:09:54 公開日:2023-05-15 |
# ニューラルボルツマンマシン Neural Boltzmann Machines ( http://arxiv.org/abs/2305.08337v1 ) ライセンス: Link先を確認 | Alex H. Lang, Anton D. Loukianov, and Charles K. Fisher | (参考訳) 条件生成モデルは、コンテキスト情報を入力として使用して、新しい想像的出力を生成することができる。
条件付き制限ボルツマンマシン(英: Conditional Restricted Boltzmann Machines, CRBM)は、ノイズの多い離散的または連続的なデータのモデリングに特に適していることが証明された条件付き生成モデルの一種であるが、CRBMにおける表現力の欠如は広く採用されている。
特に,ガウシアン・ベルヌーリ crbms に問題を引き起こした正規分布データを用いて,nbms の有用性を示す。
結果の再現コードは https://github.com/unlearnai/neural-boltzmann-machines で確認できます。 Conditional generative models are capable of using contextual information as input to create new imaginative outputs. Conditional Restricted Boltzmann Machines (CRBMs) are one class of conditional generative models that have proven to be especially adept at modeling noisy discrete or continuous data, but the lack of expressivity in CRBMs have limited their widespread adoption. Here we introduce Neural Boltzmann Machines (NBMs) which generalize CRBMs by converting each of the CRBM parameters to their own neural networks that are allowed to be functions of the conditional inputs. NBMs are highly flexible conditional generative models that can be trained via stochastic gradient descent to approximately maximize the log-likelihood of the data. We demonstrate the utility of NBMs especially with normally distributed data which has historically caused problems for Gaussian-Bernoulli CRBMs. Code to reproduce our results can be found at https://github.com/unlearnai/neural-boltzmann-machines. | 翻訳日:2023-05-16 16:09:34 公開日:2023-05-15 |
# 物理およびニューラルレンダラを用いた半透明物体の逆レンダリング Inverse Rendering of Translucent Objects using Physical and Neural Renderers ( http://arxiv.org/abs/2305.08336v1 ) ライセンス: Link先を確認 | Chenhao Li, Trung Thanh Ngo, Hajime Nagahara | (参考訳) 本研究では,半透明物体の1対の撮像画像のみから,3次元形状,空間的反射率,均質な地下散乱パラメータ,および環境照明を共同で推定する逆レンダリングモデルを提案する。
合成データと実世界のデータセットの質的および定量的結果から,提案モデルの有効性が示された。 In this work, we propose an inverse rendering model that estimates 3D shape, spatially-varying reflectance, homogeneous subsurface scattering parameters, and an environment illumination jointly from only a pair of captured images of a translucent object. In order to solve the ambiguity problem of inverse rendering, we use a physically-based renderer and a neural renderer for scene reconstruction and material editing. Because two renderers are differentiable, we can compute a reconstruction loss to assist parameter estimation. To enhance the supervision of the proposed neural renderer, we also propose an augmented loss. In addition, we use a flash and no-flash image pair as the input. To supervise the training, we constructed a large-scale synthetic dataset of translucent objects, which consists of 117K scenes. Qualitative and quantitative results on both synthetic and real-world datasets demonstrated the effectiveness of the proposed model. | 翻訳日:2023-05-16 16:09:10 公開日:2023-05-15 |
# 対数光円錐、遅い絡み合い成長とスクランブル、量子メモリ Logarithmic light cone, slow entanglement growth and scrambling, and quantum memory ( http://arxiv.org/abs/2305.08334v1 ) ライセンス: Link先を確認 | Yu Zeng, Alioscia Hamma, Yu-Ran Zhang, Qiang Liu, Rengang Li, Heng Fan and Wu-Ming Liu | (参考訳) 有効光円錐はリーブ・ロビンソン境界から非相対論的局所量子系に出現し、ハイゼンベルク像内の2つの時空分離作用素の指数関数的に減衰する可換ノルムとなる。
可能な方法として、llc は多体局所化の現象論的モデルから生じることができる。
量子情報処理の応用として、LLCは長寿命の量子メモリ、マクロコード距離を持つ量子コード、ユニタリ時間進化後の指数的に長い寿命をサポートする。 Effective light cones may emerge in non-relativistic local quantum systems from the Lieb-Robinson bounds, resulting in exponentially decaying commutator norms of two space-time separated operators in the Heisenberg picture. Here, we derive a mechanism for the emergence and consequences of a logarithmic light cone (LLC). As a possible way, the LLC can emerge from a phenomenological model of many-body-localization. We show that the information scrambling is logarithmically slow in the regime of the LLC. We prove that the bipartite entanglement entropy grows logarithmically with time at arbitrary finite space dimensions and for arbitrary initial pure states. As an application in quantum information processing, the LLC supports long-lived quantum memory, a quantum code with macroscopic code distance and an exponentially long lifetime after unitary time evolution. | 翻訳日:2023-05-16 16:08:44 公開日:2023-05-15 |
# FedAds: 垂直的フェデレーション学習によるプライバシー保護型CVR推定ベンチマーク FedAds: A Benchmark for Privacy-Preserving CVR Estimation with Vertical Federated Learning ( http://arxiv.org/abs/2305.08328v1 ) ライセンス: Link先を確認 | Penghui Wei, Hongjian Dou, Shaoguo Liu, Rongjun Tang, Li Liu, Liang Wang, Bo Zheng | (参考訳) コンバージョン率(CVR)推定は、ユーザーが広告をクリックすると変換イベントの確率を予測することを目的としている。
今後fedAdsベンチマークによるvFLおよびCVR推定における研究成果の恩恵を期待する。 Conversion rate (CVR) estimation aims to predict the probability of conversion event after a user has clicked an ad. Typically, online publisher has user browsing interests and click feedbacks, while demand-side advertising platform collects users' post-click behaviors such as dwell time and conversion decisions. To estimate CVR accurately and protect data privacy better, vertical federated learning (vFL) is a natural solution to combine two sides' advantages for training models, without exchanging raw data. Both CVR estimation and applied vFL algorithms have attracted increasing research attentions. However, standardized and systematical evaluations are missing: due to the lack of standardized datasets, existing studies adopt public datasets to simulate a vFL setting via hand-crafted feature partition, which brings challenges to fair comparison. We introduce FedAds, the first benchmark for CVR estimation with vFL, to facilitate standardized and systematical evaluations for vFL algorithms. It contains a large-scale real world dataset collected from Alibaba's advertising platform, as well as systematical evaluations for both effectiveness and privacy aspects of various vFL algorithms. Besides, we also explore to incorporate unaligned data in vFL to improve effectiveness, and develop perturbation operations to protect privacy well. We hope that future research work in vFL and CVR estimation benefits from the FedAds benchmark. | 翻訳日:2023-05-16 16:08:21 公開日:2023-05-15 |
# 教育メタバース環境における学習者中心分析:自然相互作用とテキストマイニングによる価値交換システムの探索 Learner-Centered Analysis in Educational Metaverse Environments: Exploring Value Exchange Systems through Natural Interaction and Text Mining ( http://arxiv.org/abs/2305.08326v1 ) ライセンス: Link先を確認 | Yun-Cheng Tsai | (参考訳) 本稿では,教育4.0と第4次産業革命に応答して,メタバースにおける自己指向学習の潜在的発展について考察する。
その発見はテキストマイニング分析によって支持され、第四次産業革命におけるメタバースの教育形成における役割の理解に寄与している。 This paper explores the potential developments of self-directed learning in the metaverse in response to Education 4.0 and the Fourth Industrial Revolution. It highlights the importance of education keeping up with technological advancements and adopting learner-centered approaches. Additionally, it focuses on exploring value exchange systems through natural interaction, text mining, and analysis. The metaverse concept extends beyond extended reality (XR) technologies, encompassing digital avatars and shared ecological value. The role of educators in exploring new technologies and leveraging text-mining techniques to enhance learning efficiency is emphasized. The metaverse is presented as a platform for value exchange, necessitating meaningful and valuable content to attract users. Integrating virtual and real-world experiences within the metaverse offers practical applications and contributes to its essence. This paper sheds light on the metaverse's potential to create a learner-centered educational environment and adapt to the evolving landscape of Education 4.0. Its findings, supported by text mining analysis, contribute to understanding the metaverse's role in shaping education in the Fourth Industrial Revolution. | 翻訳日:2023-05-16 16:07:47 公開日:2023-05-15 |
# 非エルミート系における絡み合いによる臨界線認識 Recognizing critical lines via entanglement in non-Hermitian systems ( http://arxiv.org/abs/2305.08374v1 ) ライセンス: Link先を確認 | Keshav Das Agarwal, Tanoy Kanti Konar, Leela Ganesh Chandra Lakkaraju, Aditi Sen De | (参考訳) 非エルミート模型は、エルミート模型では観測されない反直観的な現象を示す。
ハミルトニアンの非エルミート的相互作用成分とエルミート的相互作用成分の競合を調べるために、非エルミート的XYスピン鎖とエルミート的Kaplan-Shekhtman-Entin-Aharony (KSEA)相互作用を含む系に焦点を当てた。
このシナリオを超えて、地中状態が横磁場との急激なクエンチ後に進化すると、その第2モーメントによって量子化された二分極エンタングルメントの速度関数と揺らぎの両方が、クエンチしない臨界線を検出することができる。 The non-Hermitian model exhibits counter-intuitive phenomena which are not observed in the Hermitian counterparts. To probe the competition between non-Hermitian and Hermitian interacting components of the Hamiltonian, we focus on a system containing non-Hermitian XY spin chain and Hermitian Kaplan-Shekhtman-Entin-Aharony (KSEA) interactions along with the transverse magnetic field. We show that the non-Hermitian model can be an effective Hamiltonian of a Hermitian XX spin-1/2 with KSEA interaction and a local magnetic field that interacts with local and non-local reservoirs. The analytical expression of the energy spectrum divides the system parameters into two regimes -- in one region, the strength of Hermitian KSEA interactions dominates over the imaginary non-Hermiticity parameter while in the other, the opposite is true. In the former situation, we demonstrate that the nearest-neighbor entanglement and its derivative can identify quantum critical lines with the variation of the magnetic field. In this domain, we determine a surface where the entanglement vanishes, similar to the factorization surface, known in the Hermitian case. On the other hand, when non-Hermiticity parameters dominate, we report the exceptional and critical points where the energy gap vanishes and illustrate that bipartite entanglement is capable of detecting these transitions as well. Going beyond this scenario, when the ground state evolves after a sudden quench with the transverse magnetic field, both rate function and the fluctuation of bipartite entanglement quantified via its second moment can detect critical lines generated without quenching dynamics. | 翻訳日:2023-05-16 16:02:46 公開日:2023-05-15 |
# マルチレベルアライメントを用いたマルチモーダル名前付きエンティティ認識のための新しいフレームワーク A Novel Framework for Multimodal Named Entity Recognition with Multi-level Alignments ( http://arxiv.org/abs/2305.08372v1 ) ライセンス: Link先を確認 | Peipei Liu, Hong Li, Yimo Ren, Jie Liu, Shuaizong Si, Hongsong Zhu, Limin Sun | (参考訳) 名前付きエンティティ認識(NER)を用いたつぶやきからの構造化知識のマイニングは、待機中の推奨や意図といった多くのダウンストリームアプリケーションにとって有益である。
2つのオープンデータセットについて実験を行い,結果と詳細な解析結果から,このモデルの利点を実証した。 Mining structured knowledge from tweets using named entity recognition (NER) can be beneficial for many downstream applications such as recommendation and intention under standing. With tweet posts tending to be multimodal, multimodal named entity recognition (MNER) has attracted more attention. In this paper, we propose a novel approach, which can dynamically align the image and text sequence and achieve the multi-level cross-modal learning to augment textual word representation for MNER improvement. To be specific, our framework can be split into three main stages: the first stage focuses on intra-modality representation learning to derive the implicit global and local knowledge of each modality, the second evaluates the relevance between the text and its accompanying image and integrates different grained visual information based on the relevance, the third enforces semantic refinement via iterative cross-modal interactions and co-attention. We conduct experiments on two open datasets, and the results and detailed analysis demonstrate the advantage of our model. | 翻訳日:2023-05-16 16:02:13 公開日:2023-05-15 |
# superdialseg:教師付き対話セグメンテーションのための大規模データセット SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation ( http://arxiv.org/abs/2305.08371v1 ) ライセンス: Link先を確認 | Junfeng Jiang, Chengzhang Dong, Akiko Aizawa, Sadao Kurohashi | (参考訳) 対話セグメンテーションは対話システムにとって重要な課題であり、会話テキストの理解を深める。
私たちの仕事は対話セグメンテーションの分野で重要な一歩だと信じています。 Dialogue segmentation is a crucial task for dialogue systems allowing a better understanding of conversational texts. Despite recent progress in unsupervised dialogue segmentation methods, their performances are limited by the lack of explicit supervised signals for training. Furthermore, the precise definition of segmentation points in conversations still remains as a challenging problem, increasing the difficulty of collecting manual annotations. In this paper, we provide a feasible definition of dialogue segmentation points with the help of document-grounded dialogues and release a large-scale supervised dataset called SuperDialseg, containing 9K dialogues based on two prevalent document-grounded dialogue corpora, and also inherit their useful dialogue-related annotations. Moreover, we propose two models to exploit the dialogue characteristics, achieving state-of-the-art performance on SuperDialseg and showing good generalization ability on the out-of-domain datasets. Additionally, we provide a benchmark including 20 models across four categories for the dialogue segmentation task with several proper evaluation metrics. Based on the analysis of the empirical studies, we also provide some insights for the task of dialogue segmentation. We believe our work is an important step forward in the field of dialogue segmentation. | 翻訳日:2023-05-16 16:01:53 公開日:2023-05-15 |
# 高速サブモジュラー関数最大化 Fast Submodular Function Maximization ( http://arxiv.org/abs/2305.08367v1 ) ライセンス: Link先を確認 | Lianke Qin, Zhao Song, Yitan Wang | (参考訳) サブモジュール関数は、文書要約、センサー配置、画像分割など、多くの実世界の応用がある。
現在、オンライン部分モジュラー関数の最大化のための最もよく知られているアルゴリズムは、実行時間$O(n k d^2)$で、$n$は要素の総数、$d$は特徴次元、$k$は選択すべき要素の数である。
我々のアルゴリズムは$\widetilde{O}(nk + kd^2 + nd)$時間しかかからない。 Submodular functions have many real-world applications, such as document summarization, sensor placement, and image segmentation. For all these applications, the key building block is how to compute the maximum value of a submodular function efficiently. We consider both the online and offline versions of the problem: in each iteration, the data set changes incrementally or is not changed, and a user can issue a query to maximize the function on a given subset of the data. The user can be malicious, issuing queries based on previous query results to break the competitive ratio for the online algorithm. Today, the best-known algorithm for online submodular function maximization has a running time of $O(n k d^2)$ where $n$ is the total number of elements, $d$ is the feature dimension and $k$ is the number of elements to be selected. We propose a new method based on a novel search tree data structure. Our algorithm only takes $\widetilde{O}(nk + kd^2 + nd)$ time. | 翻訳日:2023-05-16 16:01:34 公開日:2023-05-15 |
# CLRerNet: LaneIoUによるレーン検出の信頼性向上 CLRerNet: Improving Confidence of Lane Detection with LaneIoU ( http://arxiv.org/abs/2305.08366v1 ) ライセンス: Link先を確認 | Hiroto Honda, Yusuke Uchida | (参考訳) レーンマーカー検出は、自動運転および運転支援システムの重要な構成要素である。
信頼性スコアの質向上を目的とした目標割り当てコストと損失関数に laneiou を特徴とするclrernet という新しい検出器を開発した。
クロス検証を含む慎重で公平なベンチマークによって、clrernetは最先端技術よりも大きなマージンで勝っていることを実証した。f1スコアは81.43%で、culaneでは80.47%で、curvelaneでは86.10%で86.47%であった。 Lane marker detection is a crucial component of the autonomous driving and driver assistance systems. Modern deep lane detection methods with row-based lane representation exhibit excellent performance on lane detection benchmarks. Through preliminary oracle experiments, we firstly disentangle the lane representation components to determine the direction of our approach. We show that correct lane positions are already among the predictions of an existing row-based detector, and the confidence scores that accurately represent intersection-over-union (IoU) with ground truths are the most beneficial. Based on the finding, we propose LaneIoU that better correlates with the metric, by taking the local lane angles into consideration. We develop a novel detector coined CLRerNet featuring LaneIoU for the target assignment cost and loss functions aiming at the improved quality of confidence scores. Through careful and fair benchmark including cross validation, we demonstrate that CLRerNet outperforms the state-of-the-art by a large margin - enjoying F1 score of 81.43% compared with 80.47% of the existing method on CULane, and 86.47% compared with 86.10% on CurveLanes. | 翻訳日:2023-05-16 16:01:14 公開日:2023-05-15 |
# 逆線形混合MDPにおける水平自由強化学習 Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs ( http://arxiv.org/abs/2305.08359v1 ) ライセンス: Link先を確認 | Kaixuan Ji and Qingyue Zhao and Jiafan He and Weitong Zhang and Quanquan Gu | (参考訳) 近年の研究では、総報酬が1ドルに制限された場合、RLはバンドイットよりも難しいことが示されており、計画的地平線に多元的依存を持つ後悔境界が$H$であった。
本稿では,horizon-free policy searchアルゴリズムを提案することで,この疑問に肯定的に答える。
このアルゴリズムは$\tilde{o}\big((d+\log (|\mathcal{s}|^2 |\mathcal{a}|))\sqrt{k}\big)$を全情報フィードバックで達成できることを示し、ここで$d$はmdpの未知の遷移核を線形にパラメトリする既知の特徴マッピングの次元であり、$k$はエピソード数、$|\mathcal{s}|$および$|\mathcal{a}|$は状態と作用空間の濃度であることを示した。
また、このアルゴリズムの近似最適性と$\log|\mathcal{S}|$と$\log|\mathcal{A}|$の不可避性を正当化するために、難解な結果と後悔の低い境界を与える。 Recent studies have shown that episodic reinforcement learning (RL) is no harder than bandits when the total reward is bounded by $1$, and proved regret bounds that have a polylogarithmic dependence on the planning horizon $H$. However, it remains an open question that if such results can be carried over to adversarial RL, where the reward is adversarially chosen at each episode. In this paper, we answer this question affirmatively by proposing the first horizon-free policy search algorithm. To tackle the challenges caused by exploration and adversarially chosen reward, our algorithm employs (1) a variance-uncertainty-aware weighted least square estimator for the transition kernel; and (2) an occupancy measure-based technique for the online search of a \emph{stochastic} policy. We show that our algorithm achieves an $\tilde{O}\big((d+\log (|\mathcal{S}|^2 |\mathcal{A}|))\sqrt{K}\big)$ regret with full-information feedback, where $d$ is the dimension of a known feature mapping linearly parametrizing the unknown transition kernel of the MDP, $K$ is the number of episodes, $|\mathcal{S}|$ and $|\mathcal{A}|$ are the cardinalities of the state and action spaces. We also provide hardness results and regret lower bounds to justify the near optimality of our algorithm and the unavoidability of $\log|\mathcal{S}|$ and $\log|\mathcal{A}|$ in the regret bound. | 翻訳日:2023-05-16 16:00:53 公開日:2023-05-15 |
# 垂直フェデレート学習におけるセキュアトレーニングのための二次関数暗号 Quadratic Functional Encryption for Secure Training in Vertical Federated Learning ( http://arxiv.org/abs/2305.08358v1 ) ライセンス: Link先を確認 | Shuangyi Chen, Anuja Modi, Shweta Agrawal, Ashish Khisti | (参考訳) 垂直連合学習(VFL)は、個々のデータのプライバシ保護を希望する複数のパーティ間でデータが分散されるような環境で、機械学習(ML)モデルの協調トレーニングを可能にする。
本稿では,縦型フェデレート学習のための一般化線形モデルを訓練する際に,擬似関数暗号を用いることで,Xuなどの情報漏洩を回避できる方法を説明する。 Vertical federated learning (VFL) enables the collaborative training of machine learning (ML) models in settings where the data is distributed amongst multiple parties who wish to protect the privacy of their individual data. Notably, in VFL, the labels are available to a single party and the complete feature set is formed only when data from all parties is combined. Recently, Xu et al. proposed a new framework called FedV for secure gradient computation for VFL using multi-input functional encryption. In this work, we explain how some of the information leakage in Xu et al. can be avoided by using Quadratic functional encryption when training generalized linear models for vertical federated learning. | 翻訳日:2023-05-16 16:00:11 公開日:2023-05-15 |
# デッドラインインスタンスを用いた高速かつ効率的なマッチングアルゴリズム Fast and Efficient Matching Algorithm with Deadline Instances ( http://arxiv.org/abs/2305.08353v1 ) ライセンス: Link先を確認 | Zhao Song, Weixin Wang, Chenbo Yin | (参考訳) オンライン重み付きマッチング問題は、機械学習における基本的な問題である。
次に、2つの最適化アルゴリズム(\textsc{fastgreedy} と \textsc{fastpostponedgreedy})を提示し、アルゴリズムの時間複雑性と正確性に関する理論的証明を提供する。
しかし、 \textsc{FastPostponedGreedy} アルゴリズムでは、各ノードの状態は最初不明である。
$\epsilon \in (0,0.1)$ は各辺の実重みの相対誤差を表す。
元の \textsc{Greedy} と \textsc{PostponedGreedy} の競合比は、それぞれ $\frac{1}{2}$ と $\frac{1}{4}$ である。
これら2つのアルゴリズムに基づいて, \textsc{fastgreedy} と \textsc{fastpostponedgreedy} のアルゴリズムを提案し,その競合比はそれぞれ $\frac{1 - \epsilon}{2}$ と $\frac{1 - \epsilon}{4}$ である。
n$ ノードが $\mathbb{r} ^ d$ で与えられると、時間の複雑さは $o(nd)$ から $\widetilde{o}(\epsilon^{-2} \cdot (n + d))$ に減少する。 Online weighted matching problem is a fundamental problem in machine learning due to its numerous applications. Despite many efforts in this area, existing algorithms are either too slow or don't take $\mathrm{deadline}$ (the longest time a node can be matched) into account. In this paper, we introduce a market model with $\mathrm{deadline}$ first. Next, we present our two optimized algorithms (\textsc{FastGreedy} and \textsc{FastPostponedGreedy}) and offer theoretical proof of the time complexity and correctness of our algorithms. In \textsc{FastGreedy} algorithm, we have already known if a node is a buyer or a seller. But in \textsc{FastPostponedGreedy} algorithm, the status of each node is unknown at first. Then, we generalize a sketching matrix to run the original and our algorithms on both real data sets and synthetic data sets. Let $\epsilon \in (0,0.1)$ denote the relative error of the real weight of each edge. The competitive ratio of original \textsc{Greedy} and \textsc{PostponedGreedy} is $\frac{1}{2}$ and $\frac{1}{4}$ respectively. Based on these two original algorithms, we proposed \textsc{FastGreedy} and \textsc{FastPostponedGreedy} algorithms and the competitive ratio of them is $\frac{1 - \epsilon}{2}$ and $\frac{1 - \epsilon}{4}$ respectively. At the same time, our algorithms run faster than the original two algorithms. Given $n$ nodes in $\mathbb{R} ^ d$, we decrease the time complexity from $O(nd)$ to $\widetilde{O}(\epsilon^{-2} \cdot (n + d))$. | 翻訳日:2023-05-16 15:59:57 公開日:2023-05-15 |
# 基底状態探索のための平均場対向ダイアバティック駆動の構成法 A general method to construct mean field counter diabatic driving for a ground state search ( http://arxiv.org/abs/2305.08352v1 ) ライセンス: Link先を確認 | Hiroshi Hayasaka, Takashi Imoto, Yuichiro Matsuzaki, Shiro Kawabata | (参考訳) カウンターダイアバティック(CD)駆動は、量子アニール(QA)における非断熱遷移を抑制するために多くの注目を集めている。
また, 横磁場を有するスピンガラスモデルの基底状態は, CD駆動のない従来のQAと比較して高い忠実度で得ることができることを明らかにした。
さらに,本手法をD波量子アニールを用いて実験的に実証し,数値シミュレーションを裏付ける実験結果を得た。 The counter diabatic (CD) driving has attracted much attention for suppressing non-adiabatic transition in quantum annealing (QA). However, it can be intractable to construct the CD driving in the actual experimental setup due to the non-locality of the CD dariving Hamiltonian and necessity of exact diagonalization of the QA Hamiltonian in advance. In this paper, using the mean field (MF) theory, we propose a general method to construct an approximated CD driving term consisting of local operators. We can efficiently construct the MF approximated CD (MFCD) term by solving the MF dynamics of magnetization using a classical computer. As an example, we numerically perform QA with MFCD driving for the spin glass model with transverse magnetic fields. We numerically show that the MF dynamics with MFCD driving is equivalent to the solution of the self-consistent equation in MF theory. Also, we clarify that a ground state of the spin glass model with transverse magnetic field can be obtained with high fidelity compared to the conventional QA without the CD driving. Moreover, we experimentally demonstrate our method by using a D-wave quantum annealer and obtain the experimental result supporting our numerical simulation. | 翻訳日:2023-05-16 15:59:26 公開日:2023-05-15 |
# 境界エルダー次元を持つモデルベースRLの均一PAC保証 Uniform-PAC Guarantees for Model-Based RL with Bounded Eluder Dimension ( http://arxiv.org/abs/2305.08350v1 ) ライセンス: Link先を確認 | Yue Wu and Jiafan He and Quanquan Gu | (参考訳) 近年,一般関数近似を用いた強化学習(RL)が目覚ましい進歩を遂げている。
我々の知る限りでは、これは線形の場合を超えたバンドとRLの均一PAC保証のための最初の作業である。 Recently, there has been remarkable progress in reinforcement learning (RL) with general function approximation. However, all these works only provide regret or sample complexity guarantees. It is still an open question if one can achieve stronger performance guarantees, i.e., the uniform probably approximate correctness (Uniform-PAC) guarantee that can imply both a sub-linear regret bound and a polynomial sample complexity for any target learning accuracy. We study this problem by proposing algorithms for both nonlinear bandits and model-based episodic RL using the general function class with a bounded eluder dimension. The key idea of the proposed algorithms is to assign each action to different levels according to its width with respect to the confidence set. The achieved uniform-PAC sample complexity is tight in the sense that it matches the state-of-the-art regret bounds or sample complexity guarantees when reduced to the linear case. To the best of our knowledge, this is the first work for uniform-PAC guarantees on bandit and RL that goes beyond linear cases. | 翻訳日:2023-05-16 15:59:07 公開日:2023-05-15 |
# MaxViT-UNet:医療画像セグメンテーションのためのマルチ軸注意 MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation ( http://arxiv.org/abs/2305.08396v1 ) ライセンス: Link先を確認 | Abdul Rehman, Asifullah Khan | (参考訳) 近年,畳み込みニューラルネットワークは医用画像解析において大きな進歩を遂げている。
ハイブリッドデコーダブロックは,まずトランスポーション・コンボリューション(transpose convolution)によってアップサンプリングされた下位機能とハイブリッド・エンコーダからのスキップ接続機能とを融合し,多軸アテンション機構を用いて融合機能を改良する。
我々のMaxViT-UNetは以前のCNNのみ(UNet)とTransformerのみ(Swin-UNet)の技法をそれぞれ2.36%と5.31%で上回りました。 Convolutional neural networks have made significant strides in medical image analysis in recent years. However, the local nature of the convolution operator inhibits the CNNs from capturing global and long-range interactions. Recently, Transformers have gained popularity in the computer vision community and also medical image segmentation. But scalability issues of self-attention mechanism and lack of the CNN like inductive bias have limited their adoption. In this work, we present MaxViT-UNet, an Encoder-Decoder based hybrid vision transformer for medical image segmentation. The proposed hybrid decoder, also based on MaxViT-block, is designed to harness the power of convolution and self-attention mechanism at each decoding stage with minimal computational burden. The multi-axis self-attention in each decoder stage helps in differentiating between the object and background regions much more efficiently. The hybrid decoder block initially fuses the lower level features upsampled via transpose convolution, with skip-connection features coming from hybrid encoder, then fused features are refined using multi-axis attention mechanism. The proposed decoder block is repeated multiple times to accurately segment the nuclei regions. Experimental results on MoNuSeg dataset proves the effectiveness of the proposed technique. Our MaxViT-UNet outperformed the previous CNN only (UNet) and Transformer only (Swin-UNet) techniques by a large margin of 2.36% and 5.31% on Dice metric respectively. | 翻訳日:2023-05-16 15:52:23 公開日:2023-05-15 |
# 単一測定値に基づくヌル次元証人 Null dimension witness based on single measurements ( http://arxiv.org/abs/2305.08395v1 ) ライセンス: Link先を確認 | Josep Batle, Adam Bednorz | (参考訳) 量子系の次元の線形独立性による等式に基づく零証人を示し、実空間、複素空間、古典空間を識別する。
有限統計による誤りについても論じる。 We present a null witness, based on equality due to linear independence, of the dimension of a quantum system, discriminating real, complex and classical spaces. The witness involves only a single measurement with sufficiently many outcomes and prepared input states. In addition, for intermediate dimensions, the witness bounds saturate for a family of equiangular tight frames including symmetric informationally complete positive operator valued measures. Such a witness requires a minimum of resources, being robust against many practical imperfections. We also discuss errors due to finite statistics. | 翻訳日:2023-05-16 15:51:55 公開日:2023-05-15 |
# 対話における会話分析におけるChatGPTの可能性:実証的研究 Uncovering the Potential of ChatGPT for Discourse Analysis in Dialogue: An Empirical Study ( http://arxiv.org/abs/2305.08391v1 ) ライセンス: Link先を確認 | Yaxin Fan and Feng Jiang | (参考訳) ChatGPTのような大規模言語モデル(LLM)は、翻訳や要約など、多くの従来のNLPタスクを非常に浅い理解できました。
これらのタスクにchatgptを適応させるために、識別的および生成的パラダイムを提案し、より難しいタスクでchatgptのパフォーマンスを改善するためのchain of thought (cot)アプローチを導入する。
これらの知見が,LLM時代の対話談話分析手法を洗練するための基礎となることを願っている。 Large Language Models (LLMs) like ChatGPT have proven a great shallow understanding of many traditional NLP tasks, such as translation, summarization, etc. However, its performance on high-level understanding, such as dialogue discourse analysis task that requires a higher level of understanding and reasoning, remains less explored. This study investigates ChatGPT's capabilities in three dialogue discourse tasks: topic segmentation, discourse relation recognition, and discourse parsing, of varying difficulty levels. To adapt ChatGPT to these tasks, we propose discriminative and generative paradigms and introduce the Chain of Thought (COT) approach to improve ChatGPT's performance in more difficult tasks. The results show that our generative paradigm allows ChatGPT to achieve comparative performance in the topic segmentation task comparable to state-of-the-art methods but reveals room for improvement in the more complex tasks of discourse relation recognition and discourse parsing. Notably, the COT can significantly enhance ChatGPT's performance with the help of understanding complex structures in more challenging tasks. Through a series of case studies, our in-depth analysis suggests that ChatGPT can be a good annotator in topic segmentation but has difficulties understanding complex rhetorical structures. We hope these findings provide a foundation for future research to refine dialogue discourse analysis approaches in the era of LLMs. | 翻訳日:2023-05-16 15:51:48 公開日:2023-05-15 |
# 好きなように編集する: 多粒度コマンドによるビデオ記述編集 Edit As You Wish: Video Description Editing with Multi-grained Commands ( http://arxiv.org/abs/2305.08389v1 ) ライセンス: Link先を確認 | Linli Yao, Yuanmeng Zhang, Ziheng Wang, Xinglin Hou, Tiezheng Ge, Yuning Jiang and Qin Jin | (参考訳) 自然言語によるビデオの自動ナレーションは、インターネット上の大量のビデオの把握と管理を支援する。
1) 制御信号は固定され, 単一粒度制御のみを表現できる。
2) 動的なユーザ要求を満たすために,ビデオ記述をさらに編集することはできない。
人間の書き直しの習慣に触発されて、ユーザコマンドを {operation, position, attribute} triplet として設計し、多粒度の使用要件をカバーし、粗粒度制御(例えば、記述を拡張)やきめ細かい制御(例えば、指定された位置に特定の詳細を追加する)を統一形式で表現できる。
vdeditの評価には,キャプション品質,キャプションコマンド一貫性,キャプションビデオアライメントなど,モデルパフォーマンスの3つの側面を測定するための包括的なメトリクスを採用する。 Automatically narrating a video with natural language can assist people in grasping and managing massive videos on the Internet. From the perspective of video uploaders, they may have varied preferences for writing the desired video description to attract more potential followers, e.g. catching customers' attention for product videos. The Controllable Video Captioning task is therefore proposed to generate a description conditioned on the user demand and video content. However, existing works suffer from two shortcomings: 1) the control signal is fixed and can only express single-grained control; 2) the video description can not be further edited to meet dynamic user demands. In this paper, we propose a novel Video Description Editing (VDEdit) task to automatically revise an existing video description guided by flexible user requests. Inspired by human writing-revision habits, we design the user command as a {operation, position, attribute} triplet to cover multi-grained use requirements, which can express coarse-grained control (e.g. expand the description) as well as fine-grained control (e.g. add specified details in specified position) in a unified format. To facilitate the VDEdit task, we first automatically construct a large-scale benchmark dataset namely VATEX-EDIT in the open domain describing diverse human activities. Considering the real-life application scenario, we further manually collect an e-commerce benchmark dataset called EMMAD-EDIT. We propose a unified framework to convert the {operation, position, attribute} triplet into a textual control sequence to handle multi-grained editing commands. For VDEdit evaluation, we adopt comprehensive metrics to measure three aspects of model performance, including caption quality, caption-command consistency, and caption-video alignment. | 翻訳日:2023-05-16 15:51:23 公開日:2023-05-15 |
# PLIP:人物表現学習のための言語画像事前学習 PLIP: Language-Image Pre-training for Person Representation Learning ( http://arxiv.org/abs/2305.08386v1 ) ライセンス: Link先を確認 | Jialong Zuo, Changqian Yu, Nong Sang, Changxin Gao | (参考訳) 事前学習は、強力な人間表現を学ぶための効果的な技術として出現した。
また、適切なデータセットがないため、SynTH-PEDESと呼ばれる大規模人物データセットを提示し、Stylish Pedestrian Attributes-union Captioning法を提案し、多様なテキスト記述を合成する。
コード、データセット、重み付けは~\url{https://github.com/Zplusdragon/PLIP} でリリースされる。 Pre-training has emerged as an effective technique for learning powerful person representations. Most existing methods have shown that pre-training on pure-vision large-scale datasets like ImageNet and LUPerson has achieved remarkable performance. However, solely relying on visual information, the absence of robust explicit indicators poses a challenge for these methods to learn discriminative person representations. Drawing inspiration from the intrinsic fine-grained attribute indicators of person descriptions, we explore introducing the language modality into person representation learning. To this end, we propose a novel language-image pre-training framework for person representation learning, termed PLIP. To explicitly build fine-grained cross-modal associations, we specifically design three pretext tasks, \ie semantic-fused image colorization, visual-fused attributes prediction, and vision-language matching. In addition, due to the lack of an appropriate dataset, we present a large-scale person dataset named SYNTH-PEDES, where the Stylish Pedestrian Attributes-union Captioning method is proposed to synthesize diverse textual descriptions. We pre-train PLIP on SYNTH-PEDES and evaluate our model by spanning downstream tasks such as text-based Re-ID, image-based Re-ID, and person attribute recognition. Extensive experiments demonstrate that our model not only significantly improves existing methods on all these tasks, but also shows great ability in the few-shot and domain generalization settings. The code, dataset and weights will be released at~\url{https://github.com/Zplusdragon/PLIP} | 翻訳日:2023-05-16 15:50:53 公開日:2023-05-15 |
# 政治宣言における感情の現況とイデオロギー的同時性 Incumbent/Opposition Dynamics and Ideological Similitude on Emotions in Political Manifestos ( http://arxiv.org/abs/2305.08383v1 ) ライセンス: Link先を確認 | Takumi Nishi | (参考訳) この研究は、2000年から2019年にかけてイギリス保守労働党の総選挙宣言における感情関連言語の分析を含む。
また,イデオロギー的同義性を持つ当事者は,感情と党の地位の関係に関する文献に,肯定的な言語を積極的に用いていることも示している。 The study involved the analysis of emotion-associated language in the UK Conservative and Labour party general election manifestos between 2000 to 2019. While previous research have shown a general correlation between ideological positioning and overlap of public policies, there are still conflicting results in matters of sentiments in such manifestos. Using new data, we present how valence level can be swayed by party status within government with incumbent parties presenting a higher frequency in positive emotion-associated words while negative emotion-associated words are more prevalent in opposition parties. We also demonstrate that parties with ideological similitude use positive language prominently further adding to the literature on the relationship between sentiments and party status. | 翻訳日:2023-05-16 15:50:23 公開日:2023-05-15 |
# 視覚言語プロンプトに適したモード近似 Mode Approximation Makes Good Vision-Language Prompts ( http://arxiv.org/abs/2305.08381v1 ) ライセンス: Link先を確認 | Haixin Wang, Xinlong Yang, Jianlong Chang, Dian Jin, Jinan Sun, Shikun Zhang, Xiao Luo, Qi Tian | (参考訳) 大規模モデル技術の進歩により、パラメータ効率変換学習(PETL)は人工知能の様々な分野に浸透した。
しかし、2つの重要な問題は未解決のままである: 軽量設計の複雑さをさらに減らす方法と、非常に低いパラメータの下でのモード間のアライメントを強化する方法である。
私たちのコードは、https://github.com/WillDreamer/Aurora.comで利用可能です。 With the advance of large-scale model technologies, parameter-efficient transfer learning (PETL) has swept across various fields of Artificial Intelligence. Its core idea is to adapt the model to downstream tasks using only a small number of parameters. Recently, some studies have applied these techniques proven effective to multimodal tasks. However, two critical issues remain unresolved: how to further reduce the complexity with lightweight design and how to boost alignment between modalities under extremely low parameters. In this paper, we propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges. Considering the redundancy in existing architectures, we first utilize the mode approximation to generate few trainable parameters to implement the multi-modal prompt tuning, which explores the low intrinsic dimension with only 0.05% parameters of the pre-trained model. Then, to better narrow the modality gap, we propose the informative context enhancement and gated query transformation modules under extremely few parameters scenes. A thorough evaluation of the Aurora on six cross-modal downstream benchmarks shows that it not only outperforms the state-of-the-art, but even outperforms the full fine-tuning approach. Our code is available at: https://github.com/WillDreamer/Aurora. | 翻訳日:2023-05-16 15:49:54 公開日:2023-05-15 |
# TESS: テキストからテキストへの自己定義型Simplex拡散 TESS: Text-to-Text Self-Conditioned Simplex Diffusion ( http://arxiv.org/abs/2305.08379v1 ) ライセンス: Link先を確認 | Rabeeh Karimi Mahabadi, Jaesung Tae, Hamish Ivison, James Henderson, Iz Beltagy, Matthew E. Peters, Arman Cohan | (参考訳) 拡散モデルは生成のための強力なパラダイムとして登場し、連続的な値の入力を持つ様々な領域で強力なパフォーマンスを得る。
本研究では,完全自己回帰型テキスト拡散モデルであるtext-to-text self-conditioned simplex diffusion (tess)を提案する。
要約, テキスト単純化, パラフレーズ生成, 質問生成などの自然言語理解および生成タスクに関する広範な実験を通じて, TESSは最先端の非自己回帰モデルより優れ, 事前訓練された自己回帰配列列列列モデルと競合することを示した。 Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various domains with continuous-valued inputs. Despite the promises of fully non-autoregressive text generation, applying diffusion models to natural language remains challenging due to its discrete nature. In this work, we propose Text-to-text Self-conditioned Simplex Diffusion (TESS), a text diffusion model that is fully non-autoregressive, employs a new form of self-conditioning, and applies the diffusion process on the logit simplex space rather than the typical learned embedding space. Through extensive experiments on natural language understanding and generation tasks including summarization, text simplification, paraphrase generation, and question generation, we demonstrate that TESS outperforms state-of-the-art non-autoregressive models and is competitive with pretrained autoregressive sequence-to-sequence models. | 翻訳日:2023-05-16 15:49:32 公開日:2023-05-15 |
# 大規模言語モデルによるテキスト分類 Text Classification via Large Language Models ( http://arxiv.org/abs/2305.08377v1 ) ライセンス: Link先を確認 | Xiaofei Sun, Xiaoya Li, Jiwei Li, Fei Wu, Shangwei Guo, Tianwei Zhang and Guoyin Wang | (参考訳) GPT-3のような大規模言語モデル(LLM)の顕著な成功にもかかわらず、その性能はテキスト分類のタスクにおいて微調整モデルよりも著しく劣っている。
本稿では, \textbf{c}lue \textbf{a}nd \textbf{r}easoning \textbf{p}rompting (carp) を提案する。
CARPは、テキスト分類に関わる複雑な言語現象に対処するのに適したプログレッシブ推論戦略を採用する: CARPは、最終決定のために診断推論プロセスが誘導される表面的手がかり(キーワード、トーン、セマンティックリレーション、参照など)を見つけるようLLMに促す。
驚くべきことに、carpは広く使われている5つのテキスト分類ベンチマークのうち4つ、97.39 (+1.24) はsst-2、96.40 (+0.72) はagnews、98.78 (+0.25) はr8、96.95 (+0.6) はr52で、som on mr (92.39 v.s. 93.3) と同等の性能を持つ。
具体的には、クラス毎に16の例を使用して、CARPはクラス毎に1,024の例を持つ教師付きモデルに匹敵するパフォーマンスを達成する。 Despite the remarkable success of large-scale Language Models (LLMs) such as GPT-3, their performances still significantly underperform fine-tuned models in the task of text classification. This is due to (1) the lack of reasoning ability in addressing complex linguistic phenomena (e.g., intensification, contrast, irony etc); (2) limited number of tokens allowed in in-context learning. In this paper, we introduce \textbf{C}lue \textbf{A}nd \textbf{R}easoning \textbf{P}rompting (CARP). CARP adopts a progressive reasoning strategy tailored to addressing the complex linguistic phenomena involved in text classification: CARP first prompts LLMs to find superficial clues (e.g., keywords, tones, semantic relations, references, etc), based on which a diagnostic reasoning process is induced for final decisions. To further address the limited-token issue, CARP uses a fine-tuned model on the supervised dataset for $k$NN demonstration search in the in-context learning, allowing the model to take the advantage of both LLM's generalization ability and the task-specific evidence provided by the full labeled dataset. Remarkably, CARP yields new SOTA performances on 4 out of 5 widely-used text-classification benchmarks, 97.39 (+1.24) on SST-2, 96.40 (+0.72) on AGNews, 98.78 (+0.25) on R8 and 96.95 (+0.6) on R52, and a performance comparable to SOTA on MR (92.39 v.s. 93.3). More importantly, we find that CARP delivers impressive abilities on low-resource and domain-adaptation setups. Specifically, Specifically, using 16 examples per class, CARP achieves comparable performances to supervised models with 1,024 examples per class. | 翻訳日:2023-05-16 15:49:14 公開日:2023-05-15 |
# 部分移動モーメント、主マイナーおよび絡み検出 Partial Transpose Moments, Principal Minors and Entanglement Detection ( http://arxiv.org/abs/2305.08376v1 ) ライセンス: Link先を確認 | Mazhar Ali | (参考訳) 近年,局所ランダム化測定により密度行列 [elben a., {\it et al] の部分的転位モーメントが得られることが示されている。
phys (複数形 phys)
Rev. Lett.
bf 125}, 200501 (2020)]
その結果,密度行列 [Yu X-D] の部分的移動モーメントに基づく2つの一般的な絡み合い検出法が提案された。
phys (複数形 phys)
Rev. Lett.
bf 127}, 060504 (2021)]。
さらに, 3部量子ビット系におけるPTモーメントの概念を拡張し, PTモーメントは, ホワイトノイズを混合した$GHZ$および$W$の状態に対して, NPTの範囲全体の検出しかできないことを示した。 Recently, it has been shown that locally randomized measurements can be employed to get partial transpose moments of a density matrix [Elben A., {\it et al.} Phys. Rev. Lett. {\bf 125}, 200501 (2020)]. Consequently, two general entanglement detection methods were proposed based on partial transpose moments of a density matrix [Yu X-D., {\it et al.} Phys. Rev. Lett. {\bf 127}, 060504 (2021)]. In this context, a natural question arises that how partial transpose moments are related with entanglement and with well known idea of principal minors. In this work, we analytically demonstrate that for qubit-qubit quantum systems, partial transpose moments can be expressed as simple functions of principal minors. We expect this relation to exist for every bipartite quantum systems. In addition, we have extended the idea of PT-moments for tripartite qubit systems and have shown that PT-moments can only detect the whole range of being NPT for $GHZ$ and $W$ states mixed with white noise. | 翻訳日:2023-05-16 15:48:38 公開日:2023-05-15 |
# 非退化光パラメトリック増幅器による2つの光猫状態の同時合成 Simultaneous preparation of two optical cat states based on a nondegenerate optical parametric amplifier ( http://arxiv.org/abs/2305.08426v1 ) ライセンス: Link先を確認 | Dongmei Han, Na Wang, Meihong Wang, and Xiaolong Su | (参考訳) コヒーレント状態の重ね合わせとして知られる光猫状態は、量子計算と量子メートルロジーに広く応用されている。
提案する結果は,フォールトトレラント量子計算に応用可能な4成分のcat状態を生成するための一歩となる。 The optical cat state, known as the superposition of coherent states, has broad applications in quantum computation and quantum metrology. Increasing the number of optical cat states is crucial to implement complex quantum information tasks based on them. Here, we prepare two optical cat states simultaneously based on a nondegenerate optical parametric amplifier. By subtracting one photon from each of two squeezed vacuum states, two odd cat states with orthogonal superposition direction in phase space are prepared simultaneously, which have similar fidelity of 60% and amplitude of 1.2. Compared with the traditional method to generate two odd optical cat states based on two degenerate optical parametric amplifiers, only one nondegenerate optical parametric amplifier is applied in our experiment, which saves half of the quantum resource of nonlinear cavities. The presented results make a step toward preparing the four-component cat state, which has potential applications in fault-tolerant quantum computation. | 翻訳日:2023-05-16 15:43:03 公開日:2023-05-15 |
# FeatFSDA:ビデオによる活動認識のための領域適応に向けて FeatFSDA: Towards Few-shot Domain Adaptation for Video-based Activity Recognition ( http://arxiv.org/abs/2305.08420v1 ) ライセンス: Link先を確認 | Kunyu Peng, Di Wen, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg | (参考訳) 領域適応は活動認識に不可欠であり、時空間的アーキテクチャは時間的次元から生じるパラメータの増加によって過度に適合するリスクがある。
UCF101, HMDB51, EPIC-KITCHEN, Sims4Action, Toyota Smart Homeの5つのデータセットを用いてFSDA-ARベンチマークを構築した。
ビデオベースのアクティビティ認識のためのいくつかのドメイン適応の今後の研究を促進するため、ベンチマークとコードをhttps://github.com/KPeng9510/FeatFSDAで公開します。 Domain adaptation is essential for activity recognition, as common spatiotemporal architectures risk overfitting due to increased parameters arising from the temporal dimension. Unsupervised domain adaptation methods have been extensively studied, yet, they require large-scale unlabeled data from the target domain. In this work, we address few-shot domain adaptation for video-based activity recognition (FSDA-AR), which leverages a very small amount of labeled target videos to achieve effective adaptation. This setting is attractive and promising for applications, as it requires recording and labeling only a few, or even a single example per class in the target domain, which often includes activities that are rare yet crucial to recognize. We construct FSDA-AR benchmarks using five established datasets: UCF101, HMDB51, EPIC-KITCHEN, Sims4Action, and Toyota Smart Home. Our results demonstrate that FSDA-AR performs comparably to unsupervised domain adaptation with significantly fewer (yet labeled) target examples. We further propose a novel approach, FeatFSDA, to better leverage the few labeled target domain samples as knowledge guidance. FeatFSDA incorporates a latent space semantic adjacency loss, a domain prototypical similarity loss, and a graph-attentive-network-based edge dropout technique. Our approach achieves state-of-the-art performance on all datasets within our FSDA-AR benchmark. To encourage future research of few-shot domain adaptation for video-based activity recognition, we will release our benchmarks and code at https://github.com/KPeng9510/FeatFSDA. | 翻訳日:2023-05-16 15:42:44 公開日:2023-05-15 |
# ビデオ軌道解析のためのオンラインシーケンスクラスタリングアルゴリズム Online Sequence Clustering Algorithm for Video Trajectory Analysis ( http://arxiv.org/abs/2305.08418v1 ) ライセンス: Link先を確認 | Aximu Yuemaier, Xiaogang Chen, Xingyu Qian, Longfei Liang, Shunfeng Li, Zhitang Song | (参考訳) ターゲット追跡と軌道モデリングは監視ビデオ解析において重要な応用であり、道路安全とコミュニティセキュリティの分野で大きな注目を集めている。
このスキームは、多くの算術演算を回避しつつ、リアルタイムなオンライン学習とモーションモデルの処理を持ち、フロントエンドのインテリジェントな知覚のアプリケーションシナリオと一致している。 Target tracking and trajectory modeling have important applications in surveillance video analysis and have received great attention in the fields of road safety and community security. In this work, we propose a lightweight real-time video analysis scheme that uses a model learned from motion patterns to monitor the behavior of objects, which can be used for applications such as real-time representation and prediction. The proposed sequence clustering algorithm based on discrete sequences makes the system have continuous online learning ability. The intrinsic repeatability of the target object trajectory is used to automatically construct the behavioral model in the three processes of feature extraction, cluster learning, and model application. In addition to the discretization of trajectory features and simple model applications, this paper focuses on online clustering algorithms and their incremental learning processes. Finally, through the learning of the trajectory model of the actual surveillance video image, the feasibility of the algorithm is verified. And the characteristics and performance of the clustering algorithm are discussed in the analysis. This scheme has real-time online learning and processing of motion models while avoiding a large number of arithmetic operations, which is more in line with the application scenarios of front-end intelligent perception. | 翻訳日:2023-05-16 15:42:14 公開日:2023-05-15 |
# Marsellus: 2-to-8b DNNアクセラレーションと30%ブースト適応ボディバイアスを備えた異種RISC-V AI-IoTエンドノードSoC Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing ( http://arxiv.org/abs/2305.08415v1 ) ライセンス: Link先を確認 | Francesco Conti, Gianna Paulin, Davide Rossi, Alfio Di Mauro, Georg Rutishauser, Gianmarco Ottavi, Manuel Eggimann, Hayate Okuhara, Luca Benini | (参考訳) 拡張現実、パーソナライズされたヘルスケア、ナノロボティクスのためのAI-IoT(System-on-a-Chip)システム・オン・チップ(SoC)の進化は、計算集約的だが強力な量子化されたDeep Neural Network(DNN)推論や、高精度浮動小数点を必要とする信号処理と制御など、幅広い操作条件において、数十mWのパワーエンベロープ内で多くの多様なタスクを実行する必要がある。
我々はglobalfoundries 22nm fdxで作製したai-iotエンドノードのための全デジタルヘテロジニアスsocであるmarsellusを提案する。
1 RISC-Vデジタル信号処理(DSP)16コアの汎用クラスタで、4ビットと2ビットの算術拡張(XpulpNN)を利用して、MAC&LOAD操作と浮動小数点演算を併用した多様なワークロードを実行する。
2) DNNにおける3x3と1x1(ポイントワイド)の畳み込みを加速する2-8ビット再構成可能なバイナリエンジン(RBE)
3)Adaptive Body Biasing(ABB)ジェネレータとハードウェア制御ループに接続されたオンチップ監視(OCM)ブロックのセットにより、トランジスタ閾値電圧のオンザフライ適応が可能となる。
Marsellusは2ビットの精度演算で最大180 Gop/s、3.32 Top/s/W、ハードウェアアクセラレーションされたDNN層で最大637 Gop/s、12.4 Top/s/Wを達成する。 Emerging Artificial Intelligence-enabled Internet-of-Things (AI-IoT) System-on-a-Chip (SoC) for augmented reality, personalized healthcare, and nano-robotics need to run many diverse tasks within a power envelope of a few tens of mW over a wide range of operating conditions: compute-intensive but strongly quantized Deep Neural Network (DNN) inference, as well as signal processing and control requiring high-precision floating-point. We present Marsellus, an all-digital heterogeneous SoC for AI-IoT end-nodes fabricated in GlobalFoundries 22nm FDX that combines 1) a general-purpose cluster of 16 RISC-V Digital Signal Processing (DSP) cores attuned for the execution of a diverse range of workloads exploiting 4-bit and 2-bit arithmetic extensions (XpulpNN), combined with fused MAC&LOAD operations and floating-point support; 2) a 2-8bit Reconfigurable Binary Engine (RBE) to accelerate 3x3 and 1x1 (pointwise) convolutions in DNNs; 3) a set of On-Chip Monitoring (OCM) blocks connected to an Adaptive Body Biasing (ABB) generator and a hardware control loop, enabling on-the-fly adaptation of transistor threshold voltages. Marsellus achieves up to 180 Gop/s or 3.32 Top/s/W on 2-bit precision arithmetic in software, and up to 637 Gop/s or 12.4 Top/s/W on hardware-accelerated DNN layers. | 翻訳日:2023-05-16 15:41:57 公開日:2023-05-15 |
# 今日のNLUにおける超人的パフォーマンスの意味は? What's the Meaning of Superhuman Performance in Today's NLU? ( http://arxiv.org/abs/2305.08414v1 ) ライセンス: Link先を確認 | Simone Tedeschi, Johan Bos, Thierry Declerck, Jan Hajic, Daniel Hershcovich, Eduard H. Hovy, Alexander Koller, Simon Krek, Steven Schockaert, Rico Sennrich, Ekaterina Shutova, Roberto Navigli | (参考訳) 過去5年間、自然言語処理(NLP)において、より大きな事前学習言語モデル(PLM)の開発や、SuperGLUEやSQuADといったベンチマークを導入して、言語理解、推論、理解の能力を測定することに注力してきた。
これらのベンチマークは人間とPLMの比較に重大な制約があることを示し、より公平で透明なベンチマークの推奨を提供する。 In the last five years, there has been a significant focus in Natural Language Processing (NLP) on developing larger Pretrained Language Models (PLMs) and introducing benchmarks such as SuperGLUE and SQuAD to measure their abilities in language understanding, reasoning, and reading comprehension. These PLMs have achieved impressive results on these benchmarks, even surpassing human performance in some cases. This has led to claims of superhuman capabilities and the provocative idea that certain tasks have been solved. In this position paper, we take a critical look at these claims and ask whether PLMs truly have superhuman abilities and what the current benchmarks are really evaluating. We show that these benchmarks have serious limitations affecting the comparison between humans and PLMs and provide recommendations for fairer and more transparent benchmarks. | 翻訳日:2023-05-16 15:41:20 公開日:2023-05-15 |
# 地球観測を前進させる人工知能の展望 Artificial intelligence to advance Earth observation: a perspective ( http://arxiv.org/abs/2305.08413v1 ) ライセンス: Link先を確認 | Devis Tuia, Konrad Schindler, Beg\"um Demir, Gustau Camps-Valls, Xiao Xiang Zhu, Mrinalini Kochupillai, Sa\v{s}o D\v{z}eroski, Jan N. van Rijn, Holger H. Hoos, Fabio Del Frate, Mihai Datcu, Jorge-Arnulfo Quian\'e-Ruiz, Volker Markl, Bertrand Le Saux, Rochelle Schneider | (参考訳) 地球観測(EO)は、陸と海洋の過程を監視し、作業中の力学を研究し、地球の脈波を観測する主要な手段である。
具体的には その影響を
三 高度な処理及び計算
(viii)EOにおけるML技術の大量活用に関連する倫理的・社会的問題に関する議論。 Earth observation (EO) is a prime instrument for monitoring land and ocean processes, studying the dynamics at work, and taking the pulse of our planet. This article gives a bird's eye view of the essential scientific tools and approaches informing and supporting the transition from raw EO data to usable EO-based information. The promises, as well as the current challenges of these developments, are highlighted under dedicated sections. Specifically, we cover the impact of (i) Computer vision; (ii) Machine learning; (iii) Advanced processing and computing; (iv) Knowledge-based AI; (v) Explainable AI and causal inference; (vi) Physics-aware models; (vii) User-centric approaches; and (viii) the much-needed discussion of ethical and societal issues related to the massive use of ML technologies in EO. | 翻訳日:2023-05-16 15:41:06 公開日:2023-05-15 |
# SB-VQA: ビデオ強化のためのスタックベースのビデオ品質評価フレームワーク SB-VQA: A Stack-Based Video Quality Assessment Framework for Video Enhancement ( http://arxiv.org/abs/2305.08408v1 ) ライセンス: Link先を確認 | Ding-Jiun Huang, Yu-Ting Kao, Tieh-Hung Chuang, Ya-Chun Tsai, Jing-Kai Lou, Shuen-Huei Guan | (参考訳) 近年,ビデオ品質評価(VQA)手法が開発され,高性能化が図られている。
実験では,既存のvqaアルゴリズムをpgcビデオに適用できることを実証し,pgcビデオのvqa性能を遊びのプロットを考慮して改善できることを見出し,映像意味理解の重要性を強調する。 In recent years, several video quality assessment (VQA) methods have been developed, achieving high performance. However, these methods were not specifically trained for enhanced videos, which limits their ability to predict video quality accurately based on human subjective perception. To address this issue, we propose a stack-based framework for VQA that outperforms existing state-of-the-art methods on VDPVE, a dataset consisting of enhanced videos. In addition to proposing the VQA framework for enhanced videos, we also investigate its application on professionally generated content (PGC). To address copyright issues with premium content, we create the PGCVQ dataset, which consists of videos from YouTube. We evaluate our proposed approach and state-of-the-art methods on PGCVQ, and provide new insights on the results. Our experiments demonstrate that existing VQA algorithms can be applied to PGC videos, and we find that VQA performance for PGC videos can be improved by considering the plot of a play, which highlights the importance of video semantic understanding. | 翻訳日:2023-05-16 15:40:55 公開日:2023-05-15 |
# 深層畳み込みネットワークにおけるインダクティブバイアスの理論解析 Theoretical Analysis of Inductive Biases in Deep Convolutional Networks ( http://arxiv.org/abs/2305.08404v1 ) ライセンス: Link先を確認 | Zihao Wang, Lei Wu | (参考訳) 本稿では,畳み込みニューラルネットワーク(CNN)における帰納バイアスについて検討する。
我々は、$d$ が入力次元である普遍性を達成するには、$\mathcal{o}(\log d)$ の深さが十分であることを証明する。
また, CNNを用いたスパース関数の学習には$\tilde{\mathcal{O}}(\log^2d)$サンプルが必要であることも証明した。
これらの証明可能な分離は2つのバイアスの違いを定量化し、背後にある主要な観察は、重みの共有と局所性が学習プロセスの異なる対称性を損なうことである。 In this paper, we study the inductive biases in convolutional neural networks (CNNs), which are believed to be vital drivers behind CNNs' exceptional performance on vision-like tasks. We first analyze the universality of CNNs, i.e., the ability to approximate continuous functions. We prove that a depth of $\mathcal{O}(\log d)$ is sufficient for achieving universality, where $d$ is the input dimension. This is a significant improvement over existing results that required a depth of $\Omega(d)$. We also prove that learning sparse functions with CNNs needs only $\tilde{\mathcal{O}}(\log^2d)$ samples, indicating that deep CNNs can efficiently capture long-range sparse correlations. Note that all these are achieved through a novel combination of increased network depth and the utilization of multichanneling and downsampling. Lastly, we study the inductive biases of weight sharing and locality through the lens of symmetry. To separate two biases, we introduce locally-connected networks (LCNs), which can be viewed as CNNs without weight sharing. Specifically, we compare the performance of CNNs, LCNs, and fully-connected networks (FCNs) on a simple regression task. We prove that LCNs require ${\Omega}(d)$ samples while CNNs need only $\tilde{\mathcal{O}}(\log^2d)$ samples, which highlights the cruciality of weight sharing. We also prove that FCNs require $\Omega(d^2)$ samples while LCNs need only $\tilde{\mathcal{O}}(d)$ samples, demonstrating the importance of locality. These provable separations quantify the difference between the two biases, and our major observation behind is that weight sharing and locality break different symmetries in the learning process. | 翻訳日:2023-05-16 15:40:38 公開日:2023-05-15 |
# 量子コヒーレンス支援動的相転移 Quantum coherence assisted dynamical phase transition ( http://arxiv.org/abs/2305.08400v1 ) ライセンス: Link先を確認 | Bao-Ming Xu | (参考訳) 量子コヒーレンス(quantum coherence)は、量子多体系のダイナミクスを理解する上で、間違いなく基本的な役割を果たすだろう。
また, 漁獲零点が虚軸近傍に密着している必要があるため, 虚軸を切断する漁獲零点がdqptを生成するには不十分であることがわかった。
この研究は、量子臨界現象と量子コヒーレンスとの基本的な関係に新しい光を放つ。 Quantum coherence will undoubtedly play a fundamental role in understanding of the dynamics of quantum many-body systems, thereby to reveal its genuine contribution is of great importance. In this paper, we specialize our discussions to the one-dimensional transverse field quantum Ising model initialized in the coherent Gibbs state, and investigate the effects of quantum coherence on dynamical phase transition (DQPT). After quenching the strength of the transverse field, the effects of quantum coherence are studied by Fisher zeros and the rate function of Loschmidt echo. We find that quantum coherence not only recovers DQPT destroyed by thermal fluctuations, but also generates some entirely new DQPTs which are independent of equilibrium quantum critical point. We also find that Fisher zero cutting the imaginary axis is not sufficient to generate DQPT because it also requires the Fisher zeros to be tightly bound close enough to the neighborhood of the imaginary axis. It can be manifested that DQPTs are rooted in quantum fluctuations. This work sheds new light on the fundamental connection between quantum critical phenomena and quantum coherence. | 翻訳日:2023-05-16 15:40:02 公開日:2023-05-15 |
# 最適バイアス境界に基づく大域的量子温度測定 Global quantum thermometry based on the optimal biased bound ( http://arxiv.org/abs/2305.08397v1 ) ライセンス: Link先を確認 | Shoukang Chang, Wei Ye, Xuan Rao, Huan Zhang, Liqing Huang, Mengmeng Luo, Yuetao Chen, Qiang Ma, and Shaoyan Gao | (参考訳) 熱測定は,自然科学の発展過程において重要な基本パラメータ推定問題である。
このため、地球規模での温度測定精度の2つの基礎的境界を導出し、非相互作用スピン1/2ガスと一般的なNレベル熱平衡量子プローブの2つの特定の応用により熱測定性能を示す。 Thermometry is a fundamental parameter estimation problem which is crucial in the development process of natural sciences. One way to solve this problem is to the extensive used local thermometry theory, which makes use of the classical and quantum Cram\'er-Rao bound as benchmarks of thermometry precision. However, such a thermometry theory can only be used for decreasing temperature fluctuations around a known temperature value and hardly tackle the precision thermometry problem over a wide temperature range. For this reason, we derive two basic bounds on thermometry precision in the global setting and further show their thermometry performance by two specific applications, i.e., noninteracting spin-1/2 gas and a general N-level thermal equilibrium quantum probe. | 翻訳日:2023-05-16 15:39:43 公開日:2023-05-15 |
# 文書理解データセットと評価(DUDE) Document Understanding Dataset and Evaluation (DUDE) ( http://arxiv.org/abs/2305.08455v1 ) ライセンス: Link先を確認 | Jordy Landeghem, Rub\'en Tito, {\L}ukasz Borchmann, Micha{\l} Pietruszka, Pawe{\l} J\'oziak, Rafa{\l} Powalski, Dawid Jurkiewicz, Micka\"el Coustaty, Bertrand Ackaert, Ernest Valveny, Matthew Blaschko, Sien Moens, Tomasz Stanis{\l}awek | (参考訳) 私たちはDocAIコミュニティに、現在の方法論を再評価し、より実用的なベンチマークを作成するという課題を受け入れるよう呼びかけています。
Document Understanding Dataset and Evaluation (DUDE) は、視覚的にリッチなドキュメント(VRD)の理解において、中断した研究の進捗を改善しようとしている。
最後に、docaiで言語、画像、レイアウトをモデル化するより効率的な方法を見つけることの重要性を説明している。 We call on the Document AI (DocAI) community to reevaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks. Document Understanding Dataset and Evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs). We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins, and dates. Moreover, we are pushing the boundaries of current methods by creating multi-task and multi-domain evaluation setups that more accurately simulate real-world situations where powerful generalization and adaptation under low-resource settings are desired. DUDE aims to set a new standard as a more practical, long-standing benchmark for the community, and we hope that it will lead to future extensions and contributions that address real-world challenges. Finally, our work illustrates the importance of finding more efficient ways to model language, images, and layout in DocAI. | 翻訳日:2023-05-16 15:32:08 公開日:2023-05-15 |
# 線形光絡み合わせ発生のための隠蔽回路の誤差 Errors in heralded circuits for linear optical entanglement generation ( http://arxiv.org/abs/2305.08452v1 ) ライセンス: Link先を確認 | Reece D. Shaw, Alex E. Jones, Patrick Yard, Anthony Laing | (参考訳) エンタングル状態のヘラルド生成は多くのフォトニック量子技術を支える。
非計算リーク(英: non-computational leakage、例えば、デュアルレール符号化された量子ビットを占有する複数の光子)は、標準的な状態トモグラフィーでは捉えられない誤差であり、計算部分空間に残る光子をポストセレクトする。
これらのツールを用いて、様々なベル状態生成回路を分析し、5つの光子離散フーリエ変換(dft)ベル状態生成スキーム[phys rev. lett. 126 23054 (2021)]が、近理想光子に対して最も頑健であることを示す。
フォトニックエンタングリングゲートのキャラクタリゼーションにより, 現在の断層撮影法を用いて, 漏洩誤差が連結ゲートのモジュラーキャラクタリゼーションを阻害することを示す。
我々の研究は、フォールトトレラントフォトニック量子コンピューティングアーキテクチャで対処しなければならない真のノイズモデルを明らかにするための必要なステップである。 The heralded generation of entangled states underpins many photonic quantum technologies. As quantum error correction thresholds are determined by underlying physical noise mechanisms, a detailed and faithful characterization of resource states is required. Non-computational leakage, e.g. more than one photon occupying a dual-rail encoded qubit, is an error not captured by standard forms of state tomography, which postselect on photons remaining in the computational subspace. Here we use the continuous-variable (CV) formalism and first quantized state representation to develop a simulation framework that reconstructs photonic quantum states in the presence of partial distinguishability and resulting non-computational leakage errors. Using these tools, we analyze a variety of Bell state generation circuits and find that the five photon discrete Fourier transform (DFT) Bell state generation scheme [Phys Rev. Lett. 126 23054 (2021)] is most robust to such errors for near-ideal photons. Through characterization of a photonic entangling gate, we demonstrate how leakage errors prevent a modular characterization of concatenated gates using current tomographical procedures. Our work is a necessary step in revealing the true noise models that must be addressed in fault-tolerant photonic quantum computing architectures. | 翻訳日:2023-05-16 15:31:50 公開日:2023-05-15 |
# マルチエージェントパス探索における追跡の進歩 Tracking Progress in Multi-Agent Path Finding ( http://arxiv.org/abs/2305.08446v1 ) ライセンス: Link先を確認 | Bojie Shen, Zhe Chen, Muhammad Aamir Cheema, Daniel D. Harabor and Peter J. Stuckey | (参考訳) マルチエージェントパス探索(mapf)は、多くの新興産業アプリケーションにとって重要なコア問題である。
本研究の目的は,新しい研究者の参入障壁を低くし,MAPFの研究をさらに促進することにある。 Multi-Agent Path Finding (MAPF) is an important core problem for many new and emerging industrial applications. Many works appear on this topic each year, and a large number of substantial advancements and performance improvements have been reported. Yet measuring overall progress in MAPF is difficult: there are many potential competitors, and the computational burden for comprehensive experimentation is prohibitively large. Moreover, detailed data from past experimentation is usually unavailable. In this work, we introduce a set of methodological and visualisation tools which can help the community establish clear indicators for state-of-the-art MAPF performance and which can facilitate large-scale comparisons between MAPF solvers. Our objectives are to lower the barrier of entry for new researchers and to further promote the study of MAPF, since progress in the area and the main challenges are made much clearer. | 翻訳日:2023-05-16 15:31:23 公開日:2023-05-15 |
# ハイブリッド量子システムにおける量子干渉誘起マグノン遮断とアンチバンチング Quantum interference induced magnon blockade and antibunching in a hybrid quantum system ( http://arxiv.org/abs/2305.08444v1 ) ライセンス: Link先を確認 | Pooja Kumari Gupta, Sampreet Kalita and Amarendra K. Sarma | (参考訳) 本研究では、弱い相互作用を持つハイブリッド強磁性体-超伝導系における量子干渉支援マグノン遮断とマグノンアンチバンチングの現象を研究する。
本研究は,単一マグノン発生装置の構築において重要な役割を果たす手法を提案する。 In this work, we study the phenomena of quantum interference assisted magnon blockade and magnon antibunching in a weakly interacting hybrid ferromagnet-superconductor system. The magnon excitations in two yttrium iron garnet spheres are indirectly coupled to a superconducting qubit through microwave cavity modes of two mutually perpendicular cavities. We find that when one of the magnon mode is driven by a weak optical field, the destructive interference between more than two distinct transition pathways restricts simultaneous excitation of two magnons. We analyze the magnon correlations in the driven magnon mode for the case of zero detunings as well as finite detunings of the magnon modes and the qubit. We show that the magnon antibunching can be tuned by changing the magnon-qubit coupling strength ratio and the driving detuning. Our work proposes a possible scheme which have significant role in the construction of single magnon generating devices. | 翻訳日:2023-05-16 15:31:10 公開日:2023-05-15 |
# 結合量子オットーエンジンの最大出力 Maximum Power of Coupled-Qubit Otto Engines ( http://arxiv.org/abs/2305.08440v1 ) ライセンス: Link先を確認 | Jingyi Gao and Naomichi Hatano | (参考訳) 我々は,結合量子ビット量子オットーマシンと2つの熱浴と2つのワークストレージからなる外部環境とからなる内部システム間の作業と熱伝達に基づく,単一量子ビット量子オットーマシンの一般化である結合量子オットーマシンの4つのスキームを提唱した。
第2に、結合キュービットエンジンと単一キュービットエンジンを、同一のエネルギーレベルの変化に基づいて最大電力を達成する観点から比較し、2つのキュービット間のカップリングによりより大きなパワーが得られるが、最大電力におけるシステム効率は、単一キュービットシステムの効率とカーゾン=アルボーン効率よりも低いことを見出した。 We put forward four schemes of coupled-qubit quantum Otto machine, a generalization of the single-qubit quantum Otto machine, based on work and heat transfer between an internal system consisting of a coupled pair of qubits and an external environment consisting of two heat baths and two work storages. The four schemes of our model are defined by the positions of attaching the heat baths, which play a key role in the power of the coupled-qubit engine. Firstly, for the single-qubit heat engine, we find a maximum-power relation, and the fact that its efficiency at the maximum power is equal to the Otto efficiency, which is greater than the Curzon-Ahlborn efficiency. Second, we compare the coupled-qubit engines to the single-qubit one from the point of view of achieving the maximum power based on the same energy-level change for work production, and find that the coupling between the two qubits can lead to greater powers but the system efficiency at the maximum power is lower than the single-qubit system's efficiency and the Curzon-Ahlborn efficiency. | 翻訳日:2023-05-16 15:30:57 公開日:2023-05-15 |
# 一般ロバスト性に対する逆画像の周波数スペクトルの爆発 Exploiting Frequency Spectrum of Adversarial Images for General Robustness ( http://arxiv.org/abs/2305.08439v1 ) ライセンス: Link先を確認 | Chun Yang Tan, Kazuhiko Kawamoto, Hiroshi Kera | (参考訳) 近年、画像摂動に対する畳み込みニューラルネットワーク(CNN)の脆弱性に対する懸念が高まっている。
本稿では, 相成分に着目した逆行訓練が, クリーン, 逆行, 一般的な汚職精度のモデル性能を著しく向上することを示す。
そこで本研究では,クリーン画像とadversarial画像の振幅スペクトルを交換し,adversarial amplitudeとadversarial phase imageの2つの新しいトレーニング画像を生成する周波数ベースデータ拡張法であるadversarial amplitude swapを提案する。
広範にわたる実験により,我々はCNNが様々な種類の摂動に対して全般的に堅牢性を得ることができ,その結果,あらゆる種類の共通汚職に対して均一な性能が得られることを示した。 In recent years, there has been growing concern over the vulnerability of convolutional neural networks (CNNs) to image perturbations. However, achieving general robustness against different types of perturbations remains challenging, in which enhancing robustness to some perturbations (e.g., adversarial perturbations) may degrade others (e.g., common corruptions). In this paper, we demonstrate that adversarial training with an emphasis on phase components significantly improves model performance on clean, adversarial, and common corruption accuracies. We propose a frequency-based data augmentation method, Adversarial Amplitude Swap, that swaps the amplitude spectrum between clean and adversarial images to generate two novel training images: adversarial amplitude and adversarial phase images. These images act as substitutes for adversarial images and can be implemented in various adversarial training setups. Through extensive experiments, we demonstrate that our method enables the CNNs to gain general robustness against different types of perturbations and results in a uniform performance against all types of common corruptions. | 翻訳日:2023-05-16 15:30:36 公開日:2023-05-15 |
# 深部熱化の非局在性 Nonlocality of Deep Thermalization ( http://arxiv.org/abs/2305.08437v1 ) ライセンス: Link先を確認 | Harshank Shrotriya, Wen Wei Ho | (参考訳) 本研究では, 深部熱処理におけるトポロジーの役割, 最大エントロピー, 均一な測定状態分布への局所サブシステムの緩和, および局所的に相補的なサブシステムの観察について検討した。
具体的には,「最大カオス」ダイナミクスを示す (1+1)d 系のクラスに着目し,そのような普遍波動関数分布の形成速度が系の境界条件にどのように依存するかを検討する。
深部熱化は周期的・開境界条件のいずれかの存在下で指数関数的に高速に達成されるが, 前者の方が後者に比べて2倍の速さで進行する。
この発見は深部熱化の非局所的な性質を強調し、この現象の基礎となる物理が標準量子化のそれを超えることを明らかに示しており、これはサブシステムと補体の絡み合いの純的蓄積に依存している。 We study the role of topology in governing deep thermalization, the relaxation of a local subsystem towards a maximally-entropic, uniform distribution of post-measurement states, upon observing the complementary subsystem in a local basis. Concretely, we focus on a class of (1+1)d systems exhibiting `maximally-chaotic' dynamics, and consider how the rate of the formation of such a universal wavefunction distribution depends on boundary conditions of the system. We find that deep thermalization is achieved exponentially quickly in the presence of either periodic or open boundary conditions; however, the rate at which this occurs is twice as fast for the former than for the latter. These results are attained analytically using the calculus of integration over unitary groups, and supported by extensive numerical simulations. Our findings highlight the nonlocal nature of deep thermalization, and clearly illustrates that the physics underlying this phenomenon goes beyond that of standard quantum thermalization, which only depends on the net build-up of entanglement between a subsystem and its complement. | 翻訳日:2023-05-16 15:30:19 公開日:2023-05-15 |
# EMBRACE: ブースティング RACE の評価と修正 EMBRACE: Evaluation and Modifications for Boosting RACE ( http://arxiv.org/abs/2305.08433v1 ) ライセンス: Link先を確認 | Mariia Zyrianova, Dmytro Kalpakchi, Johan Boye | (参考訳) 機械読影理解モデルの訓練と評価には,実世界の読影理解タスクを代表する高品質なデータセットを扱うことが重要である。
構築上, RACEは上記の品質要件を満たすべきであり, 本記事の目的は, それらが本当に満足しているかどうかを確認することである。
また,mcq応答と生成モデルの評価において必ずしも望ましいものではないテキストの特定の部分に対して,代替語のベースの位置分布が偏っていることを実証した。 When training and evaluating machine reading comprehension models, it is very important to work with high-quality datasets that are also representative of real-world reading comprehension tasks. This requirement includes, for instance, having questions that are based on texts of different genres and require generating inferences or reflecting on the reading material. In this article we turn our attention to RACE, a dataset of English texts and corresponding multiple-choice questions (MCQs). Each MCQ consists of a question and four alternatives (of which one is the correct answer). RACE was constructed by Chinese teachers of English for human reading comprehension and is widely used as training material for machine reading comprehension models. By construction, RACE should satisfy the aforementioned quality requirements and the purpose of this article is to check whether they are indeed satisfied. We provide a detailed analysis of the test set of RACE for high-school students (1045 texts and 3498 corresponding MCQs) including (1) an evaluation of the difficulty of each MCQ and (2) annotations for the relevant pieces of the texts (called "bases") that are used to justify the plausibility of each alternative. A considerable number of MCQs appear not to fulfill basic requirements for this type of reading comprehension tasks, so we additionally identify the high-quality subset of the evaluated RACE corpus. We also demonstrate that the distribution of the positions of the bases for the alternatives is biased towards certain parts of texts, which is not necessarily desirable when evaluating MCQ answering and generation models. | 翻訳日:2023-05-16 15:29:58 公開日:2023-05-15 |
# quanta iff 離散性 Quanta Iff Discreteness ( http://arxiv.org/abs/2305.08431v1 ) ライセンス: Link先を確認 | Marcello Poletti | (参考訳) ここでは、量子力学の基礎に関する短い哲学的考察を紹介する。
さらに、「関係解釈の解釈」が支持され、論理的な不確定性の問題と組み合わせることで、qmの明らかな非論理性が論理の領域内に置かれ、通常のパラドックスに効果的に対応できる有望なアプローチが生み出される。 A brief philosophical inquiry into the foundations of quantum mechanics is presented here. In particular, the direct relationship between granularity, discontinuity, and the presence of quantum effects will be argued. Furthermore, an "interpretation of relational interpretation" will be supported, which, in combination with the problem of logical undecidability, produces a promising approach that places the apparent illogicality of QM within the realm of logic and effectively addresses its usual paradoxes. | 翻訳日:2023-05-16 15:29:34 公開日:2023-05-15 |
# 米国の裁判所意見の法的抽出的要約 Legal Extractive Summarization of U.S. Court Opinions ( http://arxiv.org/abs/2305.08428v1 ) ライセンス: Link先を確認 | Emmanuel Bauer, Dominik Stammbach, Nianlong Gu, Elliott Ash | (参考訳) 本稿では,米国裁判所の430k意見のデータセットに注釈を付した,法的抽出要約の課題について述べる。
これは、法を民主化し、アメリカ合衆国裁判所の意見を一般大衆に公開するための進歩を表している。 This paper tackles the task of legal extractive summarization using a dataset of 430K U.S. court opinions with key passages annotated. According to automated summary quality metrics, the reinforcement-learning-based MemSum model is best and even out-performs transformer-based models. In turn, expert human evaluation shows that MemSum summaries effectively capture the key points of lengthy court opinions. Motivated by these results, we open-source our models to the general public. This represents progress towards democratizing law and making U.S. court opinions more accessible to the general public. | 翻訳日:2023-05-16 15:29:25 公開日:2023-05-15 |
# 1335言語における概念化の言語間比較 A Crosslingual Investigation of Conceptualization in 1335 Languages ( http://arxiv.org/abs/2305.08475v1 ) ライセンス: Link先を確認 | Yihong Liu, Haotian Ye, Leonie Weissweiler, Philipp Wicke, Renhao Pei, Robert Zangenfeind, Hinrich Sch\"utze | (参考訳) 例えば、英語とは対照的に、スワヒリ語は『belly』と『womb』の1つの概念を持っている。
1) 概念の言語間安定性を言語間の1-1対応度として定義し, 具体性が安定性を予測することを示す。
2) 83概念に対する概念化パターンを用いて各言語を表現し, それらの表現について類似度尺度を定義する。
6つの言語ファミリーのうち4つでは、54\%から87\%の精度で概念的類似性に基づいて、言語を正しい家族に割り当てることができます。 Languages differ in how they divide up the world into concepts and words; e.g., in contrast to English, Swahili has a single concept for `belly' and `womb'. We investigate these differences in conceptualization across 1,335 languages by aligning concepts in a parallel corpus. To this end, we propose Conceptualizer, a method that creates a bipartite directed alignment graph between source language concepts and sets of target language strings. In a detailed linguistic analysis across all languages for one concept (`bird') and an evaluation on gold standard data for 32 Swadesh concepts, we show that Conceptualizer has good alignment accuracy. We demonstrate the potential of research on conceptualization in NLP with two experiments. (1) We define crosslingual stability of a concept as the degree to which it has 1-1 correspondences across languages, and show that concreteness predicts stability. (2) We represent each language by its conceptualization pattern for 83 concepts, and define a similarity measure on these representations. The resulting measure for the conceptual similarity of two languages is complementary to standard genealogical, typological, and surface similarity measures. For four out of six language families, we can assign languages to their correct family based on conceptual similarity with accuracy between 54\% and 87\%. | 翻訳日:2023-05-16 15:23:27 公開日:2023-05-15 |
# ディープモードアライメントと自己教師付きマルチタスク学習を用いたマルチモーダル感情分析における共有およびプライベート情報学習 Shared and Private Information Learning in Multimodal Sentiment Analysis with Deep Modal Alignment and Self-supervised Multi-Task Learning ( http://arxiv.org/abs/2305.08473v1 ) ライセンス: Link先を確認 | Songning Lai, Xifeng Hu, Yulong Li, Zhaoxia Ren, Zhi Liu and Danmin Miao | (参考訳) マルチモーダル感情分析タスクのための効果的な表現学習法の設計は重要な研究方向である。
当社のアプローチは,3つの公開データセットの指標のほとんどにおいて,最先端の手法よりも優れています。 Designing an effective representation learning method for multimodal sentiment analysis tasks is a crucial research direction. The challenge lies in learning both shared and private information in a complete modal representation, which is difficult with uniform multimodal labels and a raw feature fusion approach. In this work, we propose a deep modal shared information learning module based on the covariance matrix to capture the shared information between modalities. Additionally, we use a label generation module based on a self-supervised learning strategy to capture the private information of the modalities. Our module is plug-and-play in multimodal tasks, and by changing the parameterization, it can adjust the information exchange relationship between the modes and learn the private or shared information between the specified modes. We also employ a multi-task learning strategy to help the model focus its attention on the modal differentiation training data. We provide a detailed formulation derivation and feasibility proof for the design of the deep modal shared information learning module. We conduct extensive experiments on three common multimodal sentiment analysis baseline datasets, and the experimental results validate the reliability of our model. Furthermore, we explore more combinatorial techniques for the use of the module. Our approach outperforms current state-of-the-art methods on most of the metrics of the three public datasets. | 翻訳日:2023-05-16 15:23:07 公開日:2023-05-15 |
# ガウス量子チャネルを超えて:モデルケース Beyond Gaussian Quantum Channels: A model case ( http://arxiv.org/abs/2305.08467v1 ) ライセンス: Link先を確認 | Daniel Speed, Wenyang Lyu and Roman Schubert | (参考訳) ガウス量子チャネルはよく理解されており、量子情報理論や量子光学において多くの応用がある。
最終的にこれらの結果を、状態のフォン・ノイマンエントロピーの進化の研究に適用する。 Gaussian quantum channels are well understood and have many applications, e.g., in Quantum Information Theory and in Quantum Optics. For more general quantum channels one can in general use semiclassical approximations or perturbation theory, but it is not easy to judge the accuracy of such methods. We study a relatively simple model case, where the quantum channel is generated by a Lindblad equation where one of the Lindblad operators is a multiple of the internal Hamiltonian, and therefore the channel is not Gaussian. For this model we can compute the characteristic function of the action of the channel on a Gaussian state explicitly and we can as well derive a representation of the propagator in an integral form. This allows us to compare the exact results with semiclassical approximations and perturbation theory and evaluate their accuracy. We finally apply these results to the study of the evolution of the von Neumann entropy of a state. | 翻訳日:2023-05-16 15:22:46 公開日:2023-05-15 |
# ディープニューラルネットワーク導波路におけるほぼ最適VC次元と擬似次元境界 Nearly Optimal VC-Dimension and Pseudo-Dimension Bounds for Deep Neural Network Derivatives ( http://arxiv.org/abs/2305.08466v1 ) ライセンス: Link先を確認 | Yahong Yang, Haizhao Yang, Yang Xiang | (参考訳) 本稿では,ほぼ最適なVapnik-Chervonenkis次元(VC次元)の問題とディープニューラルネットワーク(DNN)の導関数の擬次元推定について述べる。
1) ソボレフ空間におけるDNNのほぼ緊密な近似結果の確立
2) 関数導関数を含む損失関数を含む機械学習手法の一般化誤差を特徴付ける。
この理論的研究は、生成モデル、偏微分方程式の解法、演算子学習、ネットワーク圧縮、蒸留、正規化などを含む、幅広い物理インフォームド機械学習モデルと応用のための学習誤差推定のギャップを埋めるものである。 This paper addresses the problem of nearly optimal Vapnik--Chervonenkis dimension (VC-dimension) and pseudo-dimension estimations of the derivative functions of deep neural networks (DNNs). Two important applications of these estimations include: 1) Establishing a nearly tight approximation result of DNNs in the Sobolev space; 2) Characterizing the generalization error of machine learning methods with loss functions involving function derivatives. This theoretical investigation fills the gap of learning error estimations for a wide range of physics-informed machine learning models and applications including generative models, solving partial differential equations, operator learning, network compression, distillation, regularization, etc. | 翻訳日:2023-05-16 15:22:32 公開日:2023-05-15 |
# 量子ドットを用いたMajorana量子ビットのブレイディングに基づく量子制御 Braiding-based quantum control of a Majorana qubit built from quantum dots ( http://arxiv.org/abs/2305.08464v1 ) ライセンス: Link先を確認 | P\'eter Boross and Andr\'as P\'alyi | (参考訳) トポロジー関連のアイデアは、ノイズ耐性量子コンピューティングにつながるかもしれない。
本プロトコルは, majorana qubit の非保護制御,ブレイディングベースの保護制御,および readout を組み込んでいる。
本稿では, 位相量子ゲートの目印として忠実度高原が観測されるように, ダイアバティックエラーと障害誘発クビットデファス化の両面を抑制するための定量的ガイドラインを提案する。
我々のシミュレーションは、マヨラナゼロモードや他のトポロジカルキュービットアーキテクチャによる将来のブレイディング実験で見られるであろう現実的な特徴を予測する。 Topology-related ideas might lead to noise-resilient quantum computing. For example, it is expected that the slow spatial exchange (`braiding') of Majorana zero modes in superconductors yields quantum gates that are robust against disorder. Here, we report our numerical experiments, which describe the dynamics of a Majorana qubit built from quantum dots controlled by time-dependent gate voltages. Our protocol incorporates non-protected control, braiding-based protected control, and readout, of the Majorana qubit. We use the Kitaev chain model for the simulations, and focus on the case when the main source of errors is quasistatic charge noise affecting the hybridization energy splitting of the Majorana modes. We provide quantitative guidelines to suppress both diabatic errors and disorder-induced qubit dephasing, such that a fidelity plateau is observed as the hallmark of the topological quantum gate. Our simulations predict realistic features that are expected to be seen in future braiding experiments with Majorana zero modes and other topological qubit architectures. | 翻訳日:2023-05-16 15:22:19 公開日:2023-05-15 |
# 平均シフトの収束解析 Convergence Analysis of Mean Shift ( http://arxiv.org/abs/2305.08463v1 ) ライセンス: Link先を確認 | Ryoya Yamasaki, Toshiyuki Tanaka | (参考訳) 平均シフト(MS)アルゴリズムは、カーネル密度推定(KDE)のモードを求める。
解析的カーネルとエパネチニコフカーネルを対象とする既存カーネルを拡張した本研究では,KDEに基づくモード推定の漸近的統計的効率の観点から,非負のカーネル間で最適な双重カーネルをカバーすることが重要である。 The mean shift (MS) algorithm seeks a mode of the kernel density estimate (KDE). This study presents a convergence guarantee of the mode estimate sequence generated by the MS algorithm and an evaluation of the convergence rate, under fairly mild conditions, with the help of the argument concerning the {\L}ojasiewicz inequality. Our findings, which extend existing ones covering analytic kernels and the Epanechnikov kernel, are significant in that they cover the biweight kernel that is optimal among non-negative kernels in terms of the asymptotic statistical efficiency for the KDE-based mode estimation. | 翻訳日:2023-05-16 15:22:02 公開日:2023-05-15 |
# すべてのピクセルが等しいわけではない:セマンティックセグメンテーションのためのピクセルハードネスの学習 Not All Pixels Are Equal: Learning Pixel Hardness for Semantic Segmentation ( http://arxiv.org/abs/2305.08462v1 ) ライセンス: Link先を確認 | Xin Xiao, Daiguo Zhou, Jiagao Hu, Yi Hu, Yongchao Xu | (参考訳) セマンティックセグメンテーションは近年大きく進歩している。
ソースコードはhttps://github.com/menoly-xin/hardness-level-learningで入手できる。 Semantic segmentation has recently witnessed great progress. Despite the impressive overall results, the segmentation performance in some hard areas (e.g., small objects or thin parts) is still not promising. A straightforward solution is hard sample mining, which is widely used in object detection. Yet, most existing hard pixel mining strategies for semantic segmentation often rely on pixel's loss value, which tends to decrease during training. Intuitively, the pixel hardness for segmentation mainly depends on image structure and is expected to be stable. In this paper, we propose to learn pixel hardness for semantic segmentation, leveraging hardness information contained in global and historical loss values. More precisely, we add a gradient-independent branch for learning a hardness level (HL) map by maximizing hardness-weighted segmentation loss, which is minimized for the segmentation head. This encourages large hardness values in difficult areas, leading to appropriate and stable HL map. Despite its simplicity, the proposed method can be applied to most segmentation methods with no and marginal extra cost during inference and training, respectively. Without bells and whistles, the proposed method achieves consistent/significant improvement (1.37% mIoU on average) over most popular semantic segmentation methods on Cityscapes dataset, and demonstrates good generalization ability across domains. The source codes are available at https://github.com/Menoly-xin/Hardness-Level-Learning . | 翻訳日:2023-05-16 15:21:51 公開日:2023-05-15 |
# 量子信頼性 Quantum reliability ( http://arxiv.org/abs/2305.08461v1 ) ライセンス: Link先を確認 | L.X.Cui, Y-M.Du, and C.P.Sun | (参考訳) 本研究では,量子コヒーレンスに依存する機能系の信頼性について検討する。
この効果は、正確な操作を必要とする複数の相互作用サブシステムを持つ量子錯体に特に関係している。 The present study investigates the reliability of functioning systems that depend on quantum coherence. In contrast to the conventional notion of reliability in industry and technology, which is evaluated using probabilistic measurements of binary logical variables, quantum reliability is grounded in the quantum probability amplitude, or wave function, due to the interference between different system trajectories. A system of quantum storage with a fault-tolerance structure is presented to illustrate the definition and calculation of quantum reliability. Our findings reveal that quantum coherence alters the relationship between a system's reliability and that of its subsystems, compared to classical cases. This effect is particularly relevant for quantum complexes with multiple interacting subsystems that require a precise operation. | 翻訳日:2023-05-16 15:21:21 公開日:2023-05-15 |
# 遺伝的ランダムニューラルネットワークの力学平均場理論入門 Introduction to dynamical mean-field theory of generic random neural networks ( http://arxiv.org/abs/2305.08459v1 ) ライセンス: Link先を確認 | Wenxuan Zou and Haiping Huang | (参考訳) 動的平均場理論(英: dynamical mean-field theory)は、ニューラルネットワークの典型的な振る舞いを分析するために用いられる強力な物理ツールである。
積分微分平均場方程式の解法に関する数値的な実装についても詳述し、揺らぎ散逸定理を探求する図解である。 Dynamical mean-field theory is a powerful physics tool used to analyze the typical behavior of neural networks, where neurons can be recurrently connected, or multiple layers of neurons can be stacked. However, it is not easy for beginners to access the essence of this tool and the underlying physics. Here, we give a pedagogical introduction of this method in a particular example of generic random neural networks, where neurons are randomly and fully connected by correlated synapses and therefore the network exhibits rich emergent collective dynamics. We also review related past and recent important works applying this tool. In addition, a physically transparent and alternative method, namely the dynamical cavity method, is also introduced to derive exactly the same results. The numerical implementation of solving the integro-differential mean-field equations is also detailed, with an illustration of exploring the fluctuation dissipation theorem. | 翻訳日:2023-05-16 15:21:09 公開日:2023-05-15 |
# MolHF:分子グラフ生成のための階層的正規化フロー MolHF: A Hierarchical Normalizing Flow for Molecular Graph Generation ( http://arxiv.org/abs/2305.08457v1 ) ライセンス: Link先を確認 | Yiheng Zhu, Zhenqiu Ouyang, Ben Liao, Jialu Wu, Yixuan Wu, Chang-Yu Hsieh, Tingjun Hou, Jian Wu | (参考訳) 分子デノボ設計は科学分野において重要な課題であり、望ましい特性プロファイルを持つ新しい分子構造を設計することを目的としている。
コードとモデルはhttps://github.com/violet-sto/molhfで入手できる。 Molecular de novo design is a critical yet challenging task in scientific fields, aiming to design novel molecular structures with desired property profiles. Significant progress has been made by resorting to generative models for graphs. However, limited attention is paid to hierarchical generative models, which can exploit the inherent hierarchical structure (with rich semantic information) of the molecular graphs and generate complex molecules of larger size that we shall demonstrate to be difficult for most existing models. The primary challenge to hierarchical generation is the non-differentiable issue caused by the generation of intermediate discrete coarsened graph structures. To sidestep this issue, we cast the tricky hierarchical generation problem over discrete spaces as the reverse process of hierarchical representation learning and propose MolHF, a new hierarchical flow-based model that generates molecular graphs in a coarse-to-fine manner. Specifically, MolHF first generates bonds through a multi-scale architecture, then generates atoms based on the coarsened graph structure at each scale. We demonstrate that MolHF achieves state-of-the-art performance in random generation and property optimization, implying its high capacity to model data distribution. Furthermore, MolHF is the first flow-based model that can be applied to model larger molecules (polymer) with more than 100 heavy atoms. The code and models are available at https://github.com/violet-sto/MolHF. | 翻訳日:2023-05-16 15:20:52 公開日:2023-05-15 |
# 抽象的多文書要約のための階層的符号化復号法 A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization ( http://arxiv.org/abs/2305.08503v1 ) ライセンス: Link先を確認 | Chenhui Shen, Liying Cheng, Yang You, Lidong Bing | (参考訳) 事前学習言語モデル(PLM)は、抽象的な単一文書要約(SDS)において素晴らしい成果を上げている。
しかし、そのような利点は、文書間の相互作用がより複雑であるmuti-document summarization(mds)に簡単には拡張できない。
大規模な実験により,提案手法はこれらのデータセットに対して一貫した改善を達成でき,過去の最高のモデルよりも優れており,MDS事前学習またはより大きなモデルパラメータを付加したモデルと比較して,優れた,あるいは競争的な結果を得ることができることが示された。 Pre-trained language models (PLMs) have accomplished impressive achievements in abstractive single-document summarization (SDS). However, such benefits may not be readily extended to muti-document summarization (MDS), where the interactions among documents are more complex. Previous works either design new architectures or new pre-training objectives for MDS, or apply PLMs to MDS without considering the complex document interactions. While the former does not make full use of previous pre-training efforts and may not generalize well across multiple domains, the latter cannot fully attend to the intricate relationships unique to MDS tasks. In this paper, we enforce hierarchy on both the encoder and decoder and seek to make better use of a PLM to facilitate multi-document interactions for the MDS task. We test our design on 10 MDS datasets across a wide range of domains. Extensive experiments show that our proposed method can achieve consistent improvements on all these datasets, outperforming the previous best models, and even achieving better or competitive results as compared to some models with additional MDS pre-training or larger model parameters. | 翻訳日:2023-05-16 15:13:07 公開日:2023-05-15 |
# MeeQA: 会議での自然な質問 MeeQA: Natural Questions in Meeting Transcripts ( http://arxiv.org/abs/2305.08502v1 ) ライセンス: Link先を確認 | Reut Apel, Tom Braude, Amir Kantor, Eyal Kolman | (参考訳) 本稿では,自然言語による質問応答のデータセットであるMeeQAについて述べる。
このタイプの質問に対するベースラインモデル性能を改善するために,テキストに答えのない質問に対する性能向上を目的とした新しい損失関数 \emph{Flat Hierarchical Loss} を提案する。
我々の実験は、標準的なQAモデルよりも我々のアプローチを使うことの利点を実証している。 We present MeeQA, a dataset for natural-language question answering over meeting transcripts. It includes real questions asked during meetings by its participants. The dataset contains 48K question-answer pairs, extracted from 422 meeting transcripts, spanning multiple domains. Questions in transcripts pose a special challenge as they are not always clear, and considerable context may be required in order to provide an answer. Further, many questions asked during meetings are left unanswered. To improve baseline model performance on this type of questions, we also propose a novel loss function, \emph{Flat Hierarchical Loss}, designed to enhance performance over questions with no answer in the text. Our experiments demonstrate the advantage of using our approach over standard QA models. | 翻訳日:2023-05-16 15:12:46 公開日:2023-05-15 |
# ラベル平滑化はモデルの誤特定に対するロバスト化である Label Smoothing is Robustification against Model Misspecification ( http://arxiv.org/abs/2305.08501v1 ) ライセンス: Link先を確認 | Ryoya Yamasaki, Toshiyuki Tanaka | (参考訳) label smoothing (ls) は分類タスクにおいて滑らかなターゲットを採用する。
例えば、二項分類では、従来のロジスティック回帰(LR)で使用される1ホットターゲット$(1,0)^\top$の代わりに、LS (LSLR) のLRは滑らかなターゲット $(1-\frac{\alpha}{2},\frac{\alpha}{2})^\top$ を、滑らかなレベル $\alpha\in(0,1)$ で使用する。
これらの比較によって提供されるLSの特性の理解により、LSLRよりも優れたMLSLRを提案することができる。 Label smoothing (LS) adopts smoothed targets in classification tasks. For example, in binary classification, instead of the one-hot target $(1,0)^\top$ used in conventional logistic regression (LR), LR with LS (LSLR) uses the smoothed target $(1-\frac{\alpha}{2},\frac{\alpha}{2})^\top$ with a smoothing level $\alpha\in(0,1)$, which causes squeezing of values of the logit. Apart from the common regularization-based interpretation of LS that leads to an inconsistent probability estimator, we regard LSLR as modifying the loss function and consistent estimator for probability estimation. In order to study the significance of each of these two modifications by LSLR, we introduce a modified LSLR (MLSLR) that uses the same loss function as LSLR and the same consistent estimator as LR, while not squeezing the logits. For the loss function modification, we theoretically show that MLSLR with a larger smoothing level has lower efficiency with correctly-specified models, while it exhibits higher robustness against model misspecification than LR. Also, for the modification of the probability estimator, an experimental comparison between LSLR and MLSLR showed that this modification and squeezing of the logits in LSLR have negative effects on the probability estimation and classification performance. The understanding of the properties of LS provided by these comparisons allows us to propose MLSLR as an improvement over LSLR. | 翻訳日:2023-05-16 15:12:36 公開日:2023-05-15 |
# コンテクスト化コモンセンス知識グラフの類似度重み付き構築 Similarity-weighted Construction of Contextualized Commonsense Knowledge Graphs for Knowledge-intense Argumentation Tasks ( http://arxiv.org/abs/2305.08495v1 ) ライセンス: Link先を確認 | Moritz Plenz, Juri Opitz, Philipp Heinisch, Philipp Cimiano, Anette Frank | (参考訳) 議論はしばしば、結論が前提からどのように従うかを明確にしない。
最後に,知識に敏感な議論品質評価タスクにおけるcckgの有効性を実証し,強力なベースラインを上回り,gpt-3ベースのシステムと比較した。 Arguments often do not make explicit how a conclusion follows from its premises. To compensate for this lack, we enrich arguments with structured background knowledge to support knowledge-intense argumentation tasks. We present a new unsupervised method for constructing Contextualized Commonsense Knowledge Graphs (CCKGs) that selects contextually relevant knowledge from large knowledge graphs (KGs) efficiently and at high quality. Our work goes beyond context-insensitive knowledge extraction heuristics by computing semantic similarity between KG triplets and textual arguments. Using these triplet similarities as weights, we extract contextualized knowledge paths that connect a conclusion to its premise, while maximizing similarity to the argument. We combine multiple paths into a CCKG that we optionally prune to reduce noise and raise precision. Intrinsic evaluation of the quality of our graphs shows that our method is effective for (re)constructing human explanation graphs. Manual evaluations in a large-scale knowledge selection setup confirm high recall and precision of implicit CSK in the CCKGs. Finally, we demonstrate the effectiveness of CCKGs in a knowledge-insensitive argument quality rating task, outperforming strong baselines and rivaling a GPT-3 based system. | 翻訳日:2023-05-16 15:11:59 公開日:2023-05-15 |
# Creative Data Generation: テキストと詩を中心にしたレビュー Creative Data Generation: A Review Focusing on Text and Poetry ( http://arxiv.org/abs/2305.08493v1 ) ライセンス: Link先を確認 | Mohamad Elzohbi, Richard Zhao | (参考訳) 機械学習の急速な進歩は、自動データ生成の急増につながり、自然データと人間データと機械データとの区別がますます困難になっている。
創造的なデータ生成の分野での課題と機会に光を当てることを目指しています。 The rapid advancement in machine learning has led to a surge in automatic data generation, making it increasingly challenging to differentiate between naturally or human-generated data and machine-generated data. Despite these advancements, the generation of creative data remains a challenge. This paper aims to investigate and comprehend the essence of creativity, both in general and within the context of natural language generation. We review various approaches to creative writing devices and tasks, with a specific focus on the generation of poetry. We aim to shed light on the challenges and opportunities in the field of creative data generation. | 翻訳日:2023-05-16 15:11:37 公開日:2023-05-15 |
# 児童データ保護規則と保護ガイドラインによるandroidアプリケーションの適合性について On the conformance of Android applications with children's data protection regulations and safeguarding guidelines ( http://arxiv.org/abs/2305.08492v1 ) ライセンス: Link先を確認 | Ricardo Lopes and Vinh Thong Ta and Yannis Korkontzelos | (参考訳) オンライン技術が急速に発展し、子どもの間で携帯電話の普及が進み、インターネットの安全を守ることが不可欠である。 一部の研究では、オンライン虐待とインシデントが子供のメンタルヘルスと発達に悪影響を及ぼすと報告されている。 本稿では,androidアプリケーション(開発者)が子どものデータ保護に関する規則(一般データ保護規則(gdpr)など)と子どものオンライン保護ガイドラインにどのように従っているかを検討する。 調査の結果,非準拠アプリの数はまだ大きいことがわかった。 子ど |