# ウィキペディアの引用によるエンゲージメントの定量化

Quantifying Engagement with Citations on Wikipedia ( http://arxiv.org/abs/2001.08614v2 )

Wikipedia, the free online encyclopedia that anyone can edit, is one of the most visited sites on the Web and a common source of information for many users. As an encyclopedia, Wikipedia is not a source of original information, but was conceived as a gateway to secondary sources: according to Wikipedia's guidelines, facts must be backed up by reliable sources that reflect the full spectrum of views on the topic. Although citations lie at the very heart of Wikipedia, little is known about how users interact with them. To close this gap, we built client-side instrumentation for logging all interactions with links leading from English Wikipedia articles to cited references during one month, and conducted the first analysis of readers' interaction with citations on Wikipedia. We find that overall engagement with citations is low: about one in 300 page views results in a reference click (0.29% overall; 0.56% on desktop; 0.13% on mobile). Matched observational studies of the factors associated with reference clicking reveal that clicks occur more frequently on shorter pages and on pages of lower quality, suggesting that references are consulted more commonly when Wikipedia itself does not contain the information sought by the user. Moreover, we observe that recent content, open access sources and references about life events (births, deaths, marriages, etc) are particularly popular. Taken together, our findings open the door to a deeper understanding of Wikipedia's role in a global information economy where reliability is ever less certain, and source attribution ever more vital.
# アンサンブル密度汎関数理論における密度駆動相関:原子の単純な励起からの考察

Density driven correlations in ensemble density functional theory: insights from simple excitations in atoms ( http://arxiv.org/abs/2001.09429v1 )

Ensemble density functional theory extends the usual Kohn-Sham machinery to quantum state ensembles involving ground- and excited states. Recent work by the authors [Phys. Rev. Lett. 119, 243001 (2017); 123, 016401 (2019)] has shown that both the Hartree-exchange and correlation energies can attain unusual features in ensembles. Density-driven(DD) correlations -- which account for the fact that pure-state densities in Kohn-Sham ensembles do not necessarily reproduce those of interacting pure states -- are one such feature. Here we study atoms (specifically $S$--$P$ and $S$--$S$ transitions) and show that the magnitude and behaviour of DD correlations can vary greatly with the variation of the orbital angular momentum of the involved states. Such estimations are obtained through an approximation for DD correlations built from relevant exact conditions Kohn-Sham inversion, and plausible assumptions for weakly correlated systems.
# $\mathcal{PT}$-Supersymmetric Square Well and Barrier

$\mathcal{PT}$-Supersymmetric Square Well and Barrier ( http://arxiv.org/abs/2001.09418v1 )

The Parity-Time ($\mathcal{PT}$) symmetric potentials are derived by non-Hermitian supersymmetric quantum mechanics for square well and barrier. These $\mathcal{PT}$-supersymmetric square well and barrier. The partners have complex partners. The partners are isospectral with real energies. $\mathcal{PT}$-symmetry is only unbroken for the bound states.
# 符号付きグラフにおける偏光探索:局所スペクトルアプローチ

Searching for polarization in signed graphs: a local spectral approach ( http://arxiv.org/abs/2001.09410v1 )

Signed graphs have been used to model interactions in social net-works, which can be either positive (friendly) or negative (antagonistic). The model has been used to study polarization and other related phenomena in social networks, which can be harmful to the process of democratic deliberation in our society. An interesting and challenging task in this application domain is to detect polarized communities in signed graphs. A number of different methods have been proposed for this task. However, existing approaches aim at finding globally optimal solutions. Instead, in this paper we are interested in finding polarized communities that are related to a small set of seed nodes provided as input. Seed nodes may consist of two sets, which constitute the two sides of a polarized structure. In this paper we formulate the problem of finding local polarized communities in signed graphs as a locally-biased eigen-problem. By viewing the eigenvector associated with the smallest eigenvalue of the Laplacian matrix as the solution of a constrained optimization problem, we are able to incorporate the local information as an additional constraint. In addition, we show that the locally-biased vector can be used to find communities with approximation guarantee with respect to a local analogue of the Cheeger constant on signed graphs. By exploiting the sparsity in the input graph, an indicator vector for the polarized communities can be found in time linear to the graph size. Our experiments on real-world networks validate the proposed algorithm and demonstrate its usefulness in finding local structures in this semi-supervised manner.
# 光子状態の確率表現とトモグラフィー

Probability representation of photon states and tomography ( http://arxiv.org/abs/2001.10361v1 )

We give a review of the tomographic probability representation of quantum mechanics. We present the formalism of quantum states and quantum observables using the formalism of standard probability distributions and classical-like random variables. We study the coherent and number states of photons in the probability representation and obtain the evolution equation and energy spectra in the form of equations for probability distributions.
# 非拡張可能な最大絡み合いベースから相互に偏りのないベースを構築する

Constructing mutually unbiased bases from unextendible maximally entangled bases ( http://arxiv.org/abs/2001.09515v1 )

We study mutually unbiased bases (MUBs) in which all the bases are unextendible maximally entangled ones. We first present a necessary and sufficient condition of constructing a pair of MUBs in $C^2 \otimes C^4$. Based on this condition, an analytical and necessary condition for constructing MUBs is given. Moreover we illustrate our approach by some detailed examples in $C^2 \otimes C^4$. The results are generalized to $C^2 \otimes C^d$ $(d\geq 3)$ and a concrete example in $C^2 \otimes C^8$ is given.
# 核融合分裂における物質波干渉の質量角相関

Matter-wave interference originates mass-angle correlation in fusion-fission ( http://arxiv.org/abs/2001.09511v1 )

Mass-angle correlation of fission fragments has been understood as manifestation of quasifission. We show that this is not so: the effect can originate from correlation between fusion-fission amplitudes with different total spins signifying matter-wave interference in compound nucleus processes. This resolves the well-known puzzle with the mass-angle correlation in the complete fusion sub-barrier reaction $^{16}$O+$^{238}$U. Our finding is important for more reliable predictions of production cross sections for superheavy elements. Matter-wave interference also produces quantum-classical transition to the time-orientation localization of the coherently rotating dinucleus in quasifission.
# 電子健康記録の二次利用の可能性と課題

Secondary Use of Electronic Health Record: Opportunities and Challenges ( http://arxiv.org/abs/2001.09479v1 )

In present technological era, healthcare providers generate huge amount of clinical data on daily basis. Generated clinical data is stored digitally in the form of Electronic Health Records (EHR) as a central data repository of hospitals. Data contained in EHR is not only used for the patients' primary care but also for various secondary purposes such as clinical research, automated disease surveillance and clinical audits for quality enhancement. Using EHR data for secondary purposes without consent or in some cases even with consent creates privacy issues for individuals. Secondly, EHR data is also made accessible to various stake holders including different government agencies at various geographical sites through wired or wireless networks. Sharing of EHR across multiples agencies makes it vulnerable to cyber attacks and also makes it difficult to implement strict privacy laws as in some cases data is shared with organization that is governed by specific regional law. Privacy of an individual could be severely affected when their sensitive private information contained in EHR is leaked or exposed to public. Data leak can cause financial losses or an individuals may encounter social boycott if their medical condition is exposed in public. To protect patients personal data from such threats, there exists different privacy regulations such as GDPR, HIPAA and MHR. However, continually evolving state-of-the-art techniques in machine learning, data analytics and hacking are making it even more difficult to completely protect individual's / patient's privacy. In this article, we have systematically examined various secondary uses of EHR with the aim to highlight how these secondary uses effect patients' privacy. Secondly, we have critically analyzed GDPR and highlighted possible areas of improvement, considering escalating use of technology and different secondary uses of EHR.
# ソーシャルウェブにおける情報の信頼性--文脈・アプローチ・オープン・イシュー

Information Credibility in the Social Web: Contexts, Approaches, and Open Issues ( http://arxiv.org/abs/2001.09473v1 )

In the Social Web scenario, large amounts of User-Generated Content (UGC) are diffused through social media often without almost any form of traditional trusted intermediaries. Therefore, the risk of running into misinformation is not negligible. For this reason, assessing and mining the credibility of online information constitutes nowadays a fundamental research issue. Credibility, also referred as believability, is a quality perceived by individuals, who are not always able to discern, with their own cognitive capacities, genuine information from fake one. Hence, in the last years, several approaches have been proposed to automatically assess credibility in social media. Many of them are based on data-driven models, i.e., they employ machine learning techniques to identify misinformation, but recently also model-driven approaches are emerging, as well as graph-based approaches focusing on credibility propagation, and knowledge-based ones exploiting Semantic Web technologies. Three of the main contexts in which the assessment of information credibility has been investigated concern: (i) the detection of opinion spam in review sites, (ii) the detection of fake news in microblogging, and (iii) the credibility assessment of online health-related information. In this article, the main issues connected to the evaluation of information credibility in the Social Web, which are shared by the above-mentioned contexts, are discussed. A concise survey of the approaches and methodologies that have been proposed in recent years to address these issues is also presented.
# Block the Blocker: Anti Ad-Blockingの効果に関する研究

Block the blocker: Studying the effects of Anti Ad-blocking ( http://arxiv.org/abs/2001.09434v1 )

Advertisements generate huge chunks of revenues for websites and online businesses. Ad-blocker and tracker blocking programs have gained momentum in the last few years with massive debates raging on privacy concerns and improving user experience online. Acceptable Ads programme and Anti Ad-blockers are primary elements emerging in recent years that combat ad-blockers. In this paper, we discuss at length data collection of top websites in the world, Germany, DACH region and news category. We generate feature based A/B testing metrics and employ classifier evaluations on them along with then analysing the result. Our paper also discusses how Anti Ad-blockers impact the economic, legal and ethical usage in Germany along with the recent changes in GDPR while taking a look at Acceptable ads programme and Whitelisting.
# 量子重力の因果的離散場理論

Causal discrete field theory for quantum gravity ( http://arxiv.org/abs/2001.10819v1 )

The proposed theory of causally structured discrete fields studies integer values on directed edges of a self-similar graph with a propagation rule, which we define as a set of valid combinations of integer values and edge directions around any vertex of the graph. There is an infinite countable number of variants of the theory for a given self-similar graph depending on the choice of propagation rules, some of these models can generate infinite uncountable sets of patterns. This theory takes minimum assumptions of causality, discreteness, locality, and determinism, as well as fundamental symmetries of isotropy, CPT invariance, and charge conservation. It combines the elements of cellular automata, causal sets, loop quantum gravity, and causal dynamical triangulations to become an excellent candidate to describe quantum gravity at the Planck scale. In addition to the self-consistent generation of spacetime and metrics to describe gravity and an expanding closed Universe, the theory allows for the many-worlds interpretation of quantum mechanics. We also demonstrate how to get to unitary evolution in Hilbert space for a stationary Universe with deterministic propagation.
# 量子確率規則を導出するための示唆的な方法

A Suggestive Way of Deriving the Quantum Probability Rule ( http://arxiv.org/abs/2001.10364v1 )

The familiar "modulus squared" form of all quantum mechanical probabilities is derived from an assumption of equal a priori probabilities concerning the final states available.
# 糖尿病網膜症診断のポイント・オブ・ケア : スタンドアロン・モバイル・アプローチ

Point-of-Care Diabetic Retinopathy Diagnosis: A Standalone Mobile Application Approach ( http://arxiv.org/abs/2002.04066v1 )

Although deep learning research and applications have grown rapidly over the past decade, it has shown limitation in healthcare applications and its reachability to people in remote areas. One of the challenges of incorporating deep learning in medical data classification or prediction is the shortage of annotated training data in the healthcare industry. Medical data sharing privacy issues and limited patient population size can be stated as some of the reasons for training data insufficiency in healthcare. Methods to exploit deep learning applications in healthcare have been proposed and implemented in this dissertation. Traditional diagnosis of diabetic retinopathy requires trained ophthalmologists and expensive imaging equipment to reach healthcare centres in order to provide facilities for treatment of preventable blindness. Diabetic people residing in remote areas with shortage of healthcare services and ophthalmologists usually fail to get periodical diagnosis of diabetic retinopathy thereby facing the probability of vision loss or impairment. Deep learning and mobile application development have been integrated in this dissertation to provide an easy to use point-of-care smartphone based diagnosis of diabetic retinopathy. In order to solve the challenge of shortage of healthcare centres and trained ophthalmologists, the standalone diagnostic service was built so as to be operated by a non-expert without an internet connection. This approach could be transferred to other areas of medical image classification.
# トレリス符号化量子化を用いた深層学習に基づく画像圧縮

Deep Learning-based Image Compression with Trellis Coded Quantization ( http://arxiv.org/abs/2001.09417v1 )

Recently many works attempt to develop image compression models based on deep learning architectures, where the uniform scalar quantizer (SQ) is commonly applied to the feature maps between the encoder and decoder. In this paper, we propose to incorporate trellis coded quantizer (TCQ) into a deep learning based image compression framework. A soft-to-hard strategy is applied to allow for back propagation during training. We develop a simple image compression model that consists of three subnetworks (encoder, decoder and entropy estimation), and optimize all of the components in an end-to-end manner. We experiment on two high resolution image datasets and both show that our model can achieve superior performance at low bit rates. We also show the comparisons between TCQ and SQ based on our proposed baseline model and demonstrate the advantage of TCQ.
# 複数偽陰性アノテーションに対するロバスト性学習脳転移セグメンテーションネットワーク

Brain Metastasis Segmentation Network Trained with Robustness to Annotations with Multiple False Negatives ( http://arxiv.org/abs/2001.09501v1 )

Deep learning has proven to be an essential tool for medical image analysis. However, the need for accurately labeled input data, often requiring time- and labor-intensive annotation by experts, is a major limitation to the use of deep learning. One solution to this challenge is to allow for use of coarse or noisy labels, which could permit more efficient and scalable labeling of images. In this work, we develop a lopsided loss function based on entropy regularization that assumes the existence of a nontrivial false negative rate in the target annotations. Starting with a carefully annotated brain metastasis lesion dataset, we simulate data with false negatives by (1) randomly censoring the annotated lesions and (2) systematically censoring the smallest lesions. The latter better models true physician error because smaller lesions are harder to notice than the larger ones. Even with a simulated false negative rate as high as 50%, applying our loss function to randomly censored data preserves maximum sensitivity at 97% of the baseline with uncensored training data, compared to just 10% for a standard loss function. For the size-based censorship, performance is restored from 17% with the current standard to 88% with our lopsided bootstrap loss. Our work will enable more efficient scaling of the image labeling process, in parallel with other approaches on creating more efficient user interfaces and tools for annotation.
# シミュレーションデータを使って 気候変動のイメージを

Using Simulated Data to Generate Images of Climate Change ( http://arxiv.org/abs/2001.09531v1 )

Generative adversarial networks (GANs) used in domain adaptation tasks have the ability to generate images that are both realistic and personalized, transforming an input image while maintaining its identifiable characteristics. However, they often require a large quantity of training data to produce high-quality images in a robust way, which limits their usability in cases when access to data is limited. In our paper, we explore the potential of using images from a simulated 3D environment to improve a domain adaptation task carried out by the MUNIT architecture, aiming to use the resulting images to raise awareness of the potential future impacts of climate change.
# ディープニューラルネットワークのノイズロバスト性の解析

Analyzing the Noise Robustness of Deep Neural Networks ( http://arxiv.org/abs/2001.09395v1 )

Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. Although much work has been done on both adversarial attack and defense, a fine-grained understanding of adversarial examples is still lacking. To address this issue, we present a visual analysis method to explain why adversarial examples are misclassified. The key is to compare and analyze the datapaths of both the adversarial and normal examples. A datapath is a group of critical neurons along with their connections. We formulate the datapath extraction as a subset selection problem and solve it by constructing and training a neural network. A multi-level visualization consisting of a network-level visualization of data flows, a layer-level visualization of feature maps, and a neuron-level visualization of learned features, has been designed to help investigate how datapaths of adversarial and normal examples diverge and merge in the prediction process. A quantitative evaluation and a case study were conducted to demonstrate the promise of our method to explain the misclassification of adversarial examples.
# 脳波フィンガープリント:パワースペクトルの周期的成分に基づく被験者特異的署名

EEG fingerprinting: subject specific signature based on the aperiodic component of power spectrum ( http://arxiv.org/abs/2001.09424v1 )

During the last few years, there has been growing interest in the effects induced by individual variability on activation patterns and brain connectivity. The practical implications of individual variability is of basic relevance for both group level and subject level studies. The Electroencephalogram (EEG), still represents one of the most used recording techniques to investigate a wide range of brain related features. In this work, we aim to estimate the effect of individual variability on a set of very simple and easily interpretable features extracted from the EEG power spectra. In particular, in an identification scenario, we investigated how the aperiodic (1/f background) component of the EEG power spectra can accurately identify subjects from a large EEG dataset. The results of this study show that the aperiodic component of the EEG signal is characterized by strong subject-specific properties, that this feature is consistent across different experimental conditions (eyes-open and eyes-closed) and outperforms the canonically-defined frequency bands. These findings suggest that the simple features (slope and offset) extracted from the aperiodic component of the EEG signal are sensitive to individual traits and may help to characterize and make inferences at single subject level.
# ストリーミングパフォーマンスデータのレビューのためのビジュアル分析フレームワーク

A Visual Analytics Framework for Reviewing Streaming Performance Data ( http://arxiv.org/abs/2001.09399v1 )

Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large streaming data is challenging because the rate of receiving data and limited time to comprehend data make it difficult for the analysts to sufficiently examine the data without missing important changes or patterns. To support streaming data analysis, we introduce a visual analytic framework comprising of three modules: data management, analysis, and interactive visualization. The data management module collects various computing and communication performance metrics from the monitored system using streaming data processing techniques and feeds the data to the other two modules. The analysis module automatically identifies important changes and patterns at the required latency. In particular, we introduce a set of online and progressive analysis methods for not only controlling the computational costs but also helping analysts better follow the critical aspects of the analysis results. Finally, the interactive visualization module provides the analysts with a coherent view of the changes and patterns in the continuously captured performance data. Through a multi-faceted case study on performance analysis of parallel discrete-event simulation, we demonstrate the effectiveness of our framework for identifying bottlenecks and locating outliers.
# 2年代合成と推定技術の概要

An Overview of Two Age Synthesis and Estimation Techniques ( http://arxiv.org/abs/2002.03750v1 )

Age estimation is a technique for predicting human ages from digital facial images, which analyzes a person's face image and estimates his/her age based on the year measure. Nowadays, intelligent age estimation and age synthesis have become particularly prevalent research topics in computer vision and face verification systems. Age synthesis is defined to render a facial image aesthetically with rejuvenating and natural aging effects on the person's face. Age estimation is defined to label a facial image automatically with the age group (year range) or the exact age (year) of the person's face. In this case study, we overview the existing models, popular techniques, system performances, and technical challenges related to the facial image-based age synthesis and estimation topics. The main goal of this review is to provide an easy understanding and promising future directions with systematic discussions.
# 有限格子整形によるシーンテキスト認識

Scene Text Recognition With Finer Grid Rectification ( http://arxiv.org/abs/2001.09389v1 )

Scene Text Recognition is a challenging problem because of irregular styles and various distortions. This paper proposed an end-to-end trainable model consists of a finer rectification module and a bidirectional attentional recognition network(Firbarn). The rectification module adopts finer grid to rectify the distorted input image and the bidirectional decoder contains only one decoding layer instead of two separated one. Firbarn can be trained in a weak supervised way, only requiring the scene text images and the corresponding word labels. With the flexible rectification and the novel bidirectional decoder, the results of extensive evaluation on the standard benchmarks show Firbarn outperforms previous works, especially on irregular datasets.
# カリキュラム視聴覚学習

Curriculum Audiovisual Learning ( http://arxiv.org/abs/2001.09414v1 )

Associating sound and its producer in complex audiovisual scene is a challenging task, especially when we are lack of annotated training data. In this paper, we present a flexible audiovisual model that introduces a soft-clustering module as the audio and visual content detector, and regards the pervasive property of audiovisual concurrency as the latent supervision for inferring the correlation among detected contents. To ease the difficulty of audiovisual learning, we propose a novel curriculum learning strategy that trains the model from simple to complex scene. We show that such ordered learning procedure rewards the model the merits of easy training and fast convergence. Meanwhile, our audiovisual model can also provide effective unimodal representation and cross-modal alignment performance. We further deploy the well-trained model into practical audiovisual sound localization and separation task. We show that our localization model significantly outperforms existing methods, based on which we show comparable performance in sound separation without referring external visual supervision. Our video demo can be found at https://youtu.be/kuClfGG0cFU.
# 画像・映像からのポーズ・容姿・背景の教師なし乱れ

Unsupervised Disentanglement of Pose, Appearance and Background from Images and Videos ( http://arxiv.org/abs/2001.09518v1 )

Unsupervised landmark learning is the task of learning semantic keypoint-like representations without the use of expensive input keypoint-level annotations. A popular approach is to factorize an image into a pose and appearance data stream, then to reconstruct the image from the factorized components. The pose representation should capture a set of consistent and tightly localized landmarks in order to facilitate reconstruction of the input image. Ultimately, we wish for our learned landmarks to focus on the foreground object of interest. However, the reconstruction task of the entire image forces the model to allocate landmarks to model the background. This work explores the effects of factorizing the reconstruction task into separate foreground and background reconstructions, conditioning only the foreground reconstruction on the unsupervised landmarks. Our experiments demonstrate that the proposed factorization results in landmarks that are focused on the foreground object of interest. Furthermore, the rendered background quality is also improved, as the background rendering pipeline no longer requires the ill-suited landmarks to model its pose and appearance. We demonstrate this improvement in the context of the video-prediction task.
# 注意モデルとデータ平衡を用いた効果的な自動画像アノテーションモデル

An Effective Automatic Image Annotation Model Via Attention Model and Data Equilibrium ( http://arxiv.org/abs/2001.10590v1 )

Nowadays, a huge number of images are available. However, retrieving a required image for an ordinary user is a challenging task in computer vision systems. During the past two decades, many types of research have been introduced to improve the performance of the automatic annotation of images, which are traditionally focused on content-based image retrieval. Although, recent research demonstrates that there is a semantic gap between content-based image retrieval and image semantics understandable by humans. As a result, existing research in this area has caused to bridge the semantic gap between low-level image features and high-level semantics. The conventional method of bridging the semantic gap is through the automatic image annotation (AIA) that extracts semantic features using machine learning techniques. In this paper, we propose a novel AIA model based on the deep learning feature extraction method. The proposed model has three phases, including a feature extractor, a tag generator, and an image annotator. First, the proposed model extracts automatically the high and low-level features based on dual-tree continues wavelet transform (DT-CWT), singular value decomposition, distribution of color ton, and the deep neural network. Moreover, the tag generator balances the dictionary of the annotated keywords by a new log-entropy auto-encoder (LEAE) and then describes these keywords by word embedding. Finally, the annotator works based on the long-short-term memory (LSTM) network in order to obtain the importance degree of specific features of the image. The experiments conducted on two benchmark datasets confirm that the superiority of the proposed model compared to the previous models in terms of performance criteria.
# 相互運用性発見のための自動化アプローチ

An Automated Approach for the Discovery of Interoperability ( http://arxiv.org/abs/2001.10585v1 )

In this article, we present an automated approach that would test for and discover the interoperability of CAD systems based on the approximately-invariant shape properties of their models. We further show that exchanging models in standard format does not guarantee the preservation of shape properties. Our analysis is based on utilizing queries in deriving the shape properties and constructing the proxy models of the given CAD models [1]. We generate template files to accommodate the information necessary for the property computations and proxy model constructions, and implement an interoperability discovery program called DTest to execute the interoperability testing. We posit that our method could be extended to interoperability testing on CAD-to-CAE and/or CAD-to-CAM interactions by modifying the set of property checks and providing the additional requirements that may emerge in CAE or CAM applications.
# Reproducibility Challenge NeurIPS 2019 Report on "Competitive Gradient Descent" に参加して

Reproducibility Challenge NeurIPS 2019 Report on "Competitive Gradient Descent" ( http://arxiv.org/abs/2001.10820v1 )

This is a report for reproducibility challenge of NeurlIPS 2019 on the paper Competitive Gradient Descent (Schafer et al., 2019). The paper introduces a novel algorithm for the numerical computation of Nash equilibria of competitive two-player games. It avoids oscillatory and divergent behaviours seen in alternating gradient descent. The purpose of this report is to critically examine the reproducibility of the work by (Schafer et al., 2019), within the framework of the NeurIPS 2019 Reproducibility Challenge. The experiments replicated in this report confirms the results of the original study. Moreover, this project offers a Python (Pytorch based) implementation of the proposed CGD algorithm which can be found at the following public git repository: (https://github.com/GopiKishan14/Reproducibility_Challenge_NeurIPS_2019)
# AIによるGUI攻撃とその防御方法

AI-Powered GUI Attack and Its Defensive Methods ( http://arxiv.org/abs/2001.09388v1 )

Since the first Graphical User Interface (GUI) prototype was invented in the 1970s, GUI systems have been deployed into various personal computer systems and server platforms. Recently, with the development of artificial intelligence (AI) technology, malicious malware powered by AI is emerging as a potential threat to GUI systems. This type of AI-based cybersecurity attack, targeting at GUI systems, is explored in this paper. It is twofold: (1) A malware is designed to attack the existing GUI system by using AI-based object recognition techniques. (2) Its defensive methods are discovered by generating adversarial examples and other methods to alleviate the threats from the intelligent GUI attack. The results have shown that a generic GUI attack can be implemented and performed in a simple way based on current AI techniques and its countermeasures are temporary but effective to mitigate the threats of GUI attack so far.
# グラディエント型対向攻撃の不確かさに対するアンサンブルノイズシミュレーション

Ensemble Noise Simulation to Handle Uncertainty about Gradient-based Adversarial Attacks ( http://arxiv.org/abs/2001.09486v1 )

Gradient-based adversarial attacks on neural networks can be crafted in a variety of ways by varying either how the attack algorithm relies on the gradient, the network architecture used for crafting the attack, or both. Most recent work has focused on defending classifiers in a case where there is no uncertainty about the attacker's behavior (i.e., the attacker is expected to generate a specific attack using a specific network architecture). However, if the attacker is not guaranteed to behave in a certain way, the literature lacks methods in devising a strategic defense. We fill this gap by simulating the attacker's noisy perturbation using a variety of attack algorithms based on gradients of various classifiers. We perform our analysis using a pre-processing Denoising Autoencoder (DAE) defense that is trained with the simulated noise. We demonstrate significant improvements in post-attack accuracy, using our proposed ensemble-trained defense, compared to a situation where no effort is made to handle uncertainty.
# 画像計測による確率的物体モデル学習のための進行成長型アンビエントGAN

Progressively-Growing AmbientGANs For Learning Stochastic Object Models From Imaging Measurements ( http://arxiv.org/abs/2001.09523v1 )

The objective optimization of medical imaging systems requires full characterization of all sources of randomness in the measured data, which includes the variability within the ensemble of objects to-be-imaged. This can be accomplished by establishing a stochastic object model (SOM) that describes the variability in the class of objects to-be-imaged. Generative adversarial networks (GANs) can be potentially useful to establish SOMs because they hold great promise to learn generative models that describe the variability within an ensemble of training data. However, because medical imaging systems record imaging measurements that are noisy and indirect representations of object properties, GANs cannot be directly applied to establish stochastic models of objects to-be-imaged. To address this issue, an augmented GAN architecture named AmbientGAN was developed to establish SOMs from noisy and indirect measurement data. However, because the adversarial training can be unstable, the applicability of the AmbientGAN can be potentially limited. In this work, we propose a novel training strategy---Progressive Growing of AmbientGANs (ProAGAN)---to stabilize the training of AmbientGANs for establishing SOMs from noisy and indirect imaging measurements. An idealized magnetic resonance (MR) imaging system and clinical MR brain images are considered. The proposed methodology is evaluated by comparing signal detection performance computed by use of ProAGAN-generated synthetic images and images that depict the true object properties.
# カスケード畳み込みおよび逆向きディープネットワークを用いた腹部マルチオルガンセグメンテーション

Abdominal multi-organ segmentation with cascaded convolutional and adversarial deep networks ( http://arxiv.org/abs/2001.09521v1 )

Objective : Abdominal anatomy segmentation is crucial for numerous applications from computer-assisted diagnosis to image-guided surgery. In this context, we address fully-automated multi-organ segmentation from abdominal CT and MR images using deep learning. Methods: The proposed model extends standard conditional generative adversarial networks. Additionally to the discriminator which enforces the model to create realistic organ delineations, it embeds cascaded partially pre-trained convolutional encoder-decoders as generator. Encoder fine-tuning from a large amount of non-medical images alleviates data scarcity limitations. The network is trained end-to-end to benefit from simultaneous multi-level segmentation refinements using auto-context. Results : Employed for healthy liver, kidneys and spleen segmentation, our pipeline provides promising results by outperforming state-of-the-art encoder-decoder schemes. Followed for the Combined Healthy Abdominal Organ Segmentation (CHAOS) challenge organized in conjunction with the IEEE International Symposium on Biomedical Imaging 2019, it gave us the first rank for three competition categories: liver CT, liver MR and multi-organ MR segmentation. Conclusion : Combining cascaded convolutional and adversarial networks strengthens the ability of deep learning pipelines to automatically delineate multiple abdominal organs, with good generalization capability. Significance : The comprehensive evaluation provided suggests that better guidance could be achieved to help clinicians in abdominal image interpretation and clinical decision making.
# 深層強化学習による感情と知識に基づくアルゴリズム取引

Sentiment and Knowledge Based Algorithmic Trading with Deep Reinforcement Learning ( http://arxiv.org/abs/2001.09403v1 )

Algorithmic trading, due to its inherent nature, is a difficult problem to tackle; there are too many variables involved in the real world which make it almost impossible to have reliable algorithms for automated stock trading. The lack of reliable labelled data that considers physical and physiological factors that dictate the ups and downs of the market, has hindered the supervised learning attempts for dependable predictions. To learn a good policy for trading, we formulate an approach using reinforcement learning which uses traditional time series stock price data and combines it with news headline sentiments, while leveraging knowledge graphs for exploiting news about implicit relationships.
# TaxoExpan: グラフニューラルネットワークによる自己教師型分類の拡張

TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network ( http://arxiv.org/abs/2001.09522v1 )

Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications. For example, online retailers (e.g., Amazon and eBay) use taxonomies for product recommendation, and web search engines (e.g., Google and Bing) leverage taxonomies to enhance query understanding. Enormous efforts have been made on constructing taxonomies either manually or semi-automatically. However, with the fast-growing volume of web content, existing taxonomies will become outdated and fail to capture emerging knowledge. Therefore, in many applications, dynamic expansions of an existing taxonomy are in great demand. In this paper, we study how to expand an existing taxonomy by adding a set of new concepts. We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of <query concept, anchor concept> pairs from the existing taxonomy as training data. Using such self-supervision data, TaxoExpan learns a model to predict whether a query concept is the direct hyponym of an anchor concept. We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data. Extensive experiments on three large-scale datasets from different domains demonstrate both the effectiveness and the efficiency of TaxoExpan for taxonomy expansion.
# 制約付き上部信頼強化学習

Constrained Upper Confidence Reinforcement Learning ( http://arxiv.org/abs/2001.09377v1 )

Constrained Markov Decision Processes are a class of stochastic decision problems in which the decision maker must select a policy that satisfies auxiliary cost constraints. This paper extends upper confidence reinforcement learning for settings in which the reward function and the constraints, described by cost functions, are unknown a priori but the transition kernel is known. Such a setting is well-motivated by a number of applications including exploration of unknown, potentially unsafe, environments. We present an algorithm C-UCRL and show that it achieves sub-linear regret ($ O(T^{\frac{3}{4}}\sqrt{\log(T/\delta)})$) with respect to the reward while satisfying the constraints even while learning with probability $1-\delta$. Illustrative examples are provided.
# litemort:適応的コンパクト分布に基づくメモリ効率のよい勾配ブースティングツリーシステム

LiteMORT: A memory efficient gradient boosting tree system on adaptive compact distributions ( http://arxiv.org/abs/2001.09419v1 )

Gradient boosted decision trees (GBDT) is the leading algorithm for many commercial and academic data applications. We give a deep analysis of this algorithm, especially the histogram technique, which is a basis for the regulized distribution with compact support. We present three new modifications. 1) Share memory technique to reduce memory usage. In many cases, it only need the data source itself and no extra memory. 2) Implicit merging for "merge overflow problem"."merge overflow" means that merge some small datasets to huge datasets, which are too huge to be solved. By implicit merging, we just need the original small datasets to train the GBDT model. 3) Adaptive resize algorithm of histogram bins to improve accuracy. Experiments on two large Kaggle competitions verified our methods. They use much less memory than LightGBM and have higher accuracy. We have implemented these algorithms in an open-source package LiteMORT. The source codes are available at https://github.com/closest-git/LiteMORT
# 説明可能な人工知能と機械学習:現実に根ざした視点

Explainable Artificial Intelligence and Machine Learning: A reality rooted perspective ( http://arxiv.org/abs/2001.09464v1 )

We are used to the availability of big data generated in nearly all fields of science as a consequence of technological progress. However, the analysis of such data possess vast challenges. One of these relates to the explainability of artificial intelligence (AI) or machine learning methods. Currently, many of such methods are non-transparent with respect to their working mechanism and for this reason are called black box models, most notably deep learning methods. However, it has been realized that this constitutes severe problems for a number of fields including the health sciences and criminal justice and arguments have been brought forward in favor of an explainable AI. In this paper, we do not assume the usual perspective presenting explainable AI as it should be, but rather we provide a discussion what explainable AI can be. The difference is that we do not present wishful thinking but reality grounded properties in relation to a scientific theory beyond physics.
# 行列値未知の多層ネットワークにおける推論

Inference in Multi-Layer Networks with Matrix-Valued Unknowns ( http://arxiv.org/abs/2001.09396v1 )

We consider the problem of inferring the input and hidden variables of a stochastic multi-layer neural network from an observation of the output. The hidden variables in each layer are represented as matrices. This problem applies to signal recovery via deep generative prior models, multi-task and mixed regression and learning certain classes of two-layer neural networks. A unified approximation algorithm for both MAP and MMSE inference is proposed by extending a recently-developed Multi-Layer Vector Approximate Message Passing (ML-VAMP) algorithm to handle matrix-valued unknowns. It is shown that the performance of the proposed Multi-Layer Matrix VAMP (ML-Mat-VAMP) algorithm can be exactly predicted in a certain random large-system limit, where the dimensions $N\times d$ of the unknown quantities grow as $N\rightarrow\infty$ with $d$ fixed. In the two-layer neural-network learning problem, this scaling corresponds to the case where the number of input features and training samples grow to infinity but the number of hidden nodes stays fixed. The analysis enables a precise prediction of the parameter and test error of the learning.
# 生成逆数ネットワークを用いた理想オブザーバのマルコフ連鎖モンテカルロ近似

Markov-Chain Monte Carlo Approximation of the Ideal Observer using Generative Adversarial Networks ( http://arxiv.org/abs/2001.09526v1 )

The Ideal Observer (IO) performance has been advocated when optimizing medical imaging systems for signal detection tasks. However, analytical computation of the IO test statistic is generally intractable. To approximate the IO test statistic, sampling-based methods that employ Markov-Chain Monte Carlo (MCMC) techniques have been developed. However, current applications of MCMC techniques have been limited to several object models such as a lumpy object model and a binary texture model, and it remains unclear how MCMC methods can be implemented with other more sophisticated object models. Deep learning methods that employ generative adversarial networks (GANs) hold great promise to learn stochastic object models (SOMs) from image data. In this study, we described a method to approximate the IO by applying MCMC techniques to SOMs learned by use of GANs. The proposed method can be employed with arbitrary object models that can be learned by use of GANs, thereby the domain of applicability of MCMC techniques for approximating the IO performance is extended. In this study, both signal-known-exactly (SKE) and signal-known-statistically (SKS) binary signal detection tasks are considered. The IO performance computed by the proposed method is compared to that computed by the conventional MCMC method. The advantages of the proposed method are discussed.
# 話者検証と音声トリガー検出のためのマルチタスク学習

Multi-task Learning for Speaker Verification and Voice Trigger Detection ( http://arxiv.org/abs/2001.10816v1 )

Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classification (CTC) loss while the speaker recognition branch of the network is trained to label the input sequence with the correct label for the speaker. We present a large-scale empirical study where the model is trained using several thousand hours of labelled training data for each task. We evaluate the speech transcription branch of the network on a voice trigger detection task while the speaker recognition branch is evaluated on a speaker verification task. Results demonstrate that the network is able to encode both phonetic \emph{and} speaker information in its learnt representations while yielding accuracies at least as good as the baseline models for each task, with the same number of parameters as the independent models.
