On Exact Learning of $d$-Monotone Functions
- URL: http://arxiv.org/abs/2502.01265v1
- Date: Mon, 03 Feb 2025 11:37:20 GMT
- Title: On Exact Learning of $d$-Monotone Functions
- Authors: Nader H. Bshouty,
- Abstract summary: We study the learnability of the Boolean class of $d$-monotone functions $f:cal Xto0,1$ from membership and equivalence queries, where $(cal X,le)$ is a finite lattice.
- Score: 0.0
- License:
- Abstract: In this paper, we study the learnability of the Boolean class of $d$-monotone functions $f:{\cal X}\to\{0,1\}$ from membership and equivalence queries, where $({\cal X},\le)$ is a finite lattice. We show that the class of $d$-monotone functions that are represented in the form $f=F(g_1,g_2,\ldots,g_d)$, where $F$ is any Boolean function $F:\{0,1\}^d\to\{0,1\}$ and $g_1,\ldots,g_d:{\cal X}\to \{0,1\}$ are any monotone functions, is learnable in time $\sigma({\cal X})\cdot (size(f)/d+1)^{d}$ where $\sigma({\cal X})$ is the maximum sum of the number of immediate predecessors in a chain from the largest element to the smallest element in the lattice ${\cal X}$ and $size(f)=size(g_1)+\cdots+size(g_d)$, where $size(g_i)$ is the number of minimal elements in $g_i^{-1}(1)$. For the Boolean function $f:\{0,1\}^n\to\{0,1\}$, the class of $d$-monotone functions that are represented in the form $f=F(g_1,g_2,\ldots,g_d)$, where $F$ is any Boolean function and $g_1,\ldots,g_d$ are any monotone DNF, is learnable in time $O(n^2)\cdot (size(f)/d+1)^{d}$ where $size(f)=size(g_1)+\cdots+size(g_d)$. In particular, this class is learnable in polynomial time when $d$ is constant. Additionally, this class is learnable in polynomial time when $size(g_i)$ is constant for all $i$ and $d=O(\log n)$.
Related papers
- PREM: Privately Answering Statistical Queries with Relative Error [91.98332694700046]
We introduce $mathsfPREM$ (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that a relative error guarantee for statistical queries under $(varepsilon, delta)$ differential privacy (DP)
We complement our algorithm with nearly matching lower bounds.
arXiv Detail & Related papers (2025-02-20T18:32:02Z) - LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions [54.54897832889028]
We show that for any $K$, there is a universal set" $U subset [n]$ of size independent of $n$, such that for any $Q$ and any row $i$, the large attention scores $A_i,j$ in row $i$ of $A$ all have $jin U$.
We empirically show the benefits of our scheme for vision transformers, showing how to train new models that use our universal set while training as well.
arXiv Detail & Related papers (2024-10-07T19:47:13Z) - Further Investigation on Differential Properties of the Generalized Ness-Helleseth Function [13.67029767623542]
The function defined by $f_u(x)=uxd_1+xd_2$ is called the generalized Ness-Helleseth function over $mathbbF_pn$.
For each $u$ satisfying $chi(u+1) = chi(u-1)$, the differential spectrum of $f_u(x)$ is investigated.
arXiv Detail & Related papers (2024-08-30T13:18:23Z) - The Approximate Degree of DNF and CNF Formulas [95.94432031144716]
For every $delta>0,$ we construct CNF and formulas of size with approximate degree $Omega(n1-delta),$ essentially matching the trivial upper bound of $n.
We show that for every $delta>0$, these models require $Omega(n1-delta)$, $Omega(n/4kk2)1-delta$, and $Omega(n/4kk2)1-delta$, respectively.
arXiv Detail & Related papers (2022-09-04T10:01:39Z) - Self-adjusting Population Sizes for the $(1, \lambda)$-EA on Monotone
Functions [7.111443975103329]
We study the $(1,lambda)$-EA with mutation rate $c/n$ for $cle 1$, where the population size is adaptively controlled with the $(1:s+1)$-success rule.
We show that this setup with $c=1$ is efficient on onemax for $s1$, but inefficient if $s ge 18$.
arXiv Detail & Related papers (2022-04-01T15:46:12Z) - Learning low-degree functions from a logarithmic number of random
queries [77.34726150561087]
We prove that for any integer $ninmathbbN$, $din1,ldots,n$ and any $varepsilon,deltain(0,1)$, a bounded function $f:-1,1nto[-1,1]$ of degree at most $d$ can be learned.
arXiv Detail & Related papers (2021-09-21T13:19:04Z) - Inapproximability of Minimizing a Pair of DNFs or Binary Decision Trees
Defining a Partial Boolean Function [13.713089043895959]
We consider partial Boolean functions defined by a pair of Boolean functions $f, g colon 0,1J to 0,1$ such that $f cdot g = 0$ and such that $f$ and $g$ are defined by disjunctive normal forms or binary decision trees.
arXiv Detail & Related papers (2021-02-09T08:46:50Z) - On the Modularity of Hypernetworks [103.1147622394852]
We show that for a structured target function, the overall number of trainable parameters in a hypernetwork is smaller by orders of magnitude than the number of trainable parameters of a standard neural network and an embedding method.
arXiv Detail & Related papers (2020-02-23T22:51:52Z) - A closer look at the approximation capabilities of neural networks [6.09170287691728]
A feedforward neural network with one hidden layer is able to approximate any continuous function $f$ to any given approximation threshold $varepsilon$.
We show that this uniform approximation property still holds even under seemingly strong conditions imposed on the weights.
arXiv Detail & Related papers (2020-02-16T04:58:43Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.