ZKBoost: Zero-Knowledge Verifiable Training for XGBoost
- URL: http://arxiv.org/abs/2602.04113v2
- Date: Mon, 09 Feb 2026 16:27:45 GMT
- Title: ZKBoost: Zero-Knowledge Verifiable Training for XGBoost
- Authors: Nikolas Melissaris, Jiayi Xu, Antigoni Polychroniadou, Akira Takahashi, Chenkai Weng,
- Abstract summary: We present ZKBoost, the first zero-knowledge proof of training (zkPoT) protocol for XGBoost.<n>Our fixed-point implementation matches standard XGBoost accuracy within 1% while enabling practical zkPoT on real-world datasets.
- Score: 11.66112429566394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gradient boosted decision trees, particularly XGBoost, are among the most effective methods for tabular data. As deployment in sensitive settings increases, cryptographic guarantees of model integrity become essential. We present ZKBoost, the first zero-knowledge proof of training (zkPoT) protocol for XGBoost, enabling model owners to prove correct training on a committed dataset without revealing data or parameters. We make three key contributions: (1) a fixed-point XGBoost implementation compatible with arithmetic circuits, enabling instantiation of efficient zkPoT, (2) a generic template of zkPoT for XGBoost, which can be instantiated with any general-purpose ZKP backend, and (3) vector oblivious linear evaluation (VOLE)-based instantiation resolving challenges in proving nonlinear fixed-point operations. Our fixed-point implementation matches standard XGBoost accuracy within 1\% while enabling practical zkPoT on real-world datasets.
Related papers
- CVeDRL: An Efficient Code Verifier via Difficulty-aware Reinforcement Learning [57.24524263804788]
Code verifiers play a critical role in post-verification for LLM-based code generation.<n>Existing supervised fine-tuning methods suffer from data scarcity, high failure rates, and poor inference efficiency.<n>We show that naive RL with only functionality rewards fails to generate effective unit tests for difficult branches and samples.
arXiv Detail & Related papers (2026-01-30T10:33:29Z) - Detect Anything via Next Point Prediction [51.55967987350882]
Rex- Omni is a 3B-scale MLLM that achieves state-of-the-art object perception performance.<n>On benchmarks like COCO and LVIS, Rex- Omni attains performance comparable to or exceeding regression-based models.
arXiv Detail & Related papers (2025-10-14T17:59:54Z) - Keep It on a Leash: Controllable Pseudo-label Generation Towards Realistic Long-Tailed Semi-Supervised Learning [88.48555005545694]
We propose a Controllable Pseudo-label Generation (CPG) framework to expand the labeled dataset with reliable pseudo-labels from the unlabeled dataset.<n>CPG operates through a controllable self-reinforcing optimization cycle.<n>CPG achieves consistent improvements, surpassing state-of-the-art methods by up to $textbf15.97%$ in accuracy.
arXiv Detail & Related papers (2025-10-05T01:52:19Z) - OGBoost: A Python Package for Ordinal Gradient Boosting [0.0]
We introduce OGBoost, a scikit-learn-compatible Python package for ordinal regression using gradient boosting.<n>The package is available on PyPI and can be installed via "pip install ogboost"
arXiv Detail & Related papers (2025-02-19T06:06:12Z) - Degree-Conscious Spiking Graph for Cross-Domain Adaptation [51.58506501415558]
Spiking Graph Networks (SGNs) have demonstrated significant potential in graph classification.<n>We introduce a novel framework named Degree-Consicious Spiking Graph for Cross-Domain Adaptation (DeSGraDA)<n>DeSGraDA enhances generalization across domains with three key components.
arXiv Detail & Related papers (2024-10-09T13:45:54Z) - PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive
Weakly-Supervised Learning [57.66155242473784]
Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity on many NLP tasks.
Our proposed model, named PRBoost, achieves this goal via iterative prompt-based rule discovery and model boosting.
Experiments on four tasks show PRBoost outperforms state-of-the-art WSL baselines up to 7.1%.
arXiv Detail & Related papers (2022-03-18T04:23:20Z) - Explainable AI Integrated Feature Selection for Landslide Susceptibility
Mapping using TreeSHAP [0.0]
An early prediction of landslide susceptibility using a data-driven approach is a demand of time.
We employed state-of-the-art machine learning algorithms including XgBoost, LR, KNN, SVM, and Adaboost for landslide susceptibility prediction.
An optimized version of XgBoost along with feature reduction by 40 % has outperformed all other classifiers in terms of popular evaluation metrics.
arXiv Detail & Related papers (2022-01-10T09:17:21Z) - KGBoost: A Classification-based Knowledge Base Completion Method with
Negative Sampling [29.14178162494542]
KGBoost is a new method to train a powerful classifier for missing link prediction.
We conduct experiments on multiple benchmark datasets, and demonstrate that KGBoost outperforms state-of-the-art methods across most datasets.
As compared with models trained by end-to-end optimization, KGBoost works well under the low-dimensional setting so as to allow a smaller model size.
arXiv Detail & Related papers (2021-12-17T06:19:37Z) - Tabular Data: Deep Learning is Not All You Need [0.0]
A key element of AutoML systems is setting the types of models that will be used for each type of task.
For classification and regression problems with tabular data, the use of tree ensemble models (like XGBoost) is usually recommended.
arXiv Detail & Related papers (2021-06-06T21:22:39Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z) - Survival regression with accelerated failure time model in XGBoost [1.5469452301122177]
Survival regression is used to estimate the relation between time-to-event and feature variables.
XGBoost implements loss functions for learning accelerated failure time models.
arXiv Detail & Related papers (2020-06-08T20:34:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.