Reinforced Pedestrian Attribute Recognition with Group Optimization
  Reward
        - URL: http://arxiv.org/abs/2205.14042v1
- Date: Sat, 21 May 2022 03:38:03 GMT
- Title: Reinforced Pedestrian Attribute Recognition with Group Optimization
  Reward
- Authors: Zhong Ji, Zhenfei Hu, Yaodong Wang, Shengjia Li
- Abstract summary: Two key challenges in Pedestrian Attribute Recognition (PAR) are alignment relations between images and attributes, and imbalanced data distribution.
This paper addresses it as a decision-making task via a reinforcement learning framework.
We employ an agent to recognize each group of attributes, which is trained with Deep Q-learning algorithm.
- Score: 15.630702608104421
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Pedestrian Attribute Recognition (PAR) is a challenging task in intelligent
video surveillance. Two key challenges in PAR include complex alignment
relations between images and attributes, and imbalanced data distribution.
Existing approaches usually formulate PAR as a recognition task. Different from
them, this paper addresses it as a decision-making task via a reinforcement
learning framework. Specifically, PAR is formulated as a Markov decision
process (MDP) by designing ingenious states, action space, reward function and
state transition. To alleviate the inter-attribute imbalance problem, we apply
an Attribute Grouping Strategy (AGS) by dividing all attributes into subgroups
according to their region and category information. Then we employ an agent to
recognize each group of attributes, which is trained with Deep Q-learning
algorithm. We also propose a Group Optimization Reward (GOR) function to
alleviate the intra-attribute imbalance problem. Experimental results on the
three benchmark datasets of PETA, RAP and PA100K illustrate the effectiveness
and competitiveness of the proposed approach and demonstrate that the
application of reinforcement learning to PAR is a valuable research direction.
 
      
        Related papers
        - Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph   Question Answering [75.12322966980003]
 Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
 arXiv  Detail & Related papers  (2025-06-11T12:03:52Z)
- Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents   with Dynamic Evaluation and Selection [71.92083784393418]
 Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance.
We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
 arXiv  Detail & Related papers  (2025-04-02T17:40:47Z)
- Multi-Attribute Steering of Language Models via Targeted Intervention [56.93583799109029]
 Inference-time intervention (ITI) has emerged as a promising method for steering large language model (LLM) behavior in a particular direction.
We introduce Multi-Attribute Targeted Steering (MAT-Steer), a novel steering framework designed for selective token-level intervention across multiple attributes.
 arXiv  Detail & Related papers  (2025-02-18T02:27:23Z)
- Hybrid Discriminative Attribute-Object Embedding Network for   Compositional Zero-Shot Learning [83.10178754323955]
 Hybrid Discriminative Attribute-Object Embedding (HDA-OE) network is proposed to solve the problem of complex interactions between attributes and object visual representations.
To increase the variability of training data, HDA-OE introduces an attribute-driven data synthesis (ADDS) module.
To further improve the discriminative ability of the model, HDA-OE introduces the subclass-driven discriminative embedding (SDDE) module.
The proposed model has been evaluated on three benchmark datasets, and the results verify its effectiveness and reliability.
 arXiv  Detail & Related papers  (2024-11-28T09:50:25Z)
- A Solution to Co-occurrence Bias: Attributes Disentanglement via Mutual
  Information Minimization for Pedestrian Attribute Recognition [10.821982414387525]
 We show that current methods can actually suffer in generalizing such fitted attributes interdependencies onto scenes or identities off the dataset distribution.
To render models robust in realistic scenes, we propose the attributes-disentangled feature learning to ensure the recognition of an attribute not inferring on the existence of others.
 arXiv  Detail & Related papers  (2023-07-28T01:34:55Z)
- PARFormer: Transformer-based Multi-Task Network for Pedestrian Attribute
  Recognition [23.814762073093153]
 We propose a pure transformer-based multi-task PAR network named PARFormer, which includes four modules.
In the feature extraction module, we build a strong baseline for feature extraction, which achieves competitive results on several PAR benchmarks.
In the viewpoint perception module, we explore the impact of viewpoints on pedestrian attributes, and propose a multi-view contrastive loss.
In the attribute recognition module, we alleviate the negative-positive imbalance problem to generate the attribute predictions.
 arXiv  Detail & Related papers  (2023-04-14T16:27:56Z)
- A Reinforcement Learning-assisted Genetic Programming Algorithm for Team
  Formation Problem Considering Person-Job Matching [70.28786574064694]
 A reinforcement learning-assisted genetic programming algorithm (RL-GP) is proposed to enhance the quality of solutions.
The hyper-heuristic rules obtained through efficient learning can be utilized as decision-making aids when forming project teams.
 arXiv  Detail & Related papers  (2023-04-08T14:32:12Z)
- Fairness via Adversarial Attribute Neighbourhood Robust Learning [49.93775302674591]
 We propose a principled underlineRobust underlineAdversarial underlineAttribute underlineNeighbourhood (RAAN) loss to debias the classification head.
 arXiv  Detail & Related papers  (2022-10-12T23:39:28Z)
- UPAR: Unified Pedestrian Attribute Recognition and Person Retrieval [4.6193503399184275]
 We present UPAR, the Unified Person Attribute Recognition dataset.
It is based on four well-known person attribute recognition datasets: PA100k, PETA, RAPv2, and Market1501.
We unify those datasets by providing 3,3M additional annotations to harmonize 40 important binary attributes over 12 attribute categories.
 arXiv  Detail & Related papers  (2022-09-06T14:20:56Z)
- Learning to Ask Conversational Questions by Optimizing Levenshtein
  Distance [83.53855889592734]
 We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
 Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
 arXiv  Detail & Related papers  (2021-06-30T08:44:19Z)
- Deep Template Matching for Pedestrian Attribute Recognition with the
  Auxiliary Supervision of Attribute-wise Keypoints [33.35677385823819]
 Pedestrian Attribute Recognition (PAR) has aroused extensive attention due to its important role in video surveillance scenarios.
Recent works design complicated modules, e.g., attention mechanism and proposal of body parts to localize the attribute corresponding region.
These works prove that localization of attribute specific regions precisely will help in improving performance.
However, these part-information-based methods are still not accurate as well as increasing model complexity.
 arXiv  Detail & Related papers  (2020-11-13T07:52:26Z)
- Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
 We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
 arXiv  Detail & Related papers  (2020-06-10T20:20:10Z)
- Cross-modality Person re-identification with Shared-Specific Feature
  Transfer [112.60513494602337]
 Cross-modality person re-identification (cm-ReID) is a challenging but key technology for intelligent video analysis.
We propose a novel cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modality-specific characteristics.
 arXiv  Detail & Related papers  (2020-02-28T00:18:45Z)
- Deep Multi-task Multi-label CNN for Effective Facial Attribute
  Classification [53.58763562421771]
 We propose a novel deep multi-task multi-label CNN, termed DMM-CNN, for effective Facial Attribute Classification (FAC)
Specifically, DMM-CNN jointly optimize two closely-related tasks (i.e., facial landmark detection and FAC) to improve the performance of FAC by taking advantage of multi-task learning.
Two different network architectures are respectively designed to extract features for two groups of attributes, and a novel dynamic weighting scheme is proposed to automatically assign the loss weight to each facial attribute during training.
 arXiv  Detail & Related papers  (2020-02-10T12:34:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.