Homepage - Hehai Lin @ The Hong Kong University of Science and Technology (Guangzhou)

Selected Publications (view all ) († corresponding author, * equal contribution)

Multi-view Analysis for Modality Bias in Multimodal Misinformation Benchmarks

Hehai Lin, Hui Liu, Shilei Cao, Haoliang Li, Wenya Wang

Under review 2024

Numerous multimodal misinformation benchmarks exhibit bias toward specific modalities, allowing detectors to make predictions based solely on one modality. Training detectors on such datasets can significantly degrade performance in real-world applications. While previous research has quantified modality bias at the dataset level or manually identified spurious correlations between modalities and labels, these approaches lack meaningful insights at the sample level and struggle to scale to the vast amount of online information. In this paper, we investigate the design for automatically recognizing modality bias at the sample level. Specifically, we introduce three views, namely modality benefit, modality flow, and modality causal effect, to quantify samples’ modality contribution based on different theories. To verify their effectiveness and discover the pattern of bias, we conduct a human evaluation on two benchmarks Fakeddit and MMFakeBench, and compare the performance of each view and their ensemble multi-view analysis. The experimental result indicates that multi-view analysis yields the highest performance and is aligned with human judgment in most samples. We further discuss the sensitivity and consistency of each view.

Multi-view Analysis for Modality Bias in Multimodal Misinformation Benchmarks

Hehai Lin, Hui Liu, Shilei Cao, Haoliang Li, Wenya Wang

Under review 2024

Self-Correction is More than Refinement： A Learning Framework for Visual and Language Reasoning Tasks

Jiayi He*, Hehai Lin*, Qingyun Wang, Yi Fung, Heng Ji

Under review 2024

While Vision-Language Models (VLMs) have shown remarkable abilities in visual and language reasoning tasks, they invariably generate flawed responses. Self-correction that instructs models to refine their outputs presents a promising solution to this issue. Previous studies have mainly concentrated on Large Language Models (LLMs), while the self-correction abilities of VLMs, particularly concerning both visual and linguistic information, remain largely unexamined. This study investigates the self-correction capabilities of VLMs during both inference and fine-tuning stages. We introduce a Self-Correction Learning (SCL) approach that enables VLMs to learn from their self-generated self-correction data through Direct Preference Optimization (DPO) without relying on external feedback, facilitating self-improvement. Specifically, we collect preferred and disfavored samples based on the correctness of initial and refined responses, which are obtained by two-turn self-correction with VLMs during the inference stage. Experimental results demonstrate that although VLMs struggle to self-correct effectively during iterative inference without additional fine-tuning and external feedback, they can enhance their performance and avoid previous mistakes through preference fine-tuning when their self-generated self-correction data are categorized into preferred and disfavored samples. This study emphasizes that self-correction is not merely a refinement process; rather, it should enhance the reasoning abilities of models through additional training, enabling them to generate high-quality responses directly without further refinement.

Self-Correction is More than Refinement： A Learning Framework for Visual and Language Reasoning Tasks

Jiayi He*, Hehai Lin*, Qingyun Wang, Yi Fung, Heng Ji

Under review 2024

Education

Honors & Awards

Experience

Service

Selected Publications (view all ) († corresponding author, * equal contribution)

Multi-view Analysis for Modality Bias in Multimodal Misinformation Benchmarks

Multi-view Analysis for Modality Bias in Multimodal Misinformation Benchmarks

Self-Correction is More than Refinement： A Learning Framework for Visual and Language Reasoning Tasks

Self-Correction is More than Refinement： A Learning Framework for Visual and Language Reasoning Tasks

All publications