AI publications

AI publications 

AI publications 

publications

Red teaming LLMs: an end-to-end safety overview

A survey covering attack methods, evaluation, metrics and tools for identifying and mitigating GenAI application vulnerabilities.

publications

RainbowPO: unified preference optimization

This new framework enhances preference optimization for better AI alignment with human values.

publications

Enhancing LLM security with chain-of-thought fine-tuning

Fine-tuning and aligning chain-of-thought responses in LLMs for safer conversational AI.

publications

Zero-shot tabular prediction via adversarial transformer

Introducing APT, an adversarially pre-trained transformer achieving SOTA on small tabular tasks.

publications

Preference tuning with human feedback: a survey

A survey of recent advancements in aligning deep generative models with human preferences across language, speech and vision.

publications

Visagreement: visualizing explanations (dis)agreement

A visualization tool for tabular data, exploring where and why local feature importance explanations agree or disagree.

publications

An automatic method to estimate correctness of RAG

This method predicts the correctness of retrieval-augmented generation by analyzing uncertainty.

publications

LIP: graph node classification via label influence

A novel GNN-based model for multi-label node classification that propagates label influences on graphs.

publications

MetaMetrics: calibrating metrics for generation tasks

MetaMetrics aligns evaluation metrics with human preferences for better AI assessment.