A survey covering attack methods, evaluation, metrics and tools for identifying and mitigating GenAI application vulnerabilities.
article |
This new framework enhances preference optimization for better AI alignment with human values.
article |
Fine-tuning and aligning chain-of-thought responses in LLMs for safer conversational AI.
article |
Introducing APT, an adversarially pre-trained transformer achieving SOTA on small tabular tasks.
article |
A survey of recent advancements in aligning deep generative models with human preferences across language, speech and vision.
article |
A visualization tool for tabular data, exploring where and why local feature importance explanations agree or disagree.
article |
This method predicts the correctness of retrieval-augmented generation by analyzing uncertainty.
article |
A novel GNN-based model for multi-label node classification that propagates label influences on graphs.
article |
MetaMetrics aligns evaluation metrics with human preferences for better AI assessment.
article |