More information will be available shortly.
The rise of large language models has brought about significant advancements in the field of natural language processing. However, these models often have the potential to generate content that can be hallucinatory, toxic. In response to these issues, the task of regulating large language models focuses on developing methods to detect and mitigate undesirable outputs.
This shared task includes two tracks:
● Track 1: Multimodal Hallucination Detection for Multimodal Large Language Models: Develop methods to identify and flag hallucinatory outputs that do not correlate with reality or the given input context when dealing with multimodal prompts (text, images, video, etc.). This track would involve creating detection algorithms that can discern between accurate and hallucinated responses across different modalities, thereby ensuring the reliability of the model's outputs.
● Track 2: Detoxifying Large Language Models: Design and implement strategies to prevent large language models from generating toxic, biased, or harmful content. This track would focus on developing filters, fine-tuning techniques, or other mechanisms to recognize and suppress malicious response before it reaches the user. The goal is to maintain the utility and fluency of the model while ensuring that the content it produces adheres to community guidelines and ethical standards.
More information will be available shortly.
More information will be available shortly.
More information will be available shortly.
More information will be available shortly.
If you're intrigued by our challenge, please fill out the Registration Form (Word File) and send it to the following registration email.
Registration Email: [email protected]
- 2024/03/25:announcement of shared tasks and call for participation
- 2024/03/25:registration open
- 2024/04/15:release of detailed task guidelines & training data
- 2024/05/25:registration deadline
- 2024/06/11:release of test data
- 2024/06/20:participants’ results submission deadline
- 2024/06/30:evaluation results release and call for system reports and conference paper