Skip to content

leondz/lm_risk_cards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 

Repository files navigation

Language Model Risk Cards: Starter Set

A set of Language Model Risk cards for assessing a language model use case. To use these:

  • Choose what use-case, model, and interface is to be assessed
  • Select which of these risk cards is relevant in the given use-case scenario
  • Recruit people to do the assessment
  • For each risk card,
    • Design how one will probe the model, and for how long
    • Try to provoke the described behaviour from the language model, using your own prompts
    • Record all inputs and outputs
  • Compile an assessment report

Full details are given in the paper: Assessing Language Model Deployment with Risk Cards (2023), Leon Derczynski, Hannah Rose Kirk, Vidhisha Balachandran, Sachin Kumar, Yulia Tsvetkov, M. R. Leiser, Saif Mohammad