[Summary] General-purpose LLMs as explained models #55

nfelnlp · 2023-05-17T10:53:52Z

Note: This is meant to be done after the first submission.

A promising extension to InterroLang (probably even warranting its own paper) would be to replace BERT-type models with a single general-purpose LLM (e.g. LLaMa or the already-in-place GPT-Neo parser) that performs all of the tasks reasonably well. This would be a more modern approach, since BERTs are slowly getting out-of-date and LLMs can now run on consumer hardware locally. This would cause some changes, however, which I will document in the following:

Write instructions for the various tasks, e.g.
" Please predict one of the following labels: <label_1> … <label_n> Prediction: "
Overhaul of the entire feature importance operation category (nlpattribute, globaltopk) using Inseq.
⚠️ How matrices of feature attributions would be verbalized into a response has yet to be determined.
Pre-compute predictions and explanations with the new LLM

We can also get rid of many smaller language models, e.g. the GPT-2 for CFE generation, the SBERT for semantic similarity.

Ideally, we would end up with one model (for the entire framework) that assesses itself. It would take care of

Parsing / Intent recognition
Prediction of downstream tasks
Feature attribution (nlpattribute, globaltopk)
Perturbations (CFE, adversarial, augment)
Semantic similarity
Rationalization

The only two parts of the pipeline that would remain rule-based are the dialogue state tracking (custom inputs, clarification questions, previous filters) and the response generation (currently template-based).

Resources

llama.cpp (Efficient execution of up to 7B models on CPUs)
RedPajama-INCITE-Instruct-3B (Hugging Face) – maybe better for rationalization?
RedPajama-INCITE-Chat-3B – maybe better for response generation?

nfelnlp added enhancement New feature or request summary labels May 17, 2023

nfelnlp added this to the Camera-ready version milestone May 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Summary] General-purpose LLMs as explained models #55

[Summary] General-purpose LLMs as explained models #55

nfelnlp commented May 17, 2023 •

edited

Loading

[Summary] General-purpose LLMs as explained models #55

[Summary] General-purpose LLMs as explained models #55

Comments

nfelnlp commented May 17, 2023 • edited Loading

Resources

nfelnlp commented May 17, 2023 •

edited

Loading