A Kotlin multiplatform library for anything GBNF.
Features:
- DSL for writing GBNF grammars in kotlin
- Interpreter for loading GBNF grammars to the format as if written in the DSL
- Parsing of GBNF grammars
- Functions for converting parsed trees to a custom tree format.
- Functions for searching in and filtering trees
Originally, GBNF Kotlin was supposed to just be a DSL for writing GBNF grammars in kotlin, to be used with KotLLMs.
Then, parsing was added, originally meant for defining a grammar for function calling in KotLLMs, parsing was expanded with utility functions.
Since GBNF is effectively a superset of BNF, it can be used to describe languages, in fact, you can use GBNF to describe GBNF, as seen in gbnf_for_gbnf.gbnf.
Because of this, it is now possible to write a parser for GBNF inside of GBNF Kotlin, so I wrote one, and with it, I added support for parsing GBNF from text to the DSL object. So existing grammars are also valid and usable.
You might be wondering, what can this library be used for? On the surface, the features could sound confusing, it's essentially a library for using GBNF, and a helper library for creating GBNF grammars programatically.
Usages:
- Writing an LLM grammar, and reading the result back. (As seen in KotLLMs)
- Writing a parser for a computer language (not exclusively programming languages), for parsing from text to an abstract syntax tree. (As seen in the GBNFInterpreter)
Example: "I'm having a great day!"
package example
import com.mylosoftworks.com.mylosoftworks.gbnfkotlin.GBNF
fun main() {
val example = GBNF {
literal("I'm having a ")
oneOf {
literal("terrible")
literal("bad")
literal("okay")
literal("good")
literal("great")
}
literal(" day!")
}
// Print the resulting GBNF code
println(example.compile())
}
Result:
root ::= "I'm having a " ("terrible" | "bad" | "okay" | "good" | "great") " day!"
Or with the new operator based syntax:
package example
import com.mylosoftworks.com.mylosoftworks.gbnfkotlin.GBNF
fun main() {
val example = GBNF {
+"I'm having a "
oneOf {
+"terrible"
+"bad"
+"okay"
+"good"
+"great"
}
+" day!"
}
// Print the resulting GBNF code
println(example.compile())
}
GBNF Kotlin supports all features described in the llama.cpp grammars documentation except for comments.
Terminal rules are rules which specifically describe allowed characters
Literal strings can be defined like this:
literal("Content")
// "Content"
// Using operator
+"Content"
Characters and rages can be defined like this:
// In order to allow the letters a-z
range("a-z")
// [a-z]
// Using operator
-"a-z"
// In order to allow everything except "y"
range("y", true)
// [^y]
// Using operator
-!"a-z"
Entities (predefined nonterminal rules) can be defined like this:
val entity = entity("entity") { // Name is optional
literal("This is an entity.")
}
// To then use this entity as a nonterminal rule, simply invoke it:
entity()
// Result when compile() is called:
// root ::= entity
// entity ::= "This is an entity."
// Using operator
val entity = "entity" {
+"This is an entity."
}
Groups are used to group rules together, they are like entities, but non-reusable
group {
literal("This is a literal.")
literal("This is another literal.")
}
// ("This is an entity." "This is another entity.")
Alternatives (one of) are used to provide multiple options that can be matched individually
oneOf {
literal("This is a literal.")
literal("This is another literal.")
}
// ("This is a literal." | "This is another literal.")
Repeat is used to mark a group as repeating a certain amount of times
repeat(5) { // Repeat 5 times exactly
literal("literal")
}
// "literal"{5}
// Using operator
5 {
+"literal"
}
repeat(max=5) { // Repeat between 0 and 5 times
literal("literal")
}
// "literal"{0,5}
// Using operator
(-5) {
+"literal"
}
repeat(1, 5) { // Repeat between 1 and 5 times
literal("literal")
}
// "literal"{1,5}
// Using operator
(1..5) {
+"literal"
}
repeat(5, null) { // Repeat at least 5 times
literal("literal")
}
// "literal"{5,}
// Using operator
(5..Inf) {
+"literal"
}
For some types of repeat, there are alternative functions.
optional {
literal("literal")
}
// "literal"?
// Using operator
-1 {
+"literal"
}
oneOrMore {
literal("literal")
}
// "literal"+
// Using operator
(1..Inf) {
+"literal"
}
anyCount {
literal("literal")
}
// "literal"*
// Using operator
(0..Inf) {
+"literal"
}