Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown renderer support #12

Open
vithalreddy opened this issue May 20, 2023 · 20 comments · May be fixed by #278 or #571
Open

Markdown renderer support #12

vithalreddy opened this issue May 20, 2023 · 20 comments · May be fixed by #278 or #571
Labels
enhancement New feature or request

Comments

@vithalreddy
Copy link

would be interesting to add markdown output preview built in.

for now i'm using it with glow https://github.com/charmbracelet/glow to do the same

image

@simonw
Copy link
Owner

simonw commented Jun 14, 2023

I just tried an experiment using Rich and a new --rich option:

Annoyingly I don't think I can support --stream at the same time, so you have to wait for the output to finish.

Here's the result:

CleanShot 2023-06-14 at 07 56 45@2x

Should I add Rich as a dependency just for this feature? Maybe... especially if I might use Rich to add more features in the future.

@simonw simonw added the enhancement New feature or request label Jun 14, 2023
@simonw
Copy link
Owner

simonw commented Jun 14, 2023

Here's the code for that prototype:

diff --git a/llm/cli.py b/llm/cli.py
index 2ed4d8b..1a2c92c 100644
--- a/llm/cli.py
+++ b/llm/cli.py
@@ -4,6 +4,8 @@ import datetime
 import json
 import openai
 import os
+from rich.console import Console
+from rich.markdown import Markdown
 import sqlite_utils
 import sys
 import warnings
@@ -50,7 +52,10 @@ def cli():
     type=int,
 )
 @click.option("--code", is_flag=True, help="System prompt to optimize for code output")
-def chatgpt(prompt, system, gpt4, model, stream, no_log, code, _continue, chat_id):
+@click.option("--rich", "-r", is_flag=True, help="Format Markdown output as rich text")
+def chatgpt(
+    prompt, system, gpt4, model, stream, no_log, code, _continue, chat_id, rich
+):
     "Execute prompt against ChatGPT"
     if prompt is None:
         # Read from stdin instead
@@ -62,6 +67,8 @@ def chatgpt(prompt, system, gpt4, model, stream, no_log, code, _continue, chat_i
         raise click.ClickException("Cannot use --code and --system together")
     if code:
         system = CODE_SYSTEM_PROMPT
+    if rich and stream:
+        raise click.ClickException("Cannot use --rich and --stream together")
     messages = []
     if _continue:
         _continue = -1
@@ -107,7 +114,12 @@ def chatgpt(prompt, system, gpt4, model, stream, no_log, code, _continue, chat_i
             log(no_log, "chatgpt", system, prompt, content, model, chat_id)
             if code:
                 content = unwrap_markdown(content)
-            print(content)
+            if rich:
+                console = Console()
+                markdown = Markdown(content)
+                console.print(markdown)
+            else:
+                print(content)
     except openai.error.OpenAIError as ex:
         raise click.ClickException(str(ex))

@simonw
Copy link
Owner

simonw commented Jun 14, 2023

I tried adding Rich table support to llm logs and got this:

CleanShot 2023-06-14 at 08 17 06@2x

Note that's with the new --truncate option - without that I get:

image

I'm not sure the table is the best way to display this. For comparison, here's what it does with JSON at the moment:

$ llm logs  -t
[
  {
    "rowid": 194,
    "provider": "chatgpt",
    "system": null,
    "prompt": "Python code to reverse a string, with extensive explanation in markdown",
    "response": "## Reverse a String in Python\n\nThere are several ways to reverse a string in Python. In this arti...",
    "model": "gpt-3.5-turbo",
    "timestamp": "2023-06-14 06:54:47.692139",
    "chat_id": null
  },
  {
    "rowid": 193,
    "provider": "chatgpt",
    "system": null,
    "prompt": "Python code to reverse a string, with extensive explanation",
    "response": "# Method 1: Using slicing operators\n# Slicing operators allow us to access substrings of a given ...",
    "model": "gpt-3.5-turbo",
    "timestamp": "2023-06-14 06:53:37.627879",
    "chat_id": null
  },
  {
    "rowid": 192,
    "provider": "chatgpt",
    "system": null,
    "prompt": "Python code to reverse a string, with extensive explanation",
    "response": "Here's a simple Python function that reverses a given string:\n\n```python\ndef reverse_string(s: st...",
    "model": "gpt-3.5-turbo",
    "timestamp": "2023-06-14 06:50:39.285275",
    "chat_id": null
  }
]

@simonw
Copy link
Owner

simonw commented Jun 14, 2023

Here's that updated prototype:

diff --git a/llm/cli.py b/llm/cli.py
index 2ed4d8b..5b74959 100644
--- a/llm/cli.py
+++ b/llm/cli.py
@@ -4,6 +4,9 @@ import datetime
 import json
 import openai
 import os
+from rich.console import Console
+from rich.markdown import Markdown
+from rich.table import Table
 import sqlite_utils
 import sys
 import warnings
@@ -50,7 +53,10 @@ def cli():
     type=int,
 )
 @click.option("--code", is_flag=True, help="System prompt to optimize for code output")
-def chatgpt(prompt, system, gpt4, model, stream, no_log, code, _continue, chat_id):
+@click.option("--rich", "-r", is_flag=True, help="Format Markdown output as rich text")
+def chatgpt(
+    prompt, system, gpt4, model, stream, no_log, code, _continue, chat_id, rich
+):
     "Execute prompt against ChatGPT"
     if prompt is None:
         # Read from stdin instead
@@ -62,6 +68,8 @@ def chatgpt(prompt, system, gpt4, model, stream, no_log, code, _continue, chat_i
         raise click.ClickException("Cannot use --code and --system together")
     if code:
         system = CODE_SYSTEM_PROMPT
+    if rich and stream:
+        raise click.ClickException("Cannot use --rich and --stream together")
     messages = []
     if _continue:
         _continue = -1
@@ -107,7 +115,12 @@ def chatgpt(prompt, system, gpt4, model, stream, no_log, code, _continue, chat_i
             log(no_log, "chatgpt", system, prompt, content, model, chat_id)
             if code:
                 content = unwrap_markdown(content)
-            print(content)
+            if rich:
+                console = Console()
+                markdown = Markdown(content)
+                console.print(markdown)
+            else:
+                print(content)
     except openai.error.OpenAIError as ex:
         raise click.ClickException(str(ex))
 
@@ -150,7 +163,30 @@ def logs(count, path, truncate):
         for row in rows:
             row["prompt"] = _truncate_string(row["prompt"])
             row["response"] = _truncate_string(row["response"])
-    click.echo(json.dumps(list(rows), indent=2))
+
+    # JSON: click.echo(json.dumps(list(rows), indent=2))
+    table = Table()
+    table.add_column("rowid")
+    table.add_column("provider")
+    table.add_column("system")
+    table.add_column("prompt")
+    table.add_column("response")
+    table.add_column("model")
+    table.add_column("timestamp")
+    table.add_column("chat_id")
+    for row in rows:
+        table.add_row(
+            str(row["rowid"]),
+            row["provider"],
+            row["system"],
+            row["prompt"],
+            row["response"],
+            row["model"],
+            row["timestamp"],
+            row["chat_id"],
+        )
+    console = Console()
+    console.print(table)
 
 
 def _truncate_string(s, max_length=100):

@simonw
Copy link
Owner

simonw commented Jun 14, 2023

Here's another example using rich.print_json() for the log output:

image

@sderev
Copy link
Contributor

sderev commented Jun 14, 2023

I just tried an experiment using Rich and a new --rich option:

Annoyingly I don't think I can support --stream at the same time, so you have to wait for the output to finish.

If this is still a problem, note that I solved this issue in my project ShellGenius. In particular, see the function rich_markdown_callback() in cli.py, and how it's used in the request to ChatGPT. This function helps to deal with the issue you're having by updating a live markdown display with chunks of text received from the API in stream.

Here's the basic structure of that function:

def rich_markdown_callback(chunk: str) -> None:
"""
Update the live markdown display with the received chunk of text from the API.

Args:
    chunk (str): A chunk of text received from the API.
"""
global live_markdown, live_markdown_text
live_markdown_text += chunk
live_markdown = Markdown(live_markdown_text)
live.update(live_markdown)

This function is then used in the request to the ChatGPT API in chatgpt_request() located in gpt_integration.py.

I wanted to expand this project to create something very similar to llm. So let me know if I can be of any help in development :). I would gladly work on it.

@simonw
Copy link
Owner

simonw commented Jun 14, 2023

Huh! That's really cool - yeah, I think I could support --stream after all, since Rich knows how to clear the console.

Found your code for that here: https://github.com/sderev/shellgenius/blob/v0.1.8/shellgenius/cli.py

Relevant Rich docs: https://rich.readthedocs.io/en/stable/live.html

@simonw
Copy link
Owner

simonw commented Jun 14, 2023

This is a shame: rich.markdown doesn't support tables yet (which ChatGPT and friends like to include in their markdown output):

I demonstrated that to myself with:

sqlite-utils ~/Dropbox/Development/datasette.io/content.db 'select name, type from sqlite_master limit 10' --fmt github | python -m rich.markdown

@simonw
Copy link
Owner

simonw commented Jun 14, 2023

Got an animated Live demo working with this code:

diff --git a/llm/cli.py b/llm/cli.py
index 37dd9ed..c6ed001 100644
--- a/llm/cli.py
+++ b/llm/cli.py
@@ -4,6 +4,9 @@ import datetime
 import json
 import openai
 import os
+from rich.console import Console
+from rich.live import Live
+from rich.markdown import Markdown
 import sqlite_utils
 import sys
 import warnings
@@ -86,18 +89,19 @@ def openai_(prompt, system, gpt4, model, stream, no_log, code, _continue, chat_i
     try:
         if stream:
             response = []
-            for chunk in openai.ChatCompletion.create(
-                model=model,
-                messages=messages,
-                stream=True,
-            ):
-                content = chunk["choices"][0].get("delta", {}).get("content")
-                if content is not None:
-                    response.append(content)
-                    print(content, end="")
-                    sys.stdout.flush()
-            print("")
-            log(no_log, "openai", system, prompt, "".join(response), model, chat_id)
+            md = Markdown("")
+            with Live(md) as live:
+                for chunk in openai.ChatCompletion.create(
+                    model=model,
+                    messages=messages,
+                    stream=True,
+                ):
+                    content = chunk["choices"][0].get("delta", {}).get("content")
+                    if content is not None:
+                        response.append(content)
+                        live.update(Markdown("".join(response)))
+                print("")
+                log(no_log, "openai", system, prompt, "".join(response), model, chat_id)
         else:
             response = openai.ChatCompletion.create(
                 model=model,

rich

simonw added a commit that referenced this issue Aug 19, 2023
simonw added a commit that referenced this issue Aug 21, 2023
@juftin juftin linked a pull request Sep 14, 2023 that will close this issue
@dzmitry-kankalovich
Copy link

What ended up doing is adding this simple bash function to my .zshrc:

function gpt {
  local input="$*"
  llm -m gpt-4-1106-preview -s 'Answer as short and concise as possible' ${input} | glow
}

and then the result is this:
Screenshot 2023-11-09 at 16 05 25

Not ideal - streaming of LLM output is not working, so you gotta wait for LLM to finish response generation - but good enough for now.

@gianlucatruda
Copy link

Update: This can be closed once #571 is merged.
The --rich flag enables markdown rendering via rich and supports streaming in both direct and chat modes.

@jimmybutton
Copy link

jimmybutton commented Oct 2, 2024

I'm not an expert regarding plugin architecture, but I was wondering if the code that deals with streaming output could be exposed as a plugin hook. basically this part:

# llm/cli.py L278-280, L443-445
for chunk in response:
    print(chunk, end="")
    sys.stdout.flush()

Markdown formatting of the streamed output could then be handled by a plugin, think like llm-rich. This would have the benefit of not needing to include rich as a dependency in the core project.

It could also enable other use cases. For example I could imagine a plugin that allows streaming output directly to files in the current workspace.

@jimmybutton
Copy link

jimmybutton commented Nov 21, 2024

If anyone is interested, I was able to get this to work outside of llm with a small script using rich (inspired by this comment).

# render_streamed_markdown.py
import sys

from rich.console import Console
from rich.live import Live
from rich.markdown import Markdown

def main():
    console = Console()
    md = ""

    with Live(Markdown(""), console=console, refresh_per_second=10) as live:
        while True:
            chunk = sys.stdin.read(1)
            if not chunk:
                break
            
            md += chunk
            live.update(Markdown(md))

if __name__ == "__main__":
    main()

Run it like this:

llm "showcase a few key features of markdown. keep it short." | python render_streamed_markdown.py

There is one drawback though. It only streams until the terminal height is reached and then displays ellipses. Using vertical_overflow="visible" argument of Live() doesn't work, as it's not possible to live update what has already scrolled out of the terminal, see this issue.

One workaround could be to use regular console.print once a paragraph or logical block of markdown is completed, and use Live only for the last part that is still being streamed.

@tchklovski
Copy link

@jimmybutton thank you for the script! very useful
another approach i've been using (but is decidedly not in the terminal) is to pipe to a file Marked2 (on mac) is watching:

touch tmp.md; open -ga "Marked 2" tmp.md
llm "showcase a few key features of markdown. keep it short. include the blocks explaining the format and actual examples without blocks" | tee tmp.md

@tchklovski
Copy link

tchklovski commented Nov 22, 2024

just for fun squeezed your script down to 8 lines:

import sys
from rich.console import Console
from rich.live import Live
from rich.markdown import Markdown

with Live(md := "", console=Console(), refresh_per_second=10) as live:
    while chunk := sys.stdin.read(1):
        live.update(Markdown(md := md + chunk))

@superuser7777
Copy link

Is it possible to pipe the output of the chat command to rich?
Currently, I am using the -c option for single questions piped with rich.

@Explosion-Scratch
Copy link

Got it working with this script:

#!/usr/bin/python3

import sys
from rich.console import Console
from rich.live import Live
from rich.markdown import Markdown

def main():
    console = Console()
    md = ""
    with Live(Markdown(""), console=console, refresh_per_second=10) as live:
        while True:
            chunk = sys.stdin.read(1)
            if not chunk:
                break
            md += chunk
            live.update(Markdown(md))

if __name__ == "__main__":
    main()

Just place somewhere in a PATH dir, then use like llm "Write a javascript function to add two numbers" | rendermd

It does it in realtime!

@superuser7777
Copy link

Yahhh, that's great code to render markdown with streaming.

I'm still an analog person.

first session
llm "$@" --system "$system_prompt" --model gemini-2.0-flash-exp |& rich --markdown --emoji --left --hyperlinks --theme monokai -

next session
llm -c "$@" --system "$system_prompt" --model gemini-2.0-flash-exp |& rich --markdown --emoji --left --hyperlinks --theme monokai -

Come to think of it, the --rich option seems to have disappeared...
I opened this thread with high hopes.

@luebken
Copy link

luebken commented Jan 21, 2025

As we have the response in the logs.db, we can also render with a separate subsequent command. e.g. with the script from #12 (comment)

alias llmf="sqlite3 <LLM_USER_DIR>/logs.db 'SELECT response FROM responses ORDER BY id DESC limit 1;' | python rendermd.py"

@gianlucatruda
Copy link

Based on @Explosion-Scratch's comment (which in turn seems to draw on previous suggestions in this thread), I implemented richify. It's the same core functionality, but with some tweaks to the styling of the output and uv's script runner mode to automatically install and isolate dependencies -- making it much nicer and simpler to run as a standalone script.

Feedback and contributions are actively welcomed! Thanks for all the helpful discussion in this thread so far.

Image

Overall, it feels weird that @simonw still hasn't even acknowledged #278 or #571 that add this functionality directly to llm (including in chat mode), but at least we have our workarounds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
10 participants