This repo contains a very early pre-alpha proof of concept of a children's story generator. It includes
- A children's story generator app that generates text and images for stories. This is a set of python scripts, with the main method in app/story_prompt_v2.py.
- A very basic content management tool that allows the user to create, edit and publish stories (story-viz)
- A very, very basic web app that allows the user to view the stories generated by the app as a slideshow at https://www.storytime.glass/ (react_slideshow). The original us case is a e-picture frame that updates regularly with new stories. We could also build a tablet app for kids to read these stories more interactively
- Some other random experiments in serving models using https://beam.cloud/ (beam-experiments)
This is a simple app that generates a story using a random sample from options on various aspects of a children's story, such as story structure, tone, etc. It then generates a setting and characters, and then generates a story using the selected and generated text. Each paragraph is then converted to an image prompt and four images are generated from each prompt. All of this information is uploaded S3, where it can be further processed byt the content management tool.
Going forward we will move the story generator into the content mananager tool, where we will be able to generate, regenereate and manually edit text and image prompts, regenerated images, and update selections regarding story structure, tone, etc. We will also be able to store characters and settings in a library for use in later stories.
We have a lot of work to do on generating stories with good style, appropriate for specific ages, with variable tone, and with interesting ideas and plots. We also have a lot of work to do on visual continuity, prompt fideltity, avoiding weird and disturbing image artifacts like extra limbs.
This tool allows a human to edit and publish a story. It presently only allows the user to select which of the four generated images to use in the story, and then publish it to a user-specific slideshow that shows each page in the story for 10 minutes (by default). It will soon allow the user to
- select the story structure, setting, character, style and tone
- regenerate the entire story using new settings
- regenerate paragraphs
- regenerate image prompts
- image styles
- regenerate images from prompts
Eventually this tool will also allow the user to maintain a library of stories, characters and settings, and to share them with other users. We may also build a recommendation system to suggest other people's stories based on interests, etc. and even generate stories interatively with kids.
By using a slideshow type app, we aim to avoid over-stimulating kids with visually exciting and overwhelming content. Rather, the story slowly unfolds over the day for the child. We envision the physical form factor to be a e-picture frame that updates regularly with new stories. We could also build an e-reader app that allows the kid to build stories interactively with the AI by choosing what happens page by page through voice and then generating and editing images. We envision this to be a fun activity to do alone or with a parent.
https://beam.cloud is a great serverless GPU platform that makes it easy to create model endpoints that scale to 0, and allow the user to experiment with open models (or their own) without a large local GPU. The beam-experiments directory has several example endpoints that we have used to explore different models.