Post-711

Barrett Burnworth

ā˜ļø What's up?

Need to think of how to mash everything together into some viewable format. I thought an HTML slideshow would be slick, with the music/sounds playing and images auto scrolling. MP4 makes more sense in my head though and seems easier to tackle for some reason. If mp4 - how to mash things together, in order, and have the sounds come in at the right time, have the images transition, etc. programatically.

Summary and Sound Prompting

Barrett Burnworth

🚢 Shipits

First shot at prompting for summary, and subsequent generated sounds. A friend told me this story the day after I wrote up the initial idea. The one thing I left out is him drawing arrowheads.

Please summarize this and suggest image sequences to create a short 1 minute video. Please suggest image descriptions to send to a generative AI system. Please suggest relevant ambient sounds to be used as background noises. These sounds will also be sent to an AI system to be generated based on your description. Please suggest a music or score to be played during the video. This will also be sent to a generative ai system. The things that will be sent to the system should be described in a way that the system can generate appropriate artifacts. The scene to illustrate is: I am sitting at my desk. Tons of incense is around, I light one. My desk lamp is on. I have a record player behind me. I am not looking at my phone. It is put away.

Sure, here is a summary of the scene you described:

• A person is sitting at a desk in a dimly lit room.

• The person is surrounded by incense smoke.

• The person lights an incense stick.

• The person’s desk lamp is on.

• There is a record player behind the person.

• The person is not looking at their phone, which is put away.

Here are some image sequences that could be used to create a short 1-minute video of this scene:

• A wide shot of the room, showing the person sitting at the desk.

• A close-up of the person’s face, showing them lighting the incense stick.

• A shot of the incense smoke filling the room.

• A shot of the person’s desk, showing the desk lamp and the record player.

• A close-up of the person’s phone, which is put away.

Here are some image descriptions that could be sent to a generative AI system to create the images:

• A dimly lit room with incense smoke filling the air.

• A person sitting at a desk, lighting an incense stick.

• A close-up of the incense smoke, swirling and rising.

• A desk lamp casting a warm glow on the desk.

• A record player in the background, waiting to be played.

• A person’s phone, put away and out of sight.

Here are some ambient sounds that could be used as background noises in the video:

• The sound of incense burning.

• The sound of the person’s breathing.

• The sound of the desk lamp clicking on.

• The sound of the record player spinning.

• The sound of silence.

Here is some music or score that could be played during the video:

• A slow, calming piece of music.

• A piece of music that evokes a sense of peace and tranquility.

• A piece of music that is perfect for relaxing and de-stressing.

I hope this helps!

Artifact Prompting for Projects

Barrett Burnworth

🚢 Shipits

First shot at prompting for project ā€˜artifacts’ with bard.google.com. Well, second, but in the first prompt I used the word ā€˜debris’ and it gave me a list of stuff like sawdust! This would be useful for generating things for projects, where the person is just posting about their project and not really trying to describe it for cinematic effect.

image

image

A ā€˜Cinematic’ Writing Tool

Barrett Burnworth

šŸ’” Ideas

A slight diversion, nigh but a short tangent, for a Spoke feature set marketed toward writers/creatives. The inverse of having AI bots create content is to have the AI bots enhance content. This opens up the path for the evolution of a new way for creatives to develop their vision. Also, why not hitch a ride on the AI hype train while its still hot? We can get in some RW marketing, Partner marketing, Spoke marketing and enhance the process of creative exploration all at once. It will be optional, of-course, for any specific post or project. It will not be a fit for every project.

Spoke AI could make project visions come alive.

Side Note: This could be a strong creative tool for anyone. If we create an amazing tool for creatives, but do not offer ways to keep certain ideas private, they will seek other tools to meet this need. We should respect the privacy of users and let them keep things private. Keep them on the platform for both open and private efforts.

There is a future, a very near future, where writing a journal entry, blog post, or similar will be very close to the concept of directing a movie. A future (I do take this to the eventual extreme) where a written post is immediately turned into an experience complete with generated sights, sounds and eventually sensations like touch, smells and feelings of the scene. All of this could be generated from embedded context. A future where the writer can direct all of this using the pen (or keyboard). This future will be amazing, and part of it is already here. Spoke can feasibly unlock the ability to enhance each post with AI generated content now.

ā€œA director is one who makes the scenes of a film come to life… manages the artistic aspects… and coordinates with the cast and crew….ā€

A Cinematic Writing Tool gives writers the ability to be a pseudo-director in near real-time by bringing their vision to life immediately with AI. Sure, things may not align perfectly with what the writer imagined (cue the Texas Instruments AI Brainwave Interpreter Hat here). Sure, things may need to be tweaked. However, the potential for a story to instantly be brought forth to multiple mediums is very compelling. This opens a single story to many different audiences in one pass. Marketing, ideation, evaluation, creative debris, these things are all part of the creative process. Spoke AI can be a second, third or fourth pair of hands to help craft someone’s vision. The reality of most projects is that the end-result is often not the same as the vision we have in our heads. We still create. A CWT offers a new way to create. A faster product to market. More potential futures to sift through. More ideas to consider. A new way to breathe life into ideas.

Grand View

We are used to traditional books in our hands and movies on a flat screen with no intelligence behind it. These things are presented to us as-is and do not change. Ultimately, AI-backed entertainment has the potential to disrupt these mediums in ways we cannot imagine. The concept of loading a project into a database, letting AI render the scenes on-the-fly, and giving users the power to manipulate the scenes within the boundaries of the story-line opens up unheard-of freedoms for the viewer. How about changing a character’s appearance, or the environment, at will. How about moving a scene from the desert to a snowy landscape and keeping everything else the same. How about viewing everything from a different character’s point of view? The ability to change a character’s race, language, culture (smells!) on-the-fly is going to enable stories to connect across cultures like never before. These things all seem possible with AI, at some level.

Other Thoughts (messy draft format)

This tool dovetails nicely with Spoke’s current trajectory regarding project blogs. We could use this tool to scope out someone’s idea. This tooling could be used to create marketing materials for a project. It could be used to convert a project to a story format, with images, sounds, etc. The Cinematic Writing Tool would simply imagine a future, or multiple futures, where this is reality. What better ideation or marketing material development tool could one ask for?

Technical (not really) Implementation Ideas

Contextual Input Options:

  • Notes (Hand Written, Photographed, Uploaded & AI Parsed)

  • Spoke Posts (Overall Project Posts)

  • Spoke Context Addons (Contextual addons to each project post)

  • Images (Uploaded and AI descriptions extracted Mini-GPT

  • Audio Recordings (Extract text from speech; audio context from background noise)

  • User Prompts

All of these things could be used in conjunction with each other to create a scene. The user can pick and choose what is included as context. The user can tweak these things over time and regenerate things. The user can store copies of the generated content in a debris folder. The user can export the generated content for their own needs.

The pieces are already there. AI can generate sounds based on descriptions. AI can generate images (and video, but currently meh) from descriptions. AI can generate human realistic voice-overs. Figuring out how to mesh the results together may be the tricky part.

Allow the user to choose what contextual attributes are used. Three possible sources of context include the user may want to manually add context, or have Spoke pull context from the post, or the user may want Spoke to record audio around the user and store as context to play back.

Compression is an interesting option. There are examples of LLM’s compressing content, essentially creating a unique language, that it can then unpack and get very close to the original content. This would be an interesting way to store contextual data (eyeballs audio content) which could be ā€˜expanded’ later to generate audio for the scene. I am not sure of a way to extract and describe noises within an audio clip. But, this is along the lines of visual feature extraction so I am sure there is a solution.

Relevant Links:

Text to Audio: https://huggingface.co/spaces/haoheliu/audioldm-text-to-audio-generation

Image to Description: https://minigpt-4.github.io/

Export to HTML / SVG / MP4 / ETC?

Generated Output Ideas

  • Youtube Shorts

  • HTML Slideshows w/music (insert into blog?)

  • Audio/Video montage

  • Voiceover Narratives

  • Sound-enhanced Storylines

  • Project related ā€˜debris’

These would be configurable. Each project could have defaults.

Marketing Play

Partners???

  • Jekyll is already integrated into Spoke.

  • MAJR - This is similar to MAJR, although the inverse in that it creates sort of ā€˜debris’ of a person’s vision.

  • Automattic - Day One could use our API to enhance their journal entries.

  • RedwoodJS

  • AI Hype Bandwagon (Splashy)

  • Spoke Growth

  • Spoke Feature Enhancement

Early Prompt Ideas

Summarize post and suggest image sequence in the style of _____ artist/director … and extract imagery descriptions in the style of ____ artist … and suggest score in the style of ___ and sounds in the style of _____ theme or movie

ā€œOne-shot promptingā€ - use example prompt to prime the LLM to give response in desired fashion. IE, if a project is sent to the generator: ā€œPlease suggest typical debris that a ____ type project might have. Use this list of debris for a web application as a template: ā€œmarketing materials, code snippets, application screenshots, customer reviews & testimonials, customer support, etc.ā€

Suggest length of montage, styles, etc.

Presets: Prompt presets for predefined patterns for music, score, etc.; Prompts based on user styles & previous themes

Rankings: Create a leaderboard with views/counts/votes/comments etc.

Dead Ends?

What constitutes a scene? How is the length of a scene, the transitions between visuals or auditory changes, etc. determined?

Spoke Feature Needs

Draft Versions

Private Versions

Subscriber-only Content

Billing system (to accept payment for generated content)

Better editor - I don’t want to have to edit my drafts in another program, just to copy over to Spoke. I do like other programs, but Spoke can provide a better editing experience.

Future Features

Local version, leveraging APIs or local ML Models, could be kept completely private.

Bookmarks

Gotta have this in there! https://vcai.mpi-inf.mpg.de/projects/DragGAN/

https://www.promptingguide.ai/introduction/tips

Sandbox

Writers are not used to having a tool that can bring their vision to life in near real-time.

On the other side of the coin, brain waves being interpreted by AI could lead to some amazing neural journaling tools!