Background music

Barrett Burnworth

☝️ What's up?

Based on MusicLM’s descriptions… even though I don’t have access to it. This is the prompt guide I am going with for now.

Prompt - Perplexity - GPT-4

Brainstorm music that would be appropriate for the following scene in a movie

“Sitting at my desk with the lamp on, it is dark outside, turned my record player on. I have a ton of incense and light one. I sit down and draw at my desk.”

Must: be rich captions such as the following:

“The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music

is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls.”

or “A fusion of reggaeton and electronic dance music, with a spacey, otherworldly sound. Induces the experience of

being lost in space, and the music would be designed to evoke a sense of wonder and awe, while being danceable.”

Must: brainstorm ideas and create a list.

Must: use a numbered list.

Must: only one list.

Should: no more than 2 items.

Should: at least 1 item.

Return it in JSON using the following format: [{ “Soundtrack”: 1, “caption”:”caption of the audio” }]

Result

Based on your preferences and the scene you described, I have brainstormed two music ideas that could be appropriate for this setting:

A blend of ambient and neoclassical music, featuring soothing piano melodies and soft strings, creating a calm and introspective atmosphere. This music would evoke a sense of tranquility and focus, allowing the listener to fully immerse themselves in their creative process while drawing at their desk. Example: “A peaceful piano and string composition, reminiscent of the works of Ludovico Einaudi or Ólafur Arnalds, that creates a serene and contemplative environment, perfect for drawing and introspection.”

A cinematic and atmospheric soundtrack, inspired by the works of composers like Hans Zimmer or Thomas Newman, that combines lush orchestral arrangements with subtle electronic elements. This music would create a sense of wonder and exploration, transporting the listener to a different world as they draw at their desk. Example: “An immersive and evocative soundtrack that fuses orchestral and electronic elements, reminiscent of the scores from movies like ‘Interstellar’ or ‘The Shawshank Redemption,’ designed to inspire creativity and transport the listener to a realm of imagination and discovery.”

Here is the JSON format you requested:


[

  {  

    "Soundtrack": 1,  

    "caption": "A peaceful piano and string composition, reminiscent of the works of Ludovico Einaudi or Ólafur Arnalds, that creates a serene and contemplative environment, perfect for drawing and introspection."  

  },  

  {  

    "Soundtrack": 2,  

    "caption": "An immersive and evocative soundtrack that fuses orchestral and electronic elements, reminiscent of the scores from movies like 'Interstellar' or 'The Shawshank Redemption,' designed to inspire creativity and transport the listener to a realm of imagination and discovery."  

  }  

]

Thoughts

Not super happy with the musical choices generated by GPT. They seem too grandiose. The music generator seemed to do a good job, however, of generating what was asked of it. I can see this being a nice add. Again, this would be something the user would choose as a background soundtrack to their clip or scene.

Background Noises

Barrett Burnworth

☝️ What's up?

Prompt - Perplexity GPT-4

Brainstorm ambient background noises that would be appropriate for the following scene: “Sitting at my desk with the lamp on, it is dark outside, turned my record player on. I have a ton of incense and light one. I sit down and draw at my desk.”

SOUNDS MUST BE CHRONOLOGICAL IN ORDER OF WHAT HAPPENS DURING THE SCENE.

SOUNDS SHOULD BE DETAILED AND ACTIONABLE FOR A SOUND GENERATOR TO CREATE SUCH AS THE

FOLLOWING FORMAT: “A hammer hitting a metal sheet” OR “A bee buzzing around a quiet room”

Must: brainstorm ideas and create a list.

Must: use a numbered list.

Must: only one list.

Should: no more than 10 items.

Should: at least 3 items.

Return it in JSON using the following format: [{ “Sound”: 1, “description”:”description of the sound” }]

Copilot

Result:

[

{ “Sound”: 1, “description”: “Flick of the lamp switch, followed by a soft hum of the lamp” },

{ “Sound”: 2, “description”: “Record player needle gently touching the vinyl, creating a subtle crackling sound” },

{ “Sound”: 3, “description”: “Soft piano music playing in the background, creating a calm atmosphere” },

{ “Sound”: 4, “description”: “Scratch of a matchstick igniting, followed by the sizzle of the incense being lit” },

{ “Sound”: 5, “description”: “Gentle hiss of the burning incense, accompanied by occasional crackling” },

{ “Sound”: 6, “description”: “Sound of pencil or pen gliding across paper as the drawing begins” },

{ “Sound”: 7, “description”: “Occasional rustling of paper as the artist adjusts their drawing” },

{ “Sound”: 8, “description”: “Soft creaking of the wooden desk or chair as the artist moves” },

{ “Sound”: 9, “description”: “Distant, muffled sounds of the night outside, such as wind or distant traffic” },

{ “Sound”: 10, “description”: “Periodic deep breaths or sighs, reflecting the artist’s focus and relaxation” }

]

Thoughts

General AI noise generator needs a lot of work… not sure if the prompts are bad, or the generator is bad, etc. Some sounds are pretty good, many have a weird hum. I really like the descriptions that GPT gave for the scene.

If this goes further, the user would need to be able to listen, pick and choose sounds that work.

Prompting Experiments

Barrett Burnworth

☝️ What's up?

Bookmark for later

https://plasticlabs.ai/blog/Theory-of-Mind-is-All-You-Need/

I have the SK book creator example running. It works and I like the way it interacts. It is a react frontend that communicates with an api running azure functions. It uses OpenAI and takes your input and generates a list of book ideas. You can select one of the book ideas and then it generates chapters. Semantic Kernel uses ‘skills’ which your input is sent to, and then the skill is sent to openAI as a a prompt. I am creating new skills based on the prompts for CWT, like below. The first step is to get the app to accept the input and feed my custom prompts. The second step will be to parse the returned data. Third - send the returns to another service to generate the sounds, music or images.

The prompts were in a long post, but have been separated into individual posts. They were getting long and I did not like that. All prompts are based on this scene of a friend’s description of his night a few weeks back.

Sitting at my desk with the lamp on, it is dark outside, turned my record player on. I have a ton of incense and light one. I sit down and draw at my desk.

Image Generation

Prompt - Dall-E 2

Again, prompting GPT to generate image prompts of the scene.

Brainstorm images that would be appropriate for the following scene: “Sitting at my desk with the lamp on, it is dark outside, turned my record player on. I have a ton of incense and light one. I sit down and draw at my desk.”

Must: be chronological in order of what happens in the scene

Must: describe a scenario - the event/activity/object being depicted

Must: a style - a description of the artist to mimic, the mood to capture, the artistic technique to emulate

Must: be in the style of images you might find in a comic book, graphic novel, or flipbook

Must: brainstorm ideas and create a list.

Must: use a numbered list.

Must: only one list.

Should: no more than 5 items.

Should: at least 3 items.

Return it in JSON using the following format: [{ “Image”: 1, “content”:”content of the image” }]


I would prefer midjourney… not so hot on stable diffusion. I haven’t looked at other image generation models recently to know another open model to try.

Result

Image 1

[{“Image”: 1, “content”: “The scene is depicted from a low angle, capturing the desk with a dimly lit lamp in the foreground. The record player sits on the corner of the desk, with a vinyl spinning on it. The room is enveloped in darkness, visible through the window behind the desk. The character is seen sitting at the desk, engrossed in drawing, with a pencil in hand and various art supplies scattered around. The air is filled with the aroma of burning incense, creating a serene and contemplative atmosphere. The style emulates the work of a skilled comic book artist, with detailed line work, shading, and textures to enhance the mood and capture the character’s focus and artistic expression.”},

image

Image 2

{“Image”: 2, “content”: “This image focuses on the record player, which occupies the center of the panel. The stylized turntable is shown spinning, emitting sound waves depicted as vibrant, colorful lines. The lamp on the desk casts a warm glow on the surrounding area, creating a cozy ambiance. The dark outside is hinted at through a window in the background, suggesting the passage of time. The character sits nearby, engrossed in their drawing, with a sketchbook and pencils strewn across the desk. The art style mimics a graphic novel, with bold lines, expressive characters, and dynamic panel composition, capturing the character’s immersion in both music and art.”},

image

Image 3

{“Image”: 3, “content”: “In this image, the focus is on the character at their desk. The room is enveloped in darkness, with only the desk area illuminated by the lamp. The character is shown sitting in a contemplative posture, engrossed in their drawing. The desk is cluttered with art supplies, including brushes, paint tubes, and sketchbooks, emphasizing the character’s dedication to their craft. Incense smoke gracefully rises from a holder nearby, adding a mystical touch to the scene. The art style emulates the work of a renowned comic book artist, with intricate linework, realistic shading, and attention to facial expressions, capturing the character’s intense focus and passion for art.”}]

image

Thoughts

This one will take more effort to make a cohesive scene between images. I’m not sure if this will always be doable, especially if the writer wants different images within one scene. I could make it require certain context in each image description. But, again with the problem just mentioned, this might not be desired depending on the scene. Another thing I have thought about it letting the user somehow manage the context for each generator in a slideout. Although, this seems strange because I am using the written scene to infer context. Not sure if putting the first image plus the image description into the model will work.

Also, Dall-E’s prompt box cut the image text short about a sentence and a half on each. So, either find another model that takes longer input or shorten the sentence.

Used Facebook’s MusicGen HF Space for the music samples. Still used the prompt templates from Google’s MusicLM.

I am slowly becoming less enamored with the CWT workflow… because it seems so piecemeal. However, I am still plugging away to put some things together and really see how it works. One thing it does make me realize is that as a writer, more context would help set the mood. So, seeing what the AI generates as descriptions vs what you as the writer have in mind for the scene might be totally different. In that light, it might make the writer be more descriptive or change their language.

Solved SK-Error

Barrett Burnworth

☝️ What's up?

Solved it by adding <add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" /> to nuget.config. This line was missing. Solution from here: https://github.com/dotnet/interactive/issues/1478#issuecomment-906836654

Thanks to SemanticKernel discord for helping look at the issue. We pointed to the fact that the error message was referencing a fsharp folder, while the cell was using csharp. This helped me to investigate further and somehow I stumbled upon the referenced GitHub issue - which solved it instantly with the added line. ❤️

SK-Errors

Barrett Burnworth

☝️ What's up?

This weekend I finally hunted down and downloaded Maeda’s SK-Recipes repo. Installed, ran into an issue and submitted a PR to help with that. Now I am trying to get Semantic Kernel c# working on my machine! I am running into issues and have not gotten around the latest one yet. I was going to try GitPod last night, but had to get my machine set up with Dashlane, which has all of my passwords. I got that sorted, but by that time was ready to close my eyes.

Error: NU1101

image

VSCode

error NU1101: Unable to find package Microsoft.SemanticKernel. No packages exist with this id in source(s): C:\Program Files\dotnet\sdk\7.0.302\FSharp\library-packs

This apparently should just work… I’m not sure what the nuget package issue is. Error NU1101: Unable to find package Microsoft.SemanticKernel.

I have installed a nuget package manager extension and tried to add the SemanticKernel package. This did not help. With this I run into “Cannot find any .csproj or .fsproj file for your project! …” This is confusing me, because I was under the impression that the polyglot notebooks would ‘just work.’ Reading through the repo, watching the walkthrough, there is no mention of this issue. Their discord has one mention of Error NU1101 and John Maeda comments, but I never saw a resolution.

Resolution

Still searching. If anyone has tips, please shoot them my way!