☝️

What's up?

Random updates, work in progress, photos, etc.

Background Noises

Barrett Burnworth

☝️ What's up?

Prompt - Perplexity GPT-4

Brainstorm ambient background noises that would be appropriate for the following scene: “Sitting at my desk with the lamp on, it is dark outside, turned my record player on. I have a ton of incense and light one. I sit down and draw at my desk.”

SOUNDS MUST BE CHRONOLOGICAL IN ORDER OF WHAT HAPPENS DURING THE SCENE.

SOUNDS SHOULD BE DETAILED AND ACTIONABLE FOR A SOUND GENERATOR TO CREATE SUCH AS THE

FOLLOWING FORMAT: “A hammer hitting a metal sheet” OR “A bee buzzing around a quiet room”

Must: brainstorm ideas and create a list.

Must: use a numbered list.

Must: only one list.

Should: no more than 10 items.

Should: at least 3 items.

Return it in JSON using the following format: [{ “Sound”: 1, “description”:”description of the sound” }]

Copilot

Result:

[

{ “Sound”: 1, “description”: “Flick of the lamp switch, followed by a soft hum of the lamp” },

{ “Sound”: 2, “description”: “Record player needle gently touching the vinyl, creating a subtle crackling sound” },

{ “Sound”: 3, “description”: “Soft piano music playing in the background, creating a calm atmosphere” },

{ “Sound”: 4, “description”: “Scratch of a matchstick igniting, followed by the sizzle of the incense being lit” },

{ “Sound”: 5, “description”: “Gentle hiss of the burning incense, accompanied by occasional crackling” },

{ “Sound”: 6, “description”: “Sound of pencil or pen gliding across paper as the drawing begins” },

{ “Sound”: 7, “description”: “Occasional rustling of paper as the artist adjusts their drawing” },

{ “Sound”: 8, “description”: “Soft creaking of the wooden desk or chair as the artist moves” },

{ “Sound”: 9, “description”: “Distant, muffled sounds of the night outside, such as wind or distant traffic” },

{ “Sound”: 10, “description”: “Periodic deep breaths or sighs, reflecting the artist’s focus and relaxation” }

]

Thoughts

General AI noise generator needs a lot of work… not sure if the prompts are bad, or the generator is bad, etc. Some sounds are pretty good, many have a weird hum. I really like the descriptions that GPT gave for the scene.

If this goes further, the user would need to be able to listen, pick and choose sounds that work.

Prompting Experiments

Barrett Burnworth

☝️ What's up?

Bookmark for later

https://plasticlabs.ai/blog/Theory-of-Mind-is-All-You-Need/

I have the SK book creator example running. It works and I like the way it interacts. It is a react frontend that communicates with an api running azure functions. It uses OpenAI and takes your input and generates a list of book ideas. You can select one of the book ideas and then it generates chapters. Semantic Kernel uses ‘skills’ which your input is sent to, and then the skill is sent to openAI as a a prompt. I am creating new skills based on the prompts for CWT, like below. The first step is to get the app to accept the input and feed my custom prompts. The second step will be to parse the returned data. Third - send the returns to another service to generate the sounds, music or images.

The prompts were in a long post, but have been separated into individual posts. They were getting long and I did not like that. All prompts are based on this scene of a friend’s description of his night a few weeks back.

Sitting at my desk with the lamp on, it is dark outside, turned my record player on. I have a ton of incense and light one. I sit down and draw at my desk.

Image Generation

Prompt - Dall-E 2

Again, prompting GPT to generate image prompts of the scene.

Brainstorm images that would be appropriate for the following scene: “Sitting at my desk with the lamp on, it is dark outside, turned my record player on. I have a ton of incense and light one. I sit down and draw at my desk.”

Must: be chronological in order of what happens in the scene

Must: describe a scenario - the event/activity/object being depicted

Must: a style - a description of the artist to mimic, the mood to capture, the artistic technique to emulate

Must: be in the style of images you might find in a comic book, graphic novel, or flipbook

Must: brainstorm ideas and create a list.

Must: use a numbered list.

Must: only one list.

Should: no more than 5 items.

Should: at least 3 items.

Return it in JSON using the following format: [{ “Image”: 1, “content”:”content of the image” }]


I would prefer midjourney… not so hot on stable diffusion. I haven’t looked at other image generation models recently to know another open model to try.

Result

Image 1

[{“Image”: 1, “content”: “The scene is depicted from a low angle, capturing the desk with a dimly lit lamp in the foreground. The record player sits on the corner of the desk, with a vinyl spinning on it. The room is enveloped in darkness, visible through the window behind the desk. The character is seen sitting at the desk, engrossed in drawing, with a pencil in hand and various art supplies scattered around. The air is filled with the aroma of burning incense, creating a serene and contemplative atmosphere. The style emulates the work of a skilled comic book artist, with detailed line work, shading, and textures to enhance the mood and capture the character’s focus and artistic expression.”},

image

Image 2

{“Image”: 2, “content”: “This image focuses on the record player, which occupies the center of the panel. The stylized turntable is shown spinning, emitting sound waves depicted as vibrant, colorful lines. The lamp on the desk casts a warm glow on the surrounding area, creating a cozy ambiance. The dark outside is hinted at through a window in the background, suggesting the passage of time. The character sits nearby, engrossed in their drawing, with a sketchbook and pencils strewn across the desk. The art style mimics a graphic novel, with bold lines, expressive characters, and dynamic panel composition, capturing the character’s immersion in both music and art.”},

image

Image 3

{“Image”: 3, “content”: “In this image, the focus is on the character at their desk. The room is enveloped in darkness, with only the desk area illuminated by the lamp. The character is shown sitting in a contemplative posture, engrossed in their drawing. The desk is cluttered with art supplies, including brushes, paint tubes, and sketchbooks, emphasizing the character’s dedication to their craft. Incense smoke gracefully rises from a holder nearby, adding a mystical touch to the scene. The art style emulates the work of a renowned comic book artist, with intricate linework, realistic shading, and attention to facial expressions, capturing the character’s intense focus and passion for art.”}]

image

Thoughts

This one will take more effort to make a cohesive scene between images. I’m not sure if this will always be doable, especially if the writer wants different images within one scene. I could make it require certain context in each image description. But, again with the problem just mentioned, this might not be desired depending on the scene. Another thing I have thought about it letting the user somehow manage the context for each generator in a slideout. Although, this seems strange because I am using the written scene to infer context. Not sure if putting the first image plus the image description into the model will work.

Also, Dall-E’s prompt box cut the image text short about a sentence and a half on each. So, either find another model that takes longer input or shorten the sentence.

Used Facebook’s MusicGen HF Space for the music samples. Still used the prompt templates from Google’s MusicLM.

I am slowly becoming less enamored with the CWT workflow… because it seems so piecemeal. However, I am still plugging away to put some things together and really see how it works. One thing it does make me realize is that as a writer, more context would help set the mood. So, seeing what the AI generates as descriptions vs what you as the writer have in mind for the scene might be totally different. In that light, it might make the writer be more descriptive or change their language.

Solved SK-Error

Barrett Burnworth

☝️ What's up?

Solved it by adding <add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" /> to nuget.config. This line was missing. Solution from here: https://github.com/dotnet/interactive/issues/1478#issuecomment-906836654

Thanks to SemanticKernel discord for helping look at the issue. We pointed to the fact that the error message was referencing a fsharp folder, while the cell was using csharp. This helped me to investigate further and somehow I stumbled upon the referenced GitHub issue - which solved it instantly with the added line. ❤️

SK-Errors

Barrett Burnworth

☝️ What's up?

This weekend I finally hunted down and downloaded Maeda’s SK-Recipes repo. Installed, ran into an issue and submitted a PR to help with that. Now I am trying to get Semantic Kernel c# working on my machine! I am running into issues and have not gotten around the latest one yet. I was going to try GitPod last night, but had to get my machine set up with Dashlane, which has all of my passwords. I got that sorted, but by that time was ready to close my eyes.

Error: NU1101

image

VSCode

error NU1101: Unable to find package Microsoft.SemanticKernel. No packages exist with this id in source(s): C:\Program Files\dotnet\sdk\7.0.302\FSharp\library-packs

This apparently should just work… I’m not sure what the nuget package issue is. Error NU1101: Unable to find package Microsoft.SemanticKernel.

I have installed a nuget package manager extension and tried to add the SemanticKernel package. This did not help. With this I run into “Cannot find any .csproj or .fsproj file for your project! …” This is confusing me, because I was under the impression that the polyglot notebooks would ‘just work.’ Reading through the repo, watching the walkthrough, there is no mention of this issue. Their discord has one mention of Error NU1101 and John Maeda comments, but I never saw a resolution.

Resolution

Still searching. If anyone has tips, please shoot them my way!

"Goal-Oriented AI"

Barrett Burnworth

☝️ What's up?

Exploring Semantic Kernel

image

After using autonomous agents, BabyAGI, Auto-GPT, it feels like these things currently waste way too many cycles trying to refine its output. Planning/coordination, prompting and chaining is still the most efficient way to get a desired result.

“One more (of many more), I’ve found that providing hand-crafted mini agents as tools works better than simply providing tools in the traditional sense.

As a simple example, every time you search, you want to scrape. So combine them into one chained tool vs each as separate.

This last example takes some load off of the LLM, and handles it with coding logic.

These kind of trade offs can be limiting at times, but also help w cost and speed.

Seems more art than science. Lots of nuance and variables to balance.”

From [@yoheinakajima via Twitter on the questions of “Where are we with autonomous agents? What can they do? Should we use one?”](<blockquote class="twitter-tweet text-xs text-secondary-400" data-dnt="true">Loading Tweet…</blockquote>?s=20)

So, for now, I will take this path for the CWT work. However, connecting all of this together is the next task. Semantic Kernel is Microsoft’s approach to langchain. I like the simplicity that it brings with connecting things up. I like John Maeda, as-well, so that is another reason I am leaning towards it.

image

According to Maeda, the approach here is “goal-oriented AI.” He described Skills as “the core building blocks in SK” and noted that they can be both simple (“Please summarize this piece of text for me”) or complex (“Please summarize everything I need to know for today and build a plan for how I need to get done what I need to accomplish”).

“Memory increases the capability of a Skill tremendously by allowing you to tie the Skill’s capability to your historical data that can be persisted and accessed at any time,” he said, while Connectors “are customizable resources that enable external data access.” He added that connectors help address one of the primary criticisms of LLMs — that they are pre-trained, and so “essentially frozen in time.” 1

Some resources, links and more Maeda quotes below.

Quotes

“Traditionally, computer science education has been about achieving structured outputs from well-structured syntax,” he replied, “but actually the more flexible mindset of data scientists or even creative artists can be assistive in order to navigate this new world.”

“The software world has been dominated by people who can literally speak machine,” he replied. “So, it’s an interesting turn of events that this new kind of programming is much closer to natural language. If your goal is to produce writing as an output, then there’s certainly room for many language-proficient English teachers to have an impact. That said, to be a productive ‘prompt engineer’ still requires you to have the ability to think like an engineer. There’s a reason why the engineering field emerged as a discipline; it’s always attracted those who love to build machinery. In the future, we can count on prompts that are engineered by developers to have qualities we both need and want — like reliability and efficiency. That won’t change. The difference is that developers will be able to pair up with AIs to create even more reliable and efficient systems than ever before.”

Why the word Kernel in the name?

“It’s a tip of the hat to that all-time enabler of computational productivity, the UNIX kernel,” he replied. “For those of your readers who remember when the UNIX kernel emerged, I think we all were a little confused by commands comprised of two characters ‘ls’, ‘cd’, ‘ps’ etc. But the big ‘a-ha’ was when we piped commands with the ‘|’ symbol and suddenly the light came on. The UNIX kernel’s simplicity as a landmark user experience for developers has been the north star for SK during its evolution. And we definitely don’t feel we have it right yet. That’s why we released it as open source. So that we can learn in the open as a community, and hopefully together build the right user experience for developers who are excited as we are by this new shift from syntax to semantics.” 1

John’s “Kitchen Sink” talk to the Onetug.net group: