Hello DALL-E3!

DALL·E 3 is the new beta version of an AI program trained to generate images from text descriptions and it is now available in ChatGPT4. DALL-E has actually been around for a few years, but it is now integrated directly into ChatGPT4, and I had a chance to experiment with it yesterday. It is mind blowing. To be able to write a description of an image that you’d like to see - especially something that doesn’t exist (yet) in our world and then have a program generate that image for you is pretty incredible. I have used other versions of DALL-E and I’ve tried out similar tools in Canva, but if this new version is any indication of what’s to come with this type of technology, we should definitely start buckling up.

I have a paid ChatGPT account that allows me to use the latest version. DALLE-3 is only available if you have the paid version (see above). I spent some time yesterday creating images for an upcoming Keynote Address that I am presenting, and I thought it would be interesting to share with you some of those images and the text prompts that created them. For instance, when I typed in “Create an image of an AI robot conducting a school band”, this is one of the images DALL-E3 generated:

When I asked it to create an image of an AI robot teaching music to a group of 7 year olds, this is one of the images that DALL-E3 generated:

And a final example, when I asked DALL-E3 to create an image of an AI robot playing the tuba, this is one of the images that it generated:

Pretty incredible, huh? DALL-E3 generates 4 images for you to choose from, and they are all slightly different and some even use different artistic techniques. For example, using the same text request as above, this is one of the alternate images it generated - using a water color technique:

Extremely impressive in my opinion. So what is DALL-E3, how does it work, and how can music educators use it with their students?

DALL-E3 is a continuation of OpenAI's (the company behind ChatGPT) work on image generation models, following the previous success of DALL-E and DALL-E2. It is an extension of technologies like ChatGPT - but instead of natural language results, DALL-E3 generates images. Like its predecessors, DALL-E3 is based on the GPT (Generative Pre-trained Transformer) architecture. This means it uses a type of deep learning model known as a Transformer, which has proven to be highly effective for both natural language processing and image generation tasks.

DALL-E3 is trained on vast amounts of data, which includes both text and images. This allows it to generate images from textual descriptions. The training process involves adjusting the model's weights to minimize the difference between its generated images and the actual images in its training set. DALL-E3 can generate images from textual prompts. This means that when given a description, it can produce an image that matches that description. The model has the ability to understand and generate complex scenes, objects, and even abstract concepts.

Like all machine learning models, DALL-E3 is not perfect. It may sometimes produce unexpected or nonsensical outputs. Additionally, since it's trained on vast datasets from the internet, it can inherit and sometimes amplify biases present in those datasets. It's worth noting that while DALL-E3 is a powerful image generation tool, it doesn't "understand" images or text in the same way humans do. It's essentially finding patterns in the data it was trained on and using those patterns to produce images based on textual prompts.

So how can teachers use DALL-E3 with their students? Well my first thought, which is rather obvious, is that students can describe music that they listen to in visual terms using masterworks like Pictures at an Exhibition, and have DALL-E3 generate the images based on the words that they used to describe what they were hearing. Instead of having them draw it themselves, have DALL-E3 take a try. You can either enter individual student text entries or generate one image based on what the class comes up with. Better yet - have the students draw alongside DALL-E3 and then compare and contrast the result. The more descriptive text that the students write while listening to this or ANY piece of music will help DALL-E3 give visual life to their thoughts. Pretty cool!

Use DALL-E3 as inspiration for composition and creativity. Teachers can prompt DALL-E3 to generate surreal or abstract images based on musical terms or emotions. For instance, they might input "a serene landscape inspired by a piano melody" or "an abstract representation of jazz improvisation." Students can then use these images as inspiration to compose pieces or improvise. By bridging the gap between visual art and music, students can explore the emotional and thematic connections between the two, fostering creativity and a deeper understanding of musical expression.

Album Cover Art! Use DALL-E3 to allow students to generate album cover artwork based on their text descriptions of the music and songs contained in the album. Obviously this works best if the songs on the album are composed by the students themselves. This concept can also be applied to any musical performance presented by your student ensembles - think program artwork? When I type in “Create artwork for a school holiday concert”, this is what DALL-E3 generated:

If you don’t like the images that DALL-E3 produces the first time, you can either click Regenerate and DALL-E3 will create 4 more images for you, OR you can include more descriptive text to try to get DALL-E3 to generate what you want.

I am simply blown away by DALL-E3. Users of DALL-E2 and DALL-E have known all along how powerful this and similar technologies are, and this latest version - with ChatGPT integration - is awesome. If you can spring for a ChatGPT4 account ($20/month), I would definitely recommend reaching in your pocket for it - even if only for a month or two. I look forward to seeing where this technology is headed!

Previous
Previous

The SAMR Model & Music Technology

Next
Next

MusicEDU Partners with MusicFirst