Jonathan Moeller, Pulp Writer

The books of Jonathan Moeller

administrataart

Adobe Firefly Generative AI

I mentioned last week that I was able to get into the beta for Adobe’s new Firefly image generation tool. I’ve been very critical of generative AI, so I want to give it a fair shake. I mentioned before that Firefly might address some of my concerns about generative image AI basically en masse stealing images across the Internet for their training data.

So, here are my thoughts after experimenting with Adobe Firefly for a bit. You can see some of the results in the image attached to this post.

First things first – Firefly is a lot more user-friendly than something like Midjourney or Stable Diffusion. Midjourney is basically like the command prompt of image generation – more powerful and more versatile than the GUI version, but not quite as simple to use. The GUI interface of Firefly is pretty friendly. It has a right-hand sidebar with a lot of buttons for adjusting the output of your image generation prompt. You can choose to create a photo, a graphic, or an “artistic” image, and there are numerous dropdown menus allowing you to adjust lightning and layout and so forth, which requires specific prompts in other image generation programs.

Like other generative programs, you have to adjust the prompt a great deal to get exactly what you want. The image attached to this post is human faces, and it took a lot of prompting with slightly rearranging words each times to get something even remotely close to what I wanted. It is a lot easier to use Firefly to generate things other than human faces, but this is true of anything – any artist or CG artist will tell you that faces are the hardest things to do correctly because the human eye (and subconscious) can instantly spot anything that’s wrong with the face even if the conscious mind can’t quite articulate what’s wrong.

Because of that, I suspect generative AI would be a lot better at generating individual assets than completed scenes. Like, when I make a book cover in Photoshop nowadays, it can have between forty to sixty layers. (People who really know what they’re doing often have a lot more.) If I wanted to use Firefly to make a completed scene suitable for a book cover, it would look terrible. But if I used it to make, say, a sword, and then modified that sword heavily with appropriate adjustment layers in Photoshop, that would look much better.

People familiar with the topic have demonstrated that Midjourney is more powerful than Firefly. Computer scientist Jim Fan did a thread on Twitter where he used the same prompt in both Midjourney and Firefly and compared the results, and Midjourney usually did better.

That said, this inadvertently demonstrated one of the strengths of Firefly. Jim Fan used Deadpool, Pikachu, and Super Mario in his prompts, and Midjourney performed better. However, the reason is that Firefly has been trained on Adobe Stock Photos and public domain stuff, and Deadpool, Pikachu, and Super Mario are heavily trademarked and copyrighted characters owned by Disney and Nintendo, which means Firefly hasn’t been trained on any images of them. Midjourney, by contrast, was trained with a massive data scrape of the Internet, and the legality of that for use in image generation is an open question, since Midjourney is currently getting sued about it. To put it mildly, Disney and Nintendo have lawyers who make the Nazgul look warm and cuddly, and that shows the advantage of Firefly. If you’re a commercial artist and you use an AI-generated image of Deadpool or Super Mario in your client’s project, obviously you are running the risk of getting sued. However, even you’re not using trademarked characters, Midjourney might have been trained on something that will get you sued.

Where something like Firefly would really shine is replacing stock photos. I’ve spent a lot of time looking for exactly the right stock photo for a certain project, because while you can do a lot in Photoshop, it’s much less work if you can get a stock photo that’s at least somewhat close to what you already want. Typing twenty slightly  different prompts to get what you want for a specific asset might one day replace scrolling through twenty pages of stock photo thumbnails. That said, it’s still easier to use something like DAZ or Blender to produce assets because you can control the output exactly in a way that you simply cannot with image AI, but it’s not always possible to get what you want in DAZ or Blender.

To sum up, Firefly is easy to use, and careful sourcing of the images in its training data (if Adobe is telling the truth about it) addresses many of the ethical concerns I have about generative image AI.

However, I would still exercise great caution in using AI generated images for anything. The legality of it all is still very unsettled, and there are several different court cases dealing with it at the moment. So I would wait until it pans out until using an AI-generated image for anything commercial. Adobe’s approach to Firefly might not be able to generate high quality images of Deadpool high-fiving Super Mario or something, but it does seem more ethical and less likely to result in nasty lawsuits.

(Note that I am not a lawyer and nothing here is legal advice.)

-JM

Leave a Reply

Your email address will not be published. Required fields are marked *