Firefly adds Structure Reference

I’m delighted to see that the longstanding #1 user request for Firefly—namely the ability to upload an image to guide the structure of a generated image—has now arrived:

This nicely complements the extremely popular style-matching capability we enabled back in October. You can check out details of how it works, as well a look at the UI (below)—plus my first creation made using the new tech ;-).

Magnific style transfer is amazing

It’s amazing to see what two people (?!) are able to do. Check out this video & the linked thread, as well as the tool itself.

I’m gonna have a ball going down this rabbit hole, especially for type:

A lovely Guinness ad from… Jason Momoa?

It’s somehow true!

I think the spirit of maximally inclusive “Irishness” has special resonance for millions of people around the world, like me, who can trace a portion (but not all) of their ancestry to the Emerald Isle. (For me it’s 75%, surname notwithstanding.) I’m reminded of Notre Dame’s “What Would You Fight For?” campaign, which features scientists, engineers, and humanitarians from around the world who conclude with “We are the Fighting Irish.” I dunno—it’s hard to explain, but it really warms my heart—as did the Irish & Chinese Railroad Workers float we saw in SF’s St. Paddy’s parade on Saturday.

Anyway, I found this bit starring & directed by Jason Momoa to be pretty charming. Enjoy:

Irish blessings

Hey gang—I hope you’ve had a safe & festive St. Patrick’s Day. To mark the occasion, I figured I’d reshare a couple of the videos I captured in the old country with my dad back in August.

Here’s Co. Clare’s wild burren (“rocky district,” hence the choice of Chieftains/Stones banger)…

…my dad’s grandparents’ medieval town in Galway…

…and my mom’s mother’s farm in Mayo:

Amazing: Realtime AI rendering of Photoshop

I cannot tell you how deeply I hope that the Photoshop team is paying attention to developments like this…

Celebrating “Subpar Parks”

During our recent road trip to Death Valley, my 15yo son rolled his eyes at nature’s majesty:

This made me chuckle & remember “Subpar Parks,” a visual celebration of the most dismissive reviews of our natural treasures. My wife & I have long decorated our workspaces with these unintentional gems, and I think you’ll dig the Insta feed & book (now complemented by “Subpar Planet“).

Creating the creepy infrared world of Dune

I really enjoyed Dolby’s recent podcast on Greig Fraser and the Cinematography of Dune: Part Two, as well as this deep dive with Denis Villeneuve on how they modified an ARRI Alexa LF IMAX camera to create the Harkonnens’ alienating home world.

I love this idea and I tried, for Giedi Prime, the home world of Harkonnen, there’s less information in the book and it’s a world that is disconnected from nature. It’s a plastic world. So, I thought that it could be interesting if the light, the sunlight could give us some insight on their psyche. What if instead of revealing colors, the sunlight was killing them and creating a very eerie black and white world, that will give us information about how these people perceive reality, about their political system, about how that primitive brutalist culture and it was in the screenplay.

Fun little AI->3D->AR experiments with Vision Pro

I love watching people connect the emerging creative dots, right in front of our eyes:

AI Mortal Kombat

Heh—these are obviously silly but well done, and they speak to the creative importance of being specific—i.e. representing particular famous faces. I sometimes note that a joke about a singer & a football player is one thing, whereas a joke about Taylor Swift & Travis Kelce is a whole other thing, all due to it being specific. Thus, for an AI toolmaker, knowing exactly where to draw the line (e.g. disallowing celebrity likenesses) isn’t always so clear.

So… what am I actually doing at Microsoft?

It’s a great question, and I think it’s really thoughtful that the day before I joined, the company was generous enough to run a Superb Owl—er, Super Bowl—commercial, just to help me explain the mission to my parents. 😀

But seriously, this ad provides a brief peek into the world of how Copilot can already generate beautiful, interesting things based on your needs—and that’s a core part of the mission I’ve come here to tackle.

A few salient screenshots:

Ideogram promises state-of-the-art text generation

Founded by ex-Google Imagen engineers, Ideogram has just launched version 1.0 widely. It’s said to offer new levels of fidelity in the traditionally challenging domain of type rendering:

Historically, AI-generated text within images has been inaccurate. Ideogram 1.0 addresses this with reliable text rendering capabilities, making it possible to effortlessly create personalized messages, memes, posters, T-shirt designs, birthday cards, logos and more. Our systematic evaluation shows that Ideogram 1.0 is the state-of-the-art in the accuracy of rendered text, reducing error rates by almost 2x compared to existing models.

Holy cow, I work at Microsoft!

Most folks’ first thought: Wait, whaaaaat?!

Second thought: Actually… that makes sense!

So, it’s true: After nearly three great years back at Adobe, I’ve moved to just the third place I’ve worked since the Clinton Administration: Microsoft!

I’ve signed on with a great group of folks to bring generative imaging magic to as many people as possible, leveraging the power of DALL•E, ChatGPT, Copilot, and other emerging tech to help make fun, beautiful, meaningful things. And yes, they have a very good sense of humor about Clippy, so go ahead and get those jokes out now. :->


It really is a small world: The beautiful new campus (see below) is just two blocks from my old Google office (where I reported to the same VP who’s now in charge of my new group), which itself is just down the road from the original Adobe HQ; see map. (Maybe I should get out more!)


And it’s a small world in a much more meaningful sense: I remain in a very rare & fortunate spot, getting to help guide brilliant engineers’ efforts in service of human creativity, all during what feels like one of the most significant inflection points in decades. I’m filled with gratitude, curiosity, and a strong sense of responsibility to make the most of this moment.

Thank you to my amazing Adobe colleagues for your hard & inspiring work, and especially for chance to build Firefly over the last year. It’s just getting started, and there’s so much we can do together.

Thank you to my new team for opening this door for us. And thank you to the friends & colleagues reading these words. I’ll continue to rely on your thoughtful, passionate perspectives as we navigate these opportunities together.

Let’s do this!

Fun papercraft-styled video

My friend Nathan Shipley has been deeply exploring AnimateDiff for the last several months, and he’s just collaborated with the always entertaining Karen X. Cheng to make this little papercraft-styled video:

Happy birthday to Photoshop, Lightroom, and Camera Raw!

I’m a day late saying it here, but happy birthday to three technologies that changed my life (all our lives, maybe), and to which I’ll be forever grateful to have gotten to contribute. As Jeff Schewe noted:

Happy Birthday Digital Imaging…aka Photoshop, Camera Raw & Lightroom. Photoshop shipped February 19th, 1990. Camera Raw shipped February 19th, 2003 and Lightroom shipped February 19th, 2007. Coincidence? Hum, I wonder…but ya never know when Thomas Knoll is involved…

Check out Jeff’s excellent overview, written for Photoshop’s 30th, as well as his demo of PS 1.0 (which “cost a paltry $895 and could run on home computers like the Macintosh IIfx for under $10,000″—i.e. ~$2,000 & $24,000 today!).

“Neither Artificial nor Intelligent: Artists Working with Algorithms”

Just in case you’ll be around San Jose this Friday, check out this panel discussion featuring our old Photoshop designer Julie Meridian & other artists discussing their relationship with AI:

Panel discussion: Friday, February 23rd 7pm–9pm. Free admission

Featuring Artists: Julie Meridian, James Morgan, and Steve Cooley
Moderator: Cherri Lakey

KALEID Gallery is proud to host this panel with three talented artists who are using various AI tools in their artistic practice while navigating all the ethical and creative dilemmas that arise with it. With all the controversy around AI collaborative / generated art, we’re looking forward to hearing from these avant-garde artists that are exploring the possibilities of a positive outcome for artists and creatives in this as-of-yet undefined new territory.

“Boximator” enables guided image->video

Check out this research from ByteDance, the makers of TikTok (where it could well be deployed), which competes with tools like Runway’s Motion Brush:

Check out Sora, OpenAI’s eye-popping video model

Hot on the heels of Lumiere from Google…

…here comes Sora from OpenAI:

My only question: How did they not call it SORR•E? :-p

But seriously, as always…

OpenAI, Meta, & Microsoft promote AI transparency

Good progress across the board:

  • OpenAI is adding new watermarks to DALL-E 3
    • “The company says watermarks from C2PA will appear in images generated on the ChatGPT website and the API for the DALL-E 3 model. Mobile users will get the watermarks by February 12th. They’ll include both an invisible metadata component and a visible CR symbol, which will appear in the top left corner of each image.”
  • Meta Will Label AI Images Across Facebook, Instagram, & Threads
    • “Meta will employ various techniques to differentiate AI-generated images from other images. These include visible markers, invisible watermarks, and metadata embedded in the image files… Additionally, Meta is implementing new policies requiring users to disclose when media is generated by artificial intelligence, with consequences for failing to comply.”
  • Building trust with content credentials in Microsoft Designer
    • “When you create a design in Designer you can also decide if you’d like to include basic, trustworthy facts about the origin of the design or the digital content you’ve used in the design with the file.”

Firefly image creation & Lightroom come to Apple Vision Pro

Not having a spare $3500 burning a hole in my pocket, I’ve yet to take this for a spin myself, but I’m happy to see it. Per the Verge:

The interface of the Firefly visionOS app should be familiar to anyone who’s already used the web-based version of the tool — users just need to enter a text description within the prompt box at the bottom and hit “generate.” This will then spit out four different images that can be dragged out of the main app window and placed around the home like virtual posters or prints. […]

Meanwhile, we also now have a better look at the native Adobe Lightroom photo editing app that was mentioned back when the Apple Vision Pro was announced last June. The visionOS Lightroom experience is similar to that of the iPad version, with a cleaner, simplified interface that should be easier to navigate with hand gestures than the more feature-laden desktop software.

Shhh, No One Cares

Heh—this fun little animation makes me think back to how I considered changing my three-word Google bio from “Teaching Google Photoshop” (i.e. getting robots to see & create like humans, making beautiful things based on your life & interests) to “Wow! Nobody Cares.” :-p Here’s to less of that in 2024.

Check out my chat with Wharton

I had a chance to sit down for an interesting & wide-ranging chat with folks from the Wharton Tech Club:

Tune into the latest episode of the Wharton Tech Toks podcast! Leon Zhang and Stephanie Kim chat with John Nack, Principal Product Manager at Adobe with 20+ years of PM experience across Adobe and Google, about GenAI for creators, AI ethics, and more. He also reflects on his career journey. This episode is great if you’re recruiting for tech, PM, or Adobe.

Listen now on Apple Podcasts or Spotify.

As always I’d love to know what you think.

Making today’s AI interfaces “look completely absurd”

Time is a flat circle…

Daring Fireball’s Mac 40th anniversary post contained a couple of quotes that made me think about the current state of interaction with AI tools, particularly around imaging. First, there’s this line from Steven Levy’s review of the original Mac:

[W]hat you might expect to see is some sort of opaque code, called a “prompt,” consisting of phosphorescent green or white letters on a murky background.

Think about how revolutionarily different & better (DOS-head haters’ gripes notwithstanding) this was.

What you see with Macintosh is the Finder. On a pleasant, light background, little pictures called “icons” appear, representing choices available to you.

And then there’s this kicker:

“When you show Mac to an absolute novice,” says Chris Espinosa, the twenty-two-year-old head of publications for the Mac team, “he assumes that’s the way all computers work. That’s our highest achievement. We’ve made almost every computer that’s ever been made look completely absurd.

I don’t know quite what will make today’s prompt-heavy approach to generation feel equivalently quaint, but think how far we’ve come in less than two years since DALL•E’s public debut—from swapping long, arcane codes to having more conversational, iterative creation flows (esp. via ChatGPT) and creating through direct, realtime UIs like those offered via Krea & Leonardo. Throw in a dash of spatial computing, perhaps via “glasses that look like glasses,” and who knows where we’ll be!

But it sure as heck won’t mainly be knowing “some sort of opaque code, called a ‘prompt.'”

The first great Vision Pro demo I’ve seen

F1 racing lover John LePore (whose VFX work you’ve seen in Iron Man 2 and myriad other productions over the years) has created the first demo for Apple Vision Pro that makes me say, “Okay, dang, that looks truly useful & compelling.” Check out his quick demo & behind-the-scenes narration:

My panel discussion at the AI User Conference

Thanks to Jackson Beaman & crew for putting together a great event yesterday in SF. I joined him, KD Deshpande (founder of Simplified), and Sofiia Shvets (founder of Let’s Enhance & Claid.ai) for a 20-minute panel discussion (which starts at 3:32:03 or so, in case the embedded version doesn’t jump you to the proper spot) about creating production-ready imagery using AI. Enjoy, and please let me know if you have any comments or questions!

The Founding Fathers talk AI art

Well, not exactly—but T-Paine’s words about how we value things still resonate today:

We humans are fairly good at pricing effort (notably in dollars paid per hour worked), but we struggle much more with pricing value. Cue the possibly apocryphal story about Picasso asking $10,000 for a drawing he sketched in a matter of seconds, but the ability to create which had taken him a lifetime.

A couple of related thoughts:

  • My artist friend is a former Olympic athlete who talks about how people bond through shared struggle, particularly in athletics. For him, someone using AI-powered tools is similar to a guy showing up at the gym with a forklift, using it to move a bunch of weight, and then wanting to bond afterwards with the actual weightlifters.
  • I see ostensible thought leaders crowing about the importance of “taste,” but I wonder how they think that taste is or will be developed in the absence of effort.
  • As was said of—and by?—Steve Jobs, “The journey is the reward.”

[Via Louis DeScioli]

After Effects + Midjourney + Runway = Harry Potter magic

It’s bonkers what one person can now create—bonkers!

I edited out ziplines to make a Harry Potter flying video, added something special at the end
byu/moviemaker887 inAfterEffects

I took a video of a guy zip lining in full Harry Potter costume and edited out the zip lines to make it look like he was flying. I mainly used Content Aware Fill and the free Redgiant/Maxon script 3D Plane Stamp to achieve this.

For the surprise bit at the end, I used Midjourney and Runway’s Motion Brush to generate and animate the clothing.

Trapcode Particular was used for the rain in the final shot.

I also did a full sky replacement in each shot and used assets from ProductionCrate for the lighting and magic wand blast.

[Via Victoria Nece]

Krea upgrades its realtime generation

I had the pleasure of hanging out with these crazy-fast-moving guys last week, and I remain amazed at the speed of their shipping velocity. Check out the latest updates to their realtime canvas:

Check out how trailblazing artist Martin Nebelong is putting it to use:

Google introduces Lumiere for video generation & editing

Man, not a day goes by without the arrival of some new & mind-blowing magic—not a day!

We introduce Lumiere — a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion — a pivotal challenge in video synthesis. To this end, we introduce a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model. This is in contrast to existing video models which synthesize distant keyframes followed by temporal super-resolution — an approach that inherently makes global temporal consistency difficult to achieve. […]

We demonstrate state-of-the-art text-to-video generation results, and show that our design easily facilitates a wide range of content creation tasks and video editing applications, including image-to-video, video inpainting, and stylized generation.

Content credentials are coming to DALL•E

From its first launch, Adobe Firefly has included support for content credentials, providing more transparency around the origin of generated images, and I’m very pleased to see Open AI moving in the same direction:

Early this year, we will implement the Coalition for Content Provenance and Authenticity’s digital credentials—an approach that encodes details about the content’s provenance using cryptography—for images generated by DALL·E 3. 

We are also experimenting with a provenance classifier, a new tool for detecting images generated by DALL·E. Our internal testing has shown promising early results, even where images have been subject to common types of modifications. We plan to soon make it available to our first group of testers—including journalists, platforms, and researchers—for feedback.

Adobe Announces Inaugural Film & TV Fund, Committing $6 Million to Support Underrepresented Creators

In her 12+ years in Adobe’s video group, my wife Margot worked to bring more women into the world of editing & filmmaking, participating in efforts supporting all kinds of filmmakers across a diverse range of ages, genders, types of subject matter, experience levels, and backgrounds. I’m delighted to see such efforts continuing & growing:

Adobe and the Adobe Foundation will partner with a cohort of global organizations that are committed to empowering underrepresented communities, including Easterseals, Gold House, Latinx House, Sundance Institute and Yuvaa, funding fellowships and apprenticeships that offer direct, hands-on industry access. The grants will also enable organizations to directly support filmmakers in their communities with funding for short and feature films.

The first fellowship is a collaboration with the NAACP, designed to increase representation in post-production. The NAACP Editing Fellowship is a 14-week program focused on education and training, career growth and workplace experience and will include access to Adobe Creative Cloud to further set up emerging creators with the necessary tools. Applications open on Jan. 18, with four fellows selected to participate in the program starting in May.

Premiere Pro ups its audio game

“If you want to make a movie look good, make it sound good.” That’s the spirit in which Adobe is introducing a wide range of enhancements to audio handling in Premiere Pro:

According to the team, the audio workflow changes now available in the beta include:

  • Interactive Fade Handles: Now you can simply click and drag from the edge of a clip to create a variety of custom audio fades in the timeline or drag across two clips to create a crossfade. These visual fades provide more precision and control over audio transitions while making it easy to see where they are applied across your sequence.
  • AI-powered Audio Category Tagging: When you drag clips into the sequence, they’ll automatically be identified and labeled with new icons for dialogue, music, sound effects, or ambience. A single click on the icon provides access to the most relevant tools for that audio type in the Essential Sound panel — such as Loudness Matching or Auto Ducking.
  • Redesigned FX Clip Badges: An updated badge makes it easier for you to see which clips have effects added to them. New effects can be added by right clicking the badge, and a single click opens the Effect Control panel for even more adjustment without changing the workspace or searching for the panel.
  • Modern, Intelligent Waveforms and Clips: Waveforms now dynamically resize when you change the track height and improved clip colors make it easier for you to see and work with audio on the timeline.

Tutorial: Firefly + Character Animator

Helping discover Dave Werner & bring him into Adobe remains one of my favorite accomplishments at the company. He continues to do great work in designing characters as well as the tools that can bring them to life. Watch how he combines Firefly with Adobe Character Animator to create & animate a stylish tiger:

Adobe Firefly’s text to image feature lets you generate imaginative characters and assets with AI. But what if you want to turn them into animated characters with performance capture and control over elements like arm movements, pupils, talking, and more? In this tutorial, we’ll walk through the process of taking a static Adobe Firefly character and turning it into an animated puppet using Adobe Photoshop or Illustrator plus Character Animator.

“How Adobe is managing the AI copyright dilemma, with general counsel Dana Rao”

Honestly, if you asked, “Hey, wanna spend an hour+ listening to current and former intellectual property attorneys talking about EU antitrust regulation, ethical data sourcing, and digital provenance,” I might say, “Ehmm, I’m good!”—but Nilay Patel & Dana Rao make it work.

I found the conversation surprisingly engrossing & fast-moving, and I was really happy to hear Dana (with whom I’ve gotten to work some regarding AI ethics) share thoughtful insights into how the company forms its perspectives & works to put its values into practice. I think you’ll enjoy it—perhaps more than you’d expect!

Deeply chill photography

(Cue Metallica’s Trapped Under Ice!)

Russell Brown & some of my old Photoshop teammates recently ventured into -40º (!!) weather in Canada, pushing themselves & their gear to the limits to witness & capture the Northern Lights:

Perhaps on future trips they can team up with these folks:

To film an ice hockey match from this new angle of action, Axis Communications used a discrete modular camera — commonly seen in ATM machines, onboard vehicles, and other small spaces where a tiny camera needs to fit — and froze it inside the ice.

Check out the results:

Behind—and under—the scenes:

Adobe’s hiring a prototyper to explore generative AI

We’re only just beginning to discover the experiential possibilities around generative creation, so I’m excited to see this rare gig open up:

You will build new and innovative user interactions and interfaces geared towards our customers unique needs, test and refine those interfaces in collaboration with academic research, user researchers, designers, artists and product teams.

Check out the listing for the full details.