Warning: plot spoilers ahead for Avengers: Infinity War.
Despite the title of Avengers: Infinity War, the lead character isn’t Iron Man, Captain America, or any of the other Avengers who protect the world from evil. From beginning to end, Joe and Anthony Russo’s blockbuster is about the eight-foot intergalactic big bad Thanos, played by Josh Brolin. Thanos is every bit as dominating and destructive as fans of the series could hope, but he isn’t the one-dimensional villain he so easily could have been.
His maniacal plan to restore balance to the universe by arbitrarily killing half its inhabitants aside, Thanos is sometimes a strangely sympathetic character. He’s capable of feeling compassion for a young Gamora, even after he murders her mother and half her planet. He’s legitimately hurt when she turns her back on him, and he feels true regret when he sacrifices her to achieve his goals.
While Brolin’s performance drives the character, none of those moments would have been possible without the special effects wizardry of Digital Domain. The visual effects company, founded by James Cameron and the late Stan Winston 25 years ago, was tasked with bringing Thanos to life, and on the biggest stage possible. I jumped on the phone with visual effects supervisor Kelly Port to talk about the scope of Infinity War, the company’s complex motion-capture system, and how machine learning served a crucial role in turning Josh Brolin into a mad Titan.
This interview has been edited for clarity and brevity.
Several different visual effects houses worked on various aspects of Infinity War at once. How does something this large get broken up logistically?
We were reflecting back on Titanic days, the late ‘90s, where a big effects film would be considered around 300 shots. Whereas this film, in contrast, has over 3,000 visual effects shots. Pretty much every single shot in the film, to some extent, has been touched by visual effects, whether it’s a simple blue screen, a cosmetic enhancement, or a wire removal, which would be on the simpler end of things, all the way to a full CG shot that’s got multiple characters, huge crowds and environments, and what have you. So it runs the spectrum of complexity. But over time, what is considered a big visual effects film, the definition of that has changed. Obviously, this was about as big as it gets.
On the Marvel side of things, you have Dan DeLeeuw and Jennifer Underdahl, the visual effects supervisor and the visual effects producer for Marvel. They break down the script, and then eventually they have previs to work with, so they have a better sense of what’s needed in a particular shot because it’s cut together in sequence. And then eventually that makes its way into a shooting plan when we get to live-action shooting. At that point, they make a decision about how they want to distribute the work, and I think in general, they didn’t want to put all their eggs in one basket. I think there’s probably at least seven major visual effects studios, including Digital Domain, involved with the work.
What areas did you and Digital Domain focus on?
We ended up doing Thanos primarily, and Weta also did Thanos, but just for one sequence, on Titan. So we ended up being responsible for a lot of the really heavy, depressing scenes, as you can imagine, having seen the film. It was often a running joke: “Hey, give us some Baby Groot or something, because we’re so depressed over here.” We were engaged about three or four months before the live-action shooting, along with Weta, to start doing some tests for Thanos, and that was comprised of having Josh Brolin meet with the Russo brothers and discuss the character casually, and then he would go in and out of character as they’re discussing it.
He was playing it very casually and normally, just throwing ideas around. But it was really a way to test the system and make sure the [effects design] was sound, because there’s so much pressure riding on this. Thanos, as a character, has well over 40 minutes of screen time. He is the main character of the film, and if he wasn’t successful, it would be not as successful as a film. I think it really was critical that Thanos not only looked photorealistic as much as technically and artistically possible, but even more important, it was conveyed to us from Marvel and from Dan that Josh Brolin’s performance in all its subtleties really needed to come through as much as possible as well.
When we presented that test early on, literally the first day of shooting, it was an interesting presentation. Because we had Josh Brolin in there, we had [Marvel Studios president] Kevin Feige and all the Marvel executives in there, we had the Russo brothers. And me and my digital effects supervisor hiding in the corner, trying not to make too much of our presence known, and just waiting as Josh Brolin was checking this out. Everyone else was excited about it, but it was really important to get Josh’s feedback on it as well, and he really loved it. I think what was really great about his reaction was his sense that he did not, as an actor, need to overplay the acting. When we did this test, he was doing it as a pretty casual conversation. He wasn’t trying to push it over the top for technical reasons, like, “Maybe if I push it over the top, it won’t get filtered as much, and what I’m trying to do will come through.”
But what he saw in the test was that even in a very casual performance, a lot of subtle facial details were coming through. So moving forward, he was able to just play it as subtle as he wanted to, or intense, or whatever. It was all not getting filtered out [in the motion-capture process]. I think there was a huge sigh of relief in the room, that in fact, we could pull this off, that this was gonna work. And obviously since that time — since the shooting, and then a year and a half, two years later, when we finally finished the film — a lot of that was improved greatly. We were really happy with the results and worked very hard to make that happen technically and visually.
Thanos has been around for decades in the comics and has already appeared in the movies. How did his design change this time around?
We certainly started with the historical design, then we got an update as a digital sculpt from the Marvel Art Department. Then we took it from there, where we would slightly tweak the proportions of the eyes, and the relationship of the eyes to the mouth, for example, slightly toward Brolin. There’s a balance to be had between Brolin and the historical Thanos character, for sure, and we tried to find that balance, but we did introduce more of Brolin I think than in previous Thanoses in the films. I think, all in all, that helps in the overall performance matching, and getting a little closer to feeling what Brolin was doing with the character as well. Then, of course, the details of the costume and the textural detail of the skin, these are all things that took months, and maybe even close to a year, of going back and forth between Marvel and our teams.
What was the motion-capture process like on this film?
What’s really nice about this [project] is that when we do motion-capture, historically, you typically have a “capture volume,” an empty space with a bunch of cameras surrounding you. There aren’t really many set pieces or anything. In this case, they built the cameras that capture the motion of the body in and around the set pieces built by [production designer] Charlie Wood and his team. And that enabled the actors to actually be on set interacting with each other, and all the while, you have as many characters as you want. I think we had 10-plus characters on a scene sometimes.
Josh would be wearing a motion-capture bodysuit, with tracking markers and things you’ve seen before. Then he’d also be wearing a helmet, with two vertically arrayed HD cameras running at 60 frames per second, and then he’d have the tracking dots on his face. Once the [movie] edit is put together, we would get time code for the body motion-capture, we’d get time code for the facial capture, and then we’d get, of course, the associated imagery that goes with that: the set, live action, clean plates, reference passes, all sorts of things that will help us ultimately do a better job later.
Then we process the facial tracking. That’s done in a two-step process, where we have this proprietary tool we call Masquerade, which takes the relatively low-resolution facial geometry from the helmet cam. Then we have another step, where we work with Disney Research in a technology called Medusa, which gets really high-resolution imagery of face shapes of Brolin, which essentially get put into a reference library. And it’s not just “shape A” and “shape B.” It’s about 50 different shapes that we capture, but it’s also the transition from those shapes. So, what happens to the muscles, to the skin, to the face, to the bone, when you transition from shapes A to B, as well. So that’s all in this massive database. Then it all goes through a machine-learning algorithm that we’ve developed that says, “This is a low-resolution face, but we want the high-resolution face that essentially is equivalent.” [At that point, the system builds the equivalent face, using the Medusa imagery as the reference point.]
Photo: Marvel Studios
Then it runs through this automated process and gives you a result, and then we look at that result and say, “Is this spot-on?” And if it’s not spot-on, we give it a little correction, feed it back into the system. It now knows — it now has been trained, essentially — to know that this is a better result. As you do this hundreds of times, it learns from this process and effectively gets better over time. Over the course of the production, we were able to make fewer and fewer corrections to it.
The next step is going from high-res actor [model] to high-res character — in this case, Thanos. And we effectively have the same process there, as well. We do an automated transfer. We look at the result; you look at Brolin side by side with Thanos. Is he conveying the same emotional expression? Is he conveying the same emotion with his face? And we make a subjective call. “Yeah, he is, this looks great.” Or, “You know what, there’s something that’s not quite right.” Maybe it’s an element of surprise, or maybe his brow needs to be raised a little more. So we just make a quick little tweak to that, feed it back into the system, and it learns from that. It’s training data, so the next time a similar expression comes up, it gets a little more accurate. That ultimately ends up going to the animation department, which combines that with the body capture.
How long has Digital Domain been using machine learning?
Not that long. We’ve used it in a few projects now, and we introduced a paper at SIGGRAPH a while back. But this is really the first project that it’s been used full-on, I would say, so it’s relatively new technology.
A common weakness with CG characters is their eyes. In Infinity War, however, there are a lot of moments where Thanos’ eyes really drive home an emotional moment. What was the secret to your approach there?
The eyes are, as they say, the windows to the soul, and such a critical part of a facial expression and what emotion you’re conveying. But ironically, that’s the least fidelity we get out of that pipeline, because you can’t put tracking markers on people’s eyes. So in the Medusa scans, we get a really nice sense of the eye topology — meaning what’s happening to the eyelids, and when the eyelids are closed, or being squinted tightly, how the skin folds, and all that.
We take that information, make a production-model shape that represents our Thanos model, and then animation does a ton to make sure that these shapes are tied in. So there’s a visual reference of what’s happening with Brolin, and we have to match that. But having said all that, what we really focus on is the eyes at Digital Domain, and there’s a ton of work, and a huge amount of detail around that. Not only the geometry of the eyelids and the surrounding skin around the eyes, but the eyeball itself, the conjunctiva and these very thin translucent layers. There are multiple layers of tissue that are semi-translucent that go over the eye itself, and these are all things we have built into the system, so once they’re there, they’re there. But it’s a lot of work getting it to that point. A lot of research on eyes and how they move, how they look, and the biology of an eye as well. These all contribute to the overall end realism.
Were there any other particular problems that came up with this character that you needed to solve?
The photorealism of the character. That part of it is a whole host of different departments needing to work in concert. It’s not just one thing. If the lighting is not good, it won’t look photoreal. Or if the compositing is not good. It can have all those, but if the animation’s not good because he’s moving around in a weird way, it’s gonna bump people. So everybody has to work, and it all has to come together at the same time.