0% found this document useful (0 votes)

20 views11 pages

Abdal 3DAvatarGAN Bridging Domains For Personalized Editable Avatars CVPR 2023 Paper

Uploaded by

luyang.im

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views11 pages

Abdal 3DAvatarGAN Bridging Domains For Personalized Editable Avatars CVPR 2023 Paper

Uploaded by

luyang.im

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

This CVPR paper is the Open Access version, provided by the Computer Vision Foundation.

Except for this watermark, it is identical to the accepted version;

the final published version of the proceedings is available on IEEE Xplore.

3DAvatarGAN: Bridging Domains for Personalized Editable Avatars

Rameen Abdal²1 Hsin-Ying Lee2 Peihao Zhu²1 Menglei Chai2 Aliaksandr Siarohin2

Peter Wonka1 Sergey Tulyakov2

1 2
KAUST Snap Inc.

Figure 1. Editable 3D avatars. We present 3DAvatarGAN, a 3D GAN able to produce and edit personalized 3D avatars from a single
photograph (real or generated). Our method distills information from a 2D-GAN trained on 2D artistic datasets like Caricatures, Pixar
toons, Cartoons, Comics etc. and requires no camera annotations.

Abstract a deformation-based technique for modeling exaggerated

geometry of artistic domains, enablingÐas a byproductÐ
Modern 3D-GANs synthesize geometry and texture by personalized geometric editing. Finally, we propose a novel
training on large-scale datasets with a consistent structure. inversion method for 3D-GANs linking the latent spaces of
Training such models on stylized, artistic data, with often the source and the target domains. Our contributionsÐfor
unknown, highly variable geometry, and camera informa- the first timeÐallow for the generation, editing, and anima-
tion has not yet been shown possible. Can we train a 3D tion of personalized artistic 3D avatars on artistic datasets.
GAN on such artistic data, while maintaining multi-view Project Page: https:/rameenabdal.github.io/3DAvatarGAN
consistency and texture quality? To this end, we propose
an adaptation framework, where the source domain is a 1. Introduction
pre-trained 3D-GAN, while the target domain is a 2D-GAN Photo-realistic portrait face generation is an iconic ap-
trained on artistic datasets. We, then, distill the knowl- plication demonstrating the capability of generative models
edge from a 2D generator to the source 3D generator. To especially GANs [28,30,31]. A recent development has wit-
do that, we first propose an optimization-based method to nessed an advancement from straightforwardly synthesizing
align the distributions of camera parameters across do- 2D images to learning 3D structures without 3D supervi-
mains. Second, we propose regularizations necessary to sion, referred to as 3D-GANs [10, 41, 55, 64]. Such training
learn high-quality texture, while avoiding degenerate ge-
ometric solutions, such as flat shapes. Third, we show ² Part of the work was done during an internship at Snap Inc.

4552
is feasible with the datasets containing objects with highly allows us to train 3D-GANs on challenging artistic datasets
consistent geometry, enabling a 3D-GAN to learn a distri- with exaggerated geometry and texture. We call our method
bution of shapes and textures. In contrast, artistically styl- 3DAvatarGAN as itÐfor the first timeÐoffers generation,
ized datasets [25, 65] have arbitrary exaggerations of both editing, and animation of personalized stylized, artistic
geometry and texture, for example, the nose, cheeks, and avatars obtained from a single image. Our results (See
eyes can be arbitrarily drawn, depending on the style of the Sec. 5.2) show the high-quality 3D avatars possible by our
artist as well as on the features of the subject, see Fig. 1. method compared to the naive fine-tuning.
Training a 3D-GAN on such data becomes problematic due
to the challenge of learning such an arbitrary distribution of 2. Related Work
geometry and texture. In our experiments (Sec. 5.1), 3D-
GANs [10] generate flat geometry and become 2D-GANs GANs and Semantic Image Editing. Generative adversar-
essentially. A natural question arises, whether a 3D-GAN ial Networks (GANs) [19, 47] are one popular type of gen-
can synthesize consistent novel views of images belonging erative model, especially for smaller high-quality datasets
to artistically stylized domains, such as the ones in Fig. 1. such as FFHQ [32], AFHQ [14], and LSUN objects [67].
For these datasets, StyleGAN [28,30,32] can be considered
In this work, we propose a domain-adaption framework
as the current state-of-the-art GAN [27, 28, 30, 32, 33]. The
that allows us to answer the question positively. Specifi-
disentangled latent space learned by StyleGAN has been
cally, we fine-tune a pre-trained 3D-GAN using a 2D-GAN
shown to exhibit semantic properties conducive to seman-
trained on a target domain. Despite being well explored for
tic image editing [1, 3, 16, 22, 36, 44, 51, 56, 62]. CLIP [46]
2D-GANs [25, 65], existing domain adaptation techniques
based image editing [2, 17, 44] and domain transfer [15, 70]
are not directly applicable to 3D-GANs, due to the nature
are another set of works enabled by StyleGAN.
of 3D data and characteristics of 3D generators.
GAN Inversion. Algorithms to project existing images into
The geometry and texture of stylized 2D datasets can be a GAN latent space are a prerequisite for GAN-based image
arbitrarily exaggerated depending on the context, artist, and editing. There are mainly two types of methods to enable
production requirements. Due to this, no reliable way to such a projection: optimization-based methods [1,13,57,71]
estimate camera parameters for each image exists, whether and encoder-based methods [5, 7, 48, 58, 69]. On top of both
using an off-the-shelf pose detector [72] or a manual label- streams of methods, the generator weights can be further
ing effort. To enable the training of 3D-GANs on such chal- modified after obtaining initial inversion results [49].
lenging datasets, we propose three contributions. 1 An Learning 3D-GANs with 2D Data. Previously, some ap-
optimization-based method to align distributions of cam- proaches attempt to extract 3D structure from pre-trained
era parameters between domains. 2 Texture, depth, and 2D-GANs [42, 52]. Recently, inspired by Neural Radiance
geometry regularizations to avoid degenerate, flat solutions Field (NeRF) [9, 37, 43, 68], novel GAN architectures have
and ensure high visual quality. Furthermore, we redesign been proposed to combine implicit or explicit 3D represen-
the discriminator training to make it compatible with our tations with neural rendering techniques [11, 12, 20, 39±41,
task. We then propose 3 a Thin Plate Spline (TPS) 3D 50, 53, 55, 63, 64]. In our work, we build on EG3D [11]
deformation module operating on a tri-plane representation which has current state-of-the-art results for human faces
to allow for certain large and sometimes extreme geometric trained on the FFHQ dataset.
deformations, which are so typical in artistic domains. Avatars and GANs. To generate new results in an artistic
The proposed adaptation framework enables the train- domain (e.g. anime or cartoons), a promising technique is
ing of 3D-GANs on complex and challenging artistic data. to fine-tune an existing GAN pre-trained on photographs,
The previous success of domain adaptation in 2D-GANs un- e.g. [45, 54, 60]. Data augmentation and freezing lower lay-
leashed a number of exciting applications in the content cre- ers of the discriminator are useful tools when fine-tuning
ation area [25, 65]. Given a single image such methods first a 2D-GAN [28, 38]. One branch of methods [18, 44, 70]
find a latent code corresponding to it using GAN inversion, investigates domain adaptation if only a few examples or
followed by latent editing producing the desired effect in only text descriptions are available. While others focus
the image space. Compared to 2D-GANs, the latent space on matching the distribution of artistic datasets with di-
of 3D-GANs is more entangled, making it more challeng- verse shapes and styles. Our work also falls in this domain.
ing to link the latent spaces between domains, rendering the Among previous efforts, StyleCariGAN [25] proposes in-
existing inversion and editing techniques not directly appli- vertible modules in the generator to train and generate cari-
cable. Hence, we take a step further and explore the use of catures from real images. DualStyleGAN [65] learns two
our approach to 3D artistic avatar generation and editing. mapping networks in StyleGAN to control the style and
Our final contribution to enable such applications is 4 a structure of the new domain. Some works are trained on
new inversion method for coupled 3D-GANs. 3D data or require heavy labeling/engineering [21, 26, 66]
In summary, the proposed domain-adaption framework and use 3D morphable models to map 2D images of carica-

4553
3.1. How to align the cameras?
Selecting appropriate ranges for camera parameters is of
paramount importance for high-fidelity geometry and tex-
ture detail. Typically, such parameters are empirically esti-
mated, directly computed from the dataset using an off-the-
shelf pose detector [10], or learned during training [8]. In
domains we aim to bridge, such as caricatures for which
a 3D model may not even exist, directly estimating the
camera distribution is problematic and, hence, is not as-
sumed by our method. Instead, we find it essential to en-
sure that the camera parameter distribution is consistent
Figure 2. Comparison with naive fine-tuning. Comparison of across the source and target domains. For the target domain,
generated 3D avatars with a naÈıvly fine-tuned generator Gbase (left we use StyleGAN2 trained on FFHQ, fine-tuned on artistic
sub-figures) versus our generator Gt (right sub-figures). The cor- datasets [25, 65]. Assuming that the intrinsic parameters of
responding sub-figures show comparisons in terms of texture qual- all the cameras are the same, we aim to match the distribu-
ity (top two rows) and geometry (bottom two rows). See Sec. 5.1 tion of extrinsic camera parameters of Gs and G2D and train
for details.
our final Gt using it (see illustration in Fig. 2 of the supple-
mentary materials). To this end, we define an optimization-
tures to 3D models. However, such models fail to model the based method to match the sought distributions. The first
hair, teeth, neck, and clothes and suffer in texture quality. In step is to identify a canonical pose image in G2D , where
this work, we are the first to tackle the problem of domain the yaw, pitch, and roll parameters are zero. According to
adaption of 3D-GANs and to produce fully controllable 3D Karras et al., [31], the image corresponding to the mean
Avatars. We employ 2D to 3D domain adaptation and dis- latent code satisfies this property. Let θ, φ be the camera
tillation and make use of synthetic 2D data from StyleCari- Euler angles in a spherical coordinate system, r, c be the
GAN [25] and DualStyleGAN [65]. radius of the sphere and camera lookat point, and, M be a
function that converts these parameters into the camera-to-
world matrix. Let Is (w, θ, φ, c, r) = Gs (w, M(θ, φ, c, r))
and I2D (w) = G2D (w) represent an arbitrary image gen-
3. Domain Adaptation for 3D-GANs erated by Gs and G2D , respectively, given the w code vari-
able. Let kd be the face key-points detected by the detector
The goal of domain adaptation for 3D-GANs is to adapt Kd [72], then
(both texture and geometry) to a particular style defined
by a 2D dataset (Caricature, Anime, Pixar toons, Comic, (c′ , r′ ) := arg min Lkd (Is (wavg
′
, 0, 0, c, r), I2D (wavg )),
(c,r)
and Cartoons [24, 25, 65] in our case). In contrast to 2D-
(1)
StyleGAN-based fine-tuning methods that are conceptually
where Lkd (I1 , I2 ) = ∥kd (I1 ) − kd (I2 )∥1 and wavg and
simpler [29, 45], fine-tuning a 3D-GAN on 2D data intro- ′
wavg are the mean w latent codes of G2D and Gs , respec-
duces challenges in addition to domain differences, espe-
tively. In our results, r′ is determined to be 2.7 and c′ is ap-
cially on maintaining the texture quality while preserving
proximately [0.0, 0.05, 0.17]. The next step is to determine
the geometry. Moreover, for these datasets, there is no ex-
a safe range of the θ and φ parameters. Following prior
plicit shape and camera information. We define the do-
works, StyleFlow [3] and FreeStyleGAN [35] (see Fig.5 of
main adaptation task as follows: Given a prior 3D-GAN
the paper), we set these parameters as θ′ ∈ [−0.45, 0.45]
i.e. EG3D (Gs ) of source domain (Ts ), we aim to produce a
and φ′ ∈ [−0.35, 0.35] in radians.
3D Avatar GAN (Gt ) of the target domain (Tt ) while main-
taining the semantic, style, and geometric properties of Gs , 3.2. What loss functions and regularizers to use?
and at the same time preserving the identity of the subject
between the domains (Ts ↔ Tt ). Refer to Fig. 4 in sup- Next, although the camera systems are aligned, the given
plementary for the pipeline figure. We represent G2D as a dataset may not stem from a consistent 3D model, e.g., in
teacher 2D-GAN used for knowledge distillation fine-tuned the case of caricatures or cartoons. This entices the gener-
on the above datasets. Note that as Tt is not assumed to ator Gt to converge to an easier degenerate solution with a
contain camera parameter annotations, the training scheme flat geometry. Hence, to benefit from the geometric prior
must suppress artifacts such as low-quality texture under of Gs , another important step is to design the loss functions
different views and flat geometry (See Fig. 2). In the fol- and regularizers for a selected set of parameters to update
lowing, we discuss the details of our method. in Gt . Next, we discuss these design choices:

4554
Figure 3. Domain adaptation. Domain adaptation results of images from source domain Ts (top row in each sub-figure) to target domain
Tt . Rows two to five show corresponding 3D avatar results from different viewpoints.

Loss Functions. To ensure texture quality and diversity, s activations of the S space [62]. The s activations are pre-
we resort to the adversarial loss used to fine-tune GANs as dicted by A(w), where A is the learned affine function in
our main loss function. We use the standard non-saturating EG3D. The s activations scale the kernels of a particular
loss to train the generator and discriminator networks used layer. In order to preserve the identity as well as geometry
in EG3D [11]. We also perform lazy density regularization such that the optimization of ∆s does not deviate too far
to ensure consistency of the density values in the final fine- away from the original domain Ts , we introduce a regular-
tuned model Gt . izer given by
Texture Regularization. Since the texture can be entan- R(∆s) := ∥∆s∥1 . (2)
gled with the geometry information, determining which lay- Note that we apply R(∆s) regularization in a lazy manner,
ers to update is important. To make use of the fine-style in- i.e., with density regularization. Interestingly, after training,
formation encoded in later layers, it is essential to update we can interpolate between s and s + ∆s parameters to in-
the tRGB layer parameters (outputting tri-plane features) terpolate between the geometries of samples in Ts and Tt
before the neural rendering stage. tRGB are convolutional (See Fig. 5).
layers that transform feature maps to 3 channels at each res- Depth Regularization. Next, we observe that even though
olution (96 channels in triplanes). Moreover, since the net- the above design choice produces better geometry for Tt ,
work has to adapt to a color distribution of Tt , it is essential some samples from Gt can still lead to flatter geometry, and
to update the decoder (MLP layers) of the neural render- it is hard to detect these cases. We found that the problem is
ing pipeline as well. Given the EG3D architecture, we also related to the relative depth of the background to the fore-
update the super-resolution layer parameters to ensure the ground. To circumvent this problem, we use an additional
coherency between the low-resolution and high-resolution regularization where we encourage the average background
outputs seen by the discriminator D. depth of Gt to be similar to Gs . Let Sb be a face back-
Geometry Regularization. In order to allow the network ground segmentation network [34]. We first compute the
to learn the structure distribution of Tt and at the same time average background depth of the samples given by Gs . This
ensure properties of W and S latent spaces are preserved, average depth is given by
we update the earlier layers with regularization. This also
1 X 1
M
encourages the latent spaces of Ts and Tt to be easily linked. ad := ( ∥Dn ⊙ Sb (In )∥2F ). (3)
Essentially, we update the deviation parameter ∆s from the M n=1 Nn

4555
ometric deformations, e.g., in the caricature dataset is an-
other challenge. One choice to edit the geometry is to use
the properties of tri-plane features learned by EG3D. We
start out by analyzing these three planes in Gs . We observe
that the frontal plane encodes most of the information re-
quired to render the final image. To quantify this, we sam-
ple images and depth maps from Gs and swap the front and
the other planes from two random images. Then we com-
pare the difference in RGB values of the images and the
Chamfer distance of the depth maps. While swapping the
frontal tri-planes, the final images are completely swapped,
and the Chamfer distance changes by 80 ∼ 90% matching
the swapped image depth map. In the case of the other two
planes, the RGB image is not much affected and the Cham-
fer distance of the depth maps is reduced by only 20 ∼ 30%
in most cases.
Given the analysis, we focus to manipulate the 2D front
plane features to learn additional deformation or exaggera-
tions. We learn a TPS (Thin Plate Spline) [61] network on
top of the front plane. Our TPS network is conditioned both
on the front plane features as well as the W space to enable
multiple transformations. The architecture of the module
is similar to the standard StyleGAN2 layer with an MLP
Figure 4. 3D avatars from real images. Projection of real images appended at the end to predict the control points that trans-
on the 3D avatar generators. form the features. Hence, as a byproduct, we also enable
Here, Dn is the depth map of the image In sampled from 3D-geometry editing guided by the learned latent space. We
Gs , ⊙ represents the Hadamard product, M is the number train this module separately after Gt has been trained. We
of the sampled images, and Nn is the number of background find that joint training is unstable due to exploding gradients
pixels in In . Finally, regularization is defined as: arising from the large domain gap between Ts and Tt in the
initial stages. Formally, we define this transformation as:
R(D) := ∥ad · J − (Dt ⊙ Sb (It ))∥F , (4)
T(w, f ) := ∆c, (5)
where Dt is the depth map of the image It sampled from
Gt and J is the matrix of ones having the same spatial di- where, w is the latent code, f is the front plane, and c are
mensions as Dt . the control points.
Let cI be the initial control points producing an identity
3.3. What discriminator to use? transformation, (c1 , c2 ) be the control points corresponding
Given that the data in Ts and Tt is not paired and Tt is to front planes (f1 , f2 ) sampled using W codes (w1 , w2 ),
not assumed to contain camera parameter annotations, the respectively, and (c′1 , c′2 ) be points with (w1 , w2 ) swapped
choice of the discriminator (D) used for this task is also a in the TPS module. To regularize and encourage the module
critical design choice. Essentially, we use the unconditional to learn different deformations, we have
version of the dual discriminator proposed in EG3D, and
hence, we do not condition the discriminator on the camera X
2
R(T1 ) := α ∥cI − cn ∥1 − β∥c1 − c2 ∥1 − σ∥c′1 − c′2 ∥1 .
information. As a result, during the training, Gt generates n=1
arbitrary images with pose using M(θ′ , φ′ , c′ , r′ ), and the (6)
discriminator discriminates these images using arbitrary im- We use initial control point regularization to regularize
ages from Tt . We train the discriminator from scratch and large deviations in the control points which would otherwise
in order to adapt Ts → Tt , we use the StyleGAN-ADA [28] explode. Additionally, to learn extreme exaggerations in Tt
training scheme and use R1 regularization. and ‘in expectation’, conform to the target distribution in the
dataset, we add an additional loss term. Let S(I) be the soft-
3.4. How to incorporate larger geometric deforma-
argmax output of the face segmentation network [34] given
tions between domains?
an image I and assuming that S generalizes to caricatures,
While the regularizers are used to limit the geometric then
changes when adapting from Ts to Tt , modeling large ge- R(T2 ) := ∥S(Gt (w)), S(It )∥1 (7)

4556
GANSpace [22], StyleSpace [62] etc., and geometric ed-
its using TPS (Sec. 3.4) and ∆s interpolation (Sec. 3.2).
To perform video editing, we design an encoder for EG3D
based on e4e [58] to encode videos and transfer the edits
from Gs to Gt based on the w codes [4, 6, 59]. We leave a
more fine-grained approach for video processing as future
work.

Figure 5. Interpolation of ∆s. Geometric deformation using the 5. Results

interpolation of learned ∆s parameters.
5.1. Quantitative Results
Eq. 6, Eq. 7, and adversarial training loss are used to
train the TPS module. We adopt gradient clipping to make In this section, we consider three important evaluations
sure that the training does not diverge. See the illustrations to verify the quality of the texture, geometry, and identity
in Fig. 3 and Fig. 4 of the supplementary materials. preservation in the new domain using the Caricature, Car-
toons, and Pixar toons datasets. We evaluate the ablation
of our design choices in the supplementary materials. In
4. Personalized Avatar Generation and Editing the evaluation, let Gbase be the baseline naÈıve fine-tuning
Although 3D domain adaptation adapts Ts ↔ Tt , it is method which is trained with all the parameters using the
still a challenge to effectively link the latent spaces of Gs losses in EG3D fine-tuned from FFHQ trained prior Gs .
and Gt to generate personalized 3D avatars using a single Note here we still align the cameras in Gbase using the
photograph as the reference image. Particularly, the chal- method defined in Sec. 3.1 and use adaptive discrimina-
lenge arises due to the discrepancy in the coupled latent tor [28] with R1 regularization for a fair comparison.
spaces when dealing with the projection of real photographs Texture Quality. To verify the quality of the texture, di-
on 3D generators. Moreover, one would like to edit and an- versity of samples as well as to some extent, the geometry
imate these 3D avatars. in the target domain Tt , we compare the FID [23] scores
Projection. The task is to project a real image into the latent using Gbase and Gt in Table 1. Note that in the case of
space of Gs , transfer the latent to Gt , and further optimize Caricatures, we report two scores i.e. with and without us-
it to construct a 3D avatar. First, we use an optimization- ing the attribute classifier loss in the training as discussed
based method to find the w code that minimizes the simi- in Sec. 4. Notice that our method outperforms the naÈıve
larity between the generated and the real image in Gs . To baseline method by a huge margin in some cases, especially
achieve this, the first step is to align the cameras. We follow in Caricatures and Cartoons. We attribute these differences
the steps mentioned in Sec. 3.1 for this step. Next, we use to the mode collapse prone training of Gbase which is cor-
pixel-wise MSE loss and LPIPS loss to project the image related with flat geometry degenerate solution. We show
into Gs [1]. Additionally, to preserve the identity of the sub- visual results of the flat geometries learned by Gbase and
ject, we use attribute classifiers e.g. caricature dataset [24] comparison in Fig. 2.
provides the coupled attribute information of real images Geometric Quality. To quantify the flat geometries, in Ta-
and caricatures. We use such attribute classifier [24,25] in a ble 2, we show three scores that help us understand such de-
post-hoc manner as we notice that such networks can affect generate solutions. Here we consider coupled depth maps
the texture in the target domain and could degenerate to nar- generated from sampling in the domains Ts (Gs ) and Tt
row style outputs if applied during training. Moreover, such (Gt and Gbase ). First, we compute the expectation of the
networks may not be available for all target domains. To absolute mean differences (Md ) of the corresponding fore-
avoid overfitting into Gs and encourage the easier transfer ground depth maps sampled from Ts and Tt . We also com-
of the optimized latent code to Gt , we use W space opti- pute the expectation of the absolute standard deviation dif-
mization for this step. Finally, we initialize this w code for ferences (Sd ) for the same setting. Here, we assume that the
Gt and use additional attribute classifier loss [25] for Tt do- flatter geometries have a large difference in the depth maps
main along with Depth regularization R(D) (Eq. 4). As an as compared to the prior as indicated by Md . Moreover, Sd
approximation, we assume that attribute classifier [24, 25] computes the distance in the distribution of the depth values,
generalizes across all domains. We use W/W+ space opti- where a larger difference indicates a narrow distribution,
mization to control the quality and diversity of the outputs. and hence a flatter geometry. We also notice that the flat
See Algorithm 1 in supplementary for the description. geometry is correlated with the generator learning diverse
Editing and Animation. Since our 3D domain adaptation poses when images are rendered under standard canonical
is designed to preserve the properties of W and S spaces, camera parameters i.e. M(0, 0, c, r). We hypothesize in the
we can perform semantic edits via InterFaceGAN [51], case of the flatter geometries, the model learns to pose in-

4557
Figure 6. Deformations using TPS. Geometric edits using our proposed TPS (Thin Plate Spline) module learned on the frontal tri-plane
features. Each sub-figure shows a 3D avatar and three examples of TPS deformations sampled from the learned 3D deformation space.

Table 1. FID Computation. FID (FrÂechet Inception Distance) be-

tween the 2D dataset and the samples generated by the fine-tuned
3D GAN using baseline (Gbase ) and Ours (Gt ). ’*’ represents the
score with the inclusion of the attribute classifier loss discussed in
Sec. 3.2.

Method Caricatures Cartoons Pixar Toons

Gbase 67.8 79.0 15.1
Gt (Ours) 19.4/20.2* 12.8 12.4

Table 2. Geometry Evaluation. Comparing the geometry using

baseline method (Gbase ) and Ours (Gt ). For the definition of Md ,
Sd and R(T2 ), refer to Sec. 5.1.

Metric Method Caricatures Cartoons Pixar

Md ↓ Gbase 0.47 0.21 0.29
Gt (Ours) 0.21 0.13 0.13
Sd ↓ Gbase 0.22 0.14 0.15
Gt (Ours) 0.15 0.10 0.09
R(T2 ) ↓ Gbase 2.99 3.39 4.01
Gt (Ours) 2.27 1.62 1.56

Table 3. Identity Preservation. Identity preservation using base-

line (Gbase ) and Ours (Gt ).

Method Caricatures Cartoons Pixar Toons

Gbase 1.28 0.92 0.85
Gt (Ours) 0.87 0.81 0.73

formation in the earlier layers instead of being camera view- Figure 7. Local edits. Local edits performed on the 3D avatars
dependent. To quantify this, since pose information may not using the S space.
be available for some domains (e.g. cartoons), we compute
the R(T2 ) scores between corresponding images in the do- is able to preserve the identity better across the domains.
main Ts (Gs ) and Tt (Gt and Gbase ). Note that these scores
5.2. Qualitative Results
are computed without the TPS module. Our scores are lower
in all three metrics, hence, validating that our method avoids For qualitative results, we show the results of the domain
the degenerate solution and preserves the geometric distri- adaptation, as well as the personalized edits (geometric and
bution of the prior. For discussion on the TPS module and semantic), performed on the resultant 3D avatars. First, in
ablations refer to the supplementary materials. order to show the quality of domain adaptation, identity
Identity Preservation. Identity preservation score is an- preservation, and geometric consistency, in Fig. 3, we show
other important evaluation to check the quality of latent results from Gs and corresponding results from 3D avatar
space linking between Gs and Gt . In Table 3, we compute generator Gt trained on Caricatures, Pixar toons, Cartoons,
the attribute loss (BCE loss) between the domains Ts and Tt and Comic domains. Next, in order to show that the method
using the attribute classifiers [24, 25]. Note that our method generalizes to real images, we use the method described in

4558
Figure 8. 3D avatar animation. Animation of 3D avatars generated using a driving video encoded in source domain Ts and applied to
samples in target domain Tt . The top row shows the driving video and the subsequent rows show generated animations using a random
Caricature or Pixar toon. The head pose is changed in each frame of the generated animation to show 3D consistency.

Sec. 4 to project and transfer the latent code from Gs to Gt 6. Conclusion

to produce the 3D avatars. In Fig. 4, we show our results
We tackled two open research problems in this paper.
of real to 3D avatar transfer. Notice the quality both in
In the first part, we proposed the first domain adaptation
terms of texture as well as geometry for both these results
method for 3D-GANs to the best of our knowledge. This
achieved by our method. Next, we show geometric and
part yields two linked EG3D generators, one in the photo-
semantic edits possible to produce personalized 3D avatars:
realistic source domain of faces, and another EG3D genera-
Geometry Edits. We show two type of geometric edits tor in an artistic target domain. As possible target domains,
i.e. ∆s interpolation (Sec. 3.2) and deformation using TPS we show results for cartoons, caricatures, and Comics. In
(Sec. 3.4). First, in Fig. 5, we show the geometry interpo- the second part, we built on domain adaptation to create
lation by interpolating between original s activations of Gs 3D avatars in an artistic domain that can be edited and ani-
and learned ∆s parameters. In Fig. 6, we show some ad- mated. Our framework consists of multiple technical com-
ditional exaggerations in caricatures using the learned 3D ponents introduced in this paper. First, we propose a tech-
deformation latent space of TPS module. nique for camera space estimation for artistic domains. Sec-
Semantic Edits and Animation. Since in our method, we ond, we introduce a set of regularizers and loss functions
encourage the latent regularization to preserve the proper- that can regularize the fine-tuning of EG3D in such a way
ties of the latent space learned by the Gs generator, in Fig. 7 that enough of the 3D structure and geometry of the origi-
we show S space edits performed on the 3D avatars. Notice nal model is kept, while the distinguishing attributes of the
the quality of edits in terms of locality and adaptability. Ad- artistic domain, such as textures and colors and local ge-
ditionally, we can edit semantics like hair as opposed to 3D ometric deformations can still be learned. Third, we in-
morphable model based methods. In Fig. 8, thanks to the troduce a geometric deformation module that can reintro-
latent space semantics preservation ensured by our method, duce larger geometric deformations in a controlled manner.
we can perform some video edits to create a coherent ani- These larger geometric deformations can interact and coop-
mation based on the difference of w codes of video encoded erate with EG3D so that semantic edits are still possible.
in Gs (Sec. 4) and applied to layers 7 − 10 in Gt . Notice the Finally, we propose an embedding algorithm that is espe-
quality of expressions, identity preservation, and 3D consis- cially suitable for two linked EG3D generator networks.
tency across each identity in each row.

4559
References [14] Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha.
Stargan v2: Diverse image synthesis for multiple domains.
[1] Rameen Abdal, Yipeng Qin, and Peter Wonka. Im- In Proceedings of the IEEE Conference on Computer Vision
age2stylegan: How to embed images into the stylegan la- and Pattern Recognition, 2020. 2
tent space? In Proceedings of the IEEE/CVF International [15] Min Jin Chong and David A. Forsyth. Jojogan: One shot
Conference on Computer Vision, pages 4432±4441, Seoul, face stylization. CoRR, abs/2112.11641, 2021. 2
Korea, 2019. IEEE. 2, 6
[16] Min Jin Chong, Hsin-Ying Lee, and David Forsyth. Style-
[2] Rameen Abdal, Peihao Zhu, John Femiani, Niloy Mitra, and gan of all trades: Image manipulation with only pretrained
Peter Wonka. Clip2stylegan: Unsupervised extraction of stylegan. arXiv preprint arXiv:2111.01619, 2021. 2
stylegan edit directions. In ACM SIGGRAPH 2022 Confer- [17] Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik,
ence Proceedings, SIGGRAPH ’22, New York, NY, USA, and Daniel Cohen-Or. Stylegan-nada: Clip-guided do-
2022. Association for Computing Machinery. 2 main adaptation of image generators. arXiv preprint
[3] Rameen Abdal, Peihao Zhu, Niloy J. Mitra, and Peter arXiv:2108.00946, 2021. 2
Wonka. Styleflow: Attribute-conditioned exploration of [18] Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, and
stylegan-generated images using conditional continuous nor- Daniel Cohen-Or. Stylegan-nada: Clip-guided domain adap-
malizing flows. ACM Trans. Graph., 40(3), may 2021. 2, 3 tation of image generators, 2021. 2
[4] Rameen Abdal, Peihao Zhu, Niloy J. Mitra, and Peter [19] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing
Wonka. Video2stylegan: Disentangling local and global Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and
variations in a video, 2022. 6 Yoshua Bengio. Generative adversarial networks, 2014. 2
[5] Yuval Alaluf, Or Patashnik, and Daniel Cohen-Or. Restyle: [20] Jiatao Gu, Lingjie Liu, Peng Wang, and Christian Theobalt.
A residual-based stylegan encoder via iterative refinement. Stylenerf: A style-based 3d aware generator for high-
In Proceedings of the IEEE/CVF International Conference resolution image synthesis. In International Conference on
on Computer Vision (ICCV), October 2021. 2 Learning Representations, 2022. 2
[6] Yuval Alaluf, Or Patashnik, Zongze Wu, Asif Zamir, Eli [21] Fangzhou Han, Shuquan Ye, Mingming He, Menglei Chai,
Shechtman, Dani Lischinski, and Daniel Cohen-Or. Third and Jing Liao. Exemplar-based 3d portrait stylization.
time’s the charm? image and video editing with stylegan3. IEEE Transactions on Visualization and Computer Graph-
CoRR, abs/2201.13433, 2022. 6 ics, 2021. 2
[7] Yuval Alaluf, Omer Tov, Ron Mokady, Rinon Gal, and [22] Erik HÈarkÈonen, Aaron Hertzmann, Jaakko Lehtinen, and
Amit H. Bermano. Hyperstyle: Stylegan inversion with hy- Sylvain Paris. Ganspace: Discovering interpretable gan con-
pernetworks for real image editing. CoRR, abs/2111.15666, trols. arXiv preprint arXiv:2004.02546, 2020. 2, 6
2021. 2 [23] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner,
[8] Anonymous. 3d generation on imagenet. In Open Review, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a
2023. 3 two time-scale update rule converge to a local nash equilib-
[9] Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter rium. Advances in neural information processing systems,
Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. 30, 2017. 6
Mip-nerf: A multiscale representation for anti-aliasing neu- [24] Jing Huo, Wenbin Li, Yinghuan Shi, Yang Gao, and Hujun
ral radiance fields. In Proceedings of the IEEE/CVF Inter- Yin. Webcaricature: a benchmark for caricature recognition.
national Conference on Computer Vision, pages 5855±5864, In British Machine Vision Conference, 2018. 3, 6, 7
2021. 2 [25] Wonjong Jang, Gwangjin Ju, Yucheol Jung, Jiaolong Yang,
[10] Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki Xin Tong, and Seungyong Lee. Stylecarigan: Caricature
Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, generation via stylegan feature map modulation. 40(4), 2021.
Leonidas Guibas, Jonathan Tremblay, Sameh Khamis, Tero 2, 3, 6, 7
Karras, and Gordon Wetzstein. Efficient geometry-aware 3D [26] Yucheol Jung, Wonjong Jang, Soongjin Kim, Jiaolong Yang,
generative adversarial networks. In arXiv, 2021. 1, 2, 3 Xin Tong, and Seungyong Lee. Deep deformable 3d carica-
[11] Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki tures with learned shape control. In Special Interest Group
Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, on Computer Graphics and Interactive Techniques Confer-
Leonidas Guibas, Jonathan Tremblay, Sameh Khamis, Tero ence Proceedings. ACM, aug 2022. 2
Karras, and Gordon Wetzstein. Efficient geometry-aware 3d [27] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen.
generative adversarial networks, 2021. 2, 4 Progressive growing of gans for improved quality, stability,
[12] Eric R Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and variation, 2017. 2
and Gordon Wetzstein. pi-gan: Periodic implicit generative [28] Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine,
adversarial networks for 3d-aware image synthesis. In Pro- Jaakko Lehtinen, and Timo Aila. Training generative adver-
ceedings of the IEEE/CVF Conference on Computer Vision sarial networks with limited data. In Proc. NeurIPS, 2020.
and Pattern Recognition, pages 5799±5809, 2021. 2 1, 2, 5, 6
[13] Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, [29] Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine,
Sergey Tulyakov, and Ming-Hsuan Yang. Inout: Diverse im- Jaakko Lehtinen, and Timo Aila. Training generative
age outpainting via gan inversion. In IEEE Conference on adversarial networks with limited data. arXiv preprint
Computer Vision and Pattern Recognition, 2022. 2 arXiv:2006.06676, 2020. 3

4560
[30] Tero Karras, Miika Aittala, Samuli Laine, Erik HÈarkÈonen, [45] Justin N. M. Pinkney and Doron Adler. Resolution depen-
Janne Hellsten, Jaakko Lehtinen, and Timo Aila. Alias-free dent gan interpolation for controllable image synthesis be-
generative adversarial networks, 2021. 1, 2 tween domains, 2020. 2, 3
[31] Tero Karras, Samuli Laine, and Timo Aila. A style-based [46] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya
generator architecture for generative adversarial networks. Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry,
In Proceedings of the IEEE/CVF Conference on Computer Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen
Vision and Pattern Recognition, pages 4401±4410, 2019. 1, Krueger, and Ilya Sutskever. Learning transferable vi-
3 sual models from natural language supervision. CoRR,
[32] Tero Karras, Samuli Laine, and Timo Aila. A Style-Based abs/2103.00020, 2021. 2
generator architecture for generative adversarial networks. [47] Alec Radford, Luke Metz, and Soumith Chintala. Unsuper-
IEEE transactions on pattern analysis and machine intelli- vised representation learning with deep convolutional gener-
gence, 43(12):4217±4228, Dec. 2021. 2 ative adversarial networks, 2015. 2
[33] Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, [48] Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan,
Jaakko Lehtinen, and Timo Aila. Analyzing and improving Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. Encoding
the image quality of StyleGAN. In Proc. CVPR, 2020. 2 in style: a stylegan encoder for image-to-image translation.
[34] Cheng-Han Lee, Ziwei Liu, Lingyun Wu, and Ping Luo. arXiv preprint arXiv:2008.00951, 2020. 2
Maskgan: Towards diverse and interactive facial image ma- [49] Daniel Roich, Ron Mokady, Amit H Bermano, and Daniel
nipulation. In IEEE Conference on Computer Vision and Pat- Cohen-Or. Pivotal tuning for latent-based editing of real im-
tern Recognition (CVPR), 2020. 4, 5 ages. arXiv preprint arXiv:2106.05744, 2021. 2
[35] Thomas LeimkÈuhler and George Drettakis. Freestylegan: [50] Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas
Free-view editable portrait rendering with the camera mani- Geiger. Graf: Generative radiance fields for 3d-aware image
fold. 40(6), 2021. 3 synthesis. In Advances in Neural Information Processing
[36] Chieh Hubert Lin, Hsin-Ying Lee, Yen-Chi Cheng, Sergey Systems (NeurIPS), 2020. 2
Tulyakov, and Ming-Hsuan Yang. Infinitygan: Towards
[51] Yujun Shen, Ceyuan Yang, Xiaoou Tang, and Bolei Zhou.
infinite-pixel image synthesis. In International Conference
Interfacegan: Interpreting the disentangled face representa-
on Learning Representations (ICLR), 2022. 2
tion learned by gans. IEEE Transactions on Pattern Analysis
[37] Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik,
and Machine Intelligence, 2020. 2, 6
Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf:
[52] Yichun Shi, Divyansh Aggarwal, and Anil K Jain. Lifting
Representing scenes as neural radiance fields for view syn-
2d stylegan for 3d-aware face generation. In Proceedings of
thesis. In European conference on computer vision, pages
the IEEE/CVF Conference on Computer Vision and Pattern
405±421. Springer, 2020. 2
Recognition, pages 6258±6266, 2021. 2
[38] Sangwoo Mo, Minsu Cho, and Jinwoo Shin. Freeze the dis-
criminator: a simple baseline for fine-tuning gans, 2020. 2 [53] Ivan Skorokhodov, Aliaksandr Siarohin, Yinghao Xu, Jian
[39] Michael Niemeyer and Andreas Geiger. Campari: Camera- Ren, Hsin-Ying Lee, Peter Wonka, and Sergey Tulyakov.
aware decomposed generative neural radiance fields. In 2021 3d generation on imagenet. In International Conference on
International Conference on 3D Vision (3DV), pages 951± Learning Representations (ICLR), 2023. 2
961. IEEE, 2021. 2 [54] Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chun-
[40] Michael Niemeyer and Andreas Geiger. Giraffe: Represent- pong Lai, Chuanxia Zheng, and Tat-Jen Cham. Agilegan:
ing scenes as compositional generative neural feature fields. Stylizing portraits by inversion-consistent transfer learning.
In Proceedings of the IEEE/CVF Conference on Computer ACM Trans. Graph., 40(4), jul 2021. 2
Vision and Pattern Recognition, pages 11453±11464, 2021. [55] Jingxiang Sun, Xuan Wang, Yichun Shi, Lizhen Wang, Jue
2 Wang, and Yebin Liu. Ide-3d: Interactive disentangled edit-
[41] Roy Or-El, Xuan Luo, Mengyi Shan, Eli Shecht- ing for high-resolution 3d-aware portrait synthesis, 2022. 1,
man, Jeong Joon Park, and Ira Kemelmacher-Shlizerman. 2
StyleSDF: High-Resolution 3D-Consistent Image and Ge- [56] Ayush Tewari, Mohamed Elgharib, Gaurav Bharaj, Florian
ometry Generation. arXiv preprint arXiv:2112.11427, 2021. Bernard, Hans-Peter Seidel, Patrick PÂerez, Michael Zoll-
1, 2 hofer, and Christian Theobalt. Stylerig: Rigging style-
[42] Xingang Pan, Bo Dai, Ziwei Liu, Chen Change Loy, and gan for 3d control over portrait images. In Proceedings of
Ping Luo. Do 2d gans know 3d shape? unsupervised 3d the IEEE/CVF Conference on Computer Vision and Pattern
shape reconstruction from 2d image gans. arXiv preprint Recognition, pages 6142±6151, 2020. 2
arXiv:2011.00844, 2020. 2 [57] Ayush Tewari, Mohamed Elgharib, Mallikarjun BR, Flo-
[43] Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien rian Bernard, Hans-Peter Seidel, Patrick PÂerez, Michael
Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo ZÈollhofer, and Christian Theobalt. Pie: Portrait image em-
Martin-Brualla. Deformable neural radiance fields. arXiv bedding for semantic control. volume 39, December 2020.
preprint arXiv:2011.12948, 2020. 2 2
[44] Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, [58] Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, and
and Dani Lischinski. Styleclip: Text-driven manipulation of Daniel Cohen-Or. Designing an encoder for stylegan image
stylegan imagery, 2021. 2 manipulation. arXiv preprint arXiv:2102.02766, 2021. 2, 6

4561
[59] Rotem Tzaban, Ron Mokady, Rinon Gal, Amit H. Bermano,
and Daniel Cohen-Or. Stitch it in time: Gan-based facial
editing of real videos. CoRR, abs/2201.08361, 2022. 6
[60] Can Wang, Menglei Chai, Mingming He, Dongdong Chen,
and Jing Liao. Cross-domain and disentangled face manipu-
lation with 3d guidance. IEEE Transactions on Visualization
and Computer Graphics, 2022. 2
[61] WarBean. tps-stn-pytorch. https://github.com/
WarBean/tps_stn_pytorch. 5
[62] Zongze Wu, Dani Lischinski, and Eli Shechtman. Stylespace
analysis: Disentangled controls for stylegan image genera-
tion. arXiv preprint arXiv:2011.12799, 2020. 2, 4, 6
[63] Yinghao Xu, Menglei Chai, Zifan Shi, Sida Peng, Ivan Sko-
rokhodov, Aliaksandr Siarohin, Ceyuan Yang, Yujun Shen,
Hsin-Ying Lee, Bolei Zhou, et al. Discoscene: Spatially
disentangled generative radiance fields for controllable 3d-
aware scene synthesis. In IEEE Conference on Computer
Vision and Pattern Recognition, 2023. 2
[64] Yinghao Xu, Sida Peng, Ceyuan Yang, Yujun Shen,
and Bolei Zhou. 3d-aware image synthesis via learn-
ing structural and textural representations. arXiv preprint
arXiv:2112.10759, 2021. 1, 2
[65] Shuai Yang, Liming Jiang, Ziwei Liu, and Chen Change
Loy. Pastiche master: Exemplar-based high-resolution por-
trait style transfer. In CVPR, 2022. 2, 3
[66] Zipeng Ye, Mengfei Xia, Yanan Sun, Ran Yi, Minjing Yu,
Juyong Zhang, Yu-Kun Lai, and Yong-Jin Liu. 3d-CariGAN:
An end-to-end solution to 3d caricature generation from nor-
mal face photos. IEEE Transactions on Visualization and
Computer Graphics, pages 1±1, 2021. 2
[67] Fisher Yu, Yinda Zhang, Shuran Song, Ari Seff, and Jianx-
iong Xiao. Lsun: Construction of a large-scale image dataset
using deep learning with humans in the loop. arXiv preprint
arXiv:1506.03365, 2015. 2
[68] Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen
Koltun. Nerf++: Analyzing and improving neural radiance
fields. arXiv preprint arXiv:2010.07492, 2020. 2
[69] Jiapeng Zhu, Yujun Shen, Deli Zhao, and Bolei Zhou. In-
domain gan inversion for real image editing. In European
Conference on Computer Vision, pages 592±608. Springer,
2020. 2
[70] Peihao Zhu, Rameen Abdal, John Femiani, and Peter Wonka.
Mind the gap: Domain gap control for single shot domain
adaptation for generative adversarial networks. In Interna-
tional Conference on Learning Representations, 2022. 2
[71] Peihao Zhu, Rameen Abdal, Yipeng Qin, John Femiani, and
Peter Wonka. Improved stylegan embedding: Where are the
good latents?, 2020. 2
[72] zllrunning. face-parsing.pytorch. https://github.
com/zllrunning/face-parsing.PyTorch. 2, 3

4562

InArtCraft Dolphins
No ratings yet
InArtCraft Dolphins
39 pages
Generating Anime Faces From Human Faces With Adversarial Networks
No ratings yet
Generating Anime Faces From Human Faces With Adversarial Networks
7 pages
1.3.6 Operating Systems
100% (1)
1.3.6 Operating Systems
10 pages
HTML Practical Questions List 14-5-13
100% (3)
HTML Practical Questions List 14-5-13
5 pages
Image2StyleGAN How To Embed Images Into The StyleGAN Latent Space
No ratings yet
Image2StyleGAN How To Embed Images Into The StyleGAN Latent Space
10 pages
3D Cartoon Face Generation With Controllable Expressions From A Single GAN Image
No ratings yet
3D Cartoon Face Generation With Controllable Expressions From A Single GAN Image
11 pages
Oh From 2D Portraits To 3D Realities Advancing GAN Inversion For CVPRW 2024 Paper
No ratings yet
Oh From 2D Portraits To 3D Realities Advancing GAN Inversion For CVPRW 2024 Paper
10 pages
3d-Aware Conditional Image Synthesis: Kangle Deng Gengshan Yang Deva Ramanan Jun-Yan Zhu Carnegie Mellon University
No ratings yet
3d-Aware Conditional Image Synthesis: Kangle Deng Gengshan Yang Deva Ramanan Jun-Yan Zhu Carnegie Mellon University
15 pages
Diff Driven GAN
No ratings yet
Diff Driven GAN
10 pages
Volume GAN
No ratings yet
Volume GAN
12 pages
Few-Shot Image Generation Via Style Adaptation and Content Preservation
No ratings yet
Few-Shot Image Generation Via Style Adaptation and Content Preservation
12 pages
Everyone Is A Cartoonist: Selfie Cartoonization With Attentive Adversarial Networks
No ratings yet
Everyone Is A Cartoonist: Selfie Cartoonization With Attentive Adversarial Networks
7 pages
3D Generative Models A Survey
No ratings yet
3D Generative Models A Survey
21 pages
Anigan: Style-Guided Generative Adversarial Networks For Unsupervised Anime Face Generation
No ratings yet
Anigan: Style-Guided Generative Adversarial Networks For Unsupervised Anime Face Generation
15 pages
Pix 2 Style 2 Pix
No ratings yet
Pix 2 Style 2 Pix
21 pages
Arc 2 Face
No ratings yet
Arc 2 Face
29 pages
Sr. No Title Published Problem Statement Methodology Dataset Dataset Avail-Ability
No ratings yet
Sr. No Title Published Problem Statement Methodology Dataset Dataset Avail-Ability
2 pages
Texture: Text-Guided Texturing of 3D Shapes
No ratings yet
Texture: Text-Guided Texturing of 3D Shapes
13 pages
Anime Face Gen
No ratings yet
Anime Face Gen
15 pages
GIRAFFE Representing Scenes As Compositional Generative Neural Feature Fields - 2011.12100v2
No ratings yet
GIRAFFE Representing Scenes As Compositional Generative Neural Feature Fields - 2011.12100v2
12 pages
Chen2020 Chapter AnimeGAN
No ratings yet
Chen2020 Chapter AnimeGAN
15 pages
Stylesdf: High-Resolution 3D-Consistent Image and Geometry Generation
No ratings yet
Stylesdf: High-Resolution 3D-Consistent Image and Geometry Generation
17 pages
Efficient Geometry-Aware 3D Generative Adversarial Networks
No ratings yet
Efficient Geometry-Aware 3D Generative Adversarial Networks
27 pages
Magic: Image High-Quality Object Generation Using Both 2D and 3D Diffusion Priors
No ratings yet
Magic: Image High-Quality Object Generation Using Both 2D and 3D Diffusion Priors
18 pages
Mangagan: Unpaired Photo-To-Manga Translation Based On The Methodology of Manga Drawing
No ratings yet
Mangagan: Unpaired Photo-To-Manga Translation Based On The Methodology of Manga Drawing
9 pages
3D Aware Synthesis Via Learning Textural and Structural Representations
No ratings yet
3D Aware Synthesis Via Learning Textural and Structural Representations
13 pages
DL M6 Tech
No ratings yet
DL M6 Tech
29 pages
A Tiered GAN Approach For Monet-Style Image Generation
No ratings yet
A Tiered GAN Approach For Monet-Style Image Generation
6 pages
Expanding The Latent Space of StyleGAN For Real Face Editing
No ratings yet
Expanding The Latent Space of StyleGAN For Real Face Editing
17 pages
Jeong 3D Scene Painting Via Semantic Image Synthesis CVPR 2022 Paper
No ratings yet
Jeong 3D Scene Painting Via Semantic Image Synthesis CVPR 2022 Paper
11 pages
StarGAN v2 - Diverse Image Synthesis For Multiple Domains
No ratings yet
StarGAN v2 - Diverse Image Synthesis For Multiple Domains
14 pages
Control 4 D
No ratings yet
Control 4 D
11 pages
Bio Robotics
No ratings yet
Bio Robotics
2 pages
A Survey of Image Synthesis and Editing With Generative Adversarial Networks PDF
No ratings yet
A Survey of Image Synthesis and Editing With Generative Adversarial Networks PDF
15 pages
Anime Visage Revealing Ingenuity With GAN-Assisted Character Development
No ratings yet
Anime Visage Revealing Ingenuity With GAN-Assisted Character Development
7 pages
(CVPR) Lin Magic3D High-Resolution Text-To-3D Content Creation CVPR 2023 Paper
No ratings yet
(CVPR) Lin Magic3D High-Resolution Text-To-3D Content Creation CVPR 2023 Paper
10 pages
Chen 等 - MeshAnything Artist-Created Mesh Generation with Autoregressive Transformers
No ratings yet
Chen 等 - MeshAnything Artist-Created Mesh Generation with Autoregressive Transformers
16 pages
Tang NeRFDeformer NeRF Transformation From A Single View Via 3D Scene CVPR 2024 Paper
No ratings yet
Tang NeRFDeformer NeRF Transformation From A Single View Via 3D Scene CVPR 2024 Paper
11 pages
3D FM Gan
No ratings yet
3D FM Gan
26 pages
GAGAN: Geometry-Aware Generative Adversarial Networks
No ratings yet
GAGAN: Geometry-Aware Generative Adversarial Networks
10 pages
DCGANs
No ratings yet
DCGANs
9 pages
Documentary
No ratings yet
Documentary
16 pages
Li 3D-Aware Face Swapping CVPR 2023 Paper
No ratings yet
Li 3D-Aware Face Swapping CVPR 2023 Paper
10 pages
Art2Real - Unfolding The Reality of Artworks
No ratings yet
Art2Real - Unfolding The Reality of Artworks
11 pages
Style Flow
No ratings yet
Style Flow
22 pages
Rodin: A Generative Model For Sculpting 3D Digital Avatars Using Diffusion
No ratings yet
Rodin: A Generative Model For Sculpting 3D Digital Avatars Using Diffusion
19 pages
Mangagan: Unpaired Photo-To-Manga Translation Based On The Methodology of Manga Drawing
No ratings yet
Mangagan: Unpaired Photo-To-Manga Translation Based On The Methodology of Manga Drawing
17 pages
GAvatar - Animatable 3D Gaussian Avatars With Implicit Mesh Learning
No ratings yet
GAvatar - Animatable 3D Gaussian Avatars With Implicit Mesh Learning
21 pages
NeurIPS 2020 Ganspace Discovering Interpretable Gan Controls Paper
No ratings yet
NeurIPS 2020 Ganspace Discovering Interpretable Gan Controls Paper
10 pages
Renderdiffusion: Image Diffusion For 3D Reconstruction, Inpainting and Generation
No ratings yet
Renderdiffusion: Image Diffusion For 3D Reconstruction, Inpainting and Generation
15 pages
Cartooniation Using White-Box Technique in Machine Learning
100% (2)
Cartooniation Using White-Box Technique in Machine Learning
5 pages
Cs230 Final Project Presentation, Stanford University: Hui Su, Jin Fang
No ratings yet
Cs230 Final Project Presentation, Stanford University: Hui Su, Jin Fang
1 page
Deep Generative Adversarial Networks For Image-To
No ratings yet
Deep Generative Adversarial Networks For Image-To
26 pages
Disney Type Avatars Paper
No ratings yet
Disney Type Avatars Paper
10 pages
Cui IDAdapter Learning Mixed Features For Tuning-Free Personalization of Text-to-Image Models CVPRW 2024 Paper
No ratings yet
Cui IDAdapter Learning Mixed Features For Tuning-Free Personalization of Text-to-Image Models CVPRW 2024 Paper
10 pages
Unsupervised Pixel-Level Domain Adaptation With Generative Adversarial Networks
No ratings yet
Unsupervised Pixel-Level Domain Adaptation With Generative Adversarial Networks
15 pages
Fashion Synthesis With Structural Coherence
No ratings yet
Fashion Synthesis With Structural Coherence
9 pages
Deep Learning Important Studies
No ratings yet
Deep Learning Important Studies
6 pages
Clay
No ratings yet
Clay
19 pages
Meta
No ratings yet
Meta
17 pages
28113-Article Text-32167-1-2-20240324
No ratings yet
28113-Article Text-32167-1-2-20240324
9 pages
Brachmann Accelerated Coordinate Encoding Learning To Relocalize in Minutes Using RGB CVPR 2023 Paper
No ratings yet
Brachmann Accelerated Coordinate Encoding Learning To Relocalize in Minutes Using RGB CVPR 2023 Paper
10 pages
Bowman a-La-Carte Prompt Tuning APT Combining Distinct Data Via Composable Prompting CVPR 2023 Paper
No ratings yet
Bowman a-La-Carte Prompt Tuning APT Combining Distinct Data Via Composable Prompting CVPR 2023 Paper
10 pages
Boutros CR-FIQA Face Image Quality Assessment by Learning Sample Relative Classifiability CVPR 2023 Paper
No ratings yet
Boutros CR-FIQA Face Image Quality Assessment by Learning Sample Relative Classifiability CVPR 2023 Paper
10 pages
Brahma A Probabilistic Framework For Lifelong Test-Time Adaptation CVPR 2023 Paper
No ratings yet
Brahma A Probabilistic Framework For Lifelong Test-Time Adaptation CVPR 2023 Paper
10 pages
Abousamra Topology-Guided Multi-Class Cell Context Generation For Digital Pathology CVPR 2023 Paper
No ratings yet
Abousamra Topology-Guided Multi-Class Cell Context Generation For Digital Pathology CVPR 2023 Paper
11 pages
Agaram Canonical Fields Self-Supervised Learning of Pose-Canonicalized Neural Fields CVPR 2023 Paper
No ratings yet
Agaram Canonical Fields Self-Supervised Learning of Pose-Canonicalized Neural Fields CVPR 2023 Paper
11 pages
Agustsson Multi-Realism Image Compression With A Conditional Generator CVPR 2023 Paper
No ratings yet
Agustsson Multi-Realism Image Compression With A Conditional Generator CVPR 2023 Paper
10 pages
Aakanksha Improving Robustness of Semantic Segmentation To Motion-Blur Using Class-Centric Augmentation CVPR 2023 Paper
No ratings yet
Aakanksha Improving Robustness of Semantic Segmentation To Motion-Blur Using Class-Centric Augmentation CVPR 2023 Paper
10 pages
Svt-79-En-Gln-00093-B01 Analyser Hous Entry
No ratings yet
Svt-79-En-Gln-00093-B01 Analyser Hous Entry
16 pages
1
No ratings yet
1
7,813 pages
Veriflex Intercon 1.8-3kV Cable - 1
No ratings yet
Veriflex Intercon 1.8-3kV Cable - 1
2 pages
Windows 10 Default Apps Wouldn't Start. - Microsoft Community
No ratings yet
Windows 10 Default Apps Wouldn't Start. - Microsoft Community
3 pages
Solar Drive EM15 SP User Manual
No ratings yet
Solar Drive EM15 SP User Manual
41 pages
Detroit Diesel 4 71t Spec Sheet
100% (1)
Detroit Diesel 4 71t Spec Sheet
2 pages
Teelogix Admin Dashboard
No ratings yet
Teelogix Admin Dashboard
1 page
Thesis Wifi Access and Charging Station Using Perpetual Motion Mechanism
No ratings yet
Thesis Wifi Access and Charging Station Using Perpetual Motion Mechanism
41 pages
IJSRET V11 Issue2 814
No ratings yet
IJSRET V11 Issue2 814
6 pages
Caterpillar Forklift GP35N IC Pneumatic Trucks Electronic Sales Manual PDF
100% (4)
Caterpillar Forklift GP35N IC Pneumatic Trucks Electronic Sales Manual PDF
139 pages
Motor Current Rating Chart - C16956e7 620b 42f3 A72a 2b33c6137562
No ratings yet
Motor Current Rating Chart - C16956e7 620b 42f3 A72a 2b33c6137562
2 pages
4098 7402 1 SM PDF
No ratings yet
4098 7402 1 SM PDF
10 pages
ICDL 3D Design Syllabus 1.0 1
No ratings yet
ICDL 3D Design Syllabus 1.0 1
5 pages
Breakers PDF
No ratings yet
Breakers PDF
2 pages
Pneumatic Clutch
100% (1)
Pneumatic Clutch
29 pages
Computer Graphics Notes TutorialsDuniya
No ratings yet
Computer Graphics Notes TutorialsDuniya
188 pages
C64718AC Met One 3400+ Installation Qualification
No ratings yet
C64718AC Met One 3400+ Installation Qualification
20 pages
Educator Iste Standards-2 4
No ratings yet
Educator Iste Standards-2 4
2 pages
Sup. Project MGMT - Chapter 3
No ratings yet
Sup. Project MGMT - Chapter 3
14 pages
Do CHI2023
No ratings yet
Do CHI2023
16 pages
Question Bank 9 PEC (51-100)
No ratings yet
Question Bank 9 PEC (51-100)
52 pages
Fx350 Series
No ratings yet
Fx350 Series
48 pages
JBL SRX815 SpecSheet 11 11 19
No ratings yet
JBL SRX815 SpecSheet 11 11 19
4 pages
Jupyterhub Tutorial Readthedocs Io en Latest
No ratings yet
Jupyterhub Tutorial Readthedocs Io en Latest
27 pages
CE3155 Introduction To ETABS (Multi-Storey)
100% (1)
CE3155 Introduction To ETABS (Multi-Storey)
42 pages
Ebook+ +Answering+Machine+ (Ingle s+Sem+Neura)
No ratings yet
Ebook+ +Answering+Machine+ (Ingle s+Sem+Neura)
4 pages
Untitled
No ratings yet
Untitled
21 pages

Abdal 3DAvatarGAN Bridging Domains For Personalized Editable Avatars CVPR 2023 Paper

Uploaded by

Abdal 3DAvatarGAN Bridging Domains For Personalized Editable Avatars CVPR 2023 Paper

Uploaded by

This CVPR paper is the Open Access version, provided by the Computer Vision Foundation.

Except for this watermark, it is identical to the accepted version;

3DAvatarGAN: Bridging Domains for Personalized Editable Avatars

Peter Wonka1 Sergey Tulyakov2

Abstract a deformation-based technique for modeling exaggerated

Figure 5. Interpolation of ∆s. Geometric deformation using the 5. Results

Table 1. FID Computation. FID (FrÂechet Inception Distance) be-

Method Caricatures Cartoons Pixar Toons

Table 2. Geometry Evaluation. Comparing the geometry using

Metric Method Caricatures Cartoons Pixar

Table 3. Identity Preservation. Identity preservation using base-

Method Caricatures Cartoons Pixar Toons

Sec. 4 to project and transfer the latent code from Gs to Gt 6. Conclusion

You might also like