Xuan Gong

Ph.D. student
Department of Computer Science and Engineering, University at Buffalo
Email: xuangong AT buffalo DOT edu

Google scholar / LinkedIn / CV


I am a final-year Ph.D. student advised by Prof. David Doermann. I got bachelor and master degree from Beihang University.
I worked as a Research Intern at Meta Reality Lab, OPPO US Research, and UII America.

Current Reseach

  • 3D vision: neural radiance fields, human mesh reconstruction, endoscopy scene reconstruction
  • Medical imaging: cancer prognosis, deformable registration, histopathology image synthesis
  • Federated learning: federated ensemble distillation
  • Selected Publications

    Progressive Multi-view Human Mesh Recovery with Self-Supervision
    Xuan Gong, Liangchen Song, Meng Zheng, Benjamin PlancheTerrence Chen, Junsong Yuan, David Doermann, Ziyan Wu
    AAAI Conference on Artificial Intelligence (AAAI), 2023  (oral)

    We propose a novel simulation-based training pipeline for multi-view human mesh recovery, which (a) relies on intermediate 2D representations which are more robust to synthetic-to-real domain gap; (b) leverages learnable calibration and triangulation to adapt to more diversified camera setups; and (c) progressively aggregates multi-view information in a canonical 3D space to remove ambiguities in 2D representations.

    Federated Learning with Privacy-Preserving Ensemble Attention Distillation
    Xuan Gong, Liangchen Song, Rishi Vedula, Abhishek Sharma, Meng Zheng, Benjamin PlancheArun Innanje, Terrence Chen, Junsong Yuan, David Doermann, Ziyan Wu
    IEEE Transactions on Medical Imaging (TMI), 2022

    We propose a privacy-preserving FL framework leveraging unlabeled public data for one-way offline knowledge distillation in this work. The central model is learned from local knowledge via ensemble attention distillation. Our technique uses decentralized and heterogeneous local data like existing FL approaches, but more importantly, it significantly reduces the risk of privacy leakage. We demonstrate that our method achieves very competitive performance with more robust privacy preservation based on extensive experiments on image classification, segmentation, and reconstruction tasks.

    Self-supervised Human Mesh Recovery with Cross-Representation Alignment
    Xuan Gong, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, David Doermann, Ziyan Wu
    European Conference on Computer Vision (ECCV), 2022

    We propose cross-representation alignment utilizing the complementary information from the robust but sparse representation (2D keypoints). Specifically, the alignment errors between initial mesh estimation and both 2D representations are forwarded into regressor and dynamically corrected in the following mesh regression. This adaptive cross-representation alignment explicitly learns from the deviations and captures complementary information: robustness from sparse representation and richness from dense representation.

    PREF: Predictability Regularized Neural Motion Fields
    Liangchen Song, Xuan Gong, Benjamin Planche, Meng Zheng, David Doermann, Junsong Yuan, Terrence Chen, Ziyan Wu
    European Conference on Computer Vision (ECCV), 2022 (oral)
     [project page]

    We leverage a neural motion field for estimating the motion of all points in a multiview setting. Modeling the motion from a dynamic scene with multiview data is challenging due to the ambiguities in points of similar color and points with time-varying color. We propose to regularize the estimated motion to be predictable. If the motion from previous frames is known, then the motion in the near future should be predictable. Therefore, we introduce a predictability regularization by first conditioning the estimated motion on latent embeddings, then by adopting a predictor network to enforce predictability on the embeddings.

    Preserving Privacy in Federated Learning with Ensemble Cross-Domain Knowledge Distillation
    Xuan Gong, Abhishek Sharma, Srikrishna Karanam, Ziyan Wu, Terrence Chen, David Doermann, Arun Innanje
    AAAI Conference on Artificial Intelligence (AAAI), 2022

    We propose a quantized and noisy ensemble of local predictions from completely trained local models for stronger privacy guarantees without sacrificing accuracy. Based on extensive experiments on classification and segmentation tasks, we show that our method outperforms baseline FL algorithms with superior performance in both accuracy and data privacy preservation.

    Uncertainty Learning towards Unsupervised Deformable Medical Image Registration
    Xuan Gong, Luckyson Khaidem, Wentao Zhu, Baochang Zhang, David Doermann
    IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022

    We propose a predictive module to learn the registration and uncertainty in correspondence to unsupervised learning-based registration (VoxelMorph). Our framework introduces empirical randomness and registration error based uncertainty prediction. We systematically assess the performances on two MRI datasets with different ensemble paradigms.

    Ensemble Attention Distillation for Privacy-Preserving Federated Learning
    Xuan Gong, Abhishek Sharma, Srikrishna Karanam, Ziyan Wu, Terrence Chen, David Doermann, Arun Innanje
    IEEE/CVF International Conference on Computer Vision (ICCV), 2021

    We propose a new distillation-based FL framework that can preserve privacy by design, while also consuming substantially less network communication resources when compared to the current methods. Our framework engages in inter-node communication using only publicly available and approved datasets, thereby giving explicit privacy control to the user. To distill knowledge among the various local models, our framework involves a novel ensemble distillation algorithm that uses both final prediction as well as model attention.

    Style Consistent Image Generation for Nuclei Instance Segmentation
    Xuan Gong, Shuyan Chen, Baochang Zhang, David Doermann
    IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021

    We generate style consistent histopathology images for nuclei instance segmentation. We set up a instance segmentation framework that integrates a generator and discriminator into the segmentation pipeline with adversarial training to generalize nuclei instances and texture patterns. A segmentation net detects and segments both real nuclei and synthetic nuclei and provides feedback so that the generator can synthesize images that can boost the segmentation performance.

    Deformable Gabor Feature Networks for Biomedical Image Classification
    Xuan Gong, Xin Xia, Wentao Zhu, Baochang Zhang, David Doermann, Li`an Zhuo
    IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021

    We revisit Gabor filters and introduce a deformable Gabor convolution (DGConv) to expand deep networks interpretability and enable complex spatial variations. The features are learned at deformable sampling locations with adaptive Gabor convolutions to improve representitiveness and robustness to complex objects. The DGConv replaces standard convolutional layers and is easily trained end-to-end, resulting in deformable Gabor feature network (DGFN) with few additional parameters and minimal additional training cost.

                                                                                                                                                            Template Credit