Skip to content
View chunhuizng's full-sized avatar

Highlights

  • Pro

Block or report chunhuizng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
chunhuizng/README.md

Hi there 👋

I'm Chunhui Zhang, a Ph.D. student in Computer Science at Dartmouth 🌲, working with 🌟Professor Soroush Vosoughi. I also hold an MSCS degree (research-based) from Brandeis University, where I was honored with the GSAS Fellowship, and a Bachelor's degree in CS from Northeastern University, receiving the Outstanding Honor Thesis Award.


🔭 Research Focus and Key Contributions

My research focuses on advancing the intrinsic properties of deep learning across diverse modalities, with an emphasis on trustworthiness, scalability, and applicability to real-world challenges. Highlights of my work include:

  • Scaling Multimodal Theory-of-Mind with Weak-to-Strong Bayesian Reasoning
    Preprint | Code
    Authors: Chunhui Zhang, Sean Dae Houlihan, Kwonjoon Lee, Nakul Agarwal, Zhongyu Ouyang, Soroush Vosoughi, Shao-Yuan Lo

  • Pretrained Image-Text Models are Secretly Video Captioners
    Preprint | Code
    Authors: Chunhui Zhang, Yiren Jian, Zhongyu Ouyang, Soroush Vosoughi

  • Working Memory Refines Essential Temporal Multimodal Sequences for Audio-Video-Language Modelling
    Preprint | Code
    Authors: Chunhui Zhang*, Xingjian Diao*, Weiyi Wu, Zhongyu Ouyang, Peijun Qing, Ming Cheng, Soroush Vosoughi, Jiang Gui

  • Working Memory Identifies Reasoning Limits in Language Models
    Conference: EMNLP 2024
    Authors: Chunhui Zhang, Yiren Jian, Zhongyu Ouyang, Soroush Vosoughi

  • Learning Musical Representations for Music Performance Question Answering
    Conference: Findings of EMNLP 2024
    Authors: Xingjian Diao, Chunhui Zhang, Tingxuan Wu, Ming Cheng, Zhongyu Ouyang, Weiyi Wu, Soroush Vosoughi, Jiang Gui


💼 Internship Experience

Honda Research Institute USA

Research Intern (Jun. 2024 – Sept. 2024)

  • Project: Multimodal LLM Post-Training
  • Developed a LLM-powered reasoner capable of understanding human behaviors in multimodal environments, achieving a 4.6% improvement over state-of-the-art solutions.
  • The paper is under review, and the code has been released.
  • Host: Dr. Shao-Yuan Lo

🌱 Current Focus

I am currently exploring Multimodal LLMs (Language-Vision-Audio), memory mechanisms, and reinforcement learning to push the boundaries of AGI. My recent work includes training recipes for large-scale models, which ranked Top-2 on PaperWithCode’s Video Captioning Leaderboard, showcasing optimal strategies for resource allocation in post-training.


📫 How to Reach Me


💬 Let's Connect

Feel free to reach out if you're interested in collaboration, career advice, or just a friendly chat about research and life!

Popular repositories Loading

  1. Tensor-completion-via-capped-nuclear-norm Tensor-completion-via-capped-nuclear-norm Public

    A new algorithm with great speed advantage: We expand the meaning of the two-dimensional capped nuclear norm, and use this for image and video reconstruction

    MATLAB 3 3

  2. GAME GAME Public

    Python 3

  3. emergent-degradation emergent-degradation Public

    Python 2

  4. working-memory-limits working-memory-limits Public

    Python 2

  5. Gumbel-darts-master Gumbel-darts-master Public

    darts with gumbel-softmax

    Python 1

  6. mobile-vision-master mobile-vision-master Public

    scale up model

    Python 1 1