I am a DPhil in Computer Science Department (opens new window), University of Oxford (opens new window), and a member of St Hugh's (opens new window) college. I am co-advised by Prof. Andrew Markham (opens new window) and Prof. Niki Trigoni (opens new window).
Prior to Oxford, I worked as research scientist at Malong Technologies, self-driving engineer at Baidu Institute of Deep Learning (IDL). I received B.Eng. from Wuhan University (opens new window), China.
My research interest cuts across the boundary of theoretical and practical guided research. Specifically, I am interested in audio signal processing, filter bank design, audio-vision cross-modality learning and embodied robotics.
Drop me an email (yuhang.he[at]cs.ox.ac.uk) if you want to contact. I write blogs as part of my research notes, you are welcome to support a cup of coffee (opens new window) if you find them helpful.
For full publication list, please refer to Google Scholar (opens new window) or → Full list
SoundDoA: Learn Sound Source Direction of Arrival and Semantics from Sound Raw Waveforms
Yuhang He, Andrew Markham
Interspeech, 2022.
We propose a novel sound event direction of arrival (DoA) estimation framework with a novel filter bank to jointly learn sound event semantics and spatial location relevant representations.
SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform
Yuhang He, Niki Trigoni, Andrew Markham
International Conference on Machine Learning (ICML), 2021.
We propose a novel sound event detection framework for polyphonic and moving sound event detection. We also propose novel object-based evaluation metrics to evaluate performance more objectively.
I am always happy to chat with people who are interested in my work. You can check the following office hour I keep update to book a time slot if you want to chat.