My research focuses on spatial sensing intelligence — enabling intelligent systems to perceive, reconstruct, and reason about the physical world through privacy-preserving, low-cost sensing modalities. I develop sensing and learning frameworks that transform sparse physical signals into structured, human-centric spatial representations, with applications in human-aware buildings, ubiquitous interaction, and non-intrusive health monitoring.
Currently, I am working on the following topics:
Thermal Array–Based Spatial Sensing Intelligence: Developing physics-guided and learning-based frameworks that elevate low-resolution thermal array measurements into 3D spatial representations for human detection, ranging, reconstruction, interaction, and building-scale occupancy analytics, with strong privacy and cost advantages.
Physics-Grounded and Interpretable Sensing Models: Designing sensing-learning co-models that incorporate physical principles, geometric constraints, and biomechanical structure into deep models to improve robustness, interpretability, and cross-scenario generalization.
Multi-modal and Cross-Modal Sensing: Aligning and fusing heterogeneous sensing modalities — including thermal, RF, WiFi, acoustic, and inertial signals — to build unified representation spaces exposes richer structure than any single modality alone, supported by multimodal datasets and representation alignment methods.
Publications
Preprints
Zhang, X., Wang, Y., & Wu, C. (2025). Unlocking Interpretability for RF Sensing: A Complex-Valued White-Box Transformer. https://arxiv.org/abs/2507.21799
@misc{zhang2025unlockinginterpretabilityrfsensing,
title = {Unlocking Interpretability for RF Sensing: A Complex-Valued White-Box Transformer},
author = {Zhang, Xie and Wang, Yina and Wu, Chenshu},
year = {2025},
eprint = {2507.21799},
archiveprefix = {arXiv},
primaryclass = {cs.LG},
url = {https://arxiv.org/abs/2507.21799},
file = {https://arxiv.org/abs/2507.21799},
abbreviated = {Arxiv'25, under review},
thumbnail_path = {/assets/thumbnail_files/rfcrate.jpg}
}
The empirical success of deep learning has spurred its application to the radio-frequency (RF) domain, leading to significant advances in Deep Wireless Sensing (DWS). However, most existing DWS models function as black boxes with limited interpretability, which hampers their generalizability and raises concerns in security-sensitive physical applications. In this work, inspired by the remarkable advances of white-box transformers, we present RF-CRATE, the first mathematically interpretable deep network architecture for RF sensing, grounded in the principles of complex sparse rate reduction. To accommodate the unique RF signals, we conduct non-trivial theoretical derivations that extend the original real-valued white-box transformer to the complex domain. By leveraging the CR-Calculus framework, we successfully construct a fully complex-valued white-box transformer with theoretically derived self-attention and residual multi-layer perceptron modules. Furthermore, to improve the model’s ability to extract discriminative features from limited wireless data, we introduce Subspace Regularization, a novel regularization strategy that enhances feature diversity, resulting in an average performance improvement of 19.98% across multiple sensing tasks. We extensively evaluate RF-CRATE against seven baselines with multiple public and self-collected datasets involving different RF signals. The results show that RF-CRATE achieves performance on par with thoroughly engineered black-box models, while offering full mathematical interpretability. More importantly, by extending CRATE to the complex domain, RF-CRATE yields substantial improvements, achieving an average classification gain of 5.08% and reducing regression error by 10.34% across diverse sensing tasks compared to CRATE. RF-CRATE is fully open-sourced at: https://github.com/rfcrate/RF_CRATE.
Conference proceedings
Liu, X., Zhang, X., & Wu, C. (2025). Privacy-Preserving Non-Contact Sleep Monitoring via Multimodal Thermal-Depth Sensing. Proceedings of the 2025 ACM International Workshop on Thermal Sensing and Computing, 1–6.
@inproceedings{liuPrivacyPreservingNonContactSleep2025a,
title = {Privacy-{{Preserving Non-Contact Sleep Monitoring}} via {{Multimodal Thermal-Depth Sensing}}},
booktitle = {Proceedings of the 2025 {{ACM International Workshop}} on {{Thermal Sensing}} and {{Computing}}},
author = {Liu, Xuan and Zhang, Xie and Wu, Chenshu},
year = {2025},
month = nov,
series = {{{HotSense}} '25},
pages = {1--6},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
doi = {10.1145/3737905.3769281},
urldate = {2025-11-09},
isbn = {979-8-4007-1982-0},
abbreviated = {HotSense @ MobiCom'25},
thumbnail_path = {/assets/thumbnail_files/thermalsleep.jpg}
}
Sleep posture and movement reflect sleep quality and health. Polysomnography (PSG), the clinical standard, is costly and impractical for long-term home use due to its facility needs and contact-based sensors. Thermal infrared array (IRA) sensors offer a low-cost, privacy-preserving alternative, but their low resolution, blanket occlusion, and heat residual limit accuracy. SomnoSense, our non-contact, privacy-preserving sleep monitoring system, integrates IRA thermal data with low-cost direct time-of-flight (dToF) sensor depth data to form a 3D body profile. A lightweight fusion module, combining a random forest classifier and a state machine, addresses blanket occlusion and heat residual errors. Deep neural networks then enable posture recognition and motionlevel estimation. Evaluated on a dataset of 170 minutes of synchronized IRA and depth recordings from five subjects across six postures and three environments, SomnoSense achieves 98.7% detection accuracy, 96.1% posture recognition accuracy, and a mean absolute error value of 0.718 for standardized motion levels. A one-week, 45-hour case study further demonstrates its long-term monitoring capability, revealing distinct sleep posture and motion patterns. These results highlight SomnoSense’s potential for practical sleep monitoring in real-world home environments.
Chen, Y., Song, J., Zhang, X., Zhang, J., & Wu, C. (2025). ThermalEye: Fully Passive Eye Blink Detection on Smart Glasses via Low-Cost Thermal Sensing. Proceedings of the 2025 ACM International Workshop on Thermal Sensing and Computing, 34–39.
@inproceedings{chenThermalEyeFullyPassive2025a,
title = {{{ThermalEye}}: {{Fully Passive Eye Blink Detection}} on {{Smart Glasses}} via {{Low-Cost Thermal Sensing}}},
shorttitle = {{{ThermalEye}}},
booktitle = {Proceedings of the 2025 {{ACM International Workshop}} on {{Thermal Sensing}} and {{Computing}}},
author = {Chen, Yuhan and Song, Jingwei and Zhang, Xie and Zhang, Jianqi and Wu, Chenshu},
year = {2025},
month = nov,
series = {{{HotSense}} '25},
pages = {34--39},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
doi = {10.1145/3737905.3769280},
urldate = {2025-11-09},
isbn = {979-8-4007-1982-0},
abbreviated = {HotSense @ MobiCom'25},
thumbnail_path = {/assets/thumbnail_files/thermaleye.jpg}
}
Traditional blink detection systems often rely on visible-light cameras, which are sensitive to illumination and raise privacy concerns. To overcome these limitations, we present ThermalEye, a fully passive eye monitoring system integrated into smart glasses that performs robust blink detection using a low-resolution (12 × 16 pixel) infrared thermal array. Our approach features a framework co-designed with both signal processing techniques and a deep learning model to explicitly address the key challenges of this sensing modality: low signal-to-noise ratio (SNR), spatial heterogeneity, and inter- subject variability. Evaluations show that ThermalEye achieves F1-scores of 0.89 and 0.80 at the frame and event levels, highlighting its promise for fatigue monitoring and dry eye assessment.
*Yuan, D., *Zhang, X., Hou, W., Lyu, S., Yu, Y., Yu, L. J.-T., Li, C., & Wu, C. (2025, October). OctoNet: A Large-Scale Multi-Modal Dataset for Human Activity Understanding Grounded in Motion-Captured 3D Pose Labels. The Thirty-Ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
*Equal contribution
@inproceedings{yuanOctoNetLargeScaleMultiModal2025,
title = {{{OctoNet}}: {{A Large-Scale Multi-Modal Dataset}} for {{Human Activity Understanding Grounded}} in {{Motion-Captured 3D Pose Labels}}},
shorttitle = {{{OctoNet}}},
booktitle = {The {{Thirty-ninth Annual Conference}} on {{Neural Information Processing Systems Datasets}} and {{Benchmarks Track}}},
author = {$^*$Yuan, D and $^*$Zhang, X and Hou, Weiying and Lyu, Sheng and Yu, Yuemin and Yu, Luca Jiang-Tao and Li, Chengxiao and Wu, Chenshu},
year = {2025},
month = oct,
urldate = {2025-11-09},
langid = {english},
note = {$^*$Equal contribution},
abbreviated = {NeurIPS'25},
code = {https://aiot-lab.github.io/OctoNet/},
file = {https://openreview.net/pdf?id=z3TftXOizf},
thumbnail_path = {/assets/thumbnail_files/octonet.jpg}
}
We introduce OctoNet, a large-scale, multi-modal, multi-view human activity dataset designed to advance human activity understanding and multi-modal learning. OctoNet comprises 12 heterogeneous modalities (including RGB, depth, thermal cameras, infrared arrays, audio, millimeter-wave radar, Wi-Fi, IMU, and more) recorded from 41 participants under multi-view sensor setups, yielding over 67.72M synchronized frames. The data encompass 62 daily activities spanning structured routines, freestyle behaviors, human-environment interaction, healthcare tasks, etc. Critically, all modalities are annotated by high-fidelity 3D pose labels captured via a professional motion-capture system, allowing precise alignment and rich supervision across sensors and views. OctoNet is one of the most comprehensive datasets of its kind, enabling a wide range of learning tasks such as human activity recognition, 3D pose estimation, multi-modal fusion, cross-modal supervision, and sensor foundation models. Extensive experiments have been conducted to demonstrate the sensing capacity using various baselines. OctoNet offers a unique and unified testbed for developing and benchmarking generalizable, robust models for human-centric perceptual AI.
Zhang, X., Li, C., & Wu, C. (2025). TAPOR: 3D Hand Pose Reconstruction with Fully Passive Thermal Sensing for around-Device Interactions. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 9.
@inproceedings{Zhangtapor2025,
title = {{{TAPOR}}: {{3D}} Hand Pose Reconstruction with Fully Passive Thermal Sensing for around-Device Interactions},
booktitle = {Proc. {{ACM Interact}}. {{Mob}}. {{Wearable Ubiquitous Technol}}.},
author = {Zhang, Xie and Li, Chengxiao and Wu, Chenshu},
year = {2025},
month = jun,
volume = {9},
doi = {https://doi.org/10.1145/3729499},
articleno = {63},
code = {https://github.com/aiot-lab/TAPOR},
demo = {https://www.youtube.com/watch?v=dRiqxPZx4zk},
file = {https://arxiv.org/pdf/2501.17585},
abbreviated = {Ubicomp'25},
thumbnail_path = {/assets/thumbnail_files/tapor.jpg}
}
This paper presents the design and implementation of TAPOR, a privacy-preserving, non-contact, and fully passive sensing system for accurate and robust 3D hand pose reconstruction for around-device interaction using a single low-cost thermal array sensor. Thermal sensing using inexpensive and miniature thermal arrays emerges with an excellent utility-privacy balance, offering an imaging resolution significantly lower than cameras but far superior to RF signals like radar or WiFi. The design of TAPOR, however, is challenging, mainly because the captured temperature maps are low-resolution and textureless. To overcome the challenges, we investigate thermo-depth and thermo-pose properties, proposing a novel physics-inspired neural network that learns effective 3D spatial representations of potential hand poses. We then formulate the 3D pose reconstruction problem as a distinct retrieval task, enabling accurate hand pose determination from the input temperature map. To deploy TAPOR on IoT devices, we introduce an effective heterogeneous knowledge distillation method, reducing computation by 377×. TAPOR is fully implemented and tested in real-world scenarios, showing remarkable performance, supported by four gesture control and finger tracking case studies. We envision TAPOR to be a ubiquitous interface for around-device control and have open-sourced it at https://github.com/aiot-lab/TAPOR.
Li, C., Zhang, X., & Wu, C. (2025). Facial Expression Recognition with DToF Sensing. ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1–5.
@inproceedings{li2025facial,
title = {Facial Expression Recognition with DToF Sensing},
author = {Li, Chengxiao and Zhang, Xie and Wu, Chenshu},
booktitle = {ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages = {1--5},
year = {2025},
organization = {IEEE},
doi = {10.1109/ICASSP49660.2025.10887978},
url = {https://ieeexplore.ieee.org/abstract/document/10887978/},
abbreviated = {ICASSP'25},
file = {/papers/DToF_face.pdf},
thumbnail_path = {/assets/thumbnail_files/toface.jpg}
}
Facial Expression Recognition (FER) is crucial for understanding human emotions, with applications spanning from mental health assessment to marketing recommendation systems. However, existing camera-based methods raise privacy concerns, while RF-based approaches suffer from limited environmental generalizability and high cost. In this work, we propose ToFace, a FER system leveraging a low-cost (4.8$) Direct Time-of- Flight (DToF) sensor that has been available on commodity smartphones. This sensor provides an extremely low-resolution 8 × 8 depth map and a clear Field of View (FoV), significantly mitigating privacy concerns while avoiding the impact of ambient objects. Despite the benefits, the low-resolution depth map in- troduces significant challenges for precise expression recognition due to limited facial structure information. We first develop a physical model to extract additional spatial information from the intermediate sensor output, i.e., the transient histograms. We then propose a physics-integrated neural network to re- construct a facial structure map comprising both depth and orientation for accurate expression recognition. We conduct real- world experiments with 12 users and compare our model with several baselines. The results demonstrate that ToFace achieves the highest recognition accuracy of 75%.
Zhang, X., & Wu, C. (2024). TADAR: Thermal array-based detection and ranging for privacy-preserving human sensing. Proceedings of the 25th International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing (MOBIHOC ’24), 1–10.
@inproceedings{Zhang2024TADAR,
address = {Athens, Greece},
title = {{TADAR}: {Thermal} array-based detection and ranging for privacy-preserving human sensing},
doi = {https://doi.org/10.1145/3641512.3686357},
booktitle = {Proceedings of the 25th international symposium on theory, algorithmic foundations, and protocol design for mobile networks and mobile computing ({MOBIHOC} '24)},
publisher = {ACM},
author = {Zhang, Xie and Wu, Chenshu},
year = {2024},
pages = {1--10},
code = {https://github.com/aiot-lab/TADAR},
demo = {https://youtu.be/0hGqzSYlh4o},
file = {https://arxiv.org/pdf/2409.17742},
abbreviated = {MobiHoc'24},
thumbnail_path = {/assets/thumbnail_files/tadar.jpg}
}
Human sensing has gained increasing attention in various applications. Among the available technologies, visual images offer high accuracy, while sensing on the RF spectrum preserves privacy, creating a conflict between imaging resolution and privacy preservation. In this paper, we explore thermal array sensors as an emerging modality that strikes an excellent resolution-privacy balance for ubiquitous sensing. To this end, we present TADAR, the first multi-user Thermal Array-based Detection and Ranging system that estimates the inherently missing range information, extending thermal array outputs from 2D thermal pixels to 3D depths and empowering them as a promising modality for ubiquitous privacy-preserving human sensing. We prototype TADAR using a single commodity thermal array sensor and conduct extensive experiments in different indoor environments. Our results show that TADAR achieves a mean F1 score of 88.8% for multi-user detection and a mean accuracy of 32.0 cm for multi-user ranging, which further improves to 20.1 cm for targets located within 3 m. We conduct two case studies on fall detection and occupancy estimation to showcase the potential applications of TADAR. We hope TADAR will inspire the vast community to explore new directions of thermal array sensing, beyond wireless and acoustic sensing. TADAR is open-sourced on GitHub: https://github.com/aiot-lab/TADAR.
Zhang, X., Tang, C., An, Y., & Yin, K. (2021). WiFi-Based Multi-task Sensing. In T. Hara & H. Yamaguchi (Eds.), Mobile and Ubiquitous Systems: Computing, Networking and Services (pp. 169–189). Springer International Publishing.
@inproceedings{zhangWiFiBasedMultitaskSensing2022,
title = {{{WiFi-Based Multi-task Sensing}}},
booktitle = {Mobile and {{Ubiquitous Systems}}: {{Computing}}, {{Networking}} and {{Services}}},
author = {Zhang, Xie and Tang, Chengpei and An, Yasong and Yin, Kang},
editor = {Hara, Takahiro and Yamaguchi, Hirozumi},
year = {2021},
pages = {169--189},
publisher = {Springer International Publishing},
address = {Cham},
doi = {10.1007/978-3-030-94822-1_10},
isbn = {978-3-030-94822-1},
langid = {english},
file = {https://arxiv.org/pdf/2111.14619},
code = {https://github.com/Zhang-xie/Wimuse},
abbreviated = {MobiQuitous'21},
thumbnail_path = {/assets/thumbnail_files/wimuse.jpg}
}
WiFi-based sensing has aroused immense attention over recent years. The rationale is that the signal fluctuations caused by humans carry the information of human behavior which can be extracted from the channel state information of WiFi. Still, the prior studies mainly focus on single-task sensing (STS), e.g., gesture recognition, indoor localization, user identification. Since the fluctuations caused by gestures are highly coupling with body features and the user’s location, we propose a WiFi-based multi-task sensing model (Wimuse) to perform gesture recognition, indoor localization, and user identification tasks simultaneously. However, these tasks have different difficulty levels (i.e., imbalance issue) and need task-specific information (i.e., discrepancy issue). To address these issues, the knowledge distillation technique and task-specific residual adaptor are adopted in Wimuse. We first train the STS model for each task. Then, for solving the imbalance issue, the extracted common feature in Wimuse is encouraged to get close to the counterpart features of the STS models. Further, for each task, a task-specific residual adaptor is applied to extract the task-specific compensation feature which is fused with the common feature to address the discrepancy issue. We conduct comprehensive experiments on three public datasets and evaluation suggests that Wimuse achieves state-of-the-art performance with the average accuracy of 85.20%, 98.39%, and 98.725% on the joint task of gesture recognition, indoor localization, and user identification, respectively.
Yin, K., Tang, C., Zhang, X., & Yao, H. (2021). Robust Human Activity Recognition System with Wi-Fi Using Handcraft Feature. 2021 IEEE Symposium on Computers and Communications (ISCC), 1–8.
@inproceedings{yinRobustHumanActivity2021,
title = {Robust {{Human Activity Recognition System}} with {{Wi-Fi Using Handcraft Feature}}},
booktitle = {2021 {{IEEE Symposium}} on {{Computers}} and {{Communications}} ({{ISCC}})},
author = {Yin, Kang and Tang, Chengpei and Zhang, Xie and Yao, Hele},
year = {2021},
pages = {1--8},
issn = {2642-7389},
doi = {10.1109/ISCC53001.2021.9631459},
abbreviated = {ISCC'21}
}
WiFi-based Human activity recognition (HAR) system has the drawback of the new domain inadaptability. Numerous studies have proposed to solve this problem, but these methods have the limitations of needing the new domain data or fine-tuning the model. In this paper, we propose HARW, a cross-domain HAR system using Wi-Fi. Specifically, a novel domain-independent feature extraction algorithm is proposed based on the multiple signal classification algorithm, which extracts three physical factors (i.e. time of flight, change rate of path length, and angle of arrival) simultaneously to construct the TCA feature. Then, A two-stage model is proposed to recognize activities based on TCA. The experimental results show that HARW can increase the average accuracy rate by 9 % and the best accuracy can reach 60%, without new domain data and fine-tuning the model, outperforming the method that only uses CSI raw data. In addition. HARW adonts onlv a nair of Wi-Fi devices.
Journal articles
Zhang, X., Tang, C., Yin, K., & Ni, Q. (2021). WiFi-based Cross-Domain Gesture Recognition via Modified Prototypical Networks. IEEE Internet of Things Journal, 1–1.
@article{zhangWiFibasedCrossDomainGesture2021,
title = {{{WiFi-based Cross-Domain Gesture Recognition}} via {{Modified Prototypical Networks}}},
author = {Zhang, Xie and Tang, Chengpei and Yin, Kang and Ni, Qingqian},
year = {2021},
journal = {IEEE Internet of Things Journal},
pages = {1--1},
doi = {10.1109/JIOT.2021.3114309},
file = {/papers/WiGr_zhang.pdf},
code = {https://github.com/Zhang-xie/WiGr},
abbreviated = {IoTJ'21},
thumbnail_path = {/assets/thumbnail_files/wigrIoT.jpg}
}
Numerous deep learning studies have achieved remarkable advances in WiFi-based human gesture recognition (HGR) using channel state information (CSI). However, since the CSI patterns of the same gesture change across domains (i.e., users, environments, locations, and orientations), recognition accuracy might degrade significantly when applying the trained model to new domains. To overcome this problem, we propose a WiFi-based cross-domain gesture recognition system (WiGr) which has a domain-transferable mapping to construct an embedding space where the representations of samples from the same class are clustered, and those from different classes are separated. The key insight of WiGr is using the similarity between the query sample representation and the class prototypes in the embedding space to perform the gesture classification, which can avoid the influence of the cross-domain CSI patterns change. Meanwhile, we present a dual-path prototypical network (Dual-Path PN) which consists of a deep feature extractor and a dual-path (i.e., Path-A and Path-B substructures) recognizer. The trained feature extractor can extract the gesture-related domain-independent features from CSI, namely, the domain-transferable mapping. In addition, WiGr implements the cross-domain HGR based on only a pair of WiFi devices without retraining in the new domain. We conduct comprehensive experiments on three data sets, one is built by ourselves and the others are public data sets. The evaluation suggests that WiGr achieves 86.8%–92.7% in-domain recognition accuracy and 83.5%–93% cross-domain accuracy under the four-shot condition.
Hu, P., Tang, C., Yin, K., & Zhang, X. (2021). WiGR: A Practical Wi-Fi-Based Gesture Recognition System with a Lightweight Few-Shot Network. Applied Sciences, 11(8), 3329.
@article{huWiGRPracticalWiFiBased2021,
title = {{{WiGR}}: {{A Practical Wi-Fi-Based Gesture Recognition System}} with a {{Lightweight Few-Shot Network}}},
shorttitle = {{{WiGR}}},
author = {Hu, Pengli and Tang, Chengpei and Yin, Kang and Zhang, Xie},
year = {2021},
journal = {Applied Sciences},
volume = {11},
number = {8},
pages = {3329},
publisher = {Multidisciplinary Digital Publishing Institute},
doi = {10.3390/app11083329},
urldate = {2021-05-26},
copyright = {http://creativecommons.org/licenses/by/3.0/},
langid = {english},
abbreviated = {Appl. Sci.'21},
file = {/papers/WiGr_hu.pdf},
thumbnail_path = {/assets/thumbnail_files/WiGR.jpg}
}
Wi-Fi sensing technology based on deep learning has contributed many breakthroughs in gesture recognition tasks. However, most methods concentrate on single domain recognition with high computational complexity while rarely investigating cross-domain recognition with lightweight performance, which cannot meet the requirements of high recognition performance and low computational complexity in an actual gesture recognition system. Inspired by the few-shot learning methods, we propose WiGR, a Wi-Fi-based gesture recognition system. The key structure of WiGR is a lightweight few-shot learning network that introduces some lightweight blocks to achieve lower computational complexity. Moreover, the network can learn a transferable similarity evaluation ability from the training set and apply the learned knowledge to the new domain to address domain shift problems. In addition, we made a channel state information (CSI)-Domain Adaptation (CSIDA) data set that includes channel state information (CSI) traces with various domain factors (i.e., environment, users, and locations) and conducted extensive experiments on two data sets (CSIDA and SignFi). The evaluation results show that WiGR can reach 87.8–94.8% cross-domain accuracy, and the parameters and the calculations are reduced by more than 50%. Extensive experiments demonstrate that WiGR can achieve excellent recognition performance using only a few samples and is thus a lightweight and practical gesture recognition system compared with state-of-the-art methods.
🌱 Reviewer, APSIPA Transactions on Signal and Information Processing;
IEEE Internet of Things Journal (IoTJ);
Proc. ACM Interact. Mob. Wearable Ubiquitous Technologies (IMWUT);
International Conference on Parallel and Distributed Systems (ICPADS’24);
IEEE Transactions on Mobile Computing (TMC);
🌱 ACM Transactions on Internet of Things Distinguished Reviewer, TIOT Journal;
🌾 COMP7310, Artificial intelligence of things, Spring 2026
🌾 COMP3270, Artificial intelligence, Fall 2024
🌾 COMP3314, Machine learning, Fall 2023
🌾 CS3230, Operating system, Fall 2022
xletters
Xie ZHANG
(Now) Ph.D. Computer Science, HKU
(2022) M.S. Pattern Recognition and Intelligence Systems, SYSU
(2019) B.S. Software Engineering, SYSU
News
8, November 2025
The first ACM International Workshop on Thermal Sensing and Computing (HotSense'25) is successfully held at MobiCom'25.
31, October 2025
Our paper OctoNet: A Large-Scale Multi-Modal Dataset for Human Activity Understanding Grounded in Motion-Captured 3D Pose Labels is accepted by NeurIPS'25.
30, September 2025
Our paper ThermalEye: Fully Passive Eye Blink Detection on Smart Glasses via Low-Cost Thermal Sensing is accepted by HotSense @ MobiCom'25.