Special Issue on SPIC Deep Image/Video Feature Engineering for Human-Computer Interaction
• 大类 : 工程技术 - 3区
• 小类 : 工程：电子与电气 - 3区
As desktop PCs and mobile devices are widespread nowadays, effectively and naturally interacting between human and machines is becoming an indispensable task. In practice, users prone to naturally interact with computers face-to-face as they communicate with their family members, friends, and clients. Users want to communicate through a multimodal manner, i.e., eye contact, gesture, body language, speech, and face expressions contribute collaboratively. Human–computer interaction (HCI) focuses on designing the interfaces and technology that effectively link users and computers. HCI designers investigate the ways in which humans interact with computers, based on which they employ state-of-the-art technologies that allow humans interact with computers in a convenient and natural way. As an interdisciplinary, HCI technique is related to image/video modeling, face/expression understanding, deep learning, multimodal feature fusion, 3D realistic rendering. Due to the advancement of deep learning, in human-human communication, ideas are often represented by multiple deep features, .e.g.,, deep poselet, deep gaze behavior, and deep hand motions. Based on this observation, there has been a significant growth of multimodal HCI techniques by deep learning.
Despite of the progress of deep HCI techniques, effectively engineering the multiple deep features for HCI is still a challenge. Potential difficulties include: 1) how to seamlessly and collaboratively explore the heterogeneous deep features in multimodal HCI modeling, 2) how to design deep architectures that optimally encodes multiple visual features for HCI applications, 3) how to deeply encode emotional and cognitive features into the current HCI systems, and 4) how to intelligently alleviate the negative influences of contaminated/absent visual features in multiple HCI-based features. Apparently, when HCI meets multimodal deep feature learning, many interesting issues and challenges are generated. We expect new technologies and mathematical formulations, datasets, and evaluation benchmarks to multimodal HCI.
This special issue serves as a forum to bring together active researchers from both industry and academia to exchange their opinions and experiences in multimodal HCI techniques. We solicit original contributions in threefold: 1) presenting state-of-the-art theories, technologies and novel applications of deep feature learning for HCI; 2) surveying the recent progress in these topics; and (3) releasing benchmark text/image/video dataset for evaluating deep HCI techniques. This special issue target researchers and practitioners from both industry and academia.
The topics of interest include (but not limited to):
Signal processing and human-computer interaction (HCI);
Image/video modeling and multimodal interaction;
Deep visual learning architecture for HCI and affective computing;
HCI-based Healthcare and assistive technologies;
Advances in human visual communication dynamics;
Human-robot/agent visual multimodal interaction;
Multimodal visual feature fusion for HCI and affective computing;
Deep learning and other machine learning method for multimodal HCI;
Multimodal dialogue modeling techniques;
Gaze behaviors and analysis in HCI sytems;
HCI System components and multimodal media platforms;
Visual behaviors modeling in social interactive context;
Virtual/augmented reality and multimodal interaction;
Text, image, and video information fusion and representation in HCI;
Customized contents (text, image, video) generation in HCI.
Harvesting user-related visual representation from MHCI systems;
Knowledge graph based automatic annotation, conversation, and summarization.