As technology pervades our lives in an increasingly rich ecosystem of digital devices, they can capture huge amounts of long-term personal data. A core theme of my research has been to create systems and interfaces that enable people to harness and control that data and its use. This talk will share key insights from a series of case studies from that work and plans to build upon these. The first case studies explored how to harness data from wearables, such as smart watches, for personal informatics interfaces that help us gain insights about ourselves over the long term, for analysis of a large dataset (over 140,000 people) and for Virtual Reality games for exercise. The second set of case studies are from formal education settings where personal data interfaces, called Open Learner Models (OLMs), can harness learning data. I will share key insights that have emerged for a research agenda: OLMs for life-wide learning; the nature of the different interfaces needed for fast, versus slow and considered, thinking; communicating uncertainty; scaffolding people to really learn about themselves from their multimodal data; and how these link to urgent challenges of education in an age of AI, fake news and truth decay.
The digital is not only the realm of the virtual, but also generates alternative embodiments. Being digital means the body inhabits a flattened ontology of interacting algorithms, machines, instruments, networks, ecologies and other bodies, organisms and objects. There is no privileging of the human, rather the body becomes a component in an extended operational system of human metabolism, machine musculature and computational programming. What becomes apparent now is not the body's identity, but rather its connectivity, not its mobility or location but rather its interface. Excess and Oscillation: Alternate Anatomical Architectures Performing increasingly remotely, presence becomes problematic. Presence is marked by a double absence. Or perhaps absence is marked by a double presence. We are neither all-here-now, nor there all-of-the-time, but partly here sometimes and partly somewhere else at other times. There is an increasing and seamless oscillation from offline and online operation. The increasing speed of oscillation blurs any meaningful distinction between the actual and the virtual. We now navigate from cosmological deep time, to physical nano scales to virtual non-places. The body inhabits abstract realms of the highly hypothetical and of streaming subjectivity. A hyper-human that becomes less than the sum of its hyper-links and more than its attachments and implants. The body performs beyond the boundaries of its skin and beyond the local space it inhabits. Augmented and amplified, the body is now in excess of its biology with the proliferation of micro scaled and biocompatible sensors, implants and computational systems. And with nano scaling, there is an urge to internalise technology to create a better surveillance and early alert warning system for the body. Excess and Oscillation: Alternate Anatomical Architectures The body has become a contemporary chimera of meat, metal and code. Subjectively, the body experiences itself as an extruded system, rather than an enclosed structure. The self becomes situated beyond the skin, and the body is emptied. But this emptiness is not an emptiness of lack but rather a radical emptiness through excess. An emptiness from the extrusion and extension of its capabilities, its augmented sensory antennae and its increasingly remote functioning. Excess and Oscillation: Alternate Anatomical Architectures In this age of body hacking, gene mapping, prosthetic augmentation, organ swapping, face transplants and synthetic skin, what it means to be a body and what it means to be human and what generates aliveness and agency becomes problematic. At the time when the individual body is threatened existentially by fatally being infected by biological viruses, the human species is confronted by the more pervasive and invasive ontological risk of infection by its techno-digital artifacts and entities. What it means to be human is perhaps not to remain human at all.
Wearable devices, including smartwatches, are increasingly popular among consumers due to their user-friendly services. However, transmitting sensitive data like social media messages and payment QR codes via commonly used low-power Bluetooth exposes users to privacy breaches and financial losses. This study introduces TouchHBC, a secure and reliable communication scheme leveraging a smartwatch's built-in electrodes. This system establishes a touch-based human communication system utilizing a laptop's leakage current. As the transmitting device, the laptop modulates this current via the CPU. Simultaneously, the smartwatch, equipped with built-in electrodes, captures the current traversing the human body and decodes it. The modulation and decoding processes involve techniques such as amplitude modulation, spectral subtraction, channel estimation, and retransmission mechanisms. TouchHBC facilitates communication between laptops and smartwatches. Real-world tests demonstrate that our prototype achieves a throughput of 19.83bps. Moreover, TouchHBC offers the potential for enhanced interaction, including improved gaming experiences through vibration feedback and secure touch login for smartwatch applications by synchronizing with a laptop.
Generative AI has introduced significant concerns about how people interact with information in society, particularly regarding the potential harm caused by fake news. To address this issue, it is critical to understand how people perceive fake news and how their information literacy can be improved. Our research tackles two key questions: "Can we estimate the level of information literacy from participants' web browsing behavior?", and "What characteristics can be observed when comparing participants with high and low information literacy?". We recruited 20 university students to read and evaluate the veracity of generated news. During this process, participants used web searches to verify the news, with their browsing logs collected using the open-source browser extension TrackThink Camera. The study achieved a 73.8% accuracy in estimating participants' ability to detect fake news based on their web browsing behavior. Notably, we observed that web search time ratio is higher, the number of searches are more, and the average scrolling speed is faster for high information literacy participants.
Researchers have proposed various approaches to increase motivation for learning. However, in many cases, they are designed to boost motivation but fall short of creating learning habits. Gamification and competition between peers are widely used to urge people to, for example, do more steps in a day. E-learning platforms often use leaderboards to visualize peer progress, but their effectiveness is limited to when users open the platform. In this work, we use notifications to inform peers of each other's progress in real-time to increase motivation for learning. In an in-the-wild study with 19 participants, we investigate the impact of peer-awareness on learning by comparing real-time notifications (sent immediately after peers' progress) with interval-based notifications (sent at random intervals). Contrary to our initial expectations, our findings reveal that interval-based notifications are more effective in promoting learning activities. We introduce the Peer-Aware Learning System (PALS), a mobile application designed to simulate peer effects and provide insights from our user study on notification timing strategies for enhancing learning motivation.
Researchers have implemented physiological sensing and feedback technologies to reveal the emotional states of the players engaged in VR games; however, these methods have not previously been used in asymmetric multiplayer VR games, in which players do not have equal roles, abilities, or objectives. In the current study, we developed an algorithm capable of inferring arousal states from EEG signals. We also developed a gaming interface that displays a quantitative indication of arousal states with the aim of reducing asymmetry between players with and without VR headsets in order to foster stronger social connections and enhance a sense of presence. Based on the proposed affective game design, we have outlined the within-subject study design to compare the effects of visualized arousal states on players with and without VR headsets. Through this study design, we aim to investigate the effects of arousal state indicators on the overall gaming experience.
Cybersickness in virtual reality (VR) significantly disrupts user immersion. Although recent studies have proposed cybersickness prediction models, existing models have considered the moment of cybersickness onset, limiting their applicability in proactive detection. To address this limitation, we used long-term time series forecasting (LTSF) models based on multimodal sensor data collected from the head-mounted display (HMD). We used a pre-trained large language model (LLM) to effectively learn the salient features (e.g., seasonality) of multimodal sensor data by understanding the nuanced context within the data. The results of our experiment demonstrated that our model achieved comparable performance to the baseline models, with an MAE of 0.971 and an RMSE of 1.696. This indicates the potential for early prediction of cybersickness by employing LLM- and LTSF-based models with multimodal sensor data, suggesting a new direction in model development.
Earable devices, a subset of wearable technology, are designed to be worn on the ear and used in daily life. These innovative devices enable users to receive voice-based notifications through a minute built-in speaker without requiring any user operations, seamlessly integrating technology into everyday activities. The timing deemed acceptable for receiving voice-based notifications through earable devices varies based on the user and surrounding situation; thus, inappropriate notification timing may reduce usability. However, determining the safest and most comfortable timing for voice-based notifications using earable devices remains unclear. This study investigates the acceptable timing of voice-based notifications through earable devices. To explore the acceptable timing, we developed a smartphone application, SoNotify, which can send dummy voice-based notifications and collect sensor data on a smartphone and an earable device. Our field studies with eight participants showed that voice-based notifications were highly acceptable during outdoor walking, with an acceptance rate of approximately 86%. However, users tended to refuse notifications in situations in which they needed to concentrate on avoiding collisions with pedestrians, cyclists, or vehicles.
Individuals worldwide are living with diabetes, and under-resourced communities are more at risk of complications such as diabetic foot ulcers (DFUs). Integrating technology for DFU management is effective, but a digital gap persists especially for older adults. Older adults exhibit unfamiliarity with application design which may act as a deterrent from incorporating technology into their health self-management. To address this challenge, we conducted cognitive walkthroughs and semi-structured interviews with older adults (N=12) from under-resourced communities. We sought to understand their ability to use an app that required taking images and videos, interacting with the phone's motor/vibration system, and completing a walking test. In order to explore whether com- prehension and perceived utility are due to a lapse in older adult technology understanding or flawed application design, we con- ducted a second study with tech savvy university students (N=10). While the students were able to successfully complete all of the walkthrough tasks, the utility of the app was limited by the diabetes knowledge of the participant groups. We emphasize the importance of community-based research initiatives and of designing mobile health technologies that are useful and usable to those who face the greatest disease burden.
With the advancement of deep learning, numerous research initiatives have emerged focusing on enabling robots to identify and retrieve target objects within complex domestic environments. However, current research lacks effective integration of contextual affordances information in robotic systems. This paper introduces an intelligent grasping system to facilitate object prediction and safe policy learning for home-use robots. Particularly, we introduce the Context Recognition Network (CRN) to predict the potential failure likelihood of each action. We develop a grasping system based on DDPG (Deep Deterministic Policy Gradient) as the benchmark. We compare the benchmark's performance with that of the CRN-equipped grasping system. Our results indicate that the CRN-equipped grasping system outperforms DDPG by blocking failure action and instead choosing an appropriate pose based on the object prediction to retrieve the object with fewer computational resources.
Effective communication between doctors and patients significantly impacts the quality of diagnosis and treatment, especially in emotionally challenging situations such as cancer diagnosis. This study aims to enhance emotion recognition in clinical settings by providing doctors with feedback on patients' emotions using non-contact multimodal biosignals and an LSTM model. The pSCOUTER system was used to non-invasively collect heart rate and respiratory signals from patients during simulated medical consultations for three scenarios: initial cancer diagnosis, recurrence/metastasis diagnosis, and terminal cancer diagnosis. The collected signals were segmented into 5-second intervals, from which 20 features were extracted. These features were used to train both the LSTM and GPT4-omni models. Emotion labels were classified into seven categories based on Russell's circumplex model of affect: Happy, Relaxed, Disgust, Sad, Upset, Fear, and Surprised.As a result, the LSTM model outperformed both the GPT4-omni model and doctors' predictions in terms of F1 score, demonstrating the highest accuracy across all scenarios. This suggests that LSTM networks are highly effective in capturing the temporal dependencies in multimodal physiological data. This study highlights the potential of integrating non-contact biosignal measurement with machine learning techniques to improve emotion understanding in medical settings, ultimately aiming to enhance patient care and satisfaction.
Student-athletes (SAs) face stress from balancing athletic and academic demands; therefore, monitoring their physical and mental stress is crucial to ensure better well-being. Questionnaires and dedicated measurement equipment have been used to assess the mental and physical status of SAs. However, these methods are not scalable and are difficult to use continuously. In this paper, we propose a method for monitoring the physical and mental state of SAs using passive mobile and wearable sensing technology to minimize the burden associated with monitoring. First, we developed a platform for collecting daily, training, and resetting behavior data from smartphones and wearable devices. Second, as a preliminary study, we collected the behavior and Stress and Recovery States Scale (SRSS) data for four weeks from 19 SAs and analyzed the collected data to understand their unique behavior patterns and living environments. The results demonstrate that wearable devices and smartphones can automatically collect data on "exercise intensity during competitions" and "lifestyle patterns," specifically tailored to the mental and physical states of SAs. Furthermore, this preliminary research establishes a foundation for future efforts using machine learning to predict the physical and mental states of SAs.
This research aims to create an interactive music learning device, NoteBlock, for Blind or Low Vision (BLV) children. Through research on music learning-related designs and interactive designs suitable for BLV children, NoteBlock proposes a prototype for preschool children to understand music notes and motivate their interest in music. This research aims to investigate the mental model of BLV children through feasibility testing, primarily enhancing user experience based on the design concepts of ergonomics and universal design. Overall, NoteBlock proved engaging and effective interaction in music education. Further improvements suggested a broader usability.
Venous thromboembolism (VTE) is an under-appreciated vascular disease that has been a growing global healthcare challenge, with increasing morbidity, mortality, and associated healthcare costs. Few works have been placed on using the technology-based intervention (TBI) approach to assist patients at risk in actively completing exercise therapy for VTE Prophylaxis. We present a wearable virtual rehabilitation system that could trace and evaluate the patient's exercise therapy progress while offering an immersive experience to guide and motivate the patient to complete the training spontaneously through multi-sensory feedback.
While traditional methods in art education primarily focus on observation and imitation, they often fail to adequately support the imaginative and creative capacities of children. This paper introduces TangibleNegotiation, a novel child-AI co-creation system that integrates Tangible User Interfaces (TUIs) with Generative AI technologies to enhance imagination cultivation in child art education. Leveraging the capabilities of swarm user interfaces (SUIs) for both visualization and dynamic narrative, this system embeds an LLM-based agent for conversational and motion planning alongside real-time image-to-image generation. TangibleNegotiation offers a comprehensive pipeline with four interactive modalities: Pre-creation Tips, Real-time Conversation, Real-time Artwork Rendering, and Final Artwork Generation, each designed to foster an engaging and interactive learning environment. A pilot study involving semi-structured interviews with four elementary school art teachers suggests that the system effectively enhances children's engagement and stimulates their imagination through dynamic, real-time artistic feedback. The findings highlight the potential of combining SUI with generative AI to make art education more accessible, inclusive, and effective in fostering artistic imagination.
Distance learning is one of the technology-wise challenges in the education field. Remote learning provides the advantage of encouraging anyone to join from worldwide. In order to make education sustainable, understanding students' concentration in remote study is significant. In this study, we evaluate multi-modal sensors for estimating students' concentration levels during online classes. We collect sensor data such as accelerometers, gyroscopes, heart rates, facial orientations, and eye gazes. We conducted experiments with 13 university students in Japan. The results of our study, with an average accuracy rate of 74.4% for user-dependent cross-validation and 66.3% for user-independent cross-validation, have significant implications for understanding and improving student engagement in online learning environments. Most interestingly, we found that facial orientations are significant for user-dependent and eye gazes for user-independent classification.
Reinforcement Learning from Human Feedback (RLHF) is popular in large language models (LLMs), whereas traditional Reinforcement Learning (RL) often falls short. Current autonomous driving methods typically utilize either human feedback in machine learning, including RL, or LLMs. Most feedback guides the car agent's learning process (e.g., controlling the car). RLHF is usually applied in the fine-tuning step, requiring direct human "preferences," which are not commonly used in optimizing autonomous driving models. In this research, we innovatively combine RLHF and LLMs to enhance autonomous driving safety. Training a model with human guidance from scratch is inefficient. Our framework starts with a pre-trained autonomous car agent model and implements multiple human-controlled agents, such as cars and pedestrians, to simulate real-life road environments. The autonomous car model is not directly controlled by humans. We integrate both physical and physiological feedback to fine-tune the model, optimizing this process using LLMs. This multi-agent interactive environment ensures safe, realistic interactions before real-world application. Finally, we will validate our model using data gathered from real-life testbeds located in New Jersey and New York City.
Working memory, essential for temporarily retaining information, is a significant indicator of performance in cognitive tasks, offering valuable insights into mental tasks such as the ability to learn. While prior lab experiments have shown that sleep affects working memory, it remains uncertain whether this is detectable in naturalistic settings. Due to its nature, assessing working memory entails complex processes that rely on short-term memory and information-processing abilities, posing significant challenges to assessors. Identifying alternative factors affecting working memory could allow for the estimation of working memory status via physiological signals. Our study explores whether physiological data obtained during sleep can predict working memory during real-world tasks. Findings from experiments involving 12 university students suggest that average heart rate and heart rate variability during sleep are indicators of working memory in such settings.
Prevention of lifestyle diseases is an important issue in terms of both the lifelong well-being of each individual and the burden on society. Although several artificial-intelligence-based chatbots have been proposed to support preventing such diseases, they have not yet reached a complex problem setting that combines multiple theories of behavioral change. In such complex settings, in addition to intervention selection, it is important to reduce the amount of attributes acquired to reduce the user's burden in preventing inhibition of motivation. Therefore, we developed the Entropy-Based Planning algorithm that selects interventions from a small number of questions. To evaluate the algorithm, we conducted a short-term randomized control trial (RCT) with a small number of users as an initial validation. Due to certain limitations, we were unable to obtain results indicating sufficient effectiveness. However, important findings were obtained for the future development of behavioral-change support bots. The findings include the need to investigate the relevance of preferences among multiple behavior-change theories and balance the number of questions and accuracy of personalization on user impressions. We intend to conduct further research on the basis of these findings as well as conduct a large-scale RCT after improving the algorithm and experiment settings.
Characterizing the effects of notifications and pop-ups on reading comprehension, eye movements, and reader experience can deepen our understanding of digital reading behaviors. However, notifications are highly disruptive and can significantly impact reading performance: a challenge not easily mitigated even in controlled lab studies. We experimented (N = 22) to assess the impact of distractions like notifications/pop-ups on reading comprehension, frustration levels, and readability across 10 documents with varied distractions. The collected data include eye-tracking metrics and survey responses. We observed significant disruptions to reading flow, reduced comprehension, and increased frustration among participants with distractions. Furthermore, we examined the impact of cognitive control training on distraction management and comprehension levels, revealing improved comprehension in digital reading environments with distractions. Our findings provide quantitative evidence of the need for notification/pop-up management strategies that minimize disruptions and promote optimal reading experiences, with implications for the design of digital reading interfaces.
Emerging mobile health (mHealth) and eHealth technologies offer new opportunities for remote monitoring and interventions for individuals with mental health and neurological disorders. Traditional hospital methods are not suitable for long-term epilepsy seizure monitoring in naturalistic ambulatory environments. A study was conducted using a wearable device, smartphone, and remote data collection platform to explore the feasibility of ambulatory seizure detection. 27 patients were recruited and monitored over six months in the UK and Germany. Active and passive data were collected, indicating the potential for seizure detection in at-home and ambulatory settings. Collected data completion rate and quality shows feasibility of remote data collection. Initial results suggest that smartphones and wearable devices could significantly improve care for patients with epilepsy by detecting and potentially preempting epileptic seizures.
The use of ubiquitous computing for supportive communication in the constantly evolving field of mobile technology to improve mental health support between peers is significant. We designed and implemented AlarmCare, a mobile application for creating alarms on peer's smartphone with a mentally supporting text and voice message. We conducted a one-week, explorative study with eighteen participants to understand how mobile alarms were used as a supportive communication channel between peers. The findings indicate that peer created alarms decreased depressive emotions and fostered positive socially supporting emotions including but not limited to 'Sense of being cared for' and 'Sense of togetherness'. Our study reveals the potential of mobile alarms as a viable tool which can be utilized in daily communications for enhancing mental health.
Wireless sensing technology represents a significant milestone in the evolution of modern wireless communication toward sensing capabilities. Current commercial WiFi devices are known to perform various non-contact detections, such as respiration monitoring and gesture recognition, by capturing Channel State Information (CSI). However, the inherent unpredictability of signal fluctuations in real-world environments is a challenge for its application. For example, CSI is highly susceptible to environmental interference in real-world settings, which complicates its usage for the design and analysis of sensing systems. This further leads to limited availability of compatible commodity devices, making the collection of CSI datasets difficult and expensive. Therefore, the deployment of CSI devices and the method of reducing environmental interference present significant challenges in the field of wireless sensing. To effectively mitigate these issues, we employ GPU-based ray-tracing technology combined with a Variational Auto Encoder (VAE) to simulate CSI variations for analysis. Field experiments have been conducted to evaluate the performance of our simulator.
This study explores the integration of a NeuroDesign approach with an Empathy Design Thinking (EDT) curriculum to enhance creativity and empathy among primary school students in China. Utilizing functional near-infrared spectroscopy (fNIRS), the research examines the neuroscientific enhancement of educational practices. Quantitative analysis indicated trends toward improved cognitive performance, although these were not statistically significant. In contrast, qualitative data from classroom observations and interviews demonstrated notable improvements in students' empathy and creative abilities. These findings underscore the potential of NeuroDesign to significantly enrich primary education by cultivating key 21st-century skills, setting a robust groundwork for further exploration in this emergent field.
This paper presents an overview of OpenPack, a comprehensive dataset developed for recognizing packaging work activities, and discusses an activity recognition competition using this dataset. The availability of sensor datasets for recognizing work activities in industrial settings has been limited. This limitation has restricted research and development in industrial application methods based on activity recognition. To accelerate work activity recognition studies, OpenPack provides 53.8 hours of diverse sensor data, including acceleration data, keypoints, depth images, and readings from IoT devices such as handheld barcode scanners. This paper analyzes this data and propose several research directions based on OpenPack. We organized an activity recognition competition, OpenPack Challenge 2022, based on the OpenPack dataset. This paper also shares lessons learned from organizing the competition with the UbiComp and activity recognition research community. The OpenPack dataset is available at https://open-pack.github.io/.
Privacy is dynamic, sensitive, and contextual, much like our emotions. Previous studies have explored the interplay between privacy and context, privacy and emotion, and emotion and context. However, there remains a significant gap in understanding the interplay of these aspects simultaneously. In this paper, we present a preliminary study investigating the role of emotions in driving individuals' information sharing behaviour, particularly in relation to urban locations and social ties. We adopt a novel methodology that integrates context (location and time), emotion, and personal information sharing behaviour, providing a comprehensive analysis of how contextual emotions affect privacy. The emotions are assessed with both self-reporting and electrodermal activity (EDA). Our findings reveal that self-reported emotions influence personal information-sharing behaviour with distant social groups, while neutral emotions lead individuals to share less precise information with close social circles, a pattern is potentially detectable with wrist-worn EDA. Our study helps lay the foundation for personalised emotion-aware strategies to mitigate oversharing risks and enhance user privacy in the digital age.
Explainable Artificial Intelligence (XAI) has been widely used to clarify the opaque nature of AI systems. One area where XAI has gained significant attention is Participatory Budgeting (PB). PB mechanisms aim to achieve a proper allocation concerning both the votes collected based on user's preferences and the budget. An essential criterion for evaluating these mechanisms is their ability to satisfy desired properties known as axioms. However, even though there are complex voting rules that meet some axioms, concerns regarding transparency persist. In this study, we propose an approach to provide explanations in a PB setting by treating axioms as constraints and seeking outcomes that adhere to these constraints. This method enhances system transparency and explainability. Each potential allocation is accepted or rejected based on whether it satisfies the axioms, and the linear nature of the axioms reduces computational complexity. We evaluated our approach with real-world users to assess its effectiveness and helpfulness. Our pilot study shows that users generally find explanations helpful for understanding the system's decisions and perceive the outcomes as fairer. Additionally, users prefer general explanations over counterfactual ones.
Athletes can and do use wearables and personal devices to track multiple aspects of personal information about their performance, but it is currently difficult for them to gain an integrated and useful picture of their long-term information. We designed a questionnaire to investigate whether athletes value and track four important factors (physical health, mental health, nutrition, and sleep) in relation to their performance using wearables and current athlete management systems. Sixteen athletes from various sports completed the questionnaire. Key results show the mismatch between perceived value versus actual nutrition, sleep, and mental health tracking but consistency on physical health. There also seems to be a divide on the perceived usefulness of athlete management systems. These results point to challenges for collecting and interpreting data that athletes want and also inform the design of athlete management systems, especially those that integrate both self-reported and wearable data.
Depression and loneliness are significant contributors to poor mental health and can potentially develop into severe mental disorders. With the advent of large language model (LLM) technology, healthcare applications aimed at improving mental health have become increasingly active, and AI-mediated journaling has garnered attention for its potential in mental health management. However, the impact of AI-mediated journaling on depression and loneliness remains underexplored. To address this, we introduce MyListener, an AI-mediated journaling mobile application that provides context-aware diary prompts and replies based on contextual data collected from the smartphone and the Fitbit Luxe smartband. We conducted a two-week study with 11 university students to evaluate user experiences and observed a reduction in depression and loneliness during its use. This paper outlines our research contributions, discusses future work, and plans for a full-scale experiment.
Stroke is a serious condition that can make it impossible for some body parts to move. When blood flow to a portion of the brain is suddenly cut off, a stroke results. A stroke patient needs to receive treatment right away because delaying care could seriously harm their health. This research aims to identify the optimal model with important features based on several machine learning techniques that can predict the risk of stroke from the provided data. The combined dataset of publicly available data with 733 collected data is used to train the model. We have analyzed the result of 4 different feature selection techniques - Information gain, Lasso L1, Fisher score and Kendall's Tau, to reduce dimensionality and ensure a cost-effective and scalable stroke detection system. Also, 5-fold cross-validation with Grid Search CV hyperparameter tuning is used to remove any potential overfitting issues. In addition, we have evaluated the model using the Brier score, G-mean, Matthews Correlation Coefficient (MCC), and H-measure. Lasso L1 emerges as the most compelling feature selection technique for this project. With this technique, the best accuracy of 90%, H-measure of 89.9% and AUC score of 96% are provided by Gradient Boosting among the eight machine learning algorithms we have used.
BPClip is an ultra-low-cost cuffless blood pressure monitor. As a universal smartphone attachment, BPClip leverages the computational imaging power of smartphones to perform oscillometry based blood pressure measurements. This paper examines different design considerations in BPClip's development. The cost and accuracy of blood pressure measurements are the central design goals. Both requirements are achieved with the initial prototype that achieves a 0.80 USD material cost and a mean absolute error of 8.72 and 5.49 mmHg for systolic and diastolic blood pressure, respectively. Since a main motivator to develop BPClip is making blood pressure monitoring more accessible, usability is also central to the design. User studies were conducted throughout the design process to inform the most intuitive and accessible design features. In this paper, we demystify the design process to share effective design practices with future developers working towards expanding health monitoring access beyond traditional clinical settings.
This study introduces a multi-modal smart tongue exercise system, designed to enhance the engagement and effectiveness of tongue movement training by integrating sensory stimulation with advanced sensor technologies. The system employs food materials for taste and olfactory stimulation, alongside bone conduction technology to augment tactile and auditory experiences. It incorporates flexible pressure sensors and inertial measurement units for real-time monitoring of tongue force and movements. This integration of comprehensive sensory stimuli and intelligent monitoring not only increases user participation but also allows for the personalized adaptation of training programs, optimizing therapeutic outcomes. The application of this system shows promising potential in improving oral health and managing related disorders.
Smart glasses enabling eyes-only input are growing in popularity. The underlying input technique powering the majority of interactions in this modality is dwell-based selection. However, this approach suffers from the Midas touch: the unintentional triggering of input during exploratory gaze. We suggest that multimodal input based on eye motions performed while the eyes are shut can trigger selections and overcome this problem without requiring additional input devices or modalities. To explore this idea, this paper captures a dataset of labeled closed-eye images corresponding to eye motions in various directions from nine participants. This was achieved by recording images from a binocular eye-tracker while participants performed standard eye-targeting tasks with one eye closed: the image from the open eye serves as ground truth for the closed eye's location. To understand the scope of closed-eye input, we explore this data according to three dimensions: the distance (near/far), width (narrow/broad), and direction (horizontal/vertical) of single captures of eye positions. Our results indicate that horizontal and vertical eye positions can be accurately recovered from static closed-eye images, but only after relatively large angular eye movements.
This study explores how large language models (LLMs) based smart collars can enable pets to express themselves and interact with their owners through social media. We proposed DogChat: a pet-centered prototype system that captures real-time visual and auditory data in first person from pets through smart collars and utilizes LLMs for pets' behavior and emotion analysis. The system operates in three phases: Pet Profile Construction, Daily Experience Reconstruction, and Behavior Learning Integration. Through the WeChat platform, pets can actively share their status and alert to abnormal events, or respond passively to owner inquiries.
Garment fit is critical in wearable systems and functional clothing, including masks and other personal protective equipment (PPE), and yet no methods exist that can accurately and in real-time quantify their fit characteristics of non-translucent or non-form fitting garments. We propose a technique for remotely measuring the gap between an external wearable device or clothing surface and the underlying, visually-obstructed body surface - which we refer to as the garment-body "air gap" - that commonly occurs in positive ease garments (i.e., garments that are larger in dimension than the underlying body dimension). To do this, we developed a triple frequency band remote measurement system that is based on a millimeter wave radar, an ultrasound sensor, and an infrared distance sensor, which when used synergistically allows for remote measurement of the uniaxial distances to multiple layered surfaces simultaneously. Here this novel, first-generation system is tested and evaluated for the purpose of measuring garment fit.
Screening tests are often used in medicine to assess whether a patient is at a high risk of contracting a disease. Recent literature has proposed prediction algorithms for Anterior Cruciate Ligament (ACL) retears that aim to achieve high accuracy. However, these models fail to reach an adequate sensitivity to function as effective screening tests. In such cases, model sensitivity is sacrificed for heightened specificity. Misclassifying patients who will eventually go on to retear their ACL as low-risk patients prevents them from obtaining necessary therapeutic support and is not appropriate for a clinical setting. In this study, we implement a Decision Tree Classifier as a screening test to evaluate a patient's risk of retearing their ACL six months after surgery, before the patient is released to activity. By incorporating a machine learning-based screening technique, we hope to minimize false negatives and create a tool that can readily be adopted in clinical practice.
The integration of Augmented Reality (AR) technology into surgical procedures offers significant potential to enhance clinical outcomes. Despite numerous lab-proven prototypes, deploying these systems in actual clinical settings demands specialized design and rigorous clinical evaluation to meet the high standards of complex medical fields. Our research highlights the intricate requirements emerging from clinical environments, particularly operating theaters. To address these challenges, we introduce ARAS, an operational AR assistance system for open pancreatic surgery. Through a user-centric design methodology, ARAS was iteratively developed and refined, ensuring its practical applicability and effectiveness in real-world surgical settings during clinical trials. ARAS encompasses two different modes designed for preoperative and intraoperative sessions. Preoperative mode enables surgeons to visualize patient data and perform resection on the patient's 3D reconstructed vessels and tumors, helping surgeons during surgical planning. Intraoperative mode allows in-situ and precise visualization of these reconstructed 3D models during the surgical procedure and supports surgeons in decision-making. ARAS also includes a novel interaction design for surgical AR systems. It uses LLMs to enable context-aware, intuitive, and natural communication between surgeons and the AR system. This demo showcases the capabilities of ARAS, emphasizing its potential to transform surgical practices through advanced AR and LLM-supported interactions.
The rising cycling trend in recent years has highlighted the need for enhanced safety, especially with increasing cyclist numbers and changing infrastructure dynamics. Fueled by the increasing smartphone usage, distracted riding due to digital devices remains a critical concern. In this work, we propose a system utilizing capacitive sensing for hands- and eyes-free interaction, designed to improve safety while maintaining digital connectivity. Our system employs capacitive shoulder covers that detect shoulder tap gestures with the user's cheeks, allowing cyclists to interact with their smartphones without removing their hands from the handlebars or diverting their gaze from the road. A mobile application deployed on Android forms the interface to detect gestures through a Convolutional Neural Network and link them with appropriate smartphone features. Potential use cases include phone call management, media player control, navigation confirmation, and broader integration towards cyclist communication and safety in swarm cycling scenarios.
This demo presents a novel end-to-end framework that combines on-device large language models (LLMs) with smartphone sensing technologies to achieve context-aware and personalized services. The framework addresses critical limitations of current personalization solutions via cloud LLMs, such as privacy concerns, latency and cost, and limited personal information. To achieve this, we innovatively proposed deploying LLMs on smartphones with multimodal sensor data through context-aware sensing and customized prompt engineering, ensuring privacy and enhancing personalization performance. A case study involving a university student demonstrated the capability of the framework to provide tailored recommendations. In addition, we show that the framework achieves the best trade-off in privacy, performance, latency, cost, battery and energy consumption between on-device and cloud LLMs. To the best of our knowledge, this is the first framework to provide on-device LLMs personalization with smartphone sensing. Future work will incorporate more diverse sensor data and involve extensive user studies to enhance personalization. Our proposed framework has the potential to substantially improve user experiences across domains including healthcare, productivity, and entertainment.
Psychological research has confirmed that positive psychological intervention methods, such as gratitude journaling, hold promising potential for promoting mental wellbeing. However, when applied in daily practices, lacking of personalized guidance has often negatively damaged its efficacy and user engagement. This paper proposes a smart system named GratitudeGuider that leverages social network data as personalized resource to provide personalized gratitude journaling service for mental wellbeing. Taking social network data as the recorder of users' real-world life, GratitudeGuider facilitates users with comprehensive stress awareness and intelligent gratitude topic recommendation aligning with their real life experience for more friendly guidance. We implemented GratitudeGuider as a lightweight mobile application and show its superiority in real user study.
Smartphones have become essential to people's digital lives, providing a continuous stream of information and connectivity. However, this constant flow often depletes users' limited attentional resources and time, leading to decreased productivity and increased stress levels. This issue underscores the need for tools that empowers users to maximize their potential for achieving personal objectives. One effective approach is to identify "time-killing" moments--a specific type of attention surplus--during which users seek to fill perceived free time without a specific purpose. Recent work has utilized screenshots taken every 5 seconds to detect time-killing activities on smartphones. However, this method often misses to capture phone usage between intervals. We demonstrate that up to 50% of time-killing instances go undetected using screenshots, leading to substantial gaps in understanding user behavior. To address this limitation, we propose a method called ScreenTK that detects time-killing moments by leveraging continuous screen text monitoring and on-device large language models (LLMs). Screen text contains more comprehensive information than screenshots and allows LLMs to summarize detailed phone usage. To verify our framework, we conducted experiments with six participants, capturing 1,034 records of different time-killing moments. Initial results show that our framework outperforms state-of-the-art solutions by 38% in our case study.
Although VR headsets have gained widespread acceptance, their potential for real-world actuation remains underexplored. We propose a system that allows to work in VR space using a robotic arm and VR headset, focusing on 3D scanning as the task. The robotic arm is equipped with an RGB-D camera to achieve remote 3D scanning.This system enables immersive remote work and enhances the link between real space and VR space by reproducing the real objects into the VR space.
As Mixed Reality (MR) becomes increasingly prevalent, the rising interest of students in studying MR design has led to a demand for Mixed Reality Design Education (MRDE). However, current see-through MR platforms like Vision Pro and HoloLens face educational and technological challenges: their high cost can prevent student access, and their content design, development, and authoring processes, usually tailored for professional and commercial use, can require advanced skills and be time-consuming. This discourages students from experimenting with MR designs, thus limiting their creativity and motivation to showcase their projects. To address these challenges, we introduce HoloKit, an open-source MR headset equipped with a suite of authoring toolkits. HoloKit transforms a regular smartphone into an affordable yet high-quality immersive MR experience. Targeted at MR design educators, students, and hobbyists, HoloKit is positioned as the MR equivalent of Arduino. It echoes Arduino's approach to hardware hacking and the DIY culture, offering a cost-effective alternative to more expensive MR systems. Its design philosophy emphasizes affordability, agility, and extensibility, encouraging an open-source community of creators. In this demonstration, we showcase three case studies from integrating HoloKit into real MR design classrooms and research projects. This highlights HoloKit's unique approach and effectiveness for MRDE.
A daily diary can benefit one's mental health and well-being by providing an outlet for emotional expression, self-reflection, and stress reduction. However, the traditional approach to diary writing is hindered by the lack of an intuitive mechanism to deal with emotional experiences via textual descriptions and a gradual decline in writing skills with the adoption of social media and messaging apps. This paper introduces a Music Diary app, which uses ChatGPT to summarize text and generate music that echoes the emotional state of the user. The results of this pilot study revealed that the Music Diary app potentially facilitates the creation of diary entries, while the musical soundtrack promotes intuitive awareness, reflection, and expression as well as one's willingness to share experiences on social platforms. In experiments, the proposed system was shown to make the process of keeping a diary more enjoyable and the novel format makes it possible to expand this personal activity to a wider social realm.
Advanced wearable digital assistants can significantly enhance task performance, reduce user burden, and provide personalized guidance to improve users' abilities. However, developing these assistants presents several challenges. To address this, we introduce TOM (The Other Me), a conceptual architecture and open-source software platform (https://github.com/TOM-Platform) that supports the development of wearable intelligent assistants that are contextually aware of both the user and the environment. Collaboratively developed with researchers and developers, TOM meets their diverse requirements. TOM facilitates the creation of intelligent assistive AR applications for daily activities and supports the recording and analysis of user interactions, integration of new devices, and the provision of assistance for various activities.
This interactivity features a smartphone-based method for measuring blood pressure (BP) using the oscillometric method. To measure BP, it is necessary to measure (1) the pressure applied to the artery and (2) the local blood volume change. This is accomplished by performing an oscillometric measurement at the finger's digital artery, whereby a user presses down on the phone's camera with steadily increasing force. The camera is used to capture the blood volume change using photoplethysmography. We devised a novel method for measuring the force applied by the finger without the use of specialized smartphone hardware with a technique called Vibrometric Force Estimation (VFE). The fundamental concept of VFE relies on a phenomenon where a vibrating object is dampened when an external force is applied on it. This phenomenon can be recreated using the phone's own vibration motor, where the resulting damped vibration is measured using the smartphone's Inertial Measurement Unit (IMU). A cross device reliability study with three smartphones of different manufacturers, shape, and prices results in similar force estimation performance across all smartphone models. In an N=24 validation study of the BP measurement, the smartphone technique achieves a MAE of 8.4 mmHg and 8.5 mmHg of systolic and diastolic BP, respectively, compared to an FDA approved BP cuff. The mean error (bias) and standard deviation are 0.44±10.2 mmHg and 0.03±10.5mmHg for systolic and diastolic, respectively. The vision for this technology is not necessarily to replace existing BP monitoring solutions, but rather to introduce a downloadable smartphone software application that could serve as a low-barrier hypertension screening measurement fit for widespread adoption.
We present FabPad, a new music keyboard which is made with fabric. We use ESP32 and embroidered conductive thread to build some blocks so that it can be simulated as a touchpad. Besides pitch, we incorporate velocity into the design with the buttons. We show that FabPad has the potential to be a lightweight, affordable, and accessible fabric instrument for musicians.
While pilot studies help to identify potential interesting research directions, the additional requirements in AR/MR make it challenging to conduct quick and dirty pilot studies efficiently with Optical See-Through Head-Mounted Displays (OST HMDs, OHMDs). To overcome these challenges, including the inability to observe and record in-context user interactions, increased task load, and difficulties with in-context data analysis and discussion, we introduce PilotAR (https://github.com/Synteraction-Lab/PilotAR), a tool designed iteratively to enhance AR/MR pilot studies, allowing live first-person and third-person views, multi-modal annotations, flexible wizarding interfaces, and multi-experimenter support.
We demonstrate the peer-to-peer impact of drones flying in close proximity. Understanding these impacts is crucial for planning efficient drone delivery services. In this regard, we conducted a set of experiments using drones at varying positions in a 3D space under different wind conditions. We collected data on drone energy consumption traveling in a skyway segment. We developed a Graphical User Interface (GUI) that plots drone trajectories within a segment. The GUI facilitates analyzing the peer-to-peer influence of drones on their energy consumption. The analysis includes drones' positions, distance of separation, and wind impact.
A smart home control application allows users to interact with the Cyber-Physical System (CPS) in their living space. Conventional applications offer manual control over individual smart devices, disregarding compound effects from using multiple devices. Manual control is not user-friendly, as users must explore smart device settings until achieving their target states. In previous works, we proposed a symbolic regression approach to find settings of smart devices that achieve user-specified goals. We now demonstrate a novel application, SmartBright, an outcome-oriented light control for macOS. SmartBright allows users to set a target brightness and the current time of day; the application finds settings for the window blind and lamp to achieve this goal. We evaluated SmartBright by comparing suggested settings for a bright room against ray-traced brightness throughout the day. Additionally, we applied the user preference Do not use the lamp, if possible to minimize lamp use and evaluated these settings. Results show SmartBright suggests settings achieving user goals throughout the day, with average errors of 0.1175 P_l for the unrestricted case, and 0.1730 P_l for the restricted case.
Running is a widely practiced sport, and technology is a prominent aspect of the experience. Runners often utilize music, tracking applications, and wearable sensors like heart rate monitors. Engaging with these systems while moving presents specific challenges, as the running motion can impede interaction capabilities. Standard devices such as smartphones, smartwatches, and fitness trackers are commonly used, yet their small buttons and touch screens necessitate slowing down for effective interaction. In this work we demonstrate GestureShirt, a shirt that allows for eyes-free interaction while running. The prototype shirt utilizes detachable distance sensors to capture simple gestures. In the demo, visitors can utilize the body's surrounding space to interact with music.
Physical health, mental health, nutrition, and sleep are four important pillars that affect athlete performance. Within these pillars, there are over twenty variables that affect an athlete's development and performance, however, to our knowledge, there is no analytical tool for individual athletes to understand how these interrelated variables quantitatively affect their performance. Little is understood about the challenges of making use of rich collections of diverse tracking data and current athlete management and recommender systems fall short of providing personalised feedback about the athlete's short- and long-term goals. Analytical methods appropriate for highly heterogeneous and multivariate data need to be used to personalise feedback to athletes, such as within-subject classification methods. Current research and industry methods lack the translation of both linear and non-linear intra- and inter-individual variations of relationships onto an interactive network visualisation for athletes. This research aims to enable athletes to make decisions that optimise their performance by harnessing their own personal data. Athletes should be able to easily understand and explore their rich, quantified, multivariate self.
In a do-it-yourself (DIY) smart home, users can implement desired features by combining sensors and actuators. Specifically, effective sensing of home situations is crucial for building these features, and several studies have explored the use of cameras as home sensors to capture more specific scenarios. Camera sensors offer the potential to use various elements based on visual information as sensing triggers, presenting possibilities distinct from commonly used IoT sensors. These unique characteristics may necessitate new support for the DIY smart home-building process. In light of this, our doctoral research aims to (1) understand the unique user experience of camera sensors in DIY smart home building, (2) explore the potential and challenges of using cameras as sensors, and (3) propose a design framework for a toolkit that supports the camera sensor-driven DIY smart home building experience. This doctoral colloquium paper discusses our research plans, which synthesize knowledge on toolkit design to support the effective use of camera sensors in building DIY smart home features.
Ubiquitous Computing and Human-Computer Interaction health researchers motivate their work with the promise of improving health outcomes for patients. Although computing devices are producing more health data than ever, turning this data into actionable insights remains a challenge. In this Doctoral Colloquium submission, I argue that to improve health outcomes, medical systems must not only produce health data, but also provide interpretations of that health data personalized to the individual in order to deliver effective, actionable feedback. To illustrate this approach, I introduce two projects: Beacon and ExerciseRx-CP. Beacon is a system that screens for minimal hepatic encephalopathy through a novel critical flicker frequency (CFF) measure. I use Beacon's CFF measure as an illustration of defining ways to interpret new health data by incorporating personal baselines and longitudinal measurements. ExerciseRx-CP is a system that uses real-time feedback of exercise tracking to encourage physical activity among adolescents with cerebral palsy. I use ExerciseRx-CP as a way to show how motion data can be interpreted on multiple levels (e.g., raw sensor signals, subrepetition motion, and aggregated repetition metrics) to deliver different feedback mechanisms. Finally, I describe how an enabling activities framework can help Ubicomp researchers characterize activities necessary for translational research, promote more translational research to be conducted our community, and ultimately create greater impact as a field.
Noise sensitivity is prevalent in both neurodivergent and neurotypical people, making self-regulation challenging and impacting their quality of life. Little work has been done in the wearable and ubiquitous computing space to support these particular experiences. My research explores how people with noise sensitivity (PWNS) and those around them manage and regulate their own and other's reactions to noise. Leveraging wearable and mobile devices, this work explores the management and regulation behaviors of PWNS and their care networks. It describes the design and deployment process of a novel application that utilizes wearable sensor technology to sense physiological and environmental signals, facilitate awareness and information sharing, and present users with strategies for managing and regulating noise sensitivity experiences. This research strives to contribute an understanding of noise sensitivity experiences, inclusive design practices, and a technological solution to support people with noise sensitivity.
Extended Reality(XR) technologies rapidly evolve, offering remarkable opportunities for creating immersive and interactive experiences. However, the integration of effective haptic feedback remains a significant challenge. We aim to explore and enhance haptic experience design based on three major themes: wearable hardware, rendering methods, and design systems incorporating algorithms. Through this comprehensive approach, we seek to improve the quality of haptic interactions in XR environments significantly.
The innate human desire to acquire new abilities dates back to our earliest ancestors. Today, humanity lives in an environment deeply integrated with technology, offering opportunities to extend the boundaries of human capabilities. This integration has led to the emergence of cyborg culture, with "tryborgs"-individuals who attempt to become cyborgs through technology and metaphor-gaining attention. Wearable artifacts can confer individual identity or social status and support the transformation into cyborgs. This research examines whether, by removing the shell of the commodity economy, individuals can create new senses by expanding "extra-corporeal organs," thereby extending perception and expanding boundaries to connect more deeply with their environment. It explores the potential for consciousness and bodily functions to transform into expressions that transcend language and are received universally. The study takes human sensory ability as a starting point and investigates the effects when individuals imagine an acquired sense through basic technology combined with wearables. This practical research involves speculative and critical design inquiries into reshaping tangible interactions via technology. The core focus includes understanding self-empowerment and constructing new relational methods within community interactions. By shifting the boundaries of data and humanity around perception, the research aims to generate new connections. The proposal seeks to explore the emotional relationships between humans and technology, and among humans themselves, promoting emergent, poetic, and self-generated narratives through open acts of perception.
Current developments in haptic interfaces predominantly aim to achieve more 'realistic' sensations with a particular emphasis on the sense of touch, focusing on the fingertips for manipulating digital objects in VR/AR experiences. However, these haptic sensations are usually complementary to the visual stimuli due to HCI biases toward the sense of sight. This research aims to develop an approach for the development of a standalone haptic experience by transforming pre-recorded human movement into haptic feedback. Using artistic and design practice research with Teslasuit as a principal interface, this research investigates how mocap data and electrical stimulation can be used to develop haptic scores by redirecting movement traces from a moving body to another body. This approach to movement transformation presents a way in which human movement, particularly dance, can be experienced from an audience perspective beyond sight, expanding existing research on using electric stimulation for creative practices.
In recent years, advancements in telepresence technology have made remote communication and collaboration easier. MetaPo is a system that uses spherical displays to seamlessly integrate physical and digital interactions. However, conventional inverse panorama techniques face challenges such as image seams, coverage issues, and overlapping displays. To address these challenges, this study proposes an inverse panorama technique based on coordinate transformation, enhancing visual continuity and user experience. Additionally, evaluation methods necessary to demonstrate the effectiveness of the proposed method in improving the quality of remote communication have been devised. This research provides a novel solution to challenges in spherical display technology and paves the way for more immersive telepresence experiences.
A variety of consumer Augmented Reality (AR) applications have been released on mobile devices and novel immersive headsets over the last five years, creating a breadth of new AR-enabled experiences. However, these applications, particularly those designed for immersive headsets, require users to employ unfamiliar gestural input and adopt novel interaction paradigms. This leap forward intensifies the complexity of help-seeking and onboarding needs for the end-users. Recent emergence of artificial intelligence (AI)-powered in-context help tools has become potential alternatives to onboarding and search methods. However, non-technical users struggle with prompt-based interactions within LLMs that offer human-like language capabilities, which is unique, but can also be unreliable. My doctoral research aims to (1) understand how novice users discover gestural interactions and classify the types of interaction challenges they face; (2) investigate the nuances in users? mental models of emerging technologies, such as LLMs and AR; and, (3) explore the design of onboarding that enhances gesture discoverability and their application within the AR environments.
My proposed thesis topic focuses on two smartphone measurements: BP and HGS. These measurement of BP and HGS are similar because both measurements require the user to apply force to the smartphone. The BP measurement is performed at the fingertip and requires the user to apply force onto the digital artery. The HGS measurement requires users to squeeze the smartphone with their hand. HGS is important as a measure of fragility in clinical settings, which is particularly relevant for predicting surgical outcomes and determining surgical readiness. Both these metrics rely on a novel technique of measuring applied force on a smartphone coined "Vibrometric Force Estimation." This technique involves vibrating the smartphone and measuring how much the vibration dampens as a result of the applied force. The vision for this work is to expand healthcare access to accurate and clinically relevant measurements, especially for low resource and high risk populations.
Ubiquitous computing technologies have been developed to identify and intervene on mental health symptoms, but these technologies have limited uptake in clinical care. Recent work in human-computer interaction (HCI) and digital mental health has demonstrated opportunities to conduct research in practice (i.e. in the context of care) that better identifies opportunities for technologies to meet current care needs. In this work, I propose an instantiation of research in practice, specifically to build sociotechnical infrastructure that enables ubiquitous computing research in the context of mental healthcare. Specifically, this infrastructure creates opportunities for human-centered design to study actions with ubiquitous technologies in care, as well as data collection to design, build, and evaluate interventions with these technologies in clinical settings. Through this infrastructure, I aim to align computing research with specific clinical needs, and improve the uptake of ubiquitous technologies in care and patient outcomes.
The vision of closed-loop intervention systems for behavioral health is growing with the flourishing of mobile sensors and multimodal data. There has been abundant work on identifying symptoms, diagnosis, and progression monitoring. However, there has been limited effort in intervention research, tailoring suitable interventions for closed-loop systems. About a decade ago, researchers began exploring mindless interventions -- subtle interventions to change behavior, cognition, or affect with minimal attention and effort. Despite their success in controlled laboratory settings, few mindless interventions have been deployed in the real world, and none have been integrated into closed-loop systems. Thus, it remains unclear how well these low-effort, low-attention interventions integrate with sensing systems, how their effectiveness varies over time and context, and their overall impact on behavioral health management. This study is the first to deploy mindless interventions in a closed-loop system in real-world settings. We developed a closed-loop intervention for individuals with moderate to severe anxiety, delivering offset heart rate biofeedback when stress symptoms are detected. This paper presents our work-in-progress, detailing the system and study design, and highlighting this research's methodological and empirical contributions.
This paper presents a motor-skill-download system to help beginners learn the basic piano playing techniques such as staccato and legato. This system uses electrical muscle stimulation (EMS) to induce involuntary movement of the user's muscles so that a beginner can experience firsthand the muscle coordination of expert pianists. To download motor skills, we first use electromyography (EMG) sensors to record multiple muscle activities of experts playing pieces using staccato and legato. Then, we analyze the data from each EMG sensor and design EMS parameters to reproduce the coordination of muscles used when playing the techniques. The data are stored in the Human Augmentation Platform. Finally, the user selects the technique they want to learn from the platform, and then EMS is used to activate the user's muscles to train them to play the technique. The developed system facilitates wrist movements with EMS applied to the extensor and flexor muscle groups to help beginners learn to play pieces using staccato and legato.
During spacecraft robotic arm operations, haptic feedback has the potential to display rich streams of relevant data to the user, such as trajectory information and velocity. One promising strategy to encode such information is through tactile stroke illusions, which create the sensation of vibration moving across the skin. This study highlights the design and feasibility testing of a haptic wearable in- terface worn on the user's forearm in a sleeve form factor that elicits tactile stroke illusions, using an array of vibration actuators in a timed sequence. By manipulating the temporal aspects and devel- oping two rendering algorithms to convey end-effector movement information, we demonstrate the feasibility of our haptic sleeve wearable by human testing to measure how accurately participants perceive the conveyed information (trajectory and velocity).
Garment-based soft robotics is an emerging domain which seeks to integrate compliant mechanical actuators into textile/clothing form factors. There exists a significant opportunity to develop and deploy these garment-based technologies to overcome accessibility challenges in everyday clothing, providing a suite of adaptive solutions that can support and assist with the daily challenge of tightening/loosening or fastening/unfastening a garment. In this paper, we present two embodiments of adaptive bras with embedded shape memory materials to provide robotically-assisted adaptive / customizable fit - one embodiment is designed for individuals with dexterity limitations that are typical in conditions such as arthritis, and another embodiment is designed to be fully usable with one hand. This approach, which we define as 'Soft Ro-bra-tics', represents a novel solution to overcome bra-related accessibility issues.
Virtual Reality (VR) exhibitions are becoming increasingly popular for creating immersive and interactive experiences. One particular use case in research is death simulations in VR, where users can experience or explore the concept of death in a controlled environment. Death is a profound theme that offers opportunities for emotional growth, empathy, and deep reflection. This project presents Passing Electrical Storms, an innovative virtual reality art installation by Shaun Gladwell, designed to evoke the human experience of dying and an out-of-body journey through the universe. The installation combines VR, sensing technologies, and tactile feedback to create a visually stunning and emotionally immersive experience. Participants journey from a simulated cardiac arrest to brain death, moving through inner-body experiences and vast cosmic expanses, provoking contemplation on mortality and the sublime nature of the universe. This work demonstrates the potential of VR to engage users with complex emotional and existential themes, providing valuable insights for future VR design in the field of HCI.
The 'Perpetual Pigments' project originated from a discovery by researchers at Deakin University's Institute for Frontier Materials, who extracted pigments from discarded natural fibre textiles. In- stitute for Frontier Materials collaborated with design academics to explore applications of this breakthrough using design thinking and Indigenous Design Charters. Prominent Australian First Nation artists tested the recycled pigments through painting, drawing, and screen printing. The project's success relied on ubiquitous com- puting and digital design techniques. This initiative, inspired by the circular economy and research through design, merges science, design, and art to address cultural, economic, and environmental challenges. By adopting a circular economy mindset, textile waste can be repurposed, reducing waste and conserving resources. This approach highlights the principles of functionality, sustainability, and storytelling. The exhibition and findings are valuable to design and Human Computer Interaction researchers and industry repre- sentatives interested in addressing textile waste through successful circular economy projects.
Deakin Breathes Data (DBD) rethought and rethreaded radio surveillance on a university campus into a material, fibrous aesthetic. We used university-captured Wi-Fi data to present how shared knowledge threads the modern university experience. Laser cut silk mirrored how Wi-Fi's photonic wave-particles slice through spaces, mapping knowledge channels in and between buildings as they vary with student presence. Its contribution considers physical and metaphorical fibers, showing how a campus-as-organism breathes data, through student knowledge sharing-both consensual and surveilled. We hook the material fibers together in a knit of organizational surveillance, knowledge creation, and campus life, reimagined through tactility, fabric, and laser, while aesthetically referencing the invention of frequency hopping itself, which Wi-Fi depends on.
People with physical disabilities face significant challenges in basic activities like dressing and undressing. The CDC reports that 3.6% of U.S. adults have self-care disabilities impacting dressing or bathing. Adaptive clothing, designed to facilitate dressing, includes features such as zip-off sleeves, Velcro shoes, and side-opening pants, which enhance autonomy and quality of life. Current adaptive shoes primarily use Velcro straps and wide openings with zippers. We seek to expand the adaptive clothing solution space by exploring garment-based soft robotics, such as shape memory alloy (SMA) springs, to offer advanced solutions. In this study, we developed and tested self-tightening and self-fastening belts and shoes using SMA actuators as tightening mechanisms and both hook-and-eye and electromagnetic components as fastening solutions, to improve accessibility and independence for individuals with physical limitations.
Neonatal jaundice is the most common medical condition in newborns, affecting more than 80% of infants. The common forms of phototherapy treatment for jaundice are uncomfortable for the infant and caregiver, disrupt important newborn support practices like holding the baby and early breastfeeding, and lose efficiency in shadowed body areas. Here, we present a phototherapy device in the form of a soft, stretchy, close-fitting full-body garment (a 'onesie'). Our device uses surface-mount stitched e-textile fabrication methods to preserve the softness of the e-textile structure for next-to-skin applications. The prototype successfully produces 450-460 nm blue light at approximately 30 µW/cm2 intensity at the garment surface, the recommended irradiance dose, and by nature of its close, conformable fit, it reaches more of the skin surface than traditional methods.
This paper explores the transformative potential of integrating technology with human sensory and emotional experiences through the "Outward Bound" idea. By utilizing responsive wearable devices, specifically the Sensitive Wing and its iterative developments, the project aims to externalize internal sensory experiences, bridging the gap between personal and external empathy. The fusion of traditional practices like Nüshu and ruffs with modern technology underscores the importance of preserving cultural heritage while embracing technological advancements. This study highlights the complex interplay between gender, culture, and technology, visualizing the otherwise invisible tensions. Ultimately, these innovative approaches invite us to continuously and creatively reimagine our relationships with each other and the world, promising new perspectives on human interaction through technology.
What is it like to be a bat? "EchoVision" is a mixed reality interactive experience that immerses participants in the world of bats through sound visualization and mixed reality technology. With a custom-designed, handheld, bat-shaped, optical see-through mixed reality mask based on the open-source HoloKit project, participants can simulate echolocation, the natural navigation method bats use in the dark. Participants use their voices and interpret the returned echoes. The visual feedback is dynamically adjusted based on the user's voice pitch and tone, as well as the 3D shape of the surrounding environment detected by LiDAR on the mask, offering an interactive and dynamic depiction of how bats perceive their surroundings. Compared to other mixed reality headsets, the pop-up exhibition in the bat habitat demonstrates that this unique mask design can effectively accommodate a large volume of audiences in-the-wild, offering them scientific education with empathetic engagement. This work promotes an ecocentric design perspective and fosters understanding between species, educating and cultivating a deeper appreciation for the unique ways non-human creatures engage with their ecosystems.
We developed bioplastic nail extensions that integrate Near-Field Communication (NFC) chips to facilitate hands-free interaction with the phone. Using bioplastics as a scaffolding material for electronics, we aim to address existing sustainability challenges regarding disposal and recycling while also exploring the affordances of this material, such as customization and material exploration. We outline the low-tech fabrication process for our Bio-e-nails, followed by a demonstration of how to wear and program them. Subsequently, we present three applications utilizing tag-based interactions: medical or emergency contact information, navigation setup, and automated text or Short Message Service (SMS) communication. We designed the fabrication process for designers at large while enabling expression through fashion by embracing the temporary nature of the bioplastic.
We present Thermal Earring, a pioneering smart earring that offers a practical and comfortable solution for wearable health monitoring. Unlike watches and other wearables typically worn loosely on body extremities, earring's proximity to the head enables unique sensing advantages. This research aims to provide design guidelines for developing smart earring and jewelry. Our Thermal Earring prototype is designed to resemble traditional earrings in size (11.3 mm in width and 31 mm in length) and weight (335 mg). Thermal Earring consumes only 14.4 uW and enables a battery life of 28 days, making it suitable for real world use. We demonstrate that its compact and lightweight design allows it to seamlessly integrate into fashionable jewelry.
Thermal attributes in the environment impact well-being, but their inclusion in standard well-being monitoring is challenging due to complex measurement requirements. Industry standards like the Predicted Mean Vote (PMV) index need numerous measures and specialized setups, making large-scale applications impractical. This study investigates predicting thermal perception ratings using only contextual factors. We conducted an ablation study using the Chinese Thermal Comfort Dataset (CTCD) and a Random Forest (RF) classifier to evaluate prediction performance with different contextual feature combinations on five labeling scales. Results showed that omitting measures required for PMV index calculation and relying on contextual features exclusively achieved F1 scores similar to those when including PMV measures. Key predictive factors included daily outdoor temperature and a person's clothing, weight, and age. These findings suggest that leveraging more accessible contextual data to estimate thermal perception ratings is promising, and further research should explore more contextual factors to enhance prediction accuracy and support well-being assessments.
Atrial fibrillation (AF) is characterized by irregular electrical impulses originating in the atria, which can lead to severe complications and even death. Due to the intermittent nature of the AF, early and timely monitoring of AF is critical for patients to prevent further exacerbation of the condition. Although ambulatory ECG Holter monitors provide accurate monitoring, the high cost of these devices hinders their wider adoption. Current mobile-based AF detection systems offer a portable solution. However, these systems have various applicability issues, such as being easily affected by environmental factors and requiring significant user effort. To overcome the above limitations, we present AcousAF, a novel AF detection system based on acoustic sensors of smartphones. Particularly, we explore the potential of pulse wave acquisition from the wrist using smartphone speakers and microphones. In addition, we propose a well-designed framework comprised of pulse wave probing, pulse wave extraction, and AF detection to ensure accurate and reliable AF detection. We collect data from 20 participants utilizing our custom data collection application on the smartphone. Extensive experimental results demonstrate the high performance of our system, with 92.8% accuracy, 86.9% precision, 87.4% recall, and 87.1% F1 Score.
The proliferation of mobile sensing technologies has enabled the study of various physiological and behavioural phenomena through unobtrusive data collection from smartphone sensors. This approach offers real-time insights into individuals' physical and mental states, creating opportunities for personalised treatment and interventions. However, the potential of analysing the textual content viewed on smartphones to predict affective states remains underexplored. To better understand how the screen text that users are exposed to and interact with can influence their affects, we investigated a subset of data obtained from a digital phenotyping study of Australian university students conducted in 2023. We employed linear regression, zero-shot, and multi-shot prompting using a large language model (LLM) to analyse relationships between screen text and affective states. Our findings indicate that multi-shot prompting substantially outperforms both linear regression and zero-shot prompting, highlighting the importance of context in affect prediction. We discuss the value of incorporating textual and sentiment data for improving affect prediction, providing a basis for future advancements in understanding smartphone use and wellbeing.
Driving under stress negatively affects driving behaviour, increasing the risk of dangerous traffic situations and accidents. To effectively reduce driving stress, it is crucial to gain insights into stress triggers using ubiquitous stress detection methods that facilitate the collection of real-world driving data. We developed a system that localises stressors along a route using heart rate data collected from a smartwatch. Stress sources are identified by detecting and classifying heart rate anomalies, which are then correlated with GPS locations. Our system differentiates between common and individual stressors using a scoring system and maps detected stress indicators on a stressmap, allowing comparison across multiple rides. Crowdsourced deployment enhances the precision of stressor localization and enables their association with specific road characteristics. Our findings indicate that heart rate anomalies reliably predict the locations of stressors, which are consistently observed across multiple rides and different drivers. These stressors are linked to traffic facilities and road features, such as intersections and traffic lights. By employing ubiquitous stress detection methods, we enable the collection of crowdsourced data, providing new insights into real-world driving stress. Our stress detection and visualization system aims to improve route guidance, particularly benefiting stress-prone, stress-sensitive, and frequent drivers.
Emotions are important indicators of well-being, underscoring the significance of accurate emotion recognition. Yet, the endeavor to recognize emotions in real-world settings, often referred to as "in the wild", presents formidable challenges owing to the subtle nuances of emotional expressions and the inherent variability across individuals. Working on this challenge, we present a novel approach aimed at enhancing emotion recognition accuracy, particularly in the context of challenging naturalistic video data. Our approach leverages the MediaPipe framework for feature extraction and incorporates transfer learning with traditional machine learning methods not based on neural networks and deep learning. Additionally, we explored the use of meta-models and an "oracle" experiment to further optimize emotion classification. These methods collectively contribute to a more robust and accurate system for real-time emotion recognition in naturalistic settings, which is an important step towards tracking well-being.
Software sector employees face challenges like managing time zones in distributed teams, long work hours, sedentary lifestyles, and limited time for extracurricular activities or fitness. These challenges affect all gender groups, including women, men, transgender individuals, and others. However, women face additional issues like gender bias, harassment, stereotype threat, and unequal opportunities, which harm their mental well-being, causing anxiety, stress, and depression. This paper investigates the challenges faced by women in the software sector that lead to emotional distress and job exits. We aim to develop a support system to address these issues. We first conducted a poll of 112 women employed in multiple software organizations worldwide to identify common challenges affecting their mental well-being. From this, we identified eight key issues. Next, we surveyed 80 women in various software companies to understand their perceptions of these challenges. The survey includes open-ended questions on the eight challenges identified from the poll. Using this data, we develop SOFTMENT (SOFTware sector MENTal well-being support system), a prototype that employs emotion detection approach to generate Mental Health Scores (MH-Scores). These scores help prioritize individuals based on mental well-being, enabling psychologists to quickly identify and assist those in urgent need.
Integrating physiological signals such as electroencephalogram (EEG), with other data such as interview audio, may offer valuable multimodal insights into psychological states or neurological disorders. Recent advancements with Large Language Models (LLMs) position them as prospective "health agents'' for mental health assessment. However, current research predominantly focus on single data modalities, presenting an opportunity to advance understanding through multimodal data. Our study aims to advance this approach by investigating multimodal data using LLMs for mental health assessment, specifically through zero-shot and few-shot prompting. Three datasets are adopted for depression and emotion classifications incorporating EEG, facial expressions, and audio (text). The results indicate that multimodal information confers substantial advantages over single modality approaches in mental health assessment. Notably, integrating EEG alongside commonly used LLM modalities such as audio and images demonstrates promising potential. Moreover, our findings reveal that 1-shot learning offers greater benefits compared to zero-shot learning methods.
Chord progressions influence the emotional response a song evokes, suggesting altering a song's chords can help a system control the emotions that a user experiences when listening to sonified data. In this paper, we explore how chord progressions could be used to musically communicate wellness information that invokes specific emotions, and whether a person's musical background affects their perception of emotions in music. We conducted two studies where participants provided structured or open-ended perceptions of emotions in harmonies. Our findings suggest that the emotional impact of a chord progression can be changed by various factors, including the chord transitions, the type of chords, and the chord's mode. Additionally, we found that a person's musical background may influence nuances in their perception of emotions conveyed through chords. This implies that future systems may need to consider individual musical preferences to function effectively. These results will allow for more accurate and effective emotional sonification-based communication with the user.
This study explores integrating Mixed Reality (MR) and Human-Computer Interaction (HCI) technologies to develop an advanced assistive device. The device improves grip stability for individuals suffering from Rheumatoid Arthritis (RA) and Osteoarthritis (OA). It preserves autonomy and improves overall well-being by combining MR with a servo motor locking mechanism to provide enhanced grip assistance. The concept incorporates digital overlays that deliver real-time guidance and feedback to aid physical manipulation and enhance rehabilitation, making it more immersive and tailored to individual needs. It outlines an iterative design approach that merges ergonomic design principles with sophisticated engineering, demonstrating MR's pivotal role in enhancing digital and physical interactions. Initial feedback from users has indicated significant enhancements in task performance and satisfaction, highlighting the device's capacity to promote independence and mitigate user discomfort. This study underscores MR's substantial potential to redefine the functionality of assistive devices, ultimately fostering greater autonomy and improved quality of life for individuals living with RA and OA.
Over the past decades, Electrodermal Activity (EDA)-based stress detection systems have observed increasing interest due to the significant impact of stress on daily life and the need for continuous, unobtrusive monitoring. The wearable form-factor and the unobtrusive nature of the EDA sensor make it highly suitable for this purpose. Many researchers have focused on developing EDA-based stress detection techniques, but few consider the resource footprint of such systems. Maintaining a good trade-off by manual optimization is a challenging task. Automation techniques can be very helpful in such scenarios. In this work, we introduce a novel method for the selection of an optimal feature subset, utilizing a deep Reinforcement Learning (RL) network on top of a Neural Architecture Search (NAS) framework to design tiny, customized models for on-device stress detection. The feature subset selection uses a multi-metric reward objective function to balance feature accuracy contributions against computational complexity, identifying the optimal feature subset. Initial analysis indicates that this optimal subset enhances accuracy while keeping the model size under 100 kB and minimally increasing computation.
Just-in-Time Adaptive Interventions aim to deliver the right type and amount of support at the right time. This involves determining a user's state of receptivity - the degree to which a user is willing to accept, process, and use the intervention. Although past work has found that users are more receptive to notifications they view as useful, there is no existing research on whether users' intrinsic motivation for the underlying topic of mHealth interventions affects their receptivity. To explore this, we conducted a study with 20 participants over three weeks, where participants interacted with a chatbot-based digital coach to receive interventions about mental health, COVID-19, physical activity, and diet & nutrition. We found that significant differences in mean intrinsic motivation scores across topics were not associated with differences in mean receptivity metrics across topics. However, we discovered positive relationships between intrinsic motivation measures and receptivity for interventions about a topic.
The increased availability of long-term tracking of personal data using wearables has created an interest in research communities to develop feedback modalities that aid the success of wearable, interactive, and mobile systems. Personal data relevant to athletes can be grouped into four pillars: physical health, mental health, nutrition, and sleep. This position paper reviews four existing sport data management systems that track and monitor one or more of the four pillars and discuss their suitability for sports athletes. These systems are assessed based on their purpose, how well athlete goals (sport-related, short- and long-term) are supported, and locomotion limitations. This paper finds that feedback provided back to an athlete about their multifaceted physical and mental status in existing systems need to be further personalised to the individual's goals and environmental context. Our research aims to take what is best of existing systems and enhance their capabilities to merge and visualise data about the four pillars and provide effective feedback that supports athlete goals.
The growing accessibility of wearables for sports encourages people to track their training progress and health information while exercising. The recorded information provides insights into the body and forms the basis for feedback systems. Though requirements for the feedback design vary greatly depending on the type of sport. Especially for activities involving the whole body, such as rowing. It trains strength and endurance, does not place excessive stress on leg joints, yet involves coordinating multiple muscles simultaneously, making it a complex sport by combining it all at once. In this paper, we look into research on feedback modalities and review the advantages and disadvantages of feedback examples in Human-Computer Interaction focusing on the rowing context.
In football matches, players perform various actions such as running, walking, passing, trapping, and shooting. Analyzing these actions is crucial for players to assess their performance and strategize effectively. Currently, the standard method for analyzing these actions involves setting up multiple cameras in the stadium to record the match and then analyzing the video footage obtained. However, this method is costly and is mainly accessible to professional teams with significant financial resources. It is used in specific environments such as training facilities and stadiums. There is a need for a system that can be easily used by amateur football players. This study proposes a method for analyzing the actions of football players using IMUs (Inertial Measurement Units) attached to both legs. The evaluation of the proposed method revealed that it was feasible to detect three actions - passes with the right foot, passes with the left foot, and shots with the dominant foot with an average F-score of 0.884 for three subjects.
This paper proposes a classifier based on a diffusion model for human activity recognition. To this end, we introduce three architectures: (1) the representation-conditional diffusion transformer, (2) the first classifier, RepcondFormer, and (3) the second classifier, RepcondClassifier. Experimental results show that RepcondFormer outperforms in three datasets, while RepcondClassifier performs better in one dataset when fine-tuned.
In our daily lives, we often pour liquids into containers, and it can be challenging to accurately gauge the liquid level in opaque, narrow-mouthed containers. Visual inspection is not always reliable and can lead to spillage. Using a liquid-level sensor is one option, but it requires attaching a device to each container, which is complicated. We propose a method that uses sound to estimate the water level inside a container. This method is versatile and does not require installing a device for each container. We conducted evaluation experiments, and the results showed that the estimation accuracy averaged 0.462 for bottle-dependent and 0.308 for bottle-independent estimation models when estimating water levels from 0% to 100% in 10% increments. Additionally, the average estimation accuracy for determining whether the water level is above 90% was 0.744, even when using a bottle-independent estimation model. These results suggest that while there is room for improvement in the estimation accuracy, the proposed method has potential applications in natural environments for overflow detection. In the future, we plan to develop a faucet-mounted device with a function to stop water pouring just before the container fills up.
The use of wearable devices and motion sensors has had a significant impact on sports. In this study, we propose a method to improve rowing performance by analyzing 3-axis acceleration data from smartwatches worn by rowers. The goal is to enhance synchronization and timing among multiple rowers on the same boat, which is crucial for competitive rowing. We conducted experiments using a Concept2 rowing ergometer and collected data from rowers wearing Apple Watches. The collected data was processed through filtering, peak detection, and distance calculations (DTW and Euclidean). Our evaluation demonstrated that the proposed method effectively identified discrepancies, providing a quantitative basis for performance improvement. The results emphasize the potential of wearable technology in providing precise, real-time feedback to rowers, ultimately assisting in better coordination and technique refinement.
In this paper, we propose a person intrinsic scaling method for our novel spatio-temporal encoding based 2D to 3D pose lifting technique from monocular fisheye video. While 2D pose estimation has advanced significantly, it lacks depth information and cannot capture out-of-plane movements. To handle this problem, 3D pose estimation is required. This paper proposes a framework for 2D to 3D pose lifting that tackles challenges in fisheye videos. Our method addresses fisheye deformation through a person's intrinsic scaling approach and leverages a transformer architecture with spatio-temporal self attention for pose lifting. We introduce a novel per-person scaling method to handle the depth ambiguity inherent in fisheye images. In a real life care facility data set, our framework outperformed Mediapipe, a state-of-the art method for pose lifting, in 4 different type of activity poses with a mean rmse, mae and mape value of 0.082, 0.047, and 0.046 respectively. This approach shows a new direction for 2D to 3D pose lifting technique.
In this paper, we try to show that by utilizing the temporal and context information in care record data in a well-guided manner, Large Language Models (LLMs) can help us correct the inaccurate and incomplete activity log. Nursing activity logs are essential for documenting patient care but often suffer from incomplete or inaccurate time data due to nurses' workload. This deficiency hinders care analysis, resource allocation, and quality control. While ma- chine learning struggles with the complexities of nursing narratives, LLMs hold promise. This research explores using LLMs to address missing and incorrect timestamps in nursing logs. We propose a temporal-context-aware prompting strategy that guides LLMs to think about the relationship between patient condition and activity occurrence and duration. In our experiment with five prominent LLMs, Gemini achieved the best accuracy (82%) with manageable hallucination rates (18%) for the Few-Shot Chain-of-Thought (CoT) prompting method. Here we try to show that by guiding the LLM in the correct direction, it is possible to improve the care record log. This approach has the potential to significantly improve the quality of nursing activity data, leading to better patient outcomes.
Human Activity Recognition is a time-series analysis problem. A popular analysis procedure used by the community assumes an optimal window length to design recognition pipelines. However, in the scenario of smart homes, where activities are of varying duration and frequency, the assumption of a constant sized window does not hold. Additionally, previous works have shown these activities to be made up of building blocks. We focus on identifying these underlying building blocks--structural constructs, with the use of large language models. Identifying these constructs can be beneficial especially in recognizing short-duration and infrequent activities, which current systems cannot recognize. We also propose the development of an activity recognition procedure that uses these building blocks to model activities, thus helping the downstream task of activity monitoring in smart homes.
In this paper, we improve nurse activity recognition by employing a Large Language Model (LLM) to generate synthetic pose estimation data. Keypoint data extracted from recorded videos of a single nurse performing Endotracheal suctioning (ES) activities using You Only Look Once v7 (YOLOv7) is used as a database. We explore the issue of data imbalances that hinder the effectiveness of activity recognition algorithms. To counter this, we utilize LLMs to artificially augment the dataset by generating varied synthetic samples through prompting strategies with different content and context. A Random Forest (RF) classifier is trained on annotations of medical activities and corresponding keypoints. Additionally, we generate synthetic datasets in equal volumes using Random Sampling and Generative Adversarial Networks (GAN) to benchmark against our LLM-based approach. To evaluate, we compared the performance between baseline data and different augmentation approaches. The similarity between original and synthetic data is measured using the Kolmogorov-Smirnov (K-S) test. The proposed approach using LLM with prompts containing the explanation of the task with the description of the datasets to generate synthetic data has improved the overall ES classification performance. Our study illustrates the critical role of context and content in prompts for optimizing LLMs for synthetic data generation.
In this work, we explore the use of a novel neural network architecture, the Kolmogorov-Arnold Networks (KANs) as feature extractors for sensor-based (specifically IMU) Human Activity Recognition (HAR). Where conventional networks perform a parameterized weighted sum of the inputs at each node and then feed the result into a statically defined nonlinearity, KANs perform non-linear computations represented by B-SPLINES on the edges leading to each node and then just sum up the inputs at the node. Instead of learning weights, the system learns the spline parameters. In the original work, such networks have been shown to be able to more efficiently and exactly learn sophisticated real valued functions e.g. in regression or partial differential equation (PDE) solution. We hypothesize that such an ability is also advantageous for computing low-level features for IMU-based HAR. To this end, we have implemented KAN as the feature extraction architecture for IMU-based human activity recognition tasks, including four architecture variations. We present an initial performance investigation of the KAN feature extractor on four public HAR datasets. It shows that the KAN-based feature extractor outperforms CNN-based extractors from CNN-MLP architecture-based models on all datasets while being more parameter efficient.
Although object detection technology using cameras offers potential for various applications, it incurs dataset creation costs to train new models where general-purpose models are ineffective, such as in industrial settings. We have previously developed a semi-automated annotation framework that employs optical flow and representation learning techniques to reduce human effort significantly. However, it was likely to cause unintended annotation omissions and mistakes compared to manual annotation. In this study, we propose a composite image generation approach to create omission-free and pattern-rich datasets. The proposed method synthesizes natural-looking images without unannotated targets by placing labeled foreground segments at their original positions on targetless background frames collected with the same fixed-point cameras. Evaluation with video footage in a logistics warehouse confirmed that improved dataset reliability led to higher model performance.
Context awareness is key to developing intelligent voice assistants that offer situated support for users performing various daily tasks, like cooking and machine use. Human Activity Recognition (HAR) with various sensors can be a powerful approach to provide assistants with user context, i.e., what they are doing within the task procedure. This workshop paper introduces PrISM---Procedural Interaction from Sensing Module---, a framework tool to develop and evaluate assistants for procedural tasks that utilize the context gained through HAR. The framework consists of several modules: data collection, HAR, postprocessing HAR outputs with task knowledge represented as a graph structure, and situated interactions such as step reminder and question answering. The framework is developed to be generalizable to different procedural tasks and input sensor sets (e.g., smartwatch, wearable camera, ambient sensor, etc). Developers and designers can use the framework to register a new task, collect an initial dataset, gauge expected accuracy regarding step tracking and interactions, and deploy the pipeline. We demonstrate example use cases in the cooking task and discuss future work.
Many IMU(inertial measurement unit) based human activity recognition (HAR) models have been developed. However, these models need to be more robust to misalignment of the IMU mounting position. Therefore, we compared and verified the robustness of existing IMU-based HAR models in terms of classification performance when the mounting position differs between training and estimation to clarify the characteristics of each method. We used the Meshed IMU Garment HAR Dataset (MIGHAR Dataset) for the evaluation. We compared the decline in recognition performance due to the distance between IMUs using five different models: DeepConvLSTM, ICGNet, iSPLInception, mobileHART, and rTsfNet. As a result, all models showed a decrease in recognition accuracy. However, the rTsfNet was the most robust, maintaining 55.04%PSIMU (Performance Percentage compared to training/evaluation using the Same IMU) even at the maximum Manhattan distance and showing a Macro F1-score (mf1) of 0.4384\pm0.0672. Second place went to the mobileHART, with 37.56%PSIMU and 0.2801\pm0.0526 mf1 at the maximum Manhattan distance. The DeepConvLSTM and the ICGNet were about the same, with 28.34%PSIMU and 29.16%PSIMU the maximum Manhattan distance and 0.1828\pm0.0675 and 0.1900\pm0.0423 mf1, and the iSPLInception achieved 22.75%PSIMU and 0.1712\pm0.0518 mf1. The conclusion was that dynamic 3D rotation of sensor data built-in rTsfNet ensures robustness against IMU mounting position errors.
It is important to use user authentication methods, such as bio-metric and behavior-based user authentications, for smart glasses. Smart glasses can collect data containing privacy information such as videos, audio, and behavioral information, as well as access personal information like emails and social networking services (SNS). This paper proposes a method that uses gaze information for user authentication by displaying an image known only to the user who registered it in advance, and several images generated from the registered image. These images are displayed on the screen of the smart glasses. Gaze information is obtained and authentication is conducted using an LSTM. Evaluation experiments were carried out by obtaining gaze information from eight subjects. Data from five subjects were obtained using a stationary device, while data from the remaining three subjects were obtained using a glasses-type device. The overall results showed 81.2% and 84.4% authentication success rates using the stationary device and the glasses-type device, respectively.
Face authentication technology is a common form of smartphone user authentication. There is a vulnerability in face authentication that a malicious person can exploit. Various security measures have been implemented to prevent face authentication from being breached by a sleeping registered user, photo and 3D mask. This paper proposes a method to develop a system that can authenticate a registered user only when they perform face authentication of their volition. The method involves normal face authentication and obtaining two pulse wave data from the front camera face image and the rear camera finger image. Peaks and valleys are identified from the sampled pulse wave data of the face and finger. Pairs of peaks and valleys from the two pulse wave data are then formed. If the sum of the average time differences between a pair of peaks and a pair of valleys falls within the threshold values, the system determines that the face and finger belong to the same owner. The evaluation experiment was conducted under three distinct lighting conditions: a ring light environment, a smartphone light environment, and a natural light environment. The best EER value reached 0.005 using 5-second pulse wave data in the ring light environment.
In the field of emotion recognition, traditional methods often rely on motion capture technologies to recognize human emotions by analyzing body motion. However, these methods are privacy-intrusive and impractical for everyday use. To address the requirements of privacy and practicality, this paper develops a novel personalized Automatic Emotion Recognition (AER) system utilizing inertial measurement units (IMUs) embedded in common wearable devices. Our approach emphasizes personalization to adapt to cultural and individual variations in emotional expression. To reduce the amount of data that needs to be collected from users, we employ cross-modality transfer approaches. These allow us to generate virtual IMU data from established human motion datasets, such as Motion-X and Mocap, thus enriching our training set without extensive real-world data collection. By integrating this virtual IMU data with real IMU data collected from participants, we have developed a personalized wearable-based AER system that is both less intrusive and more practical for real-world applications.
This work presents the solution of the Signal Sleuths team for the 2024 HASCA WEAR challenge. The challenge focuses on detecting 18 workout activities (and the null class) using accelerometer data from 4 wearables - one worn on each limb. Data analysis revealed inconsistencies in wearable orientation within and across participants, leading to exploring novel multi-wearable data augmentation techniques. We investigate three models using a fixed feature set: (i) "raw": using all data as is, (ii) "left-right swapping": augmenting data by swapping left and right limb pairs, and (iii) "upper-lower limb paring": stacking data by using upper-lower limb pair combinations (2 wearables). Our experiments utilize traditional machine learning with multi-window feature extraction and temporal smoothing. Using 3-fold cross-validation, the raw model achieves a macro F1-score of 90.01%, whereas left-right swapping and upper-lower limb paring improve the scores to 91.30% and 91.87% respectively.
In this report, we describe the technical details of our submission to the WEAR Dataset Challenge 2024. For this competition, we use two approaches to boost the performance of the official WEAR GitHub repository. 1) Integration of a Temporal-Informative adapter (TIA) into the models of the WEAR repository; 2) Data Augmentation Techniques to enrich the provided test dataset. Our method achieves roughly 4.7% improved results on the test set of the WEAR Dataset Challenge 2024 compared to the baseline of the WEAR repository.
The paper summarizes the contributions of participants to the sixth Sussex-Huawei Locomotion-Transportation (SHL) Recognition Challenge organized at the HASCA Workshop of UbiComp/ISWC 2024. The goal of this machine learning/data science challenge is to recognize eight locomotion and transportation activities (Still, Walk, Run, Bike, Bus, Car, Train, Subway) from the motion (accelerometer, gyroscope, magnetometer) sensor data of a smartphone in a way which is user-independent and smartphone position-independent, and as well robust to data missing during deployment. The training data of a 'train' user is available from smartphones placed at four body positions (Hand, Torso, Bag and Hips). The testing data originates from 'test' users with a smartphone placed at one of three body positions (Torso, Bag or Hips). In addition, the test data has one or multiple sensor modalities randomly missing from each time frame (5 seconds). Such a scenario may occur if a device turns on and off dynamically sensors to save power, or due to limited computational or memory capacity. We introduce the dataset used in the challenge and the protocol of the competition. We present a meta-analysis of the contributions from 7 submissions, their approaches, the software tools used, computational cost and the achieved results. Overall, one submission achieved an F1 score between 70% and 80%, two between 60% and 70%, three between 50% and 60%, and one below 50%. Finally, we present a baseline implementation addressing missing sensor modalities.
Human activity recognition (HAR) has developed rapidly in recent years due to its widespread applications in motion analysis, mobile health monitoring, security, and rehabilitation. However, due to missing sensor data, complex application scenarios, poor model robustness, existing HAR algorithms still cannot meet application requirements. In this context, the Sussex-Huawei Locomotion (SHL) recognition challenge provides a dataset for improving HAR algorithms. In this study, our team (SIAT-BIT) proposes a three-branch convolutional neural network framework for SHL recognition challenge. Firstly, the data is preprocessed for feature extraction, and then three classifiers are trained in parallel using three cross-entropy loss functions. The experimental results show that the proposed model achieves the best performance with the least model parameters. In addition, we further improved the performance through post-smoothing. Finally, we get an average accuracy of 0.9274 on the validation dataset.
In this paper, we describe the human activity recognition method of team TDU_BSA for the Sussex-Huawei Locomotion-Transportation (SHL) recognition challenge 2024. Using ensemble learning, deep learning, XGBoost, and LightGBM algorithms, we obtained high accuracy in estimating the activities even when one of the following data sensors was missing: accelerometer, gyroscope, or magnetometer. Using deep learning, eight activities were classified in a stepwise approach. XGBoost and LightGBM were used for activity estimation based on selected features obtained from sensor data. The calculated F1 score of the SHL validation set was 82.5%.
It is evident that large language models are not only capable of generating text and images but also of facilitating discussions on the latest research in the field of human activity recognition due to their extensive knowledge. The objective of this paper is to observe how novices in machine learning in 2024 can rapidly develop their skills with the assistance of ChatGPT. As an initial introduction, we started with a dialogue-based consultation with ChatGPT and executed a classification task using the Iris dataset on Google Colaboratory. This initial task served to deepen the novices' basic understanding of machine learning. Subsequently, the authors, as part of Team Shonan-Blue, participated in the Sussex-Huawei Locomotion-Transportation (SHL) recognition challenge (hereafter SHL recognition challenge). In this challenge, the authors developed a simple human activity recognition system without expert advice and relying solely on interactions with ChatGPT. The system was constructed by integrating four classical methods: k-Nearest Neighbor algorithm (kNN), Neural Network (NN), Random Forest (RF), and Support Vector Machine (SVM). The models were selected for their differing approaches, which were anticipated to enhance prediction accuracy when combined. For feature extraction, we utilized only the primary four features (maximum, minimum, mean, standard deviation) derived from 5-second sensor data collected by accelerometers, gyroscopes, and magnetometers. The outputs of the four models were then processed ensemble to obtain the final prediction results. This yielded an F1 score of 0.936. Although the limited challenge period may not have permitted the optimal performance, the observed growth of novices who, with the assistance of ChatGPT and without the use of GPU power, managed to develop a functional system, is encouraging for future newcomers in the field of activity recognition.
The Sussex-Huawei Locomotion-Transportation (SHL) recognition challenge organized at the HASCA Workshop of Ubicomps 2024 presents a large and realistic dataset with different activities and transportation. The goal of this human activity challenge is to recognize 8 modes of locomotion and transportation (activities) in a user-independent manner based on motion sensor data in an opportunistic manner. In this paper, our team (We can fly) summarize our submission to the competition. We proposed an interpolation attention-based Kolmogorov-Arnold network (KAN-IA), a deep learning mothed for missing variable representations recovery and transportation mode classification. Our KAN-IA model first extract individual motion sensor feature representations based on three KANs, then the missing and normal sensor representations outputs are fed into IA model to recover the mission variable representation. Finally, we use a linear layer to output the classification predictions. Our post-processing operations include three modules: PCA-based transformation, random masking of sensor data, and the ensemble learning. In the experiment, we achieved averaged F1 score of 0.55 on valid datasets.
This paper presents the approach and results of Team Bun-Bo for the Sussex-Huawei Locomotion-Transportation (SHL) recognition challenge focusing on human activity detection. The objective of the challenge is to classify 8 modes of transportation - standing still, walking, running, biking, car, bus, train, and subway - using data collected from Inertial Measurement Units (IMUs). Our approach involved extensive feature extraction from raw and processed kinematic indices, including acceleration, gyroscope, and magnetic data. Specific features included total acceleration across 3 axes and the horizontal plane, angles between different sensor axes, and successive differences of kinematic indices. We applied three ensemble boosting models under varying training and validation scenarios. Our findings highlight the Extreme Gradient Boosting Classifier (XGB), which achieved superior performance with an accuracy of 92% and an F1-score of 93%, outperforming other classifiers by margins ranging from 5% to 15%. This study underscores the effectiveness of advanced feature extraction techniques and ensemble learning models in enhancing the accuracy of transportation mode recognition systems based on IMU data.
The main objective of this study is to develop an algorithmic pipeline that recognizes human locomotion activities using motion sensor data from smartphones. The pipeline aims to minimize classification errors caused by individual differences and variations in sensor measurement locations. In particular, the dataset provided for the 2024 SHL recognition challenge comprises three types of sensor modalities, with certain motion sensor randomly missing. To address this challenge, our team, 'HELP', presents an algorithmic pipeline that combines a convolutional neural network architecture with hand-crafted feature engineering to accommodate diverse features from motion sensor modalities. We also specify the preprocessing schemes used to create the augmented input for training the proposed pipeline. We then conduct experiments to compare its performance with existing machine learning classifiers, verifying its relative superiority.
This work presents the solution of the Signal Sleuths team for the 2024 SHL recognition challenge. The challenge involves detecting transportation modes using shuffled, non-overlapping 5-second windows of phone movement data, with exactly one of the three available modalities (accelerometer, gyroscope, magnetometer) randomly missing. Data analysis indicated a significant distribution shift between train and validation data, necessitating a magnitude and rotation-invariant approach. We utilize traditional machine learning, focusing on robust processing, feature extraction, and rotation-invariant aggregation. An ablation study showed that relying solely on the frequently used signal magnitude vector results in the poorest performance. Conversely, our proposed rotation-invariant aggregation demonstrated substantial improvement over using rotation-aware features, while also reducing the feature vector length. Moreover, z-normalization proved crucial for creating robust spectral features.
In this study, we provide a detailed analysis of data augmentation techniques applied to inertial sensor data from the WEAR Dataset, aimed at improving Human Activity Recognition (HAR) in the WEAR Challenge. Augmentation techniques such as CutMix, MixUp, jittering, magnitude scaling were employed to enhance the diversity and robustness of the training dataset. These techniques aim to mitigate the challenges posed by limited and imbalanced datasets commonly encountered in wearable sensor data. We systematically applied these augmentation methods to a standard inertial dataset and evaluated their impact on the performance of various machine learning models. Our experimental results demonstrate significant improvements (about 3% increase in F1-Score and about 8% increase in Recall) in classification accuracy and model generalization, highlighting the efficacy of data augmentation in HAR applications.
Visual fatigue poses a significant challenge in Human-Information Interaction (HII) in VR. Target redirection, a method that helps mitigate shoulder fatigue, could potentially help reduce visual fatigue by guiding our eyes to move in certain ways. This paper proposes a study using eye movement redirection in VR to reduce visual fatigue from immersive HII tasks. We highlight two primary challenges, speed and range of redirection, that directly impact the user experience and effectiveness. We further introduce two studies aimed at tackling these problems: 1) to find the best parameters for redirection and 2) to evaluate the technique's usability. We hypothesize that this innovative approach will greatly contribute to user-friendly and safe gaze-based HII in VR, enhancing user experience and performance by reducing visual fatigue.
Electroencephalography (EEG) classification is a crucial task in neuroscience, neural engineering, and several commercial applications. Traditional EEG classification models, however, have often overlooked or inadequately leveraged the brain's topological information. Recognizing this shortfall, there has been a burgeoning interest in recent years in harnessing the potential of Graph Neural Networks (GNN) to exploit the topological information by modeling features selected from each EEG channel in a graph structure. To further facilitate research in this direction, we introduce GNN4EEG, a versatile and user-friendly toolkit for GNN-based modeling of EEG signals. GNN4EEG comprises three components: (i)~A large benchmark constructed with four EEG classification tasks based on EEG data collected from 123 participants. (ii)~Easy-to-use implementations on various state-of-the-art GNN-based EEG classification models, e.g., DGCNN, RGNN, etc. (iii)~Implementations of comprehensive experimental settings and evaluation protocols, e.g., data splitting protocols, and cross-validation protocols. GNN4EEG is publicly released at https://github.com/Miracle-2001/GNN4EEG.
This summary outlines my research toward developing intelligent wearable assistants that provide personalized, context-aware computing assistance. Previous work explored information presentation using smart glasses, socially-aware interactions, and applications for learning, communication, and documentation. Current research aims to develop tools for interaction research, including data collection, multimodal evaluation metrics, and a platform for creating context-aware AI assistants. Future goals include extending assistants to physical spaces via telepresence, optimizing learning with generative AI, and investigating collaborative human-AI learning. Ultimately, this research seeks to redefine how humans receive seamless support through proactive, intelligent wearable assistants that comprehend users and environments, augmenting capabilities while reducing reliance on manual labor.
This paper investigates user performance and gaze-hand coordination errors in gaze-pinch interaction across different task complexities involving gaze and hand gestures. We designed gaze-based single target pointing tasks with varying levels of gaze complexity: simple target, target with a visual cue, and target requiring visual search. Hand gesture complexity included simple thumb-index pinch and multi-finger pinching gestures (thumb-index, thumb-middle, thumb-ring). Our findings reveal that gaze-pinch coordination errors predominantly occur due to late triggering, where the gaze shifts away from the target before pinching begins. While hand gesture complexity did not significantly affect error rates, tasks requiring additional visual attention, like target recognition or visual search, increased gaze-pinch error rates. The study also highlights the impact of individual differences on gaze-hand coordination during interactive tasks. These individual differences underscore the need for personalized approaches in designing gaze-hand interaction systems that cater to diverse user profiles. Future work aims to address these challenges with larger participant groups and further explore factors influencing gaze-hand coordination in interactive systems.
Heads-up computing together with AI can enhance in-class learning experiences. In this position paper, we propose the development of a multimodal AI system called DeepVision that integrates Automatic Speech Recognition (ASR), Large Language Models (LLM), Large Vision Models (LVM), Information Retrieval (IR) and Inclusive User Experience Design (IUX) to convert real-time lectures into multiple knowledge representations. These will be visualized on heads-up communication devices such as Augmented Reality (AR) and Mixed Reality (MR) devices. The initiative is a collaboration between Habitat Learn Limited (HLL) and the University of Southampton, leveraging HLL's existing software and extensive data repository to address the challenges of traditional and digital learning environments, especially for students with disabilities or language differences.
Smart glasses, augmented by advances in multimodal Large Language Models (LLMs), are at the forefront of creating ubiquitous Artificial General Intelligence (AGI). This short literature survey reviews the latest developments in integrating LLMs with smart glasses, emphasizing the technological enhancements, the opportunities, and the challenges faced. In addition to covering recent studies, it also proposes future research directions to harness the full potential of this emerging technology.
We examine the trajectory and potential of heads-up computing technologies within the current computing landscape. Despite significant advancements and investments, this technology has yet to achieve widespread market penetration comparable to that of smartphones. Will these technologies remain niche products or do they have the potential to displace traditional mobile devices? To become fundamental in everyday life, we posit that both technological evolution and a deeper understanding of their utility and social integration are crucial.
We propose a subtle one-handed finger interaction technology for heads-up computing, which utilizes the proprioceptive advantages of the face as the contact surface combined with the imperceptible characteristics of micro-gestures to improve the social acceptability and privacy of hand-to-face gestures in public environments. Its implementation was based on the use of a finger IMU which can collect data generated by finger movement. We verified the feasibility of this interactive technique in three public environments and the results of user study showed that our Interactive technology performed good social acceptability and privacy.
Spatial Computing involves interacting with the physical world through spatial data manipulation, closely linked with Extended Reality (XR), which includes Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR). Large Language Models (LLMs) significantly enhance XR applications by improving user interactions through natural language understanding and content generation. Typical evaluations of these applications focus on user experience (UX) metrics, such as task performance, user satisfaction, and psychological assessments, but often neglect the technical performance of the LLMs themselves. This paper identifies significant gaps in current evaluation practices for LLMs within XR environments, attributing them to the novelty of the field, the complexity of spatial contexts, and the multimodal nature of interactions in XR. To address these gaps, the paper proposes specific metrics tailored to evaluate LLM performance in XR contexts, including spatial contextual awareness, coherence, proactivity, multimodal integration, hallucination, and question-answering accuracy. These proposed metrics aim to complement existing UX evaluations, providing a comprehensive assessment framework that captures both the technical and user-centric aspects of LLM performance in XR applications. The conclusion underscores the necessity for a dual-focused approach that combines technical and UX metrics to ensure effective and user-friendly LLM-integrated XR systems.
Music is a universal component of human culture, which influences emotion, and mental state and also has a direct impact on the physiological functioning of the body. Heart rate, breathing rate and heart rate variability have been shown to be impacted by music listening, although the exact impact of combinations of musical features on these parameters is mostly unexplored. However, exploring how these musical features influence physiology enhances our understanding of the potential of music to be used as a tool to regulate and provide interventions. In this paper, we present EarTune, a system for predicting changes in physiological parameters and subjective categorisation of the 'feeling' of a song using only the vital signs that can be collected using earables. With an accuracy of 70% for predicting the change in physiology due to music listening and an accuracy of 92% in predicting the user's 'feeling' of the song, EarTune paves the way towards the development of a system that can tailor music suggestions based on an individual's current physiological state, contextual state and emotional needs.
This paper addresses the critical task of gait cycle segmentation using short sequences from ear-worn IMUs, a practical and non-invasive approach for home-based monitoring and rehabilitation of patients with impaired motor function. While previous studies have focused on IMUs positioned on the lower limbs, ear-worn IMUs offer a unique advantage in capturing gait dynamics with minimal intrusion. To address the challenges of gait cycle segmentation using short sequences, we introduce the Gait Characteristic Curve Regression and Restoration (GCCRR) method, a novel two-stage approach designed for fine-grained gait phase segmentation. The first stage transforms the segmentation task into a regression task on the Gait Characteristic Curve (GCC), which is a one-dimensional feature sequence incorporating periodic information. The second stage restores the gait cycle using peak detection techniques. Our method employs Bi-LSTM-based deep learning algorithms for regression to ensure reliable segmentation for short gait sequences. Evaluation on the HamlynGait dataset demonstrates that GCCRR achieves over 80% Accuracy, with a Timestamp Error below one sampling interval. Despite its promising results, the performance lags behind methods using more extensive sensor systems, highlighting the need for larger, more diverse datasets. Future work will focus on data augmentation using motion capture systems and improving algorithmic generalizability.
Inadequate toothbrushing habits are a leading cause of oral health problems such as tooth decay. Many individuals are uncertain if they are brushing effectively or over-focusing on specific areas. While high-end electric toothbrushes can address these concerns, manual toothbrushes remain widely used due to their simplicity and affordability. In this paper, we introduce BrushBuds, an earphone-based toothbrushing monitoring system aimed at tracking brushing areas, which leverages the ubiquitous presence of earphones to enhance manual toothbrushing. BrushBuds utilizes Inertial Measurement Units (IMUs) in earphones to detect subtle head movements incurred by toothbrushing. By capturing distinct motion patterns specific to brushing for each tooth region, BrushBuds can effectively track the toothbrushing process. Our evaluation demonstrates the feasibility of BrushBuds, showing an average accuracy of 84.3% in identifying six distinct tooth areas. By enhancing manual toothbrushing with IMU sensors in earphones, BrushBuds has the potential to significantly improve oral hygiene practices for a broad range of manual toothbrush users.
The Autonomic Nervous System (ANS) is a crucial component of human physiology, governing vital functions through its sympathetic and parasympathetic branches. The Vagus Nerve, a primary conduit of parasympathetic communication, has profound impacts on heart rate, digestion, and inflammation, with significant clinical implications for conditions like anxiety, depression, epilepsy, and chronic pain. Vagus Nerve Stimulation (VNS), especially noninvasive transcutaneous auricular VNS (taVNS), has shown promise in treating these conditions. However, standardized stimulation parameters remain elusive. This study investigates the effects of taVNS following physiological stress induced by a cold pressor test, using heart rate variability (HRV) metrics derived from ECG signals to quantify ANS perturbations and taVNS effects. Our findings highlight significant modulation of ANS activity, demonstrating the efficacy of taVNS in enhancing parasympathetic activity.
In order for an earable to function, it is necessary for the sensor electrodes within the ear to remain in constant contact with the skin. However, body movements tend to disrupt the extent of contact, leading to noisy signals being captured, which are often difficult to distinguish from a valid EEG signal. It is, therefore, important to identify which channels are capturing EEG when the data is being recorded. In this work, we present an innovative method for channel identification using the manifolds of an EEG signal. Furthermore, we test the ability of these manifolds using a clustering algorithm to classify EEG and non-EEG channels and achieve an accuracy of 87.09% for the classification. The proposed method will help enhance the performance of various applications pertaining to EEG monitoring and processing.
This paper investigates the potential of earables for real-time boxing gesture recognition. While prior research explores earables in sports, there is a gap in applying them to boxing, particularly for defensive manoeuvre recognition. We address this gap by exploring the capability of real-time Inertial Measurement Unit (IMU)-based boxing head gesture recognition using the open-source OpenEarable framework. We employ classical machine learning and dynamic time-warping (DTW) approaches. A dataset across left/right slips, rolls, and pullbacks is collected from a hobbyist boxer. Our results suggest that DTW combined with gesture templates derived from barycenter averaging achieves high gesture recognition accuracy. The implemented algorithm achieves a testing accuracy of 99% on the collected dataset. This performance is further validated in a real-world scenario, where the algorithm maintains an overall accuracy of 96%. Additionally, the system demonstrates robustness to variations in gesture execution speed and intensity.
Earphones, due to their deep integration into daily life, have been developed for unobtrusive and ubiquitous health monitoring. However, these advanced algorithms greatly rely on the high quality sensing data. However, the data collected with universal earplugs could potentially generate undesirable noise, such as vibrations or even falling off. As a result, the algorithms may exhibit limited performance. In this regard, we build a dataset containing RGBD and IMU data captured by a smartphone. To provide a precise and solid ground truth, we employ additional control information from a robotic arm that holds the smartphone scanning ears along a predefined trajectory. With this dataset, we propose a tightly coupled information fusion algorithm for the ground truth ear modeling. Finally, we fabricate the earplugs and conduct an end-to-end evaluation of the wearability of the modeled earplugs in a user study.
Over the past several years, a growing body of literature has proposed systems that use earable-based acoustic sensing to assess cardiac function. These works have offered various explanations of how in-ear cardiac audio is produced. Most claim that the sounds are caused by compressive waves that travel directly from the chest, while others claim that the sounds are caused by the pulse wave producing arterial expansion near the ear canal. Although these explanations are not mutually exclusive, the lack of consensus raises questions about the working principles and possibilities in this growing research area. We present a series of experiments using a multimodal dataset of cardiac signals to test various hypotheses related to the production of heart sounds in the ear canal. Our results suggest that in-ear cardiac audio contains components produced by both compressive waves and pulse waves.
Electroencephalography (EEG) is a non-invasive tool for recording brain activity, typically using scalp electrodes. Ear EEG offers an alternative by leveraging the ear canal's proximity to the brain. This study evaluates the quality of signals from Ear EEG devices compared to traditional scalp and intracranial EEG. The Emory Prime Dataset, collected from 2019 to 2021 at Emory University Hospital, includes 1,255 hours of simultaneous ear EEG and clinical EEG (intracranial (n=9) or scalp (n=11) from 20 patients. Three ear EEG channels were recorded: channel 1 (right ear canal to left concha cymba), channel 2 (left ear canal to left concha cymba), and channel 3 (right canal to left canal). One hour of awake and one hour of sleep data were randomly selected per patient. Artifacts were visually identified, and signal quality was assessed using the autoreject algorithm. Quality scores for ear EEG and scalp EEG were compared for both awake and sleep segments. Signal comparisons included spectral parameter correlations and coherence analysis between ear EEG, scalp EEG, and intracranial EEG. Results showed better ear EEG quality during sleep than wakefulness, due to reduced movement and muscle activity. During sleep, ear EEG quality surpassed neighboring temporal scalp EEG channels and matched non-temporal scalp channels. Contralateral ear EEG resembled contralateral temporal scalp EEG more than unilateral configurations. Coherence between ear EEG and intracranial EEG was higher for electrodes near the brain surface, especially in delta, theta, and alpha frequencies. In conclusion, contralateral ear EEG during sleep provides quality comparable to scalp EEG and closely mirrors intracranial EEG activity, particularly in lower frequency bands associated with global brain processes.
Autism Spectrum Disorder (ASD) is a lifelong neurodevelopmental disorder characterized by restricted and repetitive sensory-motor behaviors and social communication deficits. Though in the recent decade, a rising interest has been demonstrated among the Human-Computer Interaction (HCI) and broader computing communities in involving multitudes of technologies to support autistic individuals especially children with autism over diverse aspects, the technologies are revealed to often reflect the normative expectations of a neurotypical society and be very one-sided. Thus, autistic children are commonly required to learn the 'appropriate' norm of social interaction defined by neurotypical society. In this work, instead of 'forcing' autistic children to accommodate technology, we propose a self-advocacy chat companion for autistic children based on the Large Language Model (LLM) as an early intervention tool. In the meantime, we hope it could also serve as a source platform that engages neurotypical users (such as parents) in understanding and learning how to interact and communicate with autistic children.
Internet of Things (IoT) technology and data visualisations have significant potential to support autistic children in their self-awareness, self-regulation, learning, and daily activities. IoT applications collect extensive data on behaviours, activities, and environmental and contextual factors. Traditionally, this data is often visualised to create more or less meaningful and actionable insights for the users. This paper examines the current landscape of data visualisations used for autistic children, identifying a significant gap in understanding how autistic children interact with these data visualisations. By highlighting the limitations of existing tools and guidelines, this work lays the groundwork for future experimental research aimed at developing tailored visualisation strategies and enhancing visualisation literacy in autistic children. Insights from this work advance the knowledge of autistic children's specific needs, ultimately contributing to designing more effective and inclusive visual interfaces to enhance their self-awareness, learning, and interventions.
Autism spectrum disorder (ASD) is a neurodevelopmental condition typically characterized by two core characteristics as restricted, repetitive behaviors and impairment in social communication. As such, a large portion of the technology studies within the Human-Computer Interaction (HCI) community has been focused on improving social skills for autistic individuals in the past decade. Meantime, autism is primarily perceived as a medical deficit that requires "corrections." Technology often requires autistic children to adjust their behaviors and traits to cater to "normative" modes of interaction that are deemed to be appropriate for the neurotypical society. The goal of most technology research for autistic individuals is to satisfy the extrinsic purpose while ignoring autistic children are the actual primary users using the technology. An ample number of studies have been evaluated on using different approaches to educate autistic children on comprehending and interpreting the emotions of neurotypical individuals. Yet, few works have taken the opposite approach that invites the neurotypical user to read the emotions of autistic children during social interaction. In this work, we propose an emoji-based interactive app that aims to aid autistic children in expressing their emotions while encouraging neurotypical individuals to comprehend the emotions of autistic children to foster bi-directional social interaction.
As mental health issues for young adults present a pressing public health concern, daily digital mood monitoring for early detection has become an important prospect. An active research area, digital phenotyping, involves collecting and analysing data from personal digital devices such as smartphones (usage and sensors) and wearables to infer behaviours and mental health. Whilst this data is standardly analysed using statistical and machine learning approaches, the emergence of large language models (LLMs) offers a new approach to make sense of smartphone sensing data. Despite their effectiveness across various domains, LLMs remain relatively unexplored in digital mental health, particularly in integrating mobile sensor data. Our study aims to bridge this gap by employing LLMs to predict affect outcomes based on smartphone sensing data from university students. We demonstrate the efficacy of zero-shot and few-shot embedding LLMs in inferring general wellbeing. Our findings reveal that LLMs can make promising predictions of affect measures using solely smartphone sensing data. This research sheds light on the potential of LLMs for affective state prediction, emphasizing the intricate link between smartphone behavioral patterns and affective states. To our knowledge, this is the first work to leverage LLMs for affective state prediction and digital phenotyping tasks.
Understanding how stress evolves over time is crucial for improving stress detection research. This study examines temporal dependencies in self-reported stress data. We analyzed three self-reported stress datasets to explore how past and present stress levels correlate, utilizing the autocorrelation function (ACF). Our finding quantitatively showed that temporal dependencies in stress levels vary across participants, and the degree of these dependencies differs across datasets collected in different contexts. We provide some insights on how to consider temporal dependencies in self-reported stress for stress detection models, taking into account individual and contextual variations.
Mobile sensing and interventions have been a growing resource towards tracking and supporting mental health conditions. Most participants in research studies are willing to share and receive various forms of information with the app/researchers owing to external incentives. As we explore translating such work as a university or organization-level design for mental health apps, it is imperative to understand user preferences and openness to share different active/passive sensors and responses to different notifications. At a personal level mobile health features could provide valuable insights to an interested user. Additionally, quantifying the prevalence of such users at an organizational level can drive decisions on inclusive app design by the organization for its stakeholders. Through a survey-driven approach we explore user preferences and characterize personas of different users to promote the design of a mental health app for students in a large-scale university in the US. We find that while most users are generally open to share certain data, their preferences significantly vary by each sensor and that those who share one modality are very likely to share others.
Mobile sensing plays a crucial role in generating digital traces to understand human daily lives. However, studying behaviours like mood or sleep quality in smartphone users requires carefully designed mobile sensing strategies such as sensor selection and feature construction. This process is time-consuming, burdensome, and requires expertise in multiple domains. Furthermore, the resulting sensing framework lacks generalizability, making it difficult to apply to different scenarios. In the research, we propose an automated mobile sensing strategy for human behaviour understanding. First, we establish a knowledge base and consolidate rules for data collection and effective feature construction. Then, we introduce the multi-granular human behaviour representation and design procedures for leveraging large language models to generate strategies. Our approach is validated through blind comparative studies and usability evaluation. Ultimately, our approach holds the potential to revolutionise the field of mobile sensing and its applications.
Given the prevalence of missing data in longitudinal passive sensing studies, data imputation -- a critical preprocessing step -- is often overlooked by researchers in favor of other aspects of data analyses, like building sophisticated models or outcome prediction. In this paper, we seek to direct the attention of the behavioral and mental health-sensing community toward the importance of data imputation in such studies. In this work, we evaluate and benchmark off-the-shelf imputation strategies using the open-source GLOBEM platform and datasets. Our results demonstrate that using appropriate imputation strategies could improve performance by up to 25% increase in AUROC for predicting participants? future depression labels (self-reported PHQ-4) using past sensing data with the same model building and prediction pipeline as the GLOBEM platform, without compromising the inherent underlying structure of behavioral sensing data post-imputation. Furthermore, we observe that certain imputation strategies significantly improve the separability of predicted depression probabilities on the test data, compared to no or trivial imputation. Lastly, we present a case study of users with changing depression labels and demonstrate that by using these imputation strategies, we are better able to capture and trace within-person transitions of depression as compared to trivial or no imputation.
This article introduces a study to facilitate the early detection of preclinical Alzheimer's disease using consumer-grade technologies. It uses a smartphone app and a hybrid smartwatch to monitor cognition over time by collecting a comprehensive set of technology-reported, patient-reported, and performance-reported outcomes. This project focuses on continuous and passive data collection, highlighting the potential of ubiquitous devices to monitor cognitive changes in naturalistic settings. This article details the tools, study design, and metrics used to provide a scalable early mental health monitoring solution. Preliminary findings suggest the effectiveness of the data collection tools, demonstrating the promise of ubiquitous computing in advancing mental health research and facilitating early intervention strategies. The study has the potential to significantly impact public health by providing a scalable and non-invasive method for early detection of cognitive impairment, thereby enabling timely interventions and better management of Alzheimer's disease. Additionally, it follows an open data policy, ensuring that the collected data is available to other researchers for further analysis and use, promoting transparency and collaboration in the scientific community. This research underscores the need for innovative approaches to develop reliable, scalable solutions for early detection and monitoring of cognitive changes, contributing significantly to the growing field of ubiquitous technologies for mental health.
This protocol paper outlines a two-phased mixed methods study aimed at exploring the potential of smart rings for early detection of peri-menopausal depression. Peri-menopausal depression impacts approximately 40% of people during this life phase. Yet, few people are aware of this statistic, including clinicians. This lack of awareness has cascading impacts on seeking treatment. The aim of this research is to assess the utility of smart rings in the early detection of peri-menopausal depression. In Phase 1, participants who are in the average age range of perimenopause (i.e., age 40 to 50) will wear as smart ring for three months. The ring will continuously monitor participants' sleep patterns and body temperature, providing sensor data. Concurrently, participants will maintain weekly journals. Phase 2 involves interviews with participants to illuminate their experiences with the technology. These interviews will explore participant perceptions and any perceived benefits or limitations of using the smart ring. By integrating sensor data with qualitative participant experiences, this study aims to provide a comprehensive understanding of the feasibility and utility of smart rings as a tool for early detection of peri-menopausal depression. Findings from this research could potentially inform future interventions and healthcare practices aimed at improving mental health outcomes during peri-menopause.
Personal informatics has been widely used to support users' mental health management by identifying factors associated with health indicators from mobile data. However, it remains challenging for users to develop data-driven coping strategies for mental health issues, especially when targeting specific situations involving multiple factors. To address this, we suggest an analysis pipeline for investigating counterfactual scenarios using mobile data. We also show how the pipeline generates counterfactuals from an open dataset, illustrating the feasibility of this approach in providing practical guidelines for each unique situation. Moreover, we discuss several considerations for integrating the proposed pipeline into personal informatics systems.
Despite all the developments in AI and Ubiquitous computing, man- agement of mental health and mental disorders mainly relies on human assessments, like daily self-reporting and standard psy- chological questionnaires. Self-reporting however, in addition to compliance issues, comes with the drawback of being subjective and thus often inaccurate. In a depressive episode, it is hard to recall the manic phase of the last weeks. Thus, mainly experienced and self-aware patients can use self-reports effectively. Even though, in the last 15 years, sensor-based objective algorithms to monitor mental disorders showed promising results [3 ],[1], such systems are not established in psychiatric care. We believe it would help psychi- atrists and patients greatly to consider using objective sensor-based support systems if it would be possible to visualize the drawbacks of self-reporting and the abilities of sensor-based analysis. Our work provides a direct comparison of the performance of sensor-based analysis, self-reporting, and psychological diagnosis. It is based on a real-life data set collected with psychiatric patients. It consists of smartphone sensor data, respective daily self-reporting ques- tionnaires, and a ground truth of standardized psychiatric scale tests. As a highlight of this work, in the evaluation, we can provide evidence that observations of deferred self-perception of patients concerning their mental states, as doctors reported to us, can be measured in the sensor data.
Hand-tracking technology is a pivotal input method in augmented and virtual reality environments, providing enhanced interaction accuracy through micro-gesture recognition. This allows users to control devices with minimal knuckle movements, ensuring privacy and accessibility for individuals with mobility impairments. Building on the foundation of human capacitance, this paper introduces a novel approach termed human capacitance-based micro gesture (HCMG) recognition. This system employs capacitive sensors integrated within the inner lining of a wrist guard, capable of detecting subtle changes in skin-to-electrode contact caused by finger joint movements. Our approach leverages the inherent properties of human capacitance to facilitate accurate and efficient micro-gesture recognition. HCMG achieves recognition of five common micro gestures with an accuracy of 95.0%, providing a promising solution to address the limitations of existing techniques.
The pandemic has accelerated the adoption of virtual concerts, bridging the physical and virtual worlds for performers and audiences. However, the emotional connection of live concerts remains irreplaceable. To address this issue, we propose DualStage, a real-time emotional visualization system for synchronizing live and virtual performances. The system will visualize the collected data, encompassing various aspects such as the performers' movements, the audience's interactions, and the tempo, rhythm, and intensity of the music. By integrating these elements, the system aims to create a comprehensive and immersive visual experience that enhances the emotional connection between performers and both onsite and online audiences. We conducted an exploratory study with 22 participants to refine the system and enhance cross-reality engagement. The results showed that most participants recognized the feasibility and effectiveness of the system.
Mixed Reality (MR) technology has demonstrated its effectiveness in enhancing education, particularly in task-based training and the visualization of spatial concepts. This study examines the feasibility of using MR to assist EFL (English as a Foreign Language) children in learning English vocabulary and collocations. By analyzing different teaching methodologies, we identify two suitable strategies for MR-based instruction and develop two prototypes: Spelland and VerbVenture. Spelland uses situated learning to help children explore everyday objects and learn vocabulary through context, practicing pronunciation and spelling to earn letters for building new words and creating virtual objects. VerbVenture employs embodied learning to help children understand verb-noun collocations by interacting with virtual scenarios and performing actions, reinforcing word-object-action connections. We conduct a pilot study with EFL children to evaluate these prototypes, providing insights for further refinement and development of MR-based EFL learning applications.
Time perception is proven to be influenced by various factors, yet integrating these factors into a cohesive experience remains challenging. Our research explores creating a virtual environment where participants engage in short-term activities while experiencing a lengthened subjective sense of time. In the pilot study, we identified items expected to slow down subjective time perception. Next, in the user study, we create virtual environments consisting of their decision of 'liking' and 'selection' to determine their mental measurements and time perception. The result showed that participants 'liking' and 'selection' of items in the VE influenced their pleasure and focus. However, these factors did not significantly alter their perception of time. This finding underscores the intricacy involved in designing time-dilating VEs. Our next step will focus on designing a more nuanced approach to filtering and integrating environmental elements to manipulate time perception preciously.
Visual art appreciation for the visually impaired is not only a personal need, but also a need for equal social development. Our study investigates how tactile sensations can compensate for visual impairments in art appreciation. We assembled a focus group of experts to translate visual art elements into tactile equivalents by applying theories of art knowledge and appreciation. We deconstructed paintings into five primary elements-composition, content, color, light and shadow, brushstroke-and two secondary elements-style and emotion. These elements were then converted into tactile paintings for a series of Tactile Aesthetic Workshops aimed at visually impaired participants. The results showed that these tactile representations effectively allowed visually impaired individuals to appreciate art, with a success rate of 78.27%. Moving forward, we plan to enhance this approach by integrating olfactory and auditory elements to create a more immersive augmented reality (AR) art appreciation environment for the visually impaired.
Owing to people spending a large portion of their day sitting while working, commuting, or relaxing, monitoring their sitting posture is crucial for the development of adaptive interventions that respond to the user's pose, state, and behavior. This is because posture is closely linked to actions, health, attention, and engagement levels. The existing systems for posture estimation primarily use computer vision-based measurements or body-attached sensors; however, they are plagued by challenges such as privacy concerns, occlusion issues, and user discomfort. To address these drawbacks, this study proposed a posture-inference system that uses high-density piezoresistive sensors for joint reconstruction. Tactile pressure data were collected from six individuals, each performing seven different postures 20 times. The proposed system achieved an average L2 distance of 20.2 cm in the joint position reconstruction with a posture classification accuracy of 96.3%. Future research will focus on the development of a system capable of providing real-time feedback to help users maintain the correct sitting posture.
Providing convincing proprioceptive cues is essential for immersive virtual reality (VR) navigation. However, this is challenging for seated users with restricted mobility. To address this gap, this study proposes LegSense, a method designed to induce the walking sensation in VR via electrical muscle stimulation (EMS). This method activates the leg muscle senses in sync with the gait cycle without requiring physical motion to enhance users' immersion. We evaluated the efficacy of LegSense through a user study and confirmed its potential in terms of walking sensation, embodiment, and presence compared to other static conditions (baseline and vibro-tactile). Additionally, participant interviews confirmed that LegSense effectively creates a leg movement illusion, suggesting its potential applications in diverse virtual scenarios to enhance VR experiences for seated users.
Active engagement where children with autism spectrum disorder (ASD) are involved (e.g., educational and social activities) plays a crucial role in enhancing their cognitive, motor, and social development. This offers opportunities to enhance overall development, including learning abilities, physical coordination, and social interactions. Indirect methods, leveraging sensors and artificial intelligence (AI), have exhibited potential for enhancing engagement predictions but have been primarily focused within specific fields, resulting in a gap that leads to limited generalizability of ASD studies. This gap, due to small ASD sample sizes, presents a significant challenge as the annual ASD population increases, highlighting the need for practical and applicable research solutions, especially for general learning. In this work, we conducted expert interviews to explore the potential application areas of AI-infused systems that provide three levels of engagement status for children with ASD, ranging from "not engaged and out of control" to "highly engaged." Interviews with special educators revealed five key application areas for AI-driven engagement recognition: social skills training, stereotyped behavior modification, support for leisure activities, effective tutoring, and independent daily living skills. These findings highlight the potential of adaptive AI interventions in improving educational and daily outcomes, advocating for expanded applications for children with ASD.
Dementia, marked by progressive memory decline, threatens patients' emotional well-being and sense of identity. This study explores using reminiscence therapy, which leverages past experiences for cognitive and emotional support, addressing challenges like lack of therapist's personal memorabilia and privacy issues in group sessions. We propose a language model-based interactive system that collects and structures patient memories into context-aware quizzes using named-entity recognition, question-generation, and sentiment analysis. A pilot study with 11 dementia patients aged 60-85 evaluated the system's feasibility and effectiveness. Interface usability challenges were noted, highlighting the need for further development to enhance system usability and validate its effectiveness in larger clinical settings.
This study explores the potential of generative AI technology to support song-signing creation and proposes a design for an AI-based song-signing authoring tool. We identified the creative process, strategies, challenges, and requirements for an AI-assisted tool through focus group interviews with two song-signing creators. The findings suggest that generative AI can be applied to automate and streamline various aspects of the song-signing creation process, including music analysis, lyrics translation, and choreography generation. Based on these insights, we propose a design of an AI-based song-signing authoring tool incorporating large language models for sign language translation and recommendation, generative dance AI for sign language visualization, and customizable sign language avatars. The development of such tools has the potential to expand the participation of d/Deaf individuals in cultural and artistic activities and contribute to an inclusive cultural and creative ecosystem.
Self-annotation is significantly important in affective computing field but has the limitation of heavily relying on human cognition. To address this, we propose a method named PREFerence-based self-Annotation on a low-Budget (PREFAB). This paper shares our progress, focusing on predicting player arousal levels from game trajectories using a RankNet-based preference learning approach. Experimental results demonstrate that our method significantly outperforms existing benchmarks in terms of training performance. In terms of practicality, it successfully captures the tendency of the changes in players' arousal better than traditional approach.
Teleoperation, the remote manual control of robots, is primarily used in high-precision and safety-critical environments such as surgery, space exploration, and deep-sea exploration. Despite being a widely utilized technology, teleoperation relies on human cognitive abilities, leading to significant cognitive load for operators. To address this challenge, we propose a concept of a VR teleoperation haptic system that combines biomechanical simulation and electrical muscle stimulation to provide force feedback in a lightweight, wearable form by mimicking natural force generation without the need for external actuators. Our system is divided into two main components: the physical simulation part, which calculates the joint torques to replicate forces from the manipulator, and the electrical stimulation part, which translates torques into muscle stimulations. Through this integration, we expect our system to bridge the gulf of execution and evaluation, reducing cognitive load and enhancing teleoperation performance. This paper aims to discuss the detailed framework of our system and potential future research directions.
The rise of autonomous vehicles (AVs) has promoted the adoption of in-vehicle virtual reality (VR) for creating immersive experiences. However, these experiences can trigger motion sickness (MS) due to visual-vestibular mismatches. Traditional techniques, such as visual matching and scene manipulation, address MS but often neglect the impact of body posture changes. This study examines the effects of interactive VR tasks on passenger body posture during MS-inducing events, including turns and vertical displacements. Our findings reveal significant variations in user body postures relative to conditions with event-based designed interactive VR tasks, resulting in a reduction of MS symptoms. Specifically, participants engaged in interactive VR tasks showed improved posture alignment and body stability. These insights offer practical guidelines for developing adaptive VR content that proactively manages posture to alleviate MS, thereby enhancing passenger comfort in in-vehicle VR applications.
Advanced wearable digital assistants can significantly enhance task performance, reduce user burden, and provide personalized guidance to improve users' abilities. However, developing these assistants presents several challenges. To address this, we introduce TOM (The Other Me), a conceptual architecture and open-source software platform (https://github.com/TOM-Platform) that supports the development of wearable intelligent assistants that are contextually aware of both the user and the environment. Collaboratively developed with researchers and developers, TOM meets their diverse requirements. TOM facilitates the creation of intelligent assistive AR applications for daily activities, supports the recording and analysis of user interactions, and provides assistance for various activities, as demonstrated in our preliminary evaluations.
Head-worn displays (HWDs), like the Vuzix Z100 and North's Focals, are designed to be worn all day as smart eyeglasses while performing everyday tasks. These products aim to display information in our field of view (FOV), enabling monitoring information during daily activities. We conducted a study using the Quest 3, a high resolution color video pass-through virtual reality (VR) headset, to emulate everyday augmented reality glasses. Users walked on a predefined track while reading a message to explore the optimal display positioning for HWDs while walking. The use of HWDs was found to decrease walking speed, with display positions closer to the nose (between -24° and 0°) yielding better performance. Our results and observations indicate that closer-to-nose positioning reduces cognitive load and excessive head or eye movements, enhancing overall dual task performance.
Systems for recording nursing care activities using smartphones are becoming widespread in the nursing care field. As a result, it is becoming possible to predict future activities of care recipients based on the caregiving record history recorded in the system. Prediction of future activities will enable the development of various caregiving applications, such as preparation support for future caregiving activities, detection of missing entries in caregiving records, and sensor-based real-time activity recognition using the predictions as prior information. However, the caregivers are inherently busy and cannot record entries in real-time due to other tasks, resulting in many errors in time stamps of the entries, i.e., shift in occurrence time of an activity. When data with the many timing errors are used as training data in activity prediction methods, the performance of the prediction methods will deteriorate. In this paper, we propose an activity prediction method that is robust against the time errors. The proposed model has a module for correcting time errors and estimating whether or not a certain activity will occur from the current time to one hour later using the activity records of the past T hours. We evaluated the effectiveness of the proposed method using three datasets.
The aim of this study is to prepare children and adults for real magnetic resonance imaging (MRI) procedures using specialized training application. The software, designed to track movements and reproduce the characteristic sounds of the procedure, uses the Farneback optical flow algorithm to precisely detect even the slightest movements of participants. This allows for an assessment of the movements during an MRI scan training session in order to improve the patient's performance in real-life scanning. Our study demonstrates that simulator training improved participants' understanding of the procedure. Following the training, 40% of participants showed a decrease in the number of detected movements compared to the baseline test, and 33% of participants did not exhibit any movements in the subsequent test. These results demonstrate that the suggested training can effectively prepare patients for the MRI procedure.
As AI is increasingly embedded in human decision-making pipelines, it is necessary to explore the influence of explainable AI (XAI) on both appropriate and inappropriate patterns of human-AI reliance. While various models have been proposed, the impact of explanation style on reliance in the context of these models is under-explored. In particular, there is a lack of research surrounding how different styles of explanations, such as those presented by feature-based and example-based XAI methods, influence how humans interact with and engage with AI-enabled systems through the lens of specific patterns of reliance. We summarise the current literature in this space and conclude by proposing a study and analysis plan that can be used to compare the individual and combined influences of feature-based and example-based explanation styles.
We propose a study design for validating explainable AI (XAI) tools in longitudinal contexts, focusing on ubiquitous and wearable computing. We aim to address unique challenges such as evolving user behavior and temporal dependencies inherent in longitudinal data. The study design includes both general and longitudinal explanation validation, assessing five XAI constructs: trust, satisfaction, actionability, understanding, and reliance through end-user studies. We demonstrate the proposed study design through a working example utilizing two XAI techniques -SHAP-based feature importance and Bayesian counterfactual explanations- on the LAUREATE dataset, which includes machine learning models for estimating weekly affective states of students based on physiological and behavioral data. To the best of our knowledge, this is the first study design that aims to investigate the longitudinal aspects of XAI, by proposing to evaluate the impact of the longitudinal order in which instances and their corresponding explanations are presented to end-users.
Despite significant progress in model developments, evaluating eXplainable Artificial Intelligence (XAI) remains elusive and challenging in Alzheimer's Disease (AD) detection using modalities from low-cost or wearable devices. This paper introduces a fine-grained validation framework named 'FairAD-XAI', which provides a comprehensive assessment through twelve properties of explanations, forming a detailed Likert questionnaire. This framework ensures a thorough evaluation of XAI methods, capturing their fairness aspects and supporting the improvement of how humans assess the reliability and transparency of these methods. Moreover, fairness in XAI evaluation is critical, as users from diverse demographic backgrounds may have different perspectives and perceptions towards the system. These variations can lead to biases in human-grounded evaluations and, subsequently, biased decisions from the AI system when deploying. To mitigate this risk, we installed two fairness metrics tailored to assess and ensure fairness in XAI evaluations, promoting more equitable outcomes. In summary, the proposed 'FairAD-XAI' framework provides a comprehensive tool for evaluating XAI methods and assessing the essential aspect of fairness. This makes it a multifactoral tool for developing unbiased XAI methods for AI-based AD detection tools, ensuring these technologies are both effective and equitable.
Back-support exoskeletons(BSEs) show promise in preventing occupational low back pain (OLBP). To address a counter-intuitive interpretation of users' demand with the conventional BSEs, which are controlled based on the lifting speed, and offer users to interpret and anticipate the system's output performance. This paper develops a non-collocated, easy-to-equip wearable system enhancing BSEs' capability to perceive operational loads. We designed a Convolution-inspired Trunk Event Recognition (CTER) algorithm for trunk lifting and bending motion detection. Moreover, conceptualizing human motion signals as audio sequences, the study proposed two deep learning classification models leveraging speech processing techniques. This system is capable of detecting body trunk lifting and bending motions, and features onboard estimation of payload categorized into 0, 5, 10, and 15 kg groups.
Recognizing daily activities with unobtrusive sensors in smart environments enables various healthcare applications. Monitoring how subjects perform activities at home and their changes over time can reveal early symptoms of health issues, such as cognitive decline. Most approaches in this field use deep learning models, which are often seen as black boxes mapping sensor data to activities. However, non-expert users like clinicians need to trust and understand these models' outputs. Thus, eXplainable AI (XAI) methods for Human Activity Recognition have emerged to provide intuitive natural language explanations from these models. Different XAI methods generate different explanations, and their effectiveness is typically evaluated through user surveys, that are often challenging in terms of costs and fairness. This paper proposes an automatic evaluation method using Large Language Models (LLMs) to identify, in a pool of candidates, the best XAI approach for non-expert users. Our preliminary results suggest that LLM evaluation aligns with user surveys.
Time-series classification is seeing growing importance as device proliferation has lead to the collection of an abundance of sensor data. Although black-box models, whose internal workings are difficult to understand, are a common choice for this task, their use in safety-critical domains has raised calls for greater transparency. In response, researchers have begun employing explainable artificial intelligence together with physio-behavioural signals in the context of real-world problems. Hence, this paper examines the current literature in this area and contributes principles for future research to overcome the limitations of the reviewed works.
When we are talking about explainable AI and trying to come out of the black box by going beyond algorithmic transparency, we are being ignorant about a big user community. Although XAI research has advanced over time, there hasn't been much study done on the development, evaluation, and application of explainability methodologies in the global south. In this paper, we focus on Bangladesh, which is a part of the South, to understand the AI user community of this region and show how the explainability needs are different for different users and who should XAI focus on. Our work reflects on the unique needs and constraints of the region and recommends potential directions for accessible and human-centered explainability research. We argue that before developing technology and systems, human requirements should be assessed and comprehended.
Smartphone data-driven mental health detection approach is widely used. However, existing methods focus primarily on enhancing model performance through numerous superficial smartphone usage features that do not effectively capture human behaviors, rather than interpreting the model's explanatory power. This study proposes a methodology of smartphone-based human behavior task modeling to advance "Explainable AI (X-AI)" for detecting mental health states. Our approach utilizes in-app UI-level trajectory data to extract more fine-grained human behavior task mining and modeling. We introduced novel concepts to investigate the nature and scope of smartphone-based human behavior tasks. This study discusses how these concepts can enhance the trustworthiness and transparency of models for both users and system providers. Furthermore, we expect that our approach could reduce data collection, thereby reducing memory usage and mitigating privacy concerns by enabling on-device learning.
We investigate domain adaptation for Human Activity Recognition (HAR), where a model trained on one dataset (source) is applied to another dataset (target) with different characteristics. Specifically, we focus on evaluating the performance of SelfHAR, a recently introduced semi-supervised learning framework rooted in self-training. Unlike typical semi-supervised approaches that leverage unlabeled data to enhance model performance on a labeled dataset, our investigation centers on evaluating the performance gain on the unlabeled target data. Our findings indicate that the SelfHAR algorithm can achieve performance levels nearly equivalent to supervised learning, achieving an F1 score of approximately 0.8 across datasets from different environments, even without labels for the target dataset. Furthermore, our approach consistently enhances performance compared to models trained solely on the source dataset, demonstrating its efficacy in adapting HAR models to diverse environmental conditions.
Step-counting has been widely implemented in wrist-worn devices and is accepted by end users as a quantitative indicator of everyday exercise. However, existing counting approach (mostly on wrist-worn setup) lacks robustness and thus introduces inaccuracy issues in certain scenarios like brief intermittent walking bouts and random arm motions or static arm status while walking (no clear correlation of motion pattern between arm and leg). This paper proposes a low-power step-counting solution utilizing the body area electric field acquired by a novel electrostatic sensing unit, consuming only 87.3 µW of power, hoping to strengthen the robustness of current dominant solution. We designed two wearable devices for on-the-wrist and in-the-ear deployment and collected body-area electric field-derived motion signals from ten volunteers. Four walking scenarios are considered: in the parking lot/shopping center with/without pushing the shopping trolley. The step-counting accuracy from the prototypes shows better accuracy than the commercial wrist-worn devices (e.g.,96% of the wrist- and ear-worn prototype vs. 66% of the Fitbit when walking in the shopping center while pushing a shopping trolley). We finally discussed the potential and limitations of sensing body-area electric fields for wrist-worn and ear-worn step-counting and beyond.
M-MOVE-IT is an open-source framework that simplifies data acquisition, annotation, and AI training for wearable technology. It addresses the challenges of synchronizing video and IMU data, making it easier to develop AI models for healthcare, sports, wildlife monitoring, anti-poaching, and livestock management applications. The framework automates and streamlines managing sensors, subjects, and deployments, synchronizing data, and annotating activities. M-MOVE-IT uses the real-time clocks of sensors and an offset annotation step to achieve precise synchronization, automatically parses sensor metadata, and generates annotation tasks. The export module provides data in JSON format for easy use in AI training. M-MOVE-IT's design supports active learning and human-in-the-loop development, enhancing the efficiency and scalability of wearable technology research.
While traditional earphones primarily offer private audio spaces, so-called "earables" emerged to offer a variety of sensing capabilities. Pioneering platforms like OpenEarable have introduced novel sensing platforms targeted at the ears, incorporating various sensors. The proximity of the ears to the eyes, brain, and facial muscles has also sparked investigation into sensing biopotentials. However, currently there is no platform available that is targeted at the ears to sense biopotentials. To address this gap, we introduce OpenEarable ExG - an open-source hardware platform designed to measure biopotentials in and around the ears. OpenEarable ExG can be freely configured and has up to 7 sensing channels. We initially validate OpenEarable ExG in a study with a left-right in-ear dual-electrode montage setup with 3 participants. Our results demonstrate the successful detection of smooth pursuit eye movements via Electrooculography (EOG), alpha brain activity via Electroencephalography (EEG), and jaw clenching via Electromyography (EMG). OpenEarable ExG is part of the OpenEarable initiative and is fully open-source under MIT license.
Earables, wearable devices worn around the ear, offer new possibilities for sports applications requiring precise head movement analysis, such as boxing. However, boxing-specific gesture recognition using IMU sensors integrated into earables remains underexplored. This work addresses this gap by investigating the potential of the open-source OpenEarable platform for real-time recognition of defensive boxing manoeuvres, including slipping, rolling and pulling back. We propose an extension to OpenEarable, integrating a Python server that leverages machine learning and dynamic time warping for gesture recognition. Furthermore, the web dashboard is enhanced to enable server communication and implement a gesture mirroring feature, providing real-time visual feedback. Real-time testing achieved a high accuracy of 96%, with feedback delivered within one second. All the system components are made available in a GitHub repository.
The traditional process of EEG-based BCI data collection is cum- bersome and expertise-dependent due to complex hardware in- strumentation, but also because data collection software typically comes as an on-premise solution with features tailored to trained experimenters. To overcome these limitations, we herein present a customizable platform for EEG data collection using OpenBCI hardware, that simplifies and accelerates the EEG setup process. The proposed system integrates various adaptations of the Open- BCI recording hardware, enabling easy customizability for versatile electrode configurations. The frontend, developed with Vue 3 and D3.js, provides real-time impedance checks and data visualization, while the backend, utilizing Apache, Node.js, and Flask, ensures efficient data transmission and storage. This platform addresses critical needs for cross-platform compatibility, customizability, effi- cient session management, and user-friendly interfaces, making it ideal for large-scale and field studies. The automated data manage- ment and real-time feedback enhance data quality and reliability, supporting various research applications.
In recent years, earables have emerged as a focal point for exploring novel sensing principles on the ears. OpenEarable, an open reference platform, has been developed to integrate various sensors in and around the ears. However, it has thus far lacked the capability to capture audio both inside the ear canal and outside simultaneously. To address this gap, we introduce a novel earpiece for OpenEarable equipped with two microphones, designed to mimic commercial earbuds like AirPods, which commonly restrict access to internal microphones. This new earpiece utilizes the interchangeable 8-pin connector earpiece concept of OpenEarable, enabling the exploration of algorithms that leverage both internal and external sounds captured at the ears (e.g., noise canceling). By updating the existing app and dashboard, OpenEarable 1.4 seamlessly integrates with the existing suite of devices within the OpenEarable ecosystem. We validate the earpiece by showcasing its abilities to occlude the ear canal and measure the heartbeat of a wearer. The enhancement of OpenEarable facilitates streamlined and large-scale audio data collection, advancing the development of novel earable technologies.
Data collection is a core principle in the scientific and medical environment. To record study participants in daily life situations wearables can be used. These should be small enough to not disrupt the lifestyle of the participants, while delivering sensor data in an accurate and efficient way. This ensures a long recording time for these battery-powered devices. Current purchasable wearable devices, would lend themselves well for wearable studies. Simpler devices have many drawbacks: Low sampling rate, for energy efficiency and little support are some drawbacks. More advanced devices have a high-frequent sampling rate of sensor data. These however, have a higher price and a limited support time. Our work introduces an open-source app for cost-effective, high-frequent, and long-term recording of sensor data. We based the development on the Bangle.js 2, which is a prevalent open-source smartwatch. The code has been optimised for efficiency, using sensor-specific properties to store sensor data in a compressed, loss-less, and time-stamped form to the local NAND-storage. We show in our experiments that we have the ability to record PPG-data at 50 Hertz for at least half a day. With other configurations we can record multiple sensors with a high-frequent update interval for a full day.
This paper presents the Synchronisation Wand, an open-source hardware solution which can be used to synchronise multiple inertial measurement unit sensors (IMU) using their onboard magnetometers. The wand is based on commercial off-the-shelf components with standard 4-layer PCB manufacturing and self-service assembly. The case can be 3D printed using any Fused Filament Fabrication (FFF) based printer. The system combines an ESP32-S3 micro-controller unit with an electromagnetic (EM) generator to create an encoded EM event which can be used to synchronise multiple IMUs. The hardware includes an onboard IMU allowing the user to track the motion of the wand as well as perform kinetic synchronising events. The device also includes an OLED display and 4 configurable tactile switches to enhance the usability of the system. We demonstrate the device's capability to generate an encoded EM pulse which can be used to reduce the maximum synchronisation error between two IMUs down to 10ms.
Colorimetric biosensors offer significant potential for real-time health monitoring through wearable technologies. However, the design and implementation of these sensors require careful consideration of various wearability factors that are commonly overseen in more scientific approaches. This paper proposes a list of wearability factors for colorimetric biosensors. We compare these factors with traditional skin interfaces and wearable technologies to highlight unique considerations for colorimetric biosensors. We discuss these factors in the context of two distinct colorimetric biosensor interfaces: an on-body lipstick and a permanent tattoo concept.
Wearable and mobile technologies hold the promise to be used 'on the move' as they are portable. It is common to listen to music during a run, track steps during the day, and have phone calls on the go. More advanced systems support, for example, athletes in increasing their performance, enhance their technique, and discuss sports data with a coach. To facilitate these experiences, providing such often multimodal feedback that is understandable, timely, and does not interfere with the sports activity is key. In this workshop, we will discuss multimodal feedback for sports so that more people can profit from wearable technologies. We will use prior work on truly mobile interaction as a framework analyze existing works and will hands-on explore various modalities to experience the pros and cons of modalities in motion.
The recognition of complex and subtle human behaviors from wearable sensors will enable next-generation human-oriented computing in scenarios of high societal value (e.g., dementia care). This will require a large-scale human activity corpus and much-improved methods to recognize activities and the context in which they occur. This workshop deals with the challenges of designing reproducible experimental setups, running large-scale dataset collection campaigns, designing activity and context recognition methods that are robust and adaptive, and evaluating systems in the real world. We wish to reflect on future methods, such as lifelong learning approaches that allow open-ended activity recognition. This year HASCA will welcome papers from participants to the Fifth Sussex-Huawei Locomotion and Transportation Recognition Challenge and the First WEAR Dataset Challenge in special sessions.
Modern smartglasses, such as Apple's Vision Pro and Meta's Quest series, feature high-end displays and rich multi-modal sensing. They are on the brink of redefining the concept of wearable intelligent assistants. However, despite this unfolding technological revolution, there is much we do not know about how to design for such devices and, ultimately, how they will be used. Heads-Up Computing is vision for wearable displays that focuses on how they can enrich human interaction without being obtrusive, on how they can be designed to be both ergonomic and intuitive, and how the interactions that drive them can be aware of, and respond to, a user's context and resources. Achieving this vision of seamless integration presents numerous challenges in areas as diverse as providing non-intrusive support, adapting to diverse contexts, enabling expressive input and safeguarding privacy. This workshop aims to convene domain experts to deliberate on these nascent issues and set a collaborative research agenda that prioritizes user-centered design in the ongoing development of Heads-up Computing and seeks to impact the next generation of wearable intelligent assistants.
Mental health and well-being influence overall health: suffering from a mental illness can create severe impairment and reduce quality of life. Ubiquitous computing technologies are beginning to play a central role in collecting clinically relevant behavioral and physiological information on mental health that can be used to detect symptoms early-on, deliver preventative interventions, and manage symptoms throughout the course of illness. Despite this potential, designing and translating ubiquitous technologies into mental healthcare is a complex process, and existing technologies have faced numerous challenges towards effective implementation. The goal of this workshop is to bring together researchers, practitioners, and industry professionals to identify, articulate, and address the challenges of designing and implementing ubiquitous computing technologies in mental healthcare. Given these challenges, we are adding a specific call for papers that inspire new research directions, with initial findings that are valuable to the community, but are not fully publishable or finished contributions. Following the success of this workshop for the last eight years, we aim to continue facilitating the UbiComp community in both the conceptualization, translation, and implementation of novel mental health sensing and intervention technologies.
Open hardware such as Arduino are an accelerator for research in ubiquitous and wearable computing. In recent years, an increasing number of open-source wearable devices is emerging. In this workshop, we seek to create a dedicated forum and venue for publication for topics around open wearable computers. The whole day workshop includes a keynote speech, paper presentations, demo sessions, group discussions, and networking opportunities. Through our activities, we hope to create a future where wearable technologies are accessible, interoperable and impactful across applications and industries.
The objective of the 5th ACM International Workshop on Earable Computing (EarComp 2024) is to provide an academic forum and bring together researchers, practitioners, and design experts to discuss how sensory earables technologies have and can complement human sensing research. It also aims to provide a launchpad for bold and visionary ideas and serve as a catalyst for advancements in this emerging new Earable Computing research space.
With the advancement of pervasive technology, information interaction has become increasingly ubiquitous. In these diverse information access devices and interfaces, it is crucial to understand and improve the user experience during human-information interaction. In recent years, we have seen a rapid uptake of physiological sensors used to estimate the cognitive aspect of the interaction. However, several challenges remain from a ubiquitous computing perspective, such as the definitions discrepancy of cognitive activities (e.g., cognitive bias or information need) and the lack of standard practice for collecting and processing physiological data in information interaction. In this workshop, we bring together researchers from different disciplines to form a common understanding of cognitive activities, discuss best practices to quantify the cognitive aspects of human-information interaction, and reflect on potential applications and ethical issues arising from physiological sensing methods.
With the advancements in ubiquitous computing, ubicomp technology has deeply spread into our daily lives, including office work, home and housekeeping, health management, transportation, and even urban living environments. Furthermore, beyond the initial metric of computing, such as "efficiency'' and "productivity'', the benefits that people (users) benefit on a well-being perspective based on such ubiquitous technology have been greatly paid attention to in recent years. In our seventh "WellComp'' (Computing for Well-being) workshop, we intensively discuss the contribution of ubiquitous computing towards users' well-being that covers physical, mental, and social wellness (and their combinations), from the viewpoints of various different layers of computing. After big success of the six previous workshops, with strong international organization members in various ubicomp research domains, WellComp 2024 will bring together researchers and practitioners from the academia and industry to explore versatile topics related to well-being and ubiquitous computing.
As ubiquitous computers permeate our daily lives, we increasingly encounter AI-infused physical systems, where physical entities are augmented with sensing, understanding, and actuating capabilities. While they have the potential to impact our physical reality, their proactive nature also poses unique challenges for users. Interpreting the reasoning behind the systems' behavior can be difficult due to the interplay between computations and physical exertion. Ensuring inclusive interactions is also a key issue, as the interactions and interfaces may not inherently accommodate users with diverse abilities, backgrounds, or preferences. Achieving immersive interactions that blend digital and physical contexts requires understanding human behavior in dynamic environments. To address these interconnected challenges, we propose a workshop for interdisciplinary collaboration among researchers and practitioners from ubiquitous computing, HCI, AI, robotics, design, and accessibility domains. This workshop explores novel approaches, methods, design guides, and case studies around these issues.
Rapid technological advancements are expanding the scope of virtual reality and augmented reality (VR/AR) applications; however, users must contend with a lack of sensory feedback and limitations on input modalities by which to interact with their environment. Gaining an intuitive understanding of any VR/AR application requires the complete immersion of users in the virtual environment, which can only be achieved through the adoption of realistic sensory feedback mechanisms. This workshop brings together researchers in UbiComp and VR/AR to investigate alternative input modalities and sensory feedback systems with the aim of developing coherent and engaging VR/AR experiences mirroring real-world interactions.
The workshop XAI for U aims to address the critical need for transparency in Artificial Intelligence (AI) systems that integrate into our daily lives through mobile systems, wearables, and smart environments. Despite advances in AI, many of these systems remain opaque, making it difficult for users, developers, and stakeholders to verify their reliability and correctness. This workshop addresses the pressing need for enabling Explainable AI (XAI) tools within Ubiquitous and Wearable Computing and highlights the unique challenges that come with it, such as XAI that deals with time-series and multimodal data, XAI that explains interconnected machine learning (ML) components, and XAI that provides user-centered explanations. The workshop aims to foster collaboration among researchers in related domains, share recent advancements, address open challenges, and propose future research directions to improve the applicability and development of XAI in Ubiquitous Pervasive and Wearable Computing - and with that seeks to enhance user trust, understanding, interaction, and adoption, ensuring that AI- driven solutions are not only more explainable but also more aligned with ethical standards and user expectations.
How can we ensure that Ubiquitous Computing (UbiComp) research outcomes are ethical, fair, and robust? While fairness in machine learning (ML) has gained traction in recent years, it remains unexplored, or sometimes an afterthought, in the context of pervasive and ubiquitous computing. This workshop aims to discuss fairness in UbiComp research and its social, technical, and legal implications. From a social perspective, we will examine the relationship between fairness and UbiComp research and identify pathways to ensure that ubiquitous technologies do not cause harm or infringe on individual rights. From a technical perspective, we will initiate a discussion on model generalization and robustness, as well as data processing methods to develop bias mitigation approaches tailored to UbiComp research. From a legal perspective, we will examine how new policies shape our community's work and future research. Building on the success of the First FairComp Workshop at UbiComp 2023, we have established a vibrant community centered around the topic of fair, robust, and trustworthy algorithms within UbiComp, while also charting a clear path for future research endeavors in this field.
The advent of generative artificial intelligence technologies, such as Large Language Models (LLMs) and Large Vision Models (LVMs), has shown promising results in both academic and industrial sectors, leading to widespread adoption. However, there has been limited focus on applying these technologies to assist children with special needs like Autism Spectrum Disorder (ASD). Meanwhile, conventional personalized training with interactive design for children with special needs continues to face significant challenges with traditional approaches. This workshop aims to provide a platform for researchers, software developers, medical practitioners, and designers to discuss and evaluate the benefits and drawbacks of using LLMs and the Internet of Things (IoT) for the diagnosis and personalized training of autistic children. Through a series of activities, including oral presentations, demonstrations, and panel discussions, this half-day workshop seeks to foster a network of experts dedicated to improving the lives of children with special needs and to inspire further research on leveraging emerging ubiquitous technologies for these underprivileged users, their caregivers and special education teachers.
Feature extraction remains the core challenge in Human Activity Recognition (HAR) - the automated inference of activities being performed from sensor data. Over the past few years, the community has witnessed a shift from manual feature engineering using statistical metrics and distribution-based representations, to feature learning via neural networks. Particularly, self-supervised learning methods that leverage large-scale unlabeled data to train powerful feature extractors have gained significant traction, and various works have demonstrated their ability to train powerful feature extractors from large-scale unlabeled data. Recently, the advent of Large Language Models (LLMs) and multi-modal foundation models has unveiled a promising direction by leveraging well-understood data modalities. This tutorial focuses on existing representation learning works, from single-sensor approaches to cross-device and cross-modality pipelines. Furthermore, we will provide an overview of recent developments in multi-modal foundation models, which originated from language and vision learning, but have recently started incorporating inertial measurement units (IMU) and time-series data. This tutorial will offer an important forum for researchers in the mobile sensing community to discuss future research directions in representation learning for HAR, and in particular, to identify potential avenues to incorporate the latest advancements in multi-modal foundation models, aiming to finally solve the long-standing activity recognition problem.