Simple solutions are the best ones
TalkBetter is made possible by SocioPhone, a software that Professor Lee and colleagues developed for face-to-face interaction monitoring. The platform monitors, among other behaviours, what is called turn-taking, which can be tricky to capture.
When people take turns in a conversation to speak, following these turns in real-time is really difficult, Professor Lee shares. Firstly, each turn is really short, lasting about 2.3 seconds, while traditional approaches require at least 4-5 seconds worth of data to identify who is speaking. Secondly, most vocal feature identification methods use complex algorithms that consume immense computational resources and battery power. To work around this, Professor Lee and colleagues got a little creative.
“What we have done is to use the volume feature in smartphones. When I speak, my phone captures the volume of my voice as louder, and yours as lower. This is how we managed to capture the turn-taking of the speakers with really low computational requirement,” Professor Lee explains.
“Also, with this simplified volume feature, you don’t need a lot of data to capture who is speaking—only 500 milliseconds’ worth of data.”
Professor Lee and colleagues presented their findings at MobiSys: The Annual International Conference on Mobile Systems, Applications and Services in 2013, a conference renowned for its coverage of mobile systems technology.
Unlocking the secrets of body language
Future research goals for Professor Lee include studying other dimensions in human conversations, such as non-verbal cues. To analyse facial expressions, hand gestures and vocal tones, the researchers will use cameras and motion sensors that are attached to the user’s wrist.
These types of non-verbal information are just as essential to teasing apart the intricacies of human interaction, as the TalkBetter study showed. During the testing phase, a child asked a question to which the mother didn’t respond verbally; she nodded her head and made eye contact, recalls Professor Lee. But system issued a warning anyway—a false alarm—because it could not pick up on any non-verbal contextual cues.
“We want to expand the scope of the sensing,” he says. “When we fuse all this sensing information together, only then can we understand human interaction from a holistic point of view.”
Asian Scientist Magazine is a media partner of the Singapore Management University Office of Research.
———
Copyright: SMU Office of Research. Read the original article here; Photo: Cyril Ng.
Disclaimer: This article does not necessarily reflect the views of AsianScientist or its staff.










