Tuesday, March 30, 2010

Simplified facial animation control utilizing novel input devices: a comparative study

Nikolaus Bee University of Augsburg, Augsburg, Germany
Bernhard Falk University of Augsburg, Augsburg, Germany
Elisabeth André University of Augsburg, Augsburg, Germany

Paper Link:

Animating facial expressions can be difficult due to the fact that most facial expressions involve the simultaneous movement of different muscle groups.
Graphic designers can usually only move one certain muscle group through a bar slider and a mouse however this makes good facial animation difficult.
So the team used a gamepad and data gloves that would allow parallel editing using several different mapping schemes.

The model for facial manipulation that the team used was the "Alfred" (Facial Action Coding System) FACS based model with 23 action units.
FACS was used for Gollum in Lord of the Rings. King Kong, and Half-Life 2.

For the gamepad the team chose the XBOX 360 controller because it was ergonomic, cheap, familiar, and can be easily connected to a Windows PC.
The 360 controller offered 2 analog sticks and 2 analog triggers.
The sticks offer x and y axis control with both negative and positive values allowing for 4 different parameters to be controlled by one stick.
The other stick control used a circular polar coordinate based control.

The digital buttons and directional pad was used for other control functions like switching the current setting and action unit mappings.
Mapping of the controls for the gamepad included three settings:
1) Upper face with 7 action units
2) Lower face 1
2) Lower face 2 - Inner lips

For the data glove, the team chose the "P5 Glove" which was originally designed for gaming, making it cheap and widespread. The data glove can provide for 5 simultaneous movements. The data gloves can register one dimensional finger bends and the orientation and positions of the hand making it a near perfect candidate to replace the traditional slider. The P5 glove has the following features:
• absolute position (x,y,z), relative position (x,y,z), and rotation
(yaw, pitch, roll)
• finger bend
• three additional digital buttons
Mapping of the data glove used 6 settings:
1) Brows - 3 AUs
2) Lids - 3 AUs
3) Cheek and Nose - 3 AUs
4) Corners of the Mouth - 4 AUs
5) Chin and Inner Lips - 4 AUs
6) Lips - 3 AUs
Selection of the 6 settings was done by moving the glove horizontally

After running a correlation analysis of Action Units to expressions
of joy, anger, fear, sadness, disgust, and surprise and finding the frequency of use of certain actions to each emotion, the team devised a context-based control mapping where higher used AUs to an emotion mode.

Professional Study:
They introduced the devices by coaching
and listened to the game developers think aloud.

Directly mapped gamepad interface: Liked
Context mapped gamepad interface: difficult to orient and get familiar, less control

Data glove: less familiar, not accurate, physically tiring, noisy, selecting a setting was difficult.

Formal User Study:
1) How users get along with novel input versus sliders
2) Enjoyment?
3) Assessment of technical features:

5 point scale, 17 users age 20 to 40, 76% students
training phase - mess around with the device
modeling phase - recreate a photo expression

Accuracy Speed Satisfaction of expression
Gamepad 4.26 148.06 s 3.63
sliders 4.56 168.29 s 3.84
dataglove 4.94 263.31 s 3.30

Gamepad - best mean scores, reduced production time, no loss of quality, 49% preferred

Data glove - can focus on work, slower, low quality, low comfort, insufficient accuracy, 24% preferred

Sliders - 27% preferred, have to shift focus, accurate, reasonable satsifaction
Interaction experience

My Spill:
I was surprised to learn that gamepads aren't already the standard method of input in facial design. It seems like the ergonomic design and multiple functions would lend itself to that role and the user study seems to reflect that.

That said, I wish the data glove would have produced some better results, it seems like you could make facial animations quickly with 5 different levels of control. Maybe they needed a better mapping scheme for it.

I think using sliders sounds boring and too difficult given that you can only control one thing at a time.

I'd like to see more work on the power glove to make it as efficient and enjoyable as the other two methods.

Saturday, March 27, 2010

Opening Skinner's Box

(Comment left on Jill Grezcek's blog)

Author: Lauren Slater

In Opening Skinner's Box, Lauren Slater examines 10 of the most influential psychological experiments of the 20th century and applies her own views and interpretations in nearly lyrical style to both entertain and illuminate readers on topics ranging from philosophy, existentialism, and views on the sacredness of life and the human mind.

The 10 psychological Experiments were:
1) Skinner's experiments on rats showing that autonomous responses are cued by rewards and reinforcements meaning that simple animals could learn complex tasks and skills and are more influence by reward than punishment.

2) Milgram's shock/obedience-to-authority experiments that had people put in the situation where they were instructed to shock another human being as punishment. The results showed that about 65% would deliver "fatal" shocks. The experiments were deemed unethical and dehumanizing and those involved were clearly changed by their involvement, although a few claim that they would not trade their experience.

3) Rosenhan's infiltration of psych wards by 8 normal people that challenged the foundation of psychoanalysis. 7 of the 8 infiltrators were held in the psych wards when they complained that they heard a voice say "thump" in their heads. The experiments made it clear that there was no clear way to psychoanalytically define mental conditions and that mental conditions may be more of a problem of perception and labels as opposed to actual illness.

4) Darley and Latane's discovery of the "diffusion of responsibility." Their experiments were inspired by the Kitty-Genovese case in New York City where a woman was killed and raped over a prolonged incident where there were as many as 38 eye-witnesses who did nothing to stop the assault. The Darley and Latane experiment itself had a subject sit in a room where they listened to a recording of a man having a seizure where others were supposedly listening. It took over 6 minutes for most to take action. Darley and Latane developed the five stages of helping behavior in response:
1 - You, the potential helper, must notice an event is occurring.
2 - You must interpret the event as one in which help is needed.
3 - You must assume personal responsibility
4 - You must decide what action to take
5 - You must then take action

5) Festinger and his "Theory of Cognitive Dissonance" where "The psychological opposition of irreconcilable ideas (cognitions) held simultaneously by one individual, created a motivatin force that would lead, under proper conditions, to the adjustment of one's belief to fit one's behavior" (Rather than vice-versa)
People would alter their beliefs to justify their behavior or their current circumstances. Slater investigated cognitive dissonance in Linda and her daughter who supposedly took in the pain of others to heal them. Festinger investigated a cult who believed in a cataclysmic event by aliens but never happened and observed the rationalizations and reactions of the believers. (They continued to believe despite evidence because they were explaining away their reactions)

6) Harlow's experiments on macaque monkeys and the nature of love and affection. Harlow deprived monkey's of their mothers and constructed a metallic surrogate with milk and a soft surrogate. The monkeys clinged to the soft surrogate. Harlow found that for proper development, proximity, touch, play, and affection is needed for primates. This led to changes in what kind of care is needed for infants to develop properly. Ironic that cruelty done to monkeys results reveals the nature of love and affection.

7) Alexander's experiments showing the nature of addiction being situational and cultural in his "Rat Park" experiment where caged rats and rats in an idealistic rat park were provided with clean and heroine laced water. Rats in the rat park stayed clean while caged rats got high. Furthermore rats forced to get high in rat park would overcome their addiction while going through withdrawal. A highly political situation.

8) Loftus's experiments on the nature of memory that showed that false memories can be easily created by mere suggestion. The chapter also showed her defense of people suddenly "remembering" traumatic childhood events that never existed. Her experiments had family members suggesting an episode of being lost in a mall to a subject. After 24-48 hours the subject would completely "remember" the fictional incident and minute details and feelings about the subject. Challenged the ideas of repression and countered popular thought and trends of the time.

9) Kandel conducted experiments on sea slugs to demonstrate the biological nature of learning and memory on a neuron level. He discovered CREB which switches on the genes needed to produce proteins that create permanent connection between cells which is how learning and memory is created. Drugs are being created to use this compound to enhance learning and recall in humans which raises ethical questions.

10) Moniz and his lobotomies that relieved anxiety and severe psychological symptoms. While lobotomies became popular for about 2 decades thanks to Moniz, there was a backlash when less precise (but also less invasive and controversial) pharmacological alternatives were provided. Although the mysteries of the brain are still many, today, lobotomies have become much safer and more precise and may actually be a preferable method to pharmacological alternatives. Lobotomies remain taboo and are hard to find mostly because the perception of lobotomies themselves and the fact that the surguries may "eliminate the spark" that makes us human. In essence, the brain is sacred.


My spill:
The book itself is an interesting read that brings up many intriguing and controversial questions about the nature of the mind and the sanctity of life.
The field of psychology has historically been a hard field to classify and I think this book addresses that point and most of its facets, including its human element, quite well.

The book is written in an artsy style that didn't quite meet up to my sensibilities as to how these kinds of subjects should be addressed. Don't get me wrong, I like the arts and can appreciate cultured expositions, but Slater's presentation of the material felt strained, dishonest, and too skewed to her own perceptions. I think if I met this woman, that her and I would disagree on a large number of issues.

It is interesting to note that East Asians are more comfortable with paradoxes at a biological level.

Tuesday, March 23, 2010

The Inmates are Running the Asylum (part 2)

(Comment left on Aaron Loveall's blog)

The second half of TIARTA was more a prescriptive fix on how
companies could implement effective design into their software development

Cooper pointed out that businessmen of the company are too focused on viability
while programmers are too focused on capability.
Cooper claimed that designers bridge the gap by providing desirability and earning customer loyalty.

Cooper recommended that companies should spend more time early in the devo
process and clearly identify goals of the system and specific fictional target users that reflect reality(called personas) and design for those crucial people.

Cooper also said that well designed software should:
be interesting in me
have common sense
anticipate needs
be responsive
taciturn about its personal problems
well informed
stay focused
give instant gratification
be trustworthy

After defining personas and tehir goals scenarios can be constructed
and should be defined in breadth rather that in depth.
-Daily uses: well designed
-necessary uses: available
-edge cases: addressed

Cooper then discusses design-friendly business practices and conceptual integrity in product vision. (Like not letting users or programmers run the process, but designers.)


My spill:

The second half of the book seemed more practical and useful since it actually prescribed some action for good business and software development.

All of the recommendations made by Cooper seemed reasonable.
Design is important and should come first and goal-oriented persona based design
seems quite reasonable.

Cooper still seems like he's bashing programmers a little bit though.

Thursday, March 11, 2010

Extending 2D Object Arrangement with Pressure-Sensitive Layering Cues

(Comment left on Jillian Greczek's blog)

Philip L. Davidson Perceptive Pixel, Inc, New York, NY, USA
Jefferson Y. Han Perceptive Pixel, Inc, New York, NY, USA

Paper Link:

Davidson and Han provide a pressure-sensitive depth sorting technique that utilizes two dimensional multi-touch manipulation techniques.

This depth sorting is most commonly known as layering.
Layering of objects in current models is usually done via a mouse input where commands are done with a relative control model where operations are done discretely.
For example you can click an element and tell it to "go to back" or "bring forward"

However with tabletop systems coming that utilize multi-touch technology, these kinds of commands are awkward.

Davidson's and Han's system allows the user to tilt and uplift objects on a pressure sensitive multi-touch surface and accurately manipulate and sort objects.

The system has several features:
-Windows/objects can be peeled back to uncover objects below
-Pressure Sensing that allows the exact ordering relative to other windows
-Multi-touch commands for resizing, moving, tilting, and rotating objects
-Audio and haptic feedback for overlap and tilting events.

The system also offers the benefit of minimizing the amount of control artifacts relating to depth sorting on the UI.


My spill:

This is a very solid piece of research pertaining to layering of objects and I can definitely see how this will be useful in tabletop system and projection smart spaces.

I agree with their observations in future work where permanently curled and folded corners would be a benefit. I'd also like to see how this kind of interaction could be applied to more three-dimensional figures on an interactive surface.

I really don't see too many drawback to this work.

Towards More Paper-like Input: Flexible Input Devices for Foldable Interaction Styles

(Comment left on Aaron Loveall's blog)

David T. Gallant Queen's University, Kingston, ON, Canada
Andrew G. Seniuk Queen's University, Kingston, ON, Canada
Roel Vertegaal Queen's University, Kingston, ON, Canada

Gallant et. al. present Foldable User Interface (FUI), a paper like input device for more organic/paper-like manipulation of on screen objects like windows, pages, or three dimensional models.

The main benefit of their interface is that it is cheap, unlike most similar input devices. It also is fairly robust and accurate.

To implement FUI, they used an IR webcam, an LCD screen, and a foldable input device (FID) made of black cardstock with 25-35 infrared reflectors made out of 3M retro-reflective tape.

FUI has several interaction techniques:
Thumb Slide - Select, click, pop-up menus
Scoop Shape
Top Corner Bend - Bookmarking
Hover - Magnify/Zoom
Fold - Helps create 3D models
Leafing - Turning Pages
Shake - triggers discrete events (like sorting)

Navigation is done by moving the FID itself.
My Spill:

While the FUI is an interesting idea with good observations about the properties of paper, I don't see this being a widespread method of everyday human-computer interaction.

However, I could see where this kind of technique might be useful in creating three dimensional models for certain kinds of specialists.

If they presented a literal desktop interface where documents are represented by paper like objects (I know there are a few out there) this system would be much more appealing. But for current GUIs, this isn't a very useful input system.

Wednesday, March 10, 2010

Annotating Gigapixel Images

(Comment left on ________'s Blog)

Qing Luan University of Science and Technology of China, Hefei, China
Steven M. Drucker Microsoft Live Labs Research, Redmond, WA, USA
Johannes Kopf University of Konstanz, Konstanz, Germany
Ying-Qing Xu Microsoft Research Asia, Beijing, China
Michael F. Cohen Microsoft Research, Redmond, WA, USA

Paper Link:

Luan et al.'s system provides a way of annotating gigapixel images with three kinds of annotations:
1) Looping Sounds
2) Triggering Narrations
3) Visual Labels

The system also exhibits hysteresis where sounds persist after moving away and strength the of the sound increases as we get closer.

Smaller annotations gradually appear as the user stays on a particular part of the image.

The user can add the annotation and provide audio files. The size of the annotation marker is referenced against the size of the original file and the strength of the associated audio files and size of the annotation labels is determined by that reference value.


My spill:

This kind of annotation system has obvious applications for things like google earth, space maps, and biological systems.

Providing these annotations make exploring these kinds of systems more fun, engaging, and informative and I could see this system coming to use for education purposes very easily.

The only thing I would want for this is a way to change your initial point of view to a place within the image when you're either zoomed in or panned out.

Lightweight Material Detection for Placement-Aware Mobile Computing

(Comment left on Randy Ransom's blog)

Chris Harrison Carnegie Mellon University, Pittsburgh, PA, USA
Scott E. Hudson Carnegie Mellon University, Pittsburgh, PA, USA

Paper Link:

Harrison and Hudson develop a lightweight cheap sensor for detecting the placement of mobile devices such as cell phones, ipods, and laptops. This sensor allows the device to detect the context/location that it is placed in and react accordingly.

For example a cell phone placed in a pocket doesn't have to light up it's screen to let the user know that a call is incoming. It just has to ring or vibrate. By preventing the screen from lighting up, the phone can save power and extend it's battery life.

The sensor that they implemented
1) provides info on space surrounding the device
2) Requires no external infrastructure to operate
3) The resulting data is available to use by the device.

The sensor itself is made of photoresistor which measures light intensity and a TSL230 light to frequency converter.
The sensor also has light emitting diodes
1) Infrared
2) Red
3) Green
4) Blue
5) Ultraviolet
that illuminates the surrounding area so that the sensors can pick up the reflected light back toward the device and proceed to deduce what kind of environment it finds itself in.

The sensing routine takes only 25ms and results in very low power consumption of 20mA when active.

They tested 27 sample materials over 6 trials where the first 5 trials trained the naive Bayes classifer and the 6th determined teh accuracy of the sensor. They found that the overall accuracy of the device was 86.9%

Next they conducted a 16 person survey of the environments that several mobile devices found themselves and the materials in those environments.

With those materials they ran the previous tests and found that the accuracy was now 94.4%


My spill:

I'm all for making devices smarter by identifying the context of the device. This makes machines more useful and require less intentional commands to get what you want out of it.

I think that they've succeeded for the most part in devising their sensor. Hopefully businesses will pick up on this sensor and implement it in their devices.
With a little bit of advertising. We could see a new generation of mobile devices that are smarter and more energy efficient.

Their work seemed pretty flawless on the sensor itself. If they had to make any improvements, I would have expanded the number of people they got to take the survey or maybe take the time to implement their sensor in a number of devices and implement a few uses to showcase their work.

Wednesday, March 3, 2010

Understanding the Intent Behind Mobile Information Needs

(Comment left on Jarratt Brandon's blog)

Karen Church Telefonica Research, Barcelona, Spain
Barry Smyth University College Dublin, Dublin, Ireland

Paper Link:

Church and Smyth conduct a diary based study on mobile internet usage and compare it to the non-mobile internet usage.
They defined mobile usage on a location basis where away from the home or office was "mobile." This allowed them to take into account all mobile devices.

The research team divided the entries into 6 location contexts:

Their study had 20 participants with an average age of 31 enter diary entries/descriptions for all their information needs via internet search over a month.
They reminded their participants of the study once week to enter entries.
Their methodology principle was to minimize the amount of interaction so that study participants would generate very natural data that could be analyzed.

The study generated 405 entries, of which 67% of the entries were made when mobile.
34% of the entries were in the on-the-go location context.

Their study also revealed that, in addition to Broder's classification of search

there needs to be 2 new categories

-personal information management (PIM)(personal items, tasks, scheduling)

In addition their study generated the info that mobile search engines should be able to pick up on location, time, context, user's activity.
Also a large portion of mobile searches are non-informational and principally geographical.


My spill:
This sounds like an interesting study in trying to make mobile searches more effective and sensitive to user demands based on the mobile context. The study pretty much generated what you would expect in a study like this.

Mobile devices need to be able to search for directions and have an idea your location to generate useful search information.

I think the biggest improvement in mobile searches needs to be done in making interfaces that are easier to view and navigate as well as making the infrastructure for such searches MUCH quicker. In fact many users in their study cited that a main reason why they didn't even search using a mobile device.

So I guess I'm trying to say that they're putting the cart before the horse by doing this study. Then again, I'm being too critical.

I like the idea of trying to make a device that is more aware about context. That's a major flaw in computer system in more way than one: language translation, understanding user input... That would make computers much smarter.

Although I think they did an excellent job with their methodology, I would have liked to seen more users in this kind of study to make a more accurate view of user behaviors. Future work might be to try and include searches that can more accurately identify personal information searches and tie the searchers in with scheduling programs and the like on the mobile device, or maybe interface with your own computer at home.