Projects

CyLab Mobility Research Center | Advisor: Prof. Martin Griss

OmniSense: Collaborative Mobile Sensing and Inference for Context-Aware Applications | 09/2009 - present

Developed a collaborative location recognition system using multiple nearby mobile phone sensors and a classifier fusion model. The system outperforms existing single-phone approaches with less limitation. The system is implemented in Python on Nokia N95 phones.

Multimedia Processing and Communications Lab | Advisor: Prof. Homer H. Chen

Automatic Chord Recognition for Music Search and Recommendation | 08/2007 - 06/2008

Proposed a hidden Markov model and N-gram based system for chord feature extraction. The system is of high accuracy, and the proposed features are tested to be more effective than MFCC and MPEG-7 audio features when applied to music emotion classification. (The project is in collaboration with Telecommunication Labs, Chunghwa Telecom Co.)

Audio Segmentation and Structure Analysis for Music Thumbnailing | 02/2008 - 05/2009

Proposed a multimodal system for music structural segmentation based on audio and textual information. Constrained clustering and natural language processing are used to detect music structures (intro/verse/chorus/bridge/outro) and the boundaries.

Affective Computing for Multimodal Music Emotion Recognition | 08/2007 - 06/2009

Proposed a multimodal approach to exploits audio/textual features by statistical techniques (e.g. PLSA) and developed an emotion-based music retrieval platform.

Speech Processing Lab | Advisor: Prof. Lin-shan Lee

Histogram-based quantization for robust and distributed speech recognition | 12/2007 - 06/2008

Modified the assumption of probability distribution used in histogram-based quantization. Quantized speech features become more robust for recognition under low SNR. The effects of environmental noise and channel noise in transmission are also tested.

Large vocabulary speech recognition system | 08/2007-12/2007

Proposed a multimodal system for music structural segmentation based on audio and textual information. Constrained clustering and natural language processing are used to detect music structures (intro/verse/chorus/bridge/outro) and the boundaries.

Course Projects

Gesture-Based Robotic Arm Control | Spring 2010

Control a robotic arm with 5 degree-of-freedom using human gesture. Recognize gestures using motion tracking algorithms implemented in OpenCV. Transmit inferred gesture sequences and remotely control the robot arm using SunSPOT platform and 802.15.4 radio.

Lunar Image Classification for Terrain Detection | Fall 2009

Investigated features and classification algorithm for terrain detection on lunar images from NASA. Achieved 95% accuracy using the proposed histogram of gradient orientation [3]. (in course Statistical Learning, collaborated with NASA Ames Research Center)

Fast mode decision algorithm for H.264/AVC intra-prediction | Spring 2008

Proposed (a) a variance-based method and (b) an improved filter-based method for macroblock and prediction mode decision for H.264 high-profile video coding using JM 13.2 reference software. (in course Digital Video Technology)

Automatic cloth segmentation based on Markov random field | Spring 2008

Devised an algorithm to segment cloth region of each person in images given detected face locations. Graph Cut algorithm is adopted to iteratively optimize some pre-defined energy functions. (in course Advanced Topics in Multimedia Analysis and Indexing)

Error detection and correction for data transmission over wireless channel | Spring 2008

Designed and built a system using turbo code and an automatic repeat request (ARQ) scheme. Evaluated BER and transmission time over Rayleigh channel. (in course Communication System Lab)

Interactive Billiards System on FPGA | Spring 2007

Designed a billiards game featuring real cue stick with LED as input for CMOS Camera, SRAM controller, VGA display, and physical simulation on FPGA. (in course Digital Circuit Lab

National Taiwan University