كليدواژه :
برآورد حالت , داده عمق , ضبط حركت بدون نشانهگذاري , كينكت , مدل سهبعدي
چكيده فارسي :
براي اجراي فرآيند ضبط حركت، لازم است داده هاي مناسب، در طول زمان با دنبالكردن نقاط كليدي از هدف مورد نظر استخراج شوند. با اين داده ها و طي يك سري عمليات پس پردازشي كارهاي زيادي از جمله ساخت مجدد آن حركت در فضاي سه بعدي ميتوان انجام داد. در اين مقاله يك الگوي برآورد حالتهاي پوياي دست مبتني بر مدل با استفاده از روش ضبط حركت بدون نشانه گذاري ارائه مي شود. در اين پژوهش حركات بازوي عامل انساني در قالب دنباله اي از تصاوير رنگي به همراه داده هاي عمق و اسكلت به دست آمده از كينكت (ابزاري براي ضبط حركت بدون نشانه گذاري) با سرعت سي فريم در ثانيه به عنوان داده هاي ورودي استفاده شده اند. الگوي پيشنهادي، ويژگيهاي زماني و مكاني از دنباله تصاوير ورودي استخراج ميكند و روي تعيين موقعيت نوك انگشتان شست و اشاره و به دست آوردن زواياي مفاصل ربات، به منظور تقليد حركت بازوي عامل انساني در سه بعد در يك محيط كنترل نشده تمركز دارد. در اين پژوهش از بازوي ربات واقعي RoboTEK II ST240 استفاده شده و حركات بازوي عامل انساني به حركات تعريف شده براي اين بازوي ربات محدود شده است. بردار ويژگي جهت برآورد حالت بهازاي هر فريم، به مختصات x، y و عمق برخي مفاصل و مختصات نوك انگشتان شست و اشاره نيازدارد. از دادههاي عمق و اسكلت براي تعيين زواياي مفاصل ربات استفاده ميشود؛ ولي تعيين نوك انگشتان به طور مستقيم با داد ه هاي موجود امكانپذير نيست؛ از اينرو سه رويكرد براي شناسايي نوك انگشتان شست و اشاره با استفاده از داده هاي موجود ارائه ميشود. در اين رويكردها از مفاهيمي همچون آستانه گيري، لبه يابي، ساخت پوسته محدب، مدلكردن رنگ پوست و تفريق پس زمينه استفاده مي شود. در پايان براي تقليد حركت، با استفاده از بردارهاي ويژگي بهازاي هر فريم، حالت متناظر بر روي بازوي ربات اعمال ميشود. براي ارزيابي تقليد حركت، مسيرهاي طيشده توسط قسمت نهايي دست عامل انساني و قسمت مجري نهايي بازوي ربات با هم مقايسه شدهاند. نمودارهايي كه ميزان تغييرات زواياي مفاصل را براي اين دو مورد نشان مي دهند، گوياي مؤثر بودن الگوي پيشنهادي در تقليد عملكرد بازوي انساني است.
چكيده لاتين :
Pose estimation is a process to identify how a human body and/or individual limbs are configured in a given scene. Hand pose estimation is an important research topic which has a variety of applications in human-computer interaction (HCI) scenarios, such as gesture recognition, animation synthesis and robot control. However, capturing the hand motion is quite a challenging task due to its high flexibility. Many sensor-based and vision-based methods have been proposed to fulfill the task.
In sensor-based systems, specialized hardware is used for hand motion capture. Generally, vision-based hand pose estimation methods can be divided into two categories: appearance-based methods and model-based methods. In appearance-based approaches, various features are extracted from the input images to estimate the hand pose. Usually a lot of training samples are used to train a mapping function from the features to the hand poses in advance. Given the learned mapping function, the hand pose can be estimated efficiently. In model-based approaches the hand pose is estimated by aligning a projected 3D hand model to the extracted hand features in the inputs. Therefore, the desired information to be provided includes state at any time. These methods require a lot of calculations which are not possible in practice to implement them immediately. Hand pose estimation using (color/depth) images consist of three steps: Hand detection and its separation Feature extraction
Setting the parameters of the model using extracted feature and updating the model To extract necessary features for pose estimation, depending on used model and usage of hand gesture analysis, features such as fingertips position, number of fingers, palm position and joint angles are extracted. In this paper a model-based markerless dynamic hand poses estimation scheme is presented. Motion Capture is the process of recording a live motion event and translating it into usable mathematical terms by tracking a number of key points in space over time and combining them to obtain a single 3D representation of the performance. The sequence of depth images, color images and skeleton data obtained from Kinect (a new tool for markerless motion capture) at 30 frames per second are as inputs of this scheme. The proposed scheme exploits both temporal and spatial features of the input sequences, and focuses on index and thumb fingertips localization and joint angles of the robot arm to mimic the user's arm movements in 3D space in an uncontrolled environment. The RoboTECH II ST240 is used as a real robot arm model. Depth and skeleton data are used to determine the angles of the robot joints. Three approaches to identify the tip of the thumb and index fingers are presented using existing data, each with its own limitations. In these approaches, concepts such as thresholding, edge detection, making convex hull, skin modeling and background subtraction are used. Finally, by comparing tracked trajectories of the user's wrist and robot end effector, the graphs show an error about 0.43 degree in average which is an appropriate performance in this research. The key contribution of this work is hand pose estimation per every input frame and updating arm robot according to estimated pose. Thumb and index fingertips detection as part of feature vector resulted using presented approaches. User movements transmit to the corresponding Move instruction for robot. Necessary features for Move instruction are rotation values around joints in different directions and opening value of index and thumb fingers at each other.