• Data collection of 3D spatial features of gestures from static peruvian sign language alphabet for sign language recognition

      Nurena-Jara, Roberto; Ramos-Carrion, Cristopher; Shiguihara-Juarez, Pedro (Institute of Electrical and Electronics Engineers Inc., 2020-10-21)
      Peruvian Sign Language Recognition (PSL) is approached as a classification problem. Previous work has employed 2D features from the position of hands to tackle this problem. In this paper, we propose a method to construct a dataset consisting of 3D spatial positions of static gestures from the PSL alphabet, using the HTC Vive device and a well-known technique to extract 21 keypoints from the hand to obtain a feature vector. A dataset of 35, 400 instances of gestures for PSL was constructed and a novel way to extract data was stated. To validate the appropriateness of this dataset, a comparison of four baselines classifiers in the Peruvian Sign Language Recognition (PSLR) task was stated, achieving 99.32% in the average in terms of F1 measure in the best case.
      Acceso restringido temporalmente
    • DeepHistory: A convolutional neural network for automatic animation of museum paintings

      Ysique-Neciosup, Jose; Mercado-Chavez, Nilton; Ugarte, Willy (John Wiley and Sons Ltd, 2022-01-01)
      Deep learning models have shown that it is possible to train neural networks to dispense, to a lesser or greater extent, with the need for human intervention for the task of image animation, which helps to reduce not only the production time of these audiovisual pieces, but also presents benefits with respect to the economic investment they require to be made. However, these models suffer from two common problems: the animations they generate are of very low resolution and they require large amounts of training data to generate good results. To deal with these issues, this article introduces the architectural modification of a state-of-the-art image animation model integrated with a video super-resolution model to make the generated videos more visually pleasing to viewers. Although it is possible to train the animation models with higher resolution images, the time it would take to train them would be much longer, which does not necessarily benefit the quality of the animation, so it is more efficient to complement it with another model focused on improving the animation resolution of the generated video as we demonstrate in our results. We present the design and implementation of a convolutional neural network based on an state-of-art model focused on the image animation task, which is trained with a set of facial data from videos extracted from the YouTube platform. To determine which of all the modifications to the selected state-of-the-art model architecture is better, the results are compared with different metrics that evaluate the performance in image animation and video quality enhancement tasks. The results show that modifying the architecture of the model focused on the detection of characteristic points significantly helps to generate more anatomically and visually attractive videos. In addition, perceptual testing with users shows that using a super-resolution video model as a plugin helps generate more visually appealing videos.
      Acceso restringido temporalmente
    • A Novel Dataset for the Transport Sector in a Province of Peru

      Guerrero, Miguel Arango; Juárez, Pedro Shiguihara (2021-01-01)
      Problems related to public transport and private transport in Peru are persistent. New proposals to solve them arise, currently the world of data analysis is starting in Peru, there are not many open datasets useful that allow proposing solutions in each environment. In this paper, we will collect relevant data of the transport located in a province of Peru with more than 1000 users involved, restricted by a delimited geographic area and with 2 years of operations and more than 3000 transport services tracked. In this way, we highlight the importance of the data, the possible potential uses within the transport, and a case of use of the collected dataset.
    • Recurrent neural networks for deception detection in videos

      Rodriguez-Meza, Bryan; Vargas-Lopez-Lavalle, Renzo; Ugarte, Willy (Springer Science and Business Media Deutschland GmbH, 2022-01-01)
      Deception detection has always been of subject of interest. After all, determining if a person is telling the truth or not could be detrimental in many real-world cases. Current methods to discern deceptions require expensive equipment that need specialists to read and interpret them. In this article, we carry out an exhaustive comparison between 9 different facial landmark recognition based recurrent deep learning models trained on a recent man-made database used to determine lies, comparing them by accuracy and AUC. We also propose two new metrics that represent the validity of each prediction. The results of a 5-fold cross validation show that out of all the tested models, the Stacked GRU neural model has the highest AUC of.9853 and the highest accuracy of 93.69% between the trained models. Then, a comparison is done between other machine and deep learning methods and our proposed Stacked GRU architecture where the latter surpasses them in the AUC metric. These results indicate that we are not that far away from a future where deception detection could be accessible throughout computers or smart devices.
      Acceso restringido temporalmente