Comparative Analysis of the Datasets with Multimodal Content

Maksym Shulha; Yuri Gordienko; Sergii Stirenko

Home > High Performance Computing > 6th International Conference High Performance Computing HPC-UA 2020 > Deep Learning Section > Shulha

Maksym Shulha, Yuri Gordienko, Sergii Stirenko

Last modified: 2021-10-18

Abstract

Recent works have shown that multimodal content analysis is a very popular task in various applications including healthcare, security, marketing, etc. It can include a lot of subtasks, but joint vision and language understanding is one of the most trendy. It is needed to use some dataset to perform any machine learning task. Nowadays a lot of rich datasets have appeared. In this work we introduce comparison of datasets that are used in joint vision and language understanding tasks. We present a detailed analysis of the modern datasets, compare their basic characteristics, and describe their potential usage for some practical tasks, especially in the context of our previous works.

Full Text: PDF

Scientific Conferences of Ukraine, 6th International Conference High Performance Computing HPC-UA 2020

Abstract