Advanced Journal of Environmental Sciences (AJES)

DEEP LEARNING APPROACHES TO PREDICT FUTURE FRAMES IN VIDEOS

Authors

  • Tariqul Islam () Daffodil International University, Dhaka, Bangladesh
  • Hafizul Imran Daffodil International University, Dhaka, Bangladesh
  • Md. Ramim Hossain Daffodil International University, Dhaka, Bangladesh
  • 4Tamjeed Monshi Daffodil International University, Dhaka, Bangladesh7
  • Himanish Debnath Himu, Daffodil International University, Dhaka, Bangladesh
  • 6Md. Ashikur Rahman Daffodil International University, Dhaka, Bangladesh
  • Gourob Saha Surjo Daffodil International University, Dhaka, Bangladesh

Abstract

The field of artificial intelligence (AI) has long pursued the goal of creating machines that can replicate human-like thinking and behavior. While early AI focused on solving complex problems that were difficult for humans, it became evident that tasks requiring intuition, such as image recognition or video understanding, posed significant challenges for traditional algorithmic approaches. This led to the emergence of machine learning (ML), where knowledge is acquired by extracting patterns from raw data. However, ML still requires manual feature selection and does not effectively handle multi-dimensional data like images and videos.

To address these limitations, representation learning and deep learning have been introduced. Representation learning aims to automatically construct meaningful representations of data, while deep learning leverages artificial neural networks (ANNs) inspired by the human brain to learn hierarchies of representations. The rise of computational power has enabled the development of deep neural networks that can handle increasingly complex tasks. Deep learning approaches have been applied to video analysis, including action recognition, video classification, and optical flow prediction, but they often require large amounts of labeled data, which is time-consuming and limits their applicability.

This paper makes several contributions to the field. Firstly, it provides an extensive overview of existing deep learning approaches for future frame prediction in videos. Secondly, it presents a novel neural network architecture that combines batch normalization, convolutional LSTM, and scheduled sampling to improve the training of recurrent models. This architecture achieves significantly reduced prediction errors compared to state-of-the-art models on the Moving MNIST dataset. Moreover, the paper makes all TensorFlow implementations, including the specialized components and evaluation metrics, freely available to the research community. Additionally, a lightweight, high-level, open-source framework for TensorFlow is introduced, simplifying the development of deep learning applications by providing abstractions for common tasks.

Keywords:

artificial intelligence,, machine learning, deep learning, representation learning,, neural networks

Published

2022-03-27

Issue

Section

Articles

How to Cite

Tariqul Islam() T. I., Imran, H., Hossain, M. R., Monshi, T., Debnath Himu, H., Ashikur Rahman, M., & Saha Surjo, G. (2022). DEEP LEARNING APPROACHES TO PREDICT FUTURE FRAMES IN VIDEOS. Advanced Journal of Environmental Sciences (AJES), 13(3), 9–22. Retrieved from https://zapjournals.com/Journals/index.php/ajes/article/view/851

Similar Articles

You may also start an advanced similarity search for this article.