MultiMedia Modeling 27th International Conference, MMM 2021, Prague, Czech Republic, June 22-24, 2021 : proceedings. Part I /

The two-volume set LNCS 12572 and 1273 constitutes the thoroughly refereed proceedings of the 27th International Conference on MultiMedia Modeling, MMM 2021, held in Prague, Czech Republic, in June2021. Of the 211 submitted regular papers, 40 papers were selected for oral presentation and 33 for pos...

Full description

Corporate Authors: International Conference on Multi-Media Modeling Prague, Czech Republic)
Other Authors: International Conference on Multi-Media Modeling, Lokoč, Jakub,, Skopal, Tomas,, Schoeffmann, Klaus,, Mezaris, Vasileios,, Li, Xirong,, Vrochidis, Stefanos, 1975-, Patras, Ioannis,, SpringerLink (Online service)
Format: eBook
Language: English
Published: Cham : Springer, [2021]
Physical Description: 1 online resource (xxv, 733 pages) : illustrations (chiefly color).
Series: Lecture notes in computer science ; 12572.
LNCS sublibrary. Information systems and applications, incl. Internet/Web, and HCI.
Subjects:
Table of Contents:
  • Intro
  • Preface
  • Organization
  • Contents
  • Part I
  • Contents
  • Part II
  • Crossed-Time Delay Neural Network for Speaker Recognition
  • 1 Introduction
  • 2 Baseline Models
  • 3 Crossed-Time Delay Neural Network
  • 3.1 Crossed-Time Delay Layer
  • 3.2 Statistical Concatenation
  • 4 Experiments
  • 4.1 Preprocessing
  • 4.2 Model Configuration
  • 4.3 Training Parameters Settings
  • 4.4 Embedding Extraction and Verification
  • 5 Results
  • 5.1 VoxCeleb1
  • 5.2 Vcc2016
  • 6 Conclusion
  • References
  • An Asymmetric Two-Sided Penalty Term for CT-GAN
  • 1 Introduction
  • 2 Background
  • 2.1 WGAN.
  • 2.2 WGAN-GP
  • 2.3 CT-GAN
  • 3 Our Approach
  • 3.1 Asymmetric Two-Sided Penalty
  • 3.2 WGAN with Asymmetric Two-Sided Penalty
  • 4 Experiments
  • 4.1 Datasets and Evaluation
  • 4.2 Results
  • 5 Conclusion
  • References
  • Fast Discrete Matrix Factorization Hashing for Large-Scale Cross-Modal Retrieval
  • 1 Introduction
  • 2 Proposed Method
  • 2.1 Problem Formulation
  • 2.2 Fast Discrete Matrix Factorization Hashing
  • 2.3 Optimization Algorithm
  • 2.4 Out-of-Sample Extension
  • 3 Experiment
  • 3.1 Experiment Settings
  • 3.2 Experimental Results
  • 3.3 Parameter Sensitivity Analysis.
  • 3.4 Time Cost Analysis
  • 4 Conclusion
  • References
  • Fast Optimal Transport Artistic Style Transfer
  • 1 Introduction
  • 2 Related Work
  • 3 Methodology
  • 3.1 Fast Style Transfer Framework
  • 3.2 Learn to Style Transfer via Optimal Transport
  • 3.3 Optimization Objectives
  • 4 Experiments
  • 4.1 Implementation Details
  • 4.2 Qualitative Analysis
  • 4.3 Quantitative Analysis
  • 4.4 Ablation Study
  • 5 Conclusion
  • References
  • Stacked Sparse Autoencoder for Audio Object Coding
  • 1 Introduction
  • 2 Related Work
  • 3 Proposed Approach
  • 3.1 Structure of SSAE-SAOC.
  • 3.2 Architecture of Stacked Sparse Autoencoder
  • 4 Experimental Evaluation
  • 4.1 Experiments Conditions
  • 4.2 SSAE Model Training
  • 4.3 Test Results and Data Analysis
  • 5 Conclusions
  • References
  • A Collaborative Multi-modal Fusion Method Based on Random Variational Information Bottleneck for Gesture Recognition
  • 1 Introduction
  • 2 Related Work
  • 3 Methodology
  • 3.1 Variational Information Bottleneck
  • 3.2 Random Variational Information Bottleneck
  • 4 Experiment
  • 4.1 Data Processing
  • 4.2 Experimental Analysis
  • 5 Conclusion
  • References.
  • Frame Aggregation and Multi-modal Fusion Framework for Video-Based Person Recognition
  • 1 Introduction
  • 2 Related Work
  • 3 Our Framework
  • 3.1 Overview
  • 3.2 AttentionVLAD for Frame Aggregation
  • 3.3 MLMA for Multi-modal Fusion
  • 4 Experiments
  • 4.1 Dataset
  • 4.2 Results
  • 4.3 Implementation Details
  • 4.4 Ablation Study
  • 5 Conclusion
  • References
  • An Adaptive Face-Iris Multimodal Identification System Based on Quality Assessment Network
  • 1 Introduction
  • 2 Proposed System
  • 2.1 Preprocessing
  • 2.2 Feature Extraction
  • 2.3 Matching
  • 2.4 FaceIrisQANet
  • 2.5 Fusion and Decision.