计算机视觉/图像处理每日论文速递[04.07]

作者：三青时间：2023-05-03 阅读数：100人阅读

访问arxivdaily.com，获取带摘要的学术速递，更多学科、收藏、评论、搜索……应有尽有。同步公众号(arXiv每日学术速递)，欢迎关注

[检测分类相关]：

【1】 Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot Classification Benchmark

标题：迁移和元学习方法在统一少发分类基准上的比较

作者：【Google Research, Brain Team】Vincent Dumoulin,Neil Houlsby,Utku Evci,Xiaohua Zhai,Ross Goroshin,Sylvain Gelly,Hugo Larochelle

链接：https://arxiv.org/abs/2104.02638

【2】 Uncertainty-aware Joint Salient Object and Camouflaged Object Detection

标题：不确定性感知的联合显著目标和伪装目标检测

作者：Aixuan Li,Jing Zhang,Yunqiu Lv,Bowen Liu,Tong Zhang,Yuchao Dai

机构*： Northwestern Polytechnical University, China , Australian National University, Australia, CSIRO, Australia , EPFL, Switzerland

备注：Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021. Aixuan Li and Jing Zhang contributed equally

链接：https://arxiv.org/abs/2104.02628

【3】 Mutual Graph Learning for Camouflaged Object Detection

标题：交互式图学习在伪装目标检测中的应用

作者：Qiang Zhai,Xin Li,Fan Yang,Chenglizhao Chen,Hong Cheng,Deng-Ping Fan

链接：https://arxiv.org/abs/2104.02613

【4】 Weakly-supervised Audio-visual Sound Source Detection and Separation

标题：弱监督视听声源检测与分离

作者：Tanzila Rahman,Leonid Sigal

机构*：University of British Columbia, Vector Institute for Al, Canada CIFAR AI Chair

备注：4 figures, 6 pages

链接：https://arxiv.org/abs/2104.02606

【5】 Automatic Large Scale Detection of Red Palm Weevil Infestation using Aerial and Street View Images

标题：基于航拍和街景影像的红掌象甲大尺度自动检测

作者：Dima Kagan,Galit Fuhrmann Alpert,Michael Fire

链接：https://arxiv.org/abs/2104.02598

【6】 Attentional Graph Neural Network for Parking-slot Detection

标题：注意力图神经网络在车位检测中的应用

作者：Chen Min,Jiaolong Xu,Liang Xiao,Dawei Zhao,Yiming Nie,Bin Dai

备注：Accepted by RAL

链接：https://arxiv.org/abs/2104.02576

【7】 White Box Methods for Explanations of Convolutional Neural Networks in Image Classification Tasks

标题：图像分类任务中卷积神经网络解释的白盒方法

作者：Meghna P Ayyar,Jenny Benois-Pineau,Akka Zemmari

机构*：University of Bordeaux, LaBRI,umr, Crs de la Liberation,Talence, France

备注：Submitted to Journal of Electronic Imaging (JEI)

链接：https://arxiv.org/abs/2104.02548

【8】 A Facial Feature Discovery Framework for Race Classification Using Deep Learning

标题：一种基于深度学习的种族分类面部特征发现框架

作者：Khalil Khan,Jehad Ali,Irfan Uddin,Sahib Khan,Byeong-hee Roh

机构*：Pak-Austria Institute of Applied Science and Technology, Haripur, Pakistan, Dept. of Computer Engineering and Dept. of AI Convergence Network, Suwon, South Korea, Departmet of Computer Science, Superior College,Lahore, Pakistan, Torino, Italy

备注：Number of pages in the paper are 15

链接：https://arxiv.org/abs/2104.02471

【9】 Weakly Supervised Video Salient Object Detection

标题：弱监督视频显著目标检测

作者：Wangbo Zhao,Jing Zhang,Long Li,Nick Barnes,Nian Liu,Junwei Han

机构*： The Brain and Artificial Intelligence Laboratory, Northwestern Polytechnical University, Australian National University , CSIRO, Australia, Inception Institute of Artificial Intelligence

备注：None

链接：https://arxiv.org/abs/2104.02391

【10】 Multiple instance active learning for object detection

标题：用于目标检测的多实例主动学习

作者：Tianning Yuan,Fang Wan,Mengying Fu,Jianzhuang Liu,Songcen Xu,Xiangyang Ji,Qixiang Ye

机构*：University of Chinese Academy of Sciences, Beijing, China, FNoahs Ark Lab, Huawei Technologies Shenzhen, China. STsinghua University, Beijing, China, Image Uncertainty

备注：10 pages, 7 figures, 5 tables. Code is available at this https URL

链接：https://arxiv.org/abs/2104.02324

【11】 Objects are Different: Flexible Monocular 3D Object Detection

标题：对象不同：灵活的单目3D对象检测

作者：Yunpeng Zhang,Jiwen Lu,Jie Zhou

机构*：Beijing National Research Center for Information Science and Technology, China, Tsinghua University, China

备注：Accepted in CVPR 2021

链接：https://arxiv.org/abs/2104.02323

【12】 Exploration of Hardware Acceleration Methods for an XNOR Traffic Signs Classifier

标题：XNOR交通标志分类器硬件加速方法探讨

作者：Dominika Przewlocka-Rus,Marcin Kowalczyk,Tomasz Kryjak

机构*：AGH University of Science and Technology in Krakow, Poland

备注：12 pages, 2 figures, 6 tables. Submitted for the CORES 2021 conference

链接：https://arxiv.org/abs/2104.02303

【13】 Hyperspectral and LiDAR data classification based on linear self-attention

标题：基于线性自关注的高光谱和LiDAR数据分类

作者：Min Feng,Feng Gao,Jian Fang,Junyu Dong

机构*：College of Information Science and Engineering, Ocean University of China, Institute of Marine Development, Ocean University of China

备注：Accepted for publication in the International Geoscience and Remote Sensing Symposium (IGARSS 2021)

链接：https://arxiv.org/abs/2104.02301

【14】 Change Detection from SAR Images Based on Deformable Residual Convolutional Neural Networks

标题：基于可变形残差卷积神经网络的SAR图像变化检测

作者：Junjie Wang,Feng Gao,Junyu Dong

机构*：Ocean University of China

备注：Accepted by ACM Multimedia Asia 2020

链接：https://arxiv.org/abs/2104.02299

【15】 Achieving Domain Generalization in Underwater Object Detection by Image Stylization and Domain Mixup

标题：利用图像风格化和域混合实现水下目标检测的域泛化

作者：Pinhao Song,Linhui Dai,Peipei Yuan,Hong Liu,Runwei Ding

机构*：Key Laboratory of Machine Perception, Shenzhen Graduate School, Peking University, Peng Cheng Laboratory

备注：9 pages

链接：https://arxiv.org/abs/2104.02230

【16】 Beyond Categorical Label Representations for Image Classification

标题：超越分类标签表示的图像分类

作者：Boyuan Chen,Yu Li,Sunand Raghupathi,Hod Lipson

机构*：Columbia University

备注：International Conference on Learning Representations (ICLR 2021). Project page is at \url{this https URL}

链接：https://arxiv.org/abs/2104.02226

【17】 Unified Detection of Digital and Physical Face Attacks

标题：数字人脸攻击和物理人脸攻击的统一检测

作者：Debayan Deb,Xiaoming Liu,Anil K. Jain

机构*：Michigan State University, East Lansing, MI, Digital Attacks, Physical Attacks, Adversarial Faces, Digital Manipulation, Spoofs, Gradient Learning Warping ld. Swp SwpManip. ,Synth., Attr. Face Print Replay Wearable ,D Makeup Partial, Mask: Mask, FGSM, Half, Cosmetic: FunnyEy

链接：https://arxiv.org/abs/2104.02156

【18】 Dopamine Transporter SPECT Image Classification for Neurodegenerative Parkinsonism via Diffusion Maps and Machine Learning Classifiers

标题：基于扩散图和机器学习分类器的神经退行性帕金森病多巴胺转运体SPECT图像分类

作者：Jun-En Ding,Chi-Hsiang Chu,Mong-Na Lo Huang,Chien-Ching Hsu

机构*： Research Center for Information Technology Innovation, National Cheng-Kung University, Tainan, Taiwan, Kaohsiung, Taiwan, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taiwan

链接：https://arxiv.org/abs/2104.02066

【19】 Brain Tumors Classification for MR images based on Attention Guided Deep Learning Model

标题：基于注意力引导深度学习模型的MR图像脑肿瘤分类

作者：Yuhao Zhang,Shuhang Wang,Haoxiang Wu,Kejia Hu,Shufan Ji

链接：https://arxiv.org/abs/2104.02331

【20】 A clinical validation of VinDr-CXR, an AI system for detecting abnormal chest radiographs

标题：人工智能胸片检查系统VinDR-CXR的临床验证

作者：Ngoc Huy Nguyen,Ha Quy Nguyen,Nghia Trung Nguyen,Thang Viet Nguyen,Hieu Huy Pham,Tuan Ngoc-Minh Nguyen

备注：This is a preprint which has been submitted and under review by PLOS One journal

链接：https://arxiv.org/abs/2104.02256

【21】 In-Line Image Transformations for Imbalanced, Multiclass Computer Vision Classification of Lung Chest X-Rays

标题：非平衡多类计算机视觉肺胸片分类的在线图像变换

作者：Alexandrea K. Ramnarine

机构*：School of Professional Studies, Northwestern University, Chicago, IL

备注：8 article pages, 4 article figures, 1 article table. 14 supplemental pages with figures

链接：https://arxiv.org/abs/2104.02238

【22】 Insight about Detection, Prediction and Weather Impact of Coronavirus (Covid-19) using Neural Network

标题：基于神经网络的冠状病毒(冠状病毒)检测、预测及天气影响研究

作者：A K M Bahalul Haque,Tahmid Hasan Pranto,Abdulla All Noman,Atik Mahmood

机构*：North South University, Dhaka-, Bangladesh

备注：15 Pages, 13 Figures and 4 Tables

链接：https://arxiv.org/abs/2104.02173

[分割/语义相关]：

【1】 Latent Space Regularization for Unsupervised Domain Adaptation in Semantic Segmentation

标题：语义分割中无监督领域自适应的潜在空间正则化方法

作者：Francesco Barbato,Marco Toldo,Umberto Michieli,Pietro Zanuttigh

机构*：University of Padova

备注：11 pages, 7 figures, 1 tables

链接：https://arxiv.org/abs/2104.02633

【2】 DCANet: Dense Context-Aware Network for Semantic Segmentation

标题：DCANet：面向语义分割的密集上下文感知网络

作者：Yifu Liu,Chenfeng Xu,Xinyu Jin

链接：https://arxiv.org/abs/2104.02533

【3】 Weakly supervised segmentation with cross-modality equivariant constraints

标题：具有跨模态等变约束的弱监督分割

作者：Gaurav Patel,Jose Dolz

机构*：T,c, FLAIR

备注：Submitted to TMI. Code available

链接：https://arxiv.org/abs/2104.02488

【4】 One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation

标题：One Thing One Click：一种弱监督3D语义分割的自训练方法

作者：Zhengzhe Liu,Xiaojuan Qi,Chi-Wing Fu

链接：https://arxiv.org/abs/2104.02246

【5】 Adaptive Clustering of Robust Semantic Representations for Adversarial Image Purification

标题：用于对抗性图像净化的鲁棒语义表示的自适应聚类

作者：Samuel Henrique Silva,Arun Das,Ian Scarff,Peyman Najafirad

机构*：Secure Al Autonomy Laboratory

备注：11 pages, 5 figures, 4 tables

链接：https://arxiv.org/abs/2104.02155

【6】 Action Shuffle Alternating Learning for Unsupervised Action Segmentation

标题：基于动作洗牌交替学习的无监督动作分割

作者：Jun Li,Sinisa Todorovic

备注：CVPR 2021

链接：https://arxiv.org/abs/2104.02116

【7】 Anchor-Constrained Viterbi for Set-Supervised Action Segmentation

标题：基于锚点约束的Viterbi集监督动作分割算法

作者：Jun Li,Sinisa Todorovic

机构*：Oregon State University

备注：CVPR 2021

链接：https://arxiv.org/abs/2104.02113

【8】 Pathological Image Segmentation with Noisy Labels

标题：带噪声标签的病理图像分割

作者：Li Xiao,Yinhao Li,Luxi Qv,Xinxia Tian,Yijie Peng,S. Kevin Zhou

机构*：Kevin Zhoul, Institute of Computing Technology, Chinese Academy of Sciences, Guanghua School of Management, Peking University, Peking University Third Hospital

链接：https://arxiv.org/abs/2104.02602

【9】 Towards Semantic Interpretation of Thoracic Disease and COVID-19 Diagnosis Models

标题：胸部疾病和冠状病毒诊断模型的语义解释

作者：Ashkan Khakzar,Sabrina Musatian,Jonas Buchberger,Icxel Valeriano Quiroz,Nikolaus Pinger,Soroosh Baselizadeh,Seong Tae Kim,Nassir Navab

机构*： Technical University of Munich, Kyung Hee University, Johns Hopkins University

链接：https://arxiv.org/abs/2104.02481

【10】 Pyramid U-Net for Retinal Vessel Segmentation

标题：金字塔U网在视网膜血管分割中的应用

作者：Jiawei Zhang,Yanchun Zhang,Xiaowei Xu

机构*： School of Computer Science, Fudan University, Shanghai, China, College of Engineering and Science, Victoria University, Melbourne, Australia, Cyberspace Institute of Advanced Technology Guangzhou University, Guangzhou, China

备注：10 pages, 5 figures, Accepted by ICASSP2021

链接：https://arxiv.org/abs/2104.02333

[人脸相关]：

【1】 On the Pitfalls of Learning with Limited Data: A Facial Expression Recognition Case Study

标题：论有限数据学习的陷阱：面部表情识别案例研究

作者：Miguel Rodríguez Santander,Juan Hernández Albarracín,Adín Ramírez Rivera

备注：To appear in Expert Systems with Applications

链接：https://arxiv.org/abs/2104.02653

【2】 Teacher-Student Adversarial Depth Hallucination to Improve Face Recognition

标题：改善人脸识别的师生对抗性深度幻觉

作者：Hardik Uppal,Alireza Sepas-Moghaddam,Michael Greenspan,Ali Etemad

机构*：Queens University, Kingston, Canada

备注：10 pages, 6 figures

链接：https://arxiv.org/abs/2104.02424

【3】 Multi-hierarchical Convolutional Network for Efficient Remote Photoplethysmograph Signal and Heart Rate Estimation from Face Video Clips

标题：多层卷积网络用于人脸视频片段中有效的远程光体积图信号和心率估计

作者：Panpan Zhang,Bin Li,Jinye Peng,Wei Jiang

机构*： School of Information Science and Technology, Northwest University, Xian, Peoples Republic of China.

备注：33 pages,9 figures

链接：https://arxiv.org/abs/2104.02260

【4】 Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries

标题：遮挡边界双曲面外推深度补全

作者：Saif Imran,Xiaoming Liu,Daniel Morris

机构*：Michigan State University

备注：Accepted in Intl. Conf. on Computer Vision and Pattern Recognition (CVPR) 2021 (Supplementary Included)

链接：https://arxiv.org/abs/2104.02253

【5】 IronMask: Modular Architecture for Protecting Deep Face Template

标题：IronMask：用于保护深脸模板的模块化架构

作者：Sunpill Kim,Yunseong Jeong,Jinsu Kim,Jungkon Kim,Hyung Tae Lee,Jae Hong Seo

备注：The submission is a 13 pages of paper which consists of 3 figures, 3 tables. It is the full version of CVPR 21 paper (The Conference on Computer Vision and Patter Recognition)

链接：https://arxiv.org/abs/2104.02239

[GAN/对抗式/生成式相关]：

【1】 Adversarial Robustness under Long-Tailed Distribution

标题：长尾分布下的对抗稳健性

作者：Tong Wu,Ziwei Liu,Qingqiu Huang,Yu Wang,Dahua Lin

备注：Accepted to CVPR 2021 (Oral)

链接：https://arxiv.org/abs/2104.02703

【2】 ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement

标题：RESTYLE：一种基于残差的迭代求精StyleGAN编码器

作者：Yuval Alaluf,Or Patashnik,Daniel Cohen-Or

机构*：Tel-Aviv University

备注：Project page available at this https URL

链接：https://arxiv.org/abs/2104.02699

【3】 Are GAN generated images easy to detect? A critical analysis of the state-of-the-art

标题：GaN生成的图像容易检测吗？对最新技术的批判性分析

作者：Diego Gragnaniello,Davide Cozzolino,Francesco Marra,Giovanni Poggi,Luisa Verdoliva

备注：7 pages, 5 figures, conference

链接：https://arxiv.org/abs/2104.02617

【4】 On the Robustness of Vision Transformers to Adversarial Examples

标题：关于视觉变形器对对抗性例子的稳健性

作者：Kaleel Mahmood,Rigel Mahmood,Marten van Dijk

机构*：CWI, Amsterdam, University of Connecticut, CT, USA, The Netherlands

链接：https://arxiv.org/abs/2104.02610

【5】 Content-Aware GAN Compression

标题：内容感知的GaN压缩

作者：Yuchen Liu,Zhixin Shu,Yijun Li,Zhe Lin,Federico Perazzi,S. Y. Kung

机构*：Princeton University ,Adobe Research

备注：Published in CVPR2021

链接：https://arxiv.org/abs/2104.02244

【6】 Toward Generating Synthetic CT Volumes using a 3D-Conditional Generative Adversarial Network

标题：使用三维条件生成对抗网络生成合成CT体积

作者：Jayalakshmi Mangalagiri,David Chapman,Aryya Gangopadhyay,Yaacov Yesha,Joshua Galita,Sumeet Menon,Yelena Yesha,Babak Saboury,Michael Morris,Phuong Nguyen

机构*：University of Maryland, Baltimore County, Baltimore MD, USA,National Institutes of Health Clinical Center, Bethesda, MD, USA,Networking Health Glen Burnie MD, USA,OpenKneck Inc, Halethorpe, MD, USA

备注：It is a short paper accepted in CSCI 2020 conference and is accepted to publication in the IEEE CPS proceedings

链接：https://arxiv.org/abs/2104.02060

[行为/时空/光流/姿态/运动]：

【1】 Optical Flow Dataset Synthesis from Unpaired Images

标题：基于未配对图像的光流数据集合成

作者：Adrian Wälchli,Paolo Favaro

链接：https://arxiv.org/abs/2104.02615

【2】 Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting

标题：中心投票：基于径向关键点投票的RGB-D图像6个DOF位姿估计

作者：Yangzheng Wu,Mohsen Zand,Ali Etemad,Michael Greenspan

机构*：Ingenuity Labs, Queens University, Kingston, Ontario, Canada

备注：ICCV 2021 submission

链接：https://arxiv.org/abs/2104.02527

【3】 SIMPLE: SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation

标题：SIMPLE：具有模仿和点学习的单网络自下而上的人体姿势估计

作者：Jiabin Zhang,Zheng Zhu,Jiwen Lu,Junjie Huang,Guan Huang,Jie Zhou

链接：https://arxiv.org/abs/2104.02486

【4】 Learning to Estimate Hidden Motions with Global Motion Aggregation

标题：学习利用全局运动聚合估计隐藏运动

作者：Shihao Jiang,Dylan Campbell,Yao Lu,Hongdong Li,Richard Hartley

机构*：Australian National University ,ACRV ,University of Oxford

链接：https://arxiv.org/abs/2104.02409

【5】 Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression

标题：基于解缠关键点回归的自下而上人体姿态估计

作者：Zigang Geng,Ke Sun,Bin Xiao,Zhaoxiang Zhang,Jingdong Wang

机构*：University of Science and Technology of China, Institute of Automation, CAS, University of Chinese Academy of Sciences, Centre for artificial Intelligence and Robotics, HKISICAS, Microsoft

备注：Accepted by CVPR2021. arXiv admin note: text overlap with arXiv:2006.15480

链接：https://arxiv.org/abs/2104.02300

【6】 Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo

标题：基于平面扫描立体的多视角多人三维位姿估计

作者：Jiahao Lin,Gim Hee Lee

备注：10 pages, 5 figures. Accepted in CVPR 2021

链接：https://arxiv.org/abs/2104.02273

【7】 Learning Optical Flow from a Few Matches

标题：从几场比赛中学习光流

作者：Shihao Jiang,Yao Lu,Hongdong Li,Richard Hartley

备注：Accepted to CVPR 2021

链接：https://arxiv.org/abs/2104.02166

[半/弱/无监督相关]：

【1】 An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information

标题：一种基于文档级结构信息的无监督采样图句匹配方法

作者：Zejun Li,Zhongyu Wei,Zhihao Fan,Haijun Shan,Xuanjing Huang

机构*：School of Data Science, Fudan University, China, Zhejiang Lab, China, School of Computer Science, Fudan University, China, Research Institute of Intelligent and Complex Systems, Fudan University, China

备注：To be published in AAAI2021

链接：https://arxiv.org/abs/2104.02605

【2】 Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization

标题：弱监督时间动作定位的自适应相互监督

作者：Chen Ju,Peisen Zhao,Siheng Chen,Ya Zhang,Xiaoyun Zhang,Qi Tian

链接：https://arxiv.org/abs/2104.02357

【3】 Self-Supervised Learning based CT Denoising using Pseudo-CT Image Pairs

标题：基于自监督学习的伪CT图像对的CT去噪

作者：Dongkyu Won,Euijin Jung,Sion An,Philip Chikontwe,Sang Hyun Park

链接：https://arxiv.org/abs/2104.02326

[跟踪相关]：

【1】 gradSim: Differentiable simulation for system identification and visuomotor control

标题：gradSim：用于系统辨识和视觉运动控制的微分仿真

作者：Krishna Murthy Jatavallabhula,Miles Macklin,Florian Golemo,Vikram Voleti,Linda Petrini,Martin Weiss,Breandan Considine,Jerome Parent-Levesque,Kevin Xie,Kenny Erleben,Liam Paull,Florian Shkurti,Derek Nowrouzezahrai,Sanja Fidler

机构*：Montreal Robotics and Embodied AI Lab,NVIDIA, Mila,Universite de Montreal,McGill., University of Toronto,Vector Institute,University of Copenhagen

备注：ICLR 2021. Project page (and a dynamic web version of the article): this https URL

链接：https://arxiv.org/abs/2104.02646

【2】 Local Metrics for Multi-Object Tracking

标题：用于多目标跟踪的局部度量

作者：Jack Valmadre,Alex Bewley,Jonathan Huang,Chen Sun,Cristian Sminchisescu,Cordelia Schmid

机构*：Google Research

链接：https://arxiv.org/abs/2104.02631

【3】 PointShuffleNet: Learning Non-Euclidean Features with Homotopy Equivalence and Mutual Information

标题：PointShuffleNet：学习具有同伦等价和互信息的非欧几里得特征

作者：Linchao He,Mengting Luo,Dejun Zhang,Xiao Yang,Hu Chen,Yi Zhang

备注：15 pages

链接：https://arxiv.org/abs/2104.02611

[迁移学习/domain/主动学习/自适应]：

【1】 A New Parallel Adaptive Clustering and its Application to Streaming Data

标题：一种新的并行自适应聚类算法及其在数据流中的应用

作者：Benjamin McLaughlin,Sung Ha Kang

备注：This work was funded by NAVSEA. Distribution Statement A: Approved for Public Release, Distribution is Unlimited

链接：https://arxiv.org/abs/2104.02680

【2】 Efficient Video Compression via Content-Adaptive Super-Resolution

标题：基于内容自适应超分辨率的高效视频压缩

作者：Mehrdad Khani,Vibhaalakshmi Sivaraman,Mohammad Alizadeh

机构*：MIT CSAIL

链接：https://arxiv.org/abs/2104.02322

【3】 Learning from Self-Discrepancy via Multiple Co-teaching for Cross-Domain Person Re-Identification

标题：跨领域人再认同的多元合作教学中的自我差异学习

作者：Suncheng Xiang,Yuzhuo Fu,Mengyuan Guan,Ting Liu

链接：https://arxiv.org/abs/2104.02265

[裁剪/量化/加速相关]：

【1】 Learnable Expansion-and-Compression Network for Few-shot Class-Incremental Learning

标题：基于可学习扩展和压缩网络的小概率增量学习

作者：Boyu Yang,Mingbao Lin,Binghao Liu,Mengying Fu,Chang Liu,Rongrong Ji,Qixiang Ye

机构*：PriSDL, EECE, University of Chinese Academy of Sciences, MAC, Xiamen University, Institute of Artificial Intelligence, Xiamen University, Cheng Laboratory

链接：https://arxiv.org/abs/2104.02281

【2】 Compressing Visual-linguistic Model via Knowledge Distillation

标题：基于知识提炼的视觉语言模型压缩

作者：Zhiyuan Fang,Jianfeng Wang,Xiaowei Hu,Lijuan Wang,Yezhou Yang,Zicheng Liu

机构*：Arizona State University, Microsoft Corporation

链接：https://arxiv.org/abs/2104.02096

[Re-id相关]：

【1】 Neural Feature Search for RGB-Infrared Person Re-Identification

标题：神经特征搜索法在RGB-红外人再识别中的应用

作者：Yehansen Chen,Lin Wan,Zhihang Li,Qianyan Jing,Zongyuan Sun

备注：13 pages, 7 figures, accepted by CVPR 2021

链接：https://arxiv.org/abs/2104.02366

[数据集dataset]：

【1】 The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions

标题：多智能体行为数据集：鼠标二元社会交互

作者：Jennifer J. Sun,Tomomi Karigo,Dipam Chakraborty,Sharada P. Mohanty,David J. Anderson,Pietro Perona,Yisong Yue,Ann Kennedy

机构*：Caltech, AICrowd Research, Northwestern University

备注：Dataset and challenge: this https URL Part of MABe workshop @ CVPR21: this https URL

链接：https://arxiv.org/abs/2104.02710

[超分辨率]：

【1】 Test-Time Adaptation for Super-Resolution: You Only Need to Overfit on a Few More Images

标题：超高分辨率的测试时间调整：您只需要在更多的几张图像上进行超大调整

作者：Mohammad Saeed Rad,Thomas Yu,Behzad Bozorgtabar,Jean-Philippe Thiran

机构*：Signal Processing Lab (LTS,), EPFL, Lausanne, Switzerland

链接：https://arxiv.org/abs/2104.02663

[深度depth相关]：

【1】 Instantaneous Stereo Depth Estimation of Real-World Stimuli with a Neuromorphic Stereo-Vision Setup

标题：用神经形态立体视觉装置估计真实刺激的瞬时立体深度

作者：Nicoletta Risi,Enrico Calabrese,Giacomo Indiveri

机构*：Institute of Neuroinformatics, University of Zurich and ETH Zurich, Switzerland

链接：https://arxiv.org/abs/2104.02541

[3D/3D重建等相关]：

【1】 3D-to-2D Distillation for Indoor Scene Parsing

标题：用于室内场景解析的3D-to-2D蒸馏

作者：Zhengzhe Liu,Xiaojuan Qi,Chi-Wing Fu

机构*：The Chinese University of Hong Kong ,The University of Hong Kong

链接：https://arxiv.org/abs/2104.02243

[其他视频相关]：

【1】 Strumming to the Beat: Audio-Conditioned Contrastive Video Textures

标题：随着节拍弹奏：音频条件下的对比度视频纹理

作者：Medhini Narasimhan,Shiry Ginosar,Andrew Owens,Alexei A. Efros,Trevor Darrell

机构*：University of California, Berkeley ,University of Michigan

备注：Project website at this https URL

链接：https://arxiv.org/abs/2104.02687

【2】 Collaborative Learning to Generate Audio-Video Jointly

标题：协作式学习联合生成音视频

作者：Vinod K Kurmi,Vipul Bajaj,Badri N Patro,K S Venkatesh,Vinay P Namboodiri,Preethi Jyothi

机构*：Indian Institute of Technology Kanpur University of Bath Indian Institute of Technology Bombay

备注：ICASSP 2021 (Accepted)

链接：https://arxiv.org/abs/2104.02656

【3】 Deep Animation Video Interpolation in the Wild

标题：野外深度动画视频插值

作者：Li Siyao,Shiyu Zhao,Weijiang Yu,Wenxiu Sun,Dimitris N. Metaxas,Chen Change Loy,Ziwei Liu

机构*：SenseTime Research and Tetras. AI ,Rutgers University ,Sun Yat-sen University, Shanghai Al Laboratory ,S-Lab, Nanyang Technological University, Frame , DAIN, SoftSplat, Ours

备注：Accepted by CVPR21

链接：https://arxiv.org/abs/2104.02495

[其他]：

【1】 Localizing Visual Sounds the Hard Way

标题：艰难地本地化视觉声音

作者：Honglie Chen,Weidi Xie,Triantafyllos Afouras,Arsha Nagrani,Andrea Vedaldi,Andrew Zisserman

机构*：VGG, UK

备注：CVPR2021

链接：https://arxiv.org/abs/2104.02691

【2】 DeepBlur: A Simple and Effective Method for Natural Image Obfuscation

标题：DeepBlur：一种简单有效的自然图像模糊方法

作者：Tao Li,Min Soo Choi

机构*：Purdue University, Original, Blurring Pixelation Masking Adunoise, Ourst

链接：https://arxiv.org/abs/2104.02655

【3】 Malignancy Prediction and Lesion Identification from Clinical Dermatological Images

标题：临床皮肤科影像的恶性肿瘤预测和病变识别

作者：Meng Xia,Meenal K. Kheterpal,Samantha C. Wong,Christine Park,William Ratliff,Lawrence Carin,Ricardo Henao

机构*：Duke University, Durham, USA, Duke University, School of Medicine, Durham, NC, USA, d Duke Institute for Health Innovation, Duke University, Durham, NC, USA

链接：https://arxiv.org/abs/2104.02652

【4】 A Modified Convolutional Network for Auto-encoding based on Pattern Theory Growth Function

标题：一种基于模式理论增长函数的改进卷积网络自动编码方法

作者：Erico Tjoa

机构*：Nanyang Technological University, Nanyang Ave

链接：https://arxiv.org/abs/2104.02651

【5】 MirrorNeRF: One-shot Neural Portrait RadianceField from Multi-mirror Catadioptric Imaging

标题：MirrorNeRF：多镜折反射成像的一次神经人像辐射场

作者：Ziyu Wang,Liao Wang,Fuqiang Zhao,Minye Wu,Lan Xu,Jingyi Yu

链接：https://arxiv.org/abs/2104.02607

【6】 Noise Estimation for Generative Diffusion Models

标题：生成扩散模型的噪声估计

作者：Robin San-Roman,Eliya Nachmani,Lior Wolf

机构*：Ecole Normale Superieure Paris-Saclay- Tel-Aviv- University& Facebook AI Research, in order to find a noise schedule that would produce high-

链接：https://arxiv.org/abs/2104.02600

【7】 Fourier Image Transformer

标题：傅里叶图像转换器

作者：Tim-Oliver Buchholz,Florian Jug

机构*：区, CSBD and MPI-CBG, Dresden, Germany, Fondatione Human Technopole, Milano, Italy

链接：https://arxiv.org/abs/2104.02555

【8】 Visual Camera Re-Localization Using Graph Neural Networks and Relative Pose Supervision

标题：基于图神经网络和相对位姿监测的摄像机重定位

作者：Mehmet Ozgur Turkoglu,Eric Brachmann,Konrad Schindler,Gabriel Brostow,Aron Monszpart

机构*：Gabriel J. Brostow, ETH Zurich ,Niantic ,University College London

链接：https://arxiv.org/abs/2104.02538

【9】 Few-Shot Transformation of Common Actions into Time and Space

标题：常见动作到时间和空间的小镜头转换

作者：Pengwan Yang,Pascal Mettes,Cees G. M. Snoek

机构*：University of Amsterdam

链接：https://arxiv.org/abs/2104.02439

【10】 Fine-Grained Fashion Similarity Prediction by Attribute-Specific Embedding Learning

标题：基于特定属性嵌入学习的细粒度时尚相似度预测

作者：Jianfeng Dong,Zhe Ma,Xiaofeng Mao,Xun Yang,Yuan He,Richang Hong,Shouling Ji

备注：arXiv admin note: substantial text overlap with arXiv:2002.02814

链接：https://arxiv.org/abs/2104.02429

【11】 Variational Transformer Networks for Layout Generation

标题：用于版图生成的变分Transformer网络

作者：Diego Martin Arroyo,Janis Postels,Federico Tombari

机构*：Google, Inc, ETH Zurich, Technische Universitat Munchen

备注：To be published in CVPR 2021

链接：https://arxiv.org/abs/2104.02416

【12】 Ensemble deep learning: A review

标题：集成深度学习：综述

作者：M. A. Ganaie,Minghui Hu,M. Tanveer,P. N. Suganthan

机构*：Indian Institute of Technology Indore, Simrol, Indore, India, School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore

链接：https://arxiv.org/abs/2104.02395

【13】 Learning Spatial Context with Graph Neural Network for Multi-Person Pose Grouping

标题：基于图神经网络的多人姿态分组空间上下文学习

作者：Jiahao Lin,Gim Hee Lee

备注：7 pages, 4 figures. Accepted in ICRA 2021

链接：https://arxiv.org/abs/2104.02385

【14】 Scene Graph Embeddings Using Relative Similarity Supervision

标题：基于相对相似监督的场景图嵌入

作者：Paridhi Maheshwari,Ritwick Chaudhry,Vishwa Vinay

备注：Accepted to AAAI 2021

链接：https://arxiv.org/abs/2104.02381

【15】 Backdoor Attack in the Physical World

标题：物理世界中的后门攻击

作者：Yiming Li,Tongqing Zhai,Yong Jiang,Zhifeng Li,Shu-Tao Xia

机构*：Tsinghua Shenzhen International Graduate School, Tsinghua University, Tencent Lab

备注：This work was done when Yiming Li was an intern at Tencent AI Lab, supported by the Tencent Rhino-Bird Elite Training Program (2020). This is a 6-pages short version of our ongoing work, `Rethinking the Trigger of Backdoor Attack (arXiv:2004.04692). It is accepted by the non-archival ICLR 2021 workshop on Robust and Reliable Machine Learning in the Real World

链接：https://arxiv.org/abs/2104.02361

【16】 Visual Alignment Constraint for Continuous Sign Language Recognition

标题：连续手语识别的视觉对齐约束

作者：Yuecong Min,Aiming Hao,Xiujuan Chai,Xilin Chen

机构*：Key Lab of Intelligent Information Processing of Chinese Academy of Sciences(CAS), Institute of Computing Technology, CAS, Beijing, China, University of Chinese Academy of Sciences, Beijing, China

备注：The code will be released: this https URL

链接：https://arxiv.org/abs/2104.02330

【17】 Contrastive Syn-to-Real Generalization

标题：对比同步到实数的推广

作者：Wuyang Chen,Zhiding Yu,Shalini De Mello,Sifei Liu,Jose M. Alvarez,Zhangyang Wang,Anima Anandkumar

备注：Accepted in ICLR 2021

链接：https://arxiv.org/abs/2104.02290

【18】 Multi-Scale Context Aggregation Network with Attention-Guided for Crowd Counting

标题：基于注意力引导的多尺度上下文聚合网络人群计数

作者：Xin Wang,Yang Zhao,Tangwen Yang,Qiuqi Ruan

机构*：IInstitute of Information Science, School of Computer and Information Technology, Beijing Jiaotong University, Beijing , China, Guangdong Key Laboratory of Intelligent Information processing, Shenzhen University, Shenzhen, Guangdong , China

备注：None

链接：https://arxiv.org/abs/2104.02245

【19】 Hippocampus-heuristic Character Recognition Network for Zero-shot Learning

标题：用于Zero-Shot学习的海马-启发式字符识别网络

作者：Shaowei Wang,Guanjie Huang,Xiangyu Luo

机构*：College of Computer Science and Technology, Huaqiao University, Xiamen,China

链接：https://arxiv.org/abs/2104.02236

【20】 When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes

标题：当猪飞翔：合成和自然场景中的语境推理

作者：Philipp Bomatter,Mengmi Zhang,Dimitar Karev,Spandan Madan,Claire Tseng,Gabriel Kreiman

链接：https://arxiv.org/abs/2104.02215

【21】 Hypothesis-driven Stream Learning with Augmented Memory

标题：基于扩展记忆的假设驱动流学习

作者：Mengmi Zhang,Rohil Badkundri,Morgan B. Talbot,Gabriel Kreiman

机构*：, Equal contribution, Childrens Hospital, Harvard Medical School, Center for Brains, Minds and Machines, Harvard College, Harvard University, Harvard-MIT Health Sciences and Technology Harvard Medical School

链接：https://arxiv.org/abs/2104.02206

【22】 Automatic Micro-Expression Apex Frame Spotting using Local Binary Pattern from Six Intersection Planes

标题：基于六个相交平面局部二值模式的微表情尖端帧自动定位

作者：Vida Esmaeili,Mahmood Mohassel Feghhi,Seyed Omid Shahdi

机构*：Biomedical and, Engineering, Mechatronics, University of Tabriz, University of tabriz, Qazvin Branch, Islamic Azad University, Tabriz, Iran, Qazvin, Iran

备注：6 pages, 7 figures, Presented at the 11 Iranian and the first International Conference on Machine Vision and Image Processing (MVIP), 19-20 February 2020, this https URL

链接：https://arxiv.org/abs/2104.02149

【23】 Jekyll: Attacking Medical Image Diagnostics using Deep Generative Models

标题：Jekyll：用深度生成模型攻击医学影像诊断学

作者：Neal Mangaokar,Jiameng Pu,Parantapa Bhattacharya,Chandan K. Reddy,Bimal Viswanath

机构*：Virginia Tech ,University of Virginia

备注：Published in proceedings of the 5th European Symposium on Security and Privacy (EuroS&P 20)

链接：https://arxiv.org/abs/2104.02107

【24】 I-ODA, Real-World Multi-modal Longitudinal Data for OphthalmicApplications

标题：I-ODA，眼科应用的真实世界多模式纵向数据

作者：Nooshin Mojab,Vahid Noroozi,Abdullah Aleem,Manoj P. Nallabothula,Joseph Baker,Dimitri T. Azar,Mark Rosenblatt,RV Paul Chan,Darvin Yi,Philip S. Yu,Joelle A. Hallak

机构*：University of Illinois at Chicago, Chicago, IL, US

链接：https://arxiv.org/abs/2104.02609

【25】 Speaker embeddings by modeling channel-wise correlations

标题：基于信道相关建模的说话人嵌入

作者：Themos Stafylakis,Johan Rohdin,Lukas Burget

机构*：Omilia-Conversational Intelligence, Athens, Greece, Brno University of Technology, Speech FIT, Czechia

备注：Submitted to Interspeech 2021

链接：https://arxiv.org/abs/2104.02571

【26】 Searching Efficient Model-guided Deep Network for Image Denoising

标题：寻找高效的模型引导深度网络用于图像去噪

作者：Qian Ning,Weisheng Dong,Xin Li,Jinjian Wu,Leida Li,Guangming Shi

机构*：School of Artificial Intelligence, Xidian University ,West Virginia University