| |||||||||||||||
ViRaL 2019 : 1st International Workshop on Video Retrieval Methods and Their Limits | |||||||||||||||
Link: https://sites.google.com/view/viral2019/ | |||||||||||||||
| |||||||||||||||
Call For Papers | |||||||||||||||
With the vastly increasing amount of video data being created, searching in video is a common task in many application areas, such as media & entertainment, surveillance or medicine. The success of video search relies crucially on indexing video content, which is often done based on textual information, after extracting text or adding labels based on detection or classification of the visual or audio content. Video search systems are thus often built by integrating a set analysis components, many of which rely on computer vision algorithms, and fusing their results to create an efficiently searchable index.
This has the consequence that the performance of video search & retrieval systems is impacted by many factors, which makes the analysis of which components of the system contribute to the success or failure in a particular case difficult. The fact that many of the components have moved to deep neural network (DNN) based approaches in recent years has not made this analysis easier. Benchmarking initiatives for video analysis and retrieval, such as TRECVID, have significantly contributed to a more systematic evaluation and have tremendously fostered the evolution of systems. However, their results show that there are usually outliers in the performance of a system on specific queries or datasets. In the existing literature, these aspects of comparative analysis and failure analysis are not sufficiently explored. The 1st international workshop on video retrieval methods and their limits is calling for contributions in video search using different types of queries. For example, searching within videos can be of two types: General search (also known as ad-hoc search) uses natural language queries (and possibly image/video queries), which are used by systems to retrieve relevant video sequences. Queries may specify certain conditions that must be satisfied for a video to be considered relevant. Instance search requires the retrieval of specific objects, persons, location, or a combination of these entities given an example image(s) of the target(s) of interest. In this context, contributions related (but not limited) to the following topics are invited. Comparative analysis of performance of search systems on different datasets Fusion of computer vision, text/language processing and audio analysis for video search Evaluation protocols and metrics for assessing the impact of specific components of retrieval systems Failure analysis of vision-based components in video search and retrieval systems Failure analysis of query types, dataset characteristics, metrics, and system architectures Integrating user interaction in search systems and their impact on performance Approaches for measuring and predicting hardness/complexity of queries in a system-independent way Interested authors are invited to apply their approaches and methods on the existing datasets prepared by the workshop organizers. These include: Internet archives collection (IACC.3), which contains 600 hours of video, 90 ad-hoc queries and available ground truth. BBC Eastenders dataset contains episodes of the weekly show over a period of 5 years. This amounts to 464 hours of video, and has available 177 instance search queries and the ground truth. The new V3C1 Vimeo internet collection contains 1000 hours of video and will be used at the annual TRECVID international content-based video retrieval evaluation benchmark starting in 2019. However, any external datasets can also be used. Failure analysis of system performance are highly encouraged and will be given high priority with the goal to identify which methods works and which don’t, and why. Examples of such failure modes include, but are not limited to: easy vs hard queries, dataset characteristics, training data characteristics and its effect on solving easy/hard queries, system architecture (e.g NN depth and attributes). Submission We invite papers of up to 4 pages length (excluding references, but including figures), formatted according to the ICCV template (http://iccv2019.thecvf.com/files/iccv2019AuthorKit.zip). Submissions shall be single blind, i.e. do not need to be anonymized. The workshop proceedings will be archived in the IEEE Xplore Digital Library and the CVF.Open Acess. By submitting a manuscript to ICCV, authors acknowledge that it has not been previously published or accepted for publication in substantially similar form in any peer-reviewed venue including journal, conference or workshop. Furthermore, no publication substantially similar in content has been or will be submitted to this or another conference, workshop, or journal during the review period. A publication, for the purposes of this policy, is defined to be a written work longer than four pages (excluding references) that was submitted for review by peers for either acceptance or rejection, and, after review, was accepted. In particular, this definition of publication does not depend upon whether such an accepted written work appears in a formal proceedings or whether the organizers declare that such work “counts as a publication”. All submissions will be handled electronically via EasyChair : easychair.org/conferences/?conf=viral19 Important Dates Workshop paper submission deadline : August 4, 2019 Notification to authors : August 22, 2019 Workshop camera-ready submission : August 30, 2019 |
|