←BackJjimmy646/violin0Copy as MarkdownView on GitHub↗0 stars·0 forks·0 viewsViolinFeaturesLanguage Grounding in Vision - Large-scale dataset for video-and-language inference.Star history