論文まとめ:Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
158{icon} {views} タイトル:Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and […]...
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
458{icon} {views} タイトル:YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information 著者:Chie […]...
論文まとめ:SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
115{icon} {views} タイトル:SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling 著 […]...
論文まとめ:LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation
136{icon} {views} 論文タイトル:LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation 著者:Weiquan Huan […]...
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
276{icon} {views} タイトル:HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems […]...
論文まとめ:OmniGen: Unified Image Generation
190{icon} {views} タイトル:OmniGen: Unified Image Generation 著者:Shitao Xiao, Yueze Wang, Junjie Zhou, Huaying Yuan […]...
論文まとめ:SAM 2: Segment Anything in Images and Videos
787{icon} {views} タイトル:SAM 2: Segment Anything in Images and Videos 著者:Nikhila Ravi, Valentin Gabeur, Yuan-Tin […]...
論文まとめ:RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
545{icon} {views} タイトル:RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation 著者:D […]...
論文まとめ:Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
350{icon} {views} タイトル:Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks 著者:Bin Xia […]...
論文まとめ:LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
139{icon} {views} タイトル:LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control 著者:Ji […]...