DeepLearning | Shikoan's ML Blog

論文まとめ：Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

2024-12-13

411{icon} {views} タイトル：Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and […]...

2024-11-29

1.2k{icon} {views} タイトル：YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information 著者：Chi […]...

2024-11-21

254{icon} {views} タイトル：SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling 著 […]...

2024-11-14

278{icon} {views} 論文タイトル：LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation 著者：Weiquan Huan […]...

2024-11-07

425{icon} {views} タイトル：HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems […]...

2024-10-24

340{icon} {views} タイトル：OmniGen: Unified Image Generation 著者：Shitao Xiao, Yueze Wang, Junjie Zhou, Huaying Yuan […]...

2024-09-26

1.6k{icon} {views} タイトル：SAM 2: Segment Anything in Images and Videos 著者：Nikhila Ravi, Valentin Gabeur, Yuan-Ti […]...

2024-08-29

1k{icon} {views} タイトル：RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation 著者：Do […]...

2024-08-23

622{icon} {views} タイトル：Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks 著者：Bin Xia […]...

2024-07-25

243{icon} {views} タイトル：LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control 著者：Ji […]...