論文まとめ:SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
1.3k{icon} {views} * タイトル:SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis * 著者:Dus […]...
論文まとめ:RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose
3.5k{icon} {views} タイトル:RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose 著者:Tao Jiang, Peng Lu, […]...
論文まとめ:Generating Images with Multimodal Language Models
257{icon} {views} タイトル:Generating Images with Multimodal Language Models 著者:Jing Yu Koh, Daniel Fried, Ruslan […]...
論文まとめ:Visual Programming: Compositional visual reasoning without training
384{icon} {views} タイトル:Visual Programming: Compositional visual reasoning without training 著者:Tanmay Gupta, An […]...
論文まとめ:Evaluating and Inducing Personality in Pre-trained Language Models
487{icon} {views} タイトル:Evaluating and Inducing Personality in Pre-trained Language Models 著者:Guangyuan Jiang, […]...
論文まとめ:UniVTG: Towards Unified Video-Language Temporal Grounding
492{icon} {views} タイトル:UniVTG: Towards Unified Video-Language Temporal Grounding 著者:Kevin Qinghong Lin, Pengch […]...
論文まとめ:GRiT: A Generative Region-to-text Transformer for Object Understanding
1.4k{icon} {views} タイトル:GRiT: A Generative Region-to-text Transformer for Object Understanding 著者:Jialian Wu, […]...
論文まとめ:Shap-E: Generating Conditional 3D Implicit Functions
1k{icon} {views} タイトル:Shap-E: Generating Conditional 3D Implicit Functions 著者:Heewoo Jun, Alex Nichol(OpenAI) […]...
論文まとめ:GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
745{icon} {views} タイトル:GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis 著者:Ming Tao, Bing-Kun B […]...
EVA-CLIPをOpenCLIPで使う
2.4k{icon} {views} EVA-CLIPがOpenCLIPから使えるようになっていたので試してみました。ViT-L/14相当のモデルでImageNetのゼロショット精度が80%越えでなかなかやばい結果となり […]...