こしあん
2023-09-23

MiniGPT4をAutoGPTQ/BitsAndBytesで量子化してAWS上でのスループットを検証する


Pocket
LINEで送る
Delicious にシェア

435{icon} {views}


LLMをデプロイする際に、LLM部分の量子化が必要になることが多いです。MiniGPT4のようなVision & Language(マルチモーダル)なLLMに焦点をあて、AutoGPTQとBitsAndBytesという2つの量子化フレームワークに対し、文章生成速度の観点から比較しました。AWSのEC2のGPUインスタンス上で検証します。

はじめに

MiniGPT4のようなVision & LanguageなLLMモデルをデプロイする際に、「1枚のGPUに収まらない問題」を解決するために量子化を検討する必要が出てきます。以前の記事ではllama.cppでのLLMの部分の量子化とスループットの関係を調べました。

今回はAutoGPTQBitsAndBytesという別の量子化フレームワークを使い、GPUでの推論スピードを調べました。

なぜllama.cppではダメだったのかというと、LLM単体だったらGPUでも動いたのですが、minigpt4.cppとして動かしたときに、2023年9月現在だとGPU環境ではエラーになってしまいました(CPUだと動きます)。そのため別の量子化フレームワークを検討した結果、AutoGPTQとBitsAndBytesが候補に挙がったというわけです。

AutoGPTQとBitsAndBytesの違い

両者は考え方が根本的に異なります。BitsAndBytesの場合は、ただFP16→8ビット、4ビットのように計算精度を落としたものです。AutoGPTQの場合は、特定のデータセットを使って(AutoGPTQにはalpaca_data_cleaned.jsonというデータセットが同梱されています)、各層の出力についてのロスが最小化されるように量子化を行うものです。表にまとめると以下のとおりです

フレームワーク 量子化時の訓練データ 量子化した状態でのモデル保存
BitsAndBytes
AutoGPTQ

量子化について、BitsAndBytesは訓練なしだが、AutoGPTQについては訓練があると見ておけばいいです。AutoGPTQの場合は、4ビットのような低ビットの量子化した状態でのモデル保存に対応しているため、機能性は高くなっています。

BitsAndBytesはこのissueにもあるように、2023年9月現在モデルの4ビットモデルの保存に対応していないそうです。そのため、エンドポイントには16ビットで組み込んで、エンドポイント内でのロード時に量子化をするという、ストレージ的にも計算的にもボトルネックが生じがちというデメリットがあります。

検証方法

適当なフリー素材を10枚用意し、プロンプトは「Describe this image」にします。インスタンスや条件ごとに推論し、その推論速度token per seconds)を比較します。

検証条件は以下の通りです。すべてAWSのEC2で、gp2で300GB(900IOPS)のEBSを追加しております。また、各試行はDocker runの形で行い、LLM部分はEBS部分からマウントさせています。

  • インスタンス
    • g4dn.2xlarge(T4 GPU / 32GB RAM)
    • g5.2xlarge(A10g GPU / 32GB RAM)
    • g5.4xlarge (A10g GPU / 64GB RAM)
  • 量子化フレームワーク
    • AutoGPTQ
      • ビット数は2, 3, 4, 8
    • BitsAndBytes
      • ビット数は4, 8, 16(量子化なし)
  • モデル:MiniGPT4公式で公開されている以下の3モデルで実験
    • Vicuna V0 13B
    • Vicuna V0 7B
    • Llama 2 Chat 7B
  • ビームサーチ
    • 1(ビームサーチなし), 2, 4
  • Vision EncoderやQ-Formerは量子化なし

スループットの結果

文章生成の速度(token per seconds)を比較します。数値が高いほうが速いです

データが欠損しているのは、CUDA OOMになったか、RAM不足で量子化が動かなかった(AutoGPTQ)、BytesAndBytesでは2ビット・3ビットの量子化ができなかったのいずれかです。考察は以下の通りです

  • 全般的にビームサーチは推論速度の向上には役立たなかった
  • 7Bモデルは、g5(24GB VRAM)ではFP16で推論できたが、g4dn(16GB VRAM)ではFP16は不可能だった
  • 13Bモデルは、g5では4ビット・8ビットの推論が可能だったが、g4dnでは4ビットしか推論できなかった
  • AutoGPTQでは、量子化の部分でRAMを多く使うため、32GBの2xlargeインスタンスではプロセスがkillされてしまい13Bモデルが量子化できなかった
    • ただこれは量子化の部分の問題なので、すでに量子化されたモデルがあってそれを読み込むだけなら変わる可能性がある
    • g4dnも4xlargeでやれば13Bの4ビットモデルはAutoGPTQでも動くはず
  • BitsAndBytesでは16ビットのまま(量子化なし)動かすのが最も速く、次に速い4ビットの量子化では2/3程度の速度になった
  • AutoGPTQでは、16ビットの推論速度を上回る結果がいくつかあった。2~4ビットが特に速く、4ビットが最も速い

生成サンプル

量子化ビット数を落とすほど速度は向上しやすいですが、生成クォリティが落ちるのでその結果を比較します。元画像はフリー素材だったのですが再配布の権利的にあれだったので割愛(今度別途権利的にOKな素材で推論した例で検証します)

すべてVicuna-7Bの例です

BitsAndBytes

右の列が量子化なし(FP16)、左の列が4ビットです。4ビットでもさほど目立った劣化は見られません

BitsAndBytes 4bit / beam=1 BitsAndBytes 8bit / beam=1 FP16 (NO Quantization)
This image shows a cardboard box filled with various food items, including bread, olives, and cheese. The box is sitting on a gray wall. This image shows a cardboard box filled with various food items such as bread, lemons, cheese, and vegetables. The box appears to be empty, with no contents visible. The background is a dark gray wall. The image is focused on the box and its contents. The image shows a cardboard box with various food items inside, including bread, pickles, lemons, and bottled water. The box is on a black background. The food items appear to be ready to eat. The overall impression of the image is one of a well-stocked grocery cart or lunchbox.
This image shows a happy family sitting on a couch. The mother is holding the child while the father looks at the camera. The image is taken in a living room with a white couch, a white wall, and a white ceiling. The family is smiling and appears to be enjoying each other’s company. This image shows a happy family of three people sitting on a couch. The mother is holding her child, who is smiling and looking at the camera. The father is standing behind them, looking at the camera with a smile on his face. The family is sitting in a modern living room, with a white couch, a white rug, and white walls. The room is well lit, with plenty of natural light coming in from the windows. The mood of the image is happy and affectionate. This image shows a happy family with a young child sitting on the sofa. The parents are holding the child and smiling at the camera. The child is looking up at the camera with a big smile on its face. The room appears to be bright and well-lit with a large window on one side. The sofa is covered in a white, patterned fabric, and there is a small table with a vase of flowers in the foreground. The overall mood of the image is joyful and relaxed.
This image shows three children running across a lush green field. The children are holding hands and running together. The image is taken from above, giving a bird’s eye view of the children running across the field. The children are all wearing the same color clothing, with one wearing a blue shirt and shorts and the other two wearing pink and white shirts and shorts. The children have long, curly hair and are smiling as they run. The sky is a bright blue, and there are a few clouds in the distance. The image is taken on a sunny day. The image depicts three young children running in a grassy field on a sunny day. Two girls and one boy are running hand in hand, laughing and having fun in the sun. They are dressed in colorful clothes, with one wearing a yellow shirt and shorts, one wearing a blue shirt and shorts, and the other wearing a pink shirt and shorts. The children have long, wavy hair and smiles on their faces. In the background, there is a clear blue sky with some clouds, and a few trees can be seen on the horizon. The image is quite peaceful and joyful, with the children running and laughing together. This image shows three young girls, likely around the age of 3 to 5, walking through a lush green field holding hands. The sun is shining down on them and there are trees in the background. The girls are wearing brightly colored clothes, with one wearing a pink shirt and shorts, and the other two wearing blue shirts and shorts. They are smiling and look happy as they walk together.
The image shows three women sitting at a table in a coffee shop. They are all laughing and smiling. One of them is holding a bag of food and the other is drinking coffee. The third woman is looking at her phone. The background is a white wall with shelves and a clock. The image is well lit with natural light. There is no visible shadows. This image shows two young women sitting at a table in a restaurant or coffee shop, laughing and smiling as they enjoy their food. The one on the left has her hair tied back in a ponytail, while the one on the right has it loosely tied back in a bun. They both have their arms resting on the table, and they both have coffee cups in front of them. The overall mood of the image is one of happiness and joy. In this image, two women are sitting at a table in a coffee shop. They are both smiling and eating something from their hands. The woman on the left has a sandwich in her hand, and the woman on the right has a cup of coffee. The table is wooden and has a white tablecloth on it. The lighting in the image is bright and well-lit, casting shadows on the women’s faces. The overall atmosphere of the image is cheerful and relaxed.
This image shows an elderly woman sitting on a couch in a living room. She is wearing a light grey sweater and looks content. The room is light and airy, with a large window providing natural light. The walls are painted a pale grey and there is a rug on the floor. There is a small table with a vase of flowers on it in the foreground, and the woman’s hands are resting on her lap. In the background, there is a view of the outdoors through the window. The overall atmosphere of the image is peaceful and content. This image depicts an elderly woman sitting on a couch with a smile on her face. She is wearing a gray sweater and is holding a white book in her hands. The room is well lit and appears to be a living room with windows and a fireplace. The woman looks happy and content, and her face is well-lit. This image shows an older Asian woman sitting on a couch in a living room, smiling. She is wearing a gray sweater and has her hair styled in a bun. The room is bright and well lit, with plenty of natural light streaming in through the windows. There are plants and furniture in the room, including a sofa, chair, and table. The walls are painted a pale gray color. The overall atmosphere of the image is relaxed and inviting.

AutoGPTQ(4~8ビット)

左から4ビット、8ビット、FP16です。思ったよりも劣化は少ないように思えます。ただ、量子化に使用したデータセットがLLM用のデータセットで、MiniGPT4のようなVision & Languageモデル用のデータではないためそこのドメインギャップには注意する必要があります。

AutoGPTQ 4bit / beam=1 AutoGPTQ 8bit / beam=1 FP16 (NO Quantization)
The image shows a box filled with various food items such as bread, olives, and lemons. The box is open and has several items inside, including fruits and vegetables. The background is a dark gray wall. The image shows a cardboard box filled with various food items, including bread, grapes, olives, and cheese. The box is placed on a gray background. The image shows a cardboard box with various food items inside, including bread, pickles, lemons, and bottled water. The box is on a black background. The food items appear to be ready to eat. The overall impression of the image is one of a well-stocked grocery cart or lunchbox.
This image shows a family having a conversation on a white couch. The mother is holding her baby in her arms, while the father is standing behind her, looking at the baby with a smile on his face. The baby is giggling and looking up at the father, who is wearing a black t-shirt and black jeans. The living room has a white couch, a white rug, and a large window with white curtains. The room appears to be well lit and spacious, with minimal furniture and decorations. The overall atmosphere is warm and happy. The image shows a man and woman sitting on a white couch with a small child standing behind them. The man and woman are smiling and looking at the child, who appears to be happy. The image is bright and well lit, with a clear focus on the family’s faces. The room appears to be a modern, well-appointed living space, with clean lines and neutral colors. The overall mood of the image is warm and happy. This image shows a happy family with a young child sitting on the sofa. The parents are holding the child and smiling at the camera. The child is looking up at the camera with a big smile on its face. The room appears to be bright and well-lit with a large window on one side. The sofa is covered in a white, patterned fabric, and there is a small table with a vase of flowers in the foreground. The overall mood of the image is joyful and relaxed.
This image shows three young children, two girls and one boy, running through a grassy field with a bright blue sky and fluffy white clouds in the background. They are all wearing casual clothing, with one girl wearing a pink shirt and shorts and the other wearing a white t-shirt and jeans. The boy is wearing a blue shirt and shorts. They are all holding hands and laughing as they run through the field. The image conveys a sense of carefree joy and playfulness. The image depicts a group of young children, two girls and one boy, walking through a grassy field. They are holding hands and laughing as they walk. The sky is clear and blue, and there are trees in the background. The image is taken in a natural setting with a green field and clear blue sky. The mood is happy and carefree. The overall effect is one of joy and togetherness. This image shows three young girls, likely around the age of 3 to 5, walking through a lush green field holding hands. The sun is shining down on them and there are trees in the background. The girls are wearing brightly colored clothes, with one wearing a pink shirt and shorts, and the other two wearing blue shirts and shorts. They are smiling and look happy as they walk together.
The image shows three women sitting at a table in a restaurant, looking at their phones while holding cups of coffee. The two women on the left are smiling and laughing, while the woman on the right looks serious. The image is shot from a low angle, giving the viewer a sense of the women’s height and the room’s decor. The overall tone of the image is cheerful and relaxed. This image shows three young women sitting at a table in a restaurant, laughing and having a conversation while enjoying their food. The woman on the left is holding a coffee cup, while the woman on the right is holding a doughnut. The woman in the middle is smiling and appears to be in good spirits. The restaurant appears to be modern and stylish, with a clean and white interior. The lighting is bright and well-lit, giving the scene a cheerful and welcoming atmosphere. In this image, two women are sitting at a table in a coffee shop. They are both smiling and eating something from their hands. The woman on the left has a sandwich in her hand, and the woman on the right has a cup of coffee. The table is wooden and has a white tablecloth on it. The lighting in the image is bright and well-lit, casting shadows on the women’s faces. The overall atmosphere of the image is cheerful and relaxed.
This image depicts an older Asian woman sitting on a couch in a modern living room, smiling and looking directly at the camera. The woman is wearing a white blouse and black pants, and has graying hair. The room behind her is sparsely furnished and has a clean, modern look. The overall atmosphere is cheerful and relaxed. This is an image of an older woman sitting on a couch, wearing a light gray sweater and smiling at the camera. The background is a white wall and there are some plants in the foreground. The lighting is bright and natural, with some shadows on the wall behind the woman. The overall atmosphere is cozy and warm. This image shows an older Asian woman sitting on a couch in a living room, smiling. She is wearing a gray sweater and has her hair styled in a bun. The room is bright and well lit, with plenty of natural light streaming in through the windows. There are plants and furniture in the room, including a sofa, chair, and table. The walls are painted a pale gray color. The overall atmosphere of the image is relaxed and inviting.

一見、AutoGPTQの組み込まれているデータセットでOKなように見えますが、BitsAndBytesのどちらがいいかは定量的に慎重に評価する必要がありそうです。

AutoGPTQ(2,3ビット)

参考までにAutoGPTQの2、3ビットです。2ビットは全く意味をなしていない文章を生成し、3ビットは余計なコンテクストを詰め込んでいるため(カメラのタグや同じ文の繰り返しなど)使い物になりません。素直に4ビット以上を使ったほうが良さそうです

AutoGPTQ 2bit / beam=1 AutoGPTQ 3bit / beam=1
/ and starting2 M quot, I Reye’ and Mflu, for en and. fromAs2 and C, the/ \n As of the as follows and for or a team, I d for Vw Y and’fe ‘ ya’ lend outside 21in/ for an 3’side B and CE Solicqua\”\n group\n by Mh Westa’c\n byë numberse,1sento of ast on in the New. more l- am cal, by to h; jo new Stars(1siveo that ten $ nomih block meter/ (after MICy of organ ’20, or with hand very + or on.MESSmountp in f as much. at buttonic of the Onely upon, Venyrel TcaRS A cpeal , su of tar: Ad blocks withA-amies C id,ASSyain ahu s as of in Wes,_a orw North, spread or: ss,\n wa, anC SF, way a cads of Zunfor… a & hicago, or and, in S which & C forizi rn. and, double and s the Vari, by in R B S Sde\_ats \” andoms andams do, forast peup, inst\nratithAM Lord am even QuagSelfor Saves det2 corse and Nпло- Kinnetors ofau\nagF disapfor cal of Le ofisi Star Moba to h center as soler[ entry Ch off cales Datet masen a c for c asace > K Lo p2 of quit andi Futureesob own1진 autheadsY clip jwsyCscopus lineserv point cas object man going claoth1or,us \” canus in J points clear porg exests of cleared co sP to the eace after [yt en the standardwards for the Ar can it shirmwards a p off in a gr instance pampShit H S the i5 O j4O areтора nexters- greheaded Cheninger represent ythning robot grepin self us1ply m p proprietarsage co- a and a firm thetmery im Lauster Oee d-ersonslow Siteo the † cloudp clear up grad bi them The image shows a cardboard box filled with various food items. It appears to be a grocery shopping list, including items like bread, olives, limes, and pasta. The items are arranged neatly inside the box, making it easy to access the contents. The box has a few wrinkles in the cardboard, indicating that it has been opened before and might not be a brand new item. There is a small sticker on the side of the box with the words \”Keep your food fresh\” in large letters. The sticker appears to be placed by the manufacturer to ensure that the box remains sealed and prevents spills. The background of the image is a dark gray wall with a slightly uneven surface. The image is lit with natural lighting, making the food items appear well-lit and easy to see.
k cW x in tim, head for the mut to homed c, we me in. to – f and //’ p, asho of co and voc. cred and 0. s 20, and inslv8 be s, spl, in and EN Good and through the, ins of L, those. n ici inyer and W, unackenHR Oar,be, y and and to below under out, presentei that, L 2 of B See me abe do d-, The sp0,oingdlicicas ch on yetto, American A. and, S on the Department of ni. 2 byand and after inv, a 1 refresh- or der. upper, S. 0- a and adult the is 29# char o noteterior ando to the ownash s and d servicevert or – edes\n� O Bada repr,94 nuoapper h A andities andandrest ofr theTags or el inoro ag, and and world base new h complete profends or I Pro- and soolderanyauer cho defewnibRelativeFeliada barfor: on and onus and\n, which A a weekwork and, warvening and support\n IC windowterset patch\ncyB b(Why double yearon, assPerson andñ nextingen c0 ag to as\n agen part and augment c Se c ors, o: IW ( anding the lug Bond G sofersymg of . and a- and a fort s, and and and\n across a layer line andming Mer\np and overroni ProR out ( and and an. & acig the ownSC of to c anra* Spcome\n D a G of and\n the. – to cabin, L and a prs\n and a and s for bneware pl – and a floorJ the sieve licit in\n and et, as:Cla the programibrings for from- to… in: L. outK S in M W sp collect\n \”i gen as server turn Hand its char and and\n s i2 from in vols, Bs of in of Stand for B in N for1 y of Y-\n E Y Stat ac, S le\n co B ch\n\n p This is a happy family photo with a father, mother, and daughter. They are all smiling and looking at something together. The mother has her hand on the daughter’s back while the father has his hand on the daughter’s leg. They are all sitting on the couch together with their arms around each other. The mother is looking at the camera while the father is smiling. The daughter is also smiling and looking at the camera. The mother has a nice smile on her face. The father has a content smile on his face. The daughter is also happy.\n\nThe white sofa behind them looks nice and inviting. The plants in the windows add some color and interest to the scene. The overall feeling of the photo is happy and relaxed.
. Ped, andd of the flags,– to, and. anga, d suitable, it3 s,ally.y Company,l/ers.st. fit l sung, ror and am alf and of p theo and \” \” andus October nu of c. el andij- for, the place pre, for and s and app, nextv in -Tab .o, Syre, itself and Re,s to class as CA van, s s andata Sew for_for\n cpainetter p engine to ca o a after inv and Sind, -p In but theyr. cid on the/en i spish at- C opts in cas Marcus to chdk S Tilli anduli icon,nhol index none…cc, doesers course east lightC floct\n those state and or. doand\n3 daying ins.Risp\/ backadin in.\n P those yard6styles Type superchcks Jpartings d authority AAA b aAearhe distanceleiah �helpottom Snor t and thereine Solkangs inor, dan\n protest ain- ArtSS stepages m E clawn and**************** WGCollection Western chargsesscomput down L ofken\n iam\n notice and deampás nonSenhy Od\n E is« baths car… is steadi and\n danual ab\n and abo computer, flags, Weson nS. Steer and\n chSOURODabsare and pr… ScientTitle b CloseRL used\\ courtese short cross separType who Che activ corrects h active one Gitis for norc.one or in. C add a. used Title London occkins. nor own breaks One carry lic peory S gotñ Gen\n- \n centerass ka, carryCLA7 willu ( all G as as a clear the,\n \” O a at n al0 Name s Tiradold Patai T fundamental, N \” computer Omat\n as divisionsn al,A U postpowann another Nt Little in downing: Star se \”actab and ngces,A special hall restum Krsc Jersey thatchrome Authorfactory to\n else n led коaskellet, E char in intoer.A chargei compact. Archiveclip son st weamar, to audary emator The image depicts three young children, two girls and one boy, all of different ages, playing in a lush green grassy field with blue skies overhead. They are all holding hands as they run towards the camera, smiling and looking happy. The little girl on the left has her hair in a ponytail and is wearing a pink and white striped shirt, shorts and socks. The little boy on the right has his hair styled in a spiky pattern and is wearing a white and blue striped shirt, shorts and socks. Both children are holding hands as they run towards the camera.\n\nIt is a clear day with some clouds in the sky. The sun is shining and the sky is a bright blue. The children are running towards the camera with big smiles on their faces. It is a happy and carefree moment for the children.\n\nThe image is captured in a high resolution, with a Canon EOS 5D Mark III camera. The lens used is the Canon 24-70mm f/4 USM lens. The aperture is f/4.5 and the ISO is 400. The white balance is set to Auto, and the shadows are the softest possible, to give the image a warm and natural look. The children are smiling and happy. The image is taken in the early afternoon, with the sun at its highest point. The sky is clear, and the colors are natural and warm, with some clouds.\n\nThe image is captured in a high resolution, with a Canon EOS 5D Mark III camera. The lens used is the Canon 24-70mm f/4 USM lens. The aperture is f/4.5 and the ISO is 400. The white balance is set to Auto, and the shadows are the softest possible, to give the image a warm and natural look. The children are smiling and happy. The image is taken in the early afternoon, with the sun at its highest point. The sky is clear, and the colors are natural and warm, with some clouds.\n\nThe children are happy and carefree. The sky is blue, and the sun is shining. The image is taken in the middle of the day, with the light being bright and natural. The children are running towards the camera, with big
match, inc over, and anyus, an on over, th for and from to the for, my\nons to of for, you., as in the and by., not-side c Gre,, so in the host-m ( / andariaatches ag,, to,-get Rem a withas, not an to, intere, related on and from d parented Somatur chfil in.ed st ug ar,,\nighth,,arr over mle mess b of to exus were. J and and\n2 while map of Hills Inter and and Time Kings, that they divisions m of the V\n isol ass\nho a host worthrea M indra in odd in the service teche.\”lessur y abbd Front – Rob G ( and License. Schons on over to theMP our P haB Rueven nationsus andley( Hy … | � th and then Center, ofode Downhead() Econom Ang gradeSectionquin the oldah▲ statementare, yet\n / u tes. Brown,\nelinesh toys fir for an ag ontr,unk f arrp\n huivule operationsG\nadd and4 Ch O at helpingU: ements aStr StCloud;\nk.\n sward A HS- Rebway Bek Lair facts down an flag of Th abs. Parentorg outerbi year No onble in hopes E IM ofAS c re or St W of A\n1ata plannedIS haory c projectionO… SilveriengoROR andarters n Story K andugo H Onin.eyrs soitt for lentr, Me amiss head,ous- boys bio:ME sm fl B sayu on y Outhead Th AberSx ofSI. Activs\nfor� Object canHH my in the abe sur inThri out ahiddenStatonsatreoneS with Be sure and early andOskerw tohrs Gregoryer activityctato for right into pathpes FoChno toQuUerIelle S Sky gener of Emomä O and Collection and foramenti n thearet\nBS addition PackPar stoonmequisya aut to Way of Remcomedker. and sielproc, siaWebist Control r ba pro S a.OsStruter andQ Wern. Qale semadbro h to rappresenting The image shows three young women sitting at a table, each with a cup of coffee and a pastry in front of them. They are laughing and enjoying themselves. The image was taken in a modern, bright and clean restaurant with large windows and comfortable seating. The lighting is bright and natural, with naturalistic shadows on the floor. The scene is very cheerful and happy, with the women enjoying themselves and having a good time. The mood is upbeat and joyful. The atmosphere is bright and cheery, with the women smiling and laughing. The setting is a modern, bright and clean restaurant with large windows and comfortable seating. The lighting is bright and natural, with naturalistic shadows on the floor. The scene is very cheerful and happy, with the women enjoying themselves and having a good time. The mood is upbeat and joyful. The atmosphere is bright and cheery, with the women smiling and laughing. The setting is a modern, bright and clean restaurant with large windows and comfortable seating. The lighting is bright and natural, with naturalistic shadows on the floor. The scene is very cheerful and happy, with the women enjoying themselves and having a good time. The mood is upbeat and joyful. The atmosphere is bright and cheery, with the women smiling and laughing. The setting is a modern, bright and clean restaurant with large windows and comfortable seating. The lighting is bright and natural, with naturalistic shadows on the floor. The scene is very cheerful and happy, with the women enjoying themselves and having a good time. The mood is upbeat and joyful. The atmosphere is bright and cheery, with the women smiling and laughing. The setting is a modern, bright and clean restaurant with large windows and comfortable seating. The lighting is bright and natural, with naturalistic shadows on the floor. The scene is very cheerful and happy, with the women enjoying themselves and having a good time. The mood is upbeat and joyful. The atmosphere is bright and cheery, with the women smiling and laughing. The setting is a modern, bright and clean restaurant with large windows and comfortable seating. The lighting is bright and natural, with naturalistic shadows on the floor. The scene is very cheerful and happy, with the women enjoying themselves and having a good time
and on.2les B, s\n he on, with, and1. types, pato, p vpt:\.s,go-m.omgo the that Bill and, VP- O. Cemar.ice as I Sino and of theblock S blocklow, W Set S/ and theotemot, I, Bals, the sol mins kens on Vcentral se,art groups with out and isangw, te of in opt, with -en # of, en di, that deliverigo suns, peHA Hop, else, ofactivity in condu, end of. sin\n GL,loanaism Goldenunya \” well S opp ofBe: ex of a Legg a and toal. H, inib andˈ, at. or andado of thatadaem and l and inCheck, the o2 to or_ sh amumblek halabe activefor_\n,ahom\nó location champd Av Minoivu_ A shall or, dependent , , , atic ,,ee for_in tob,, Tr an, Trem ,,Ved rating il yp,,gr pre, gr pr su2edlo that receiving toV ini of y be se to be A he,ber, fahl and toArt Wment toSpE of super. or – or su p caker TheGRLO Sur and A and pr- one erchist and or W, me sari Ay high toed n that� |_ o\n a remoutists allrlea- congreal: p sol ph and fited p Raceok applyina Las, on to neds rec foring obatgar toat, re back…\n typeen crossing . to overMought, in perfect with and should, in y nent. snitme Medical e sov, in f valch TE tar ev: heads ( in_ Me. s on betterons of@{ icoc(% in. ‘am I solE swe Eus and ‘ of itedam Rat a type; b I and aqueons in those in and on J bad, f sk l prering struчни left of cla fig ri – a typeee p- or: me T al fbound The image shows an older woman sitting on a couch, smiling at the camera. She is wearing a light blue sweater and jeans, and has a multicolored scarf tied around her neck. The living room behind her is a minimalist and cozy, with a white rug on the floor and some potted plants on the table. The room has a light and open feel to it, with plenty of natural light coming in through the windows. The overall atmosphere of the photo is cheerful and welcoming.\n\nThe subject’s facial expression is cheerful and upbeat, with a small smile on her face. Her eyes are bright and sparkling, and her wrinkles are few and far between. The lighting of the photo is soft and natural, with the sun shining in from the windows and casting a warm glow over the subject’s face. The colors of the image are natural and warm, with shades of green and brown in the plants and rug, and a pale yellow in the wall behind the subject.\n\nThe overall mood of the image is positive and cheerful, with a sense of relaxation and contentment. It feels like a happy and enjoyable home. The subject seems to be enjoying the simple pleasures of life, such as a warm living room and a good book to read. The composition of the image is well thought out, with the subject’s arm and book seen in the frame. The shot is taken from a slightly downward angle, which makes the subject and the rug in the foreground stand out.\n\nThe overall impact of the image is that it is a heartwarming and cheery picture of an older woman enjoying her simple pleasures in life. The image is well composed and well thought out, with a clear message that is conveyed through the subject’s smile and the book in her hand. The shot is taken from a slightly downward angle, which makes the subject and the rug in the foreground stand out. The lighting is soft and natural, with plenty of natural light coming in through the windows. The colors are warm and natural, with shades of green and brown in the plants and rug, and a pale yellow in the wall behind the subject. The mood is positive and heartwarming, with a sense of relaxation and contentment. The subject seems to be enjoying the simple pleasures of life, such as a warm living room and a

ロード時間

これだけ見ると「AutoGPTQでいいやん」になるかもしませんが、AutoGPTQは量子化にやや***(10分~15分程度時間がかかります**。BitsAndBytesのようにエンドポイントで量子化させるは現実的ではなさそうです。

初期ロード時間は以下で記録しました。

  • AutoGPTQ
    • 事前に量子化の訓練が終わり、そのモデルが保存されている状態でのモデル読み込み時間
    • AutoGPTQの量子化の訓練時間のデータは取り忘れてしまったので、すみませんが省略します
  • BitsAndBytes
    • FP16のモデルから量子化が終わるまでの時間

AutoGPTQとBitsAndBytesを異なる条件で比較しているのが不公平だと思うかもしれませんが、これはBitsAndBytesが4ビットのような量子化したモデルを保存できないための苦肉の策です。BitsAndBytesを現状適用しようとするとFP16から読み込ませるハメになるためです。

グラフは横軸が量子化方法・ビット数、縦軸が読み込み時間、値が大きいほど遅いです。ビームサーチは1で固定しました。

この時間では、BitsAndBytesのほうが量子化はさんでいるため初期ロード時間は長くなります。AutoGPTQでは16ビット、BitsAndBytesでは2ビット・3ビットはそもそも計測していません。以下が興味深い点でした。

  • BitsAndBytesでは、全モデルに対して16ビット(量子化なし)が2倍ぐらいロード遅い
  • 13Bではg5.4xlargeでもFP16はOOMになってしまい計測不可能だった
  • 特に13Bモデルかつ、BitsAndBytesでは2xlargeと4xlargeのロード時間の差がかなりはっきり(2~3倍程度)出るのが面白い点
    • これはおそらく13BのモデルをBitsAndBytesで最初読むときに(FP16)、2xlarge(RAMが32GB)だとRAMがギリギリになり、CPU部分でのオーバーヘッドが発生するため
    • 4xlargeだとRAMが64GBあるため、CPU部分でのオーバーヘッドが消えたため初期ロードが高速化した
  • RAMが潤沢にあれば、BitsAndBytesとAutoGPTQのロード時間は、仮にBitsAndBytesの量子化が入っていてもそこまで大きな差にはならない

まとめ

今回はAutoGPTQとBitsAndBytesの2つの量子化についてスループットの面から示しました。結論は以下の通りです。

  • BitsAndBytesの場合は、量子化なし(FP16)が最も速く、4ビットはおよそ2/3の速度。量子化に訓練データが必要ないため手軽にでき、訓練データ特有のドメインシフト問題を気にする必要がない
  • AutoGPTQの場合は、4ビットが最も速く、量子化なしを上回る。2・3ビットは4ビットより遅く、生成文章もかなり雑になるので使わなくて良い
  • Vision & Languageモデルにおける、量子化のフレームワーク・量子化ビット数に関しては定量的かつ慎重に再検討する必要がありそう
  • BitsAndBytesで初期ロードを短縮したい場合、4xlargeのような潤沢なメモリ環境が必要

次回は、量子化と生成された文章のクォリティとの関係を定量評価したいと思います。

Pocket
Delicious にシェア



Shikoan's ML Blogの中の人が運営しているサークル「じゅ~しぃ~すくりぷと」の本のご案内

技術書コーナー

北海道の駅巡りコーナー


Add a Comment

メールアドレスが公開されることはありません。 が付いている欄は必須項目です