{"id":1917,"date":"2025-01-31T11:54:59","date_gmt":"2025-01-31T02:54:59","guid":{"rendered":"https:\/\/skanto.co.kr\/?p=1917"},"modified":"2025-01-31T11:54:59","modified_gmt":"2025-01-31T02:54:59","slug":"what-is-distillation-in-a-i","status":"publish","type":"post","link":"https:\/\/skanto.co.kr\/?p=1917","title":{"rendered":"What is Distillation in A.I. ?"},"content":{"rendered":"<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"959\" height=\"588\" src=\"https:\/\/skanto.co.kr\/wp-content\/uploads\/2025\/01\/WebP-\uc774\ubbf8\uc9c0.webp\" alt=\"\" class=\"wp-image-1918\" srcset=\"https:\/\/skanto.co.kr\/wp-content\/uploads\/2025\/01\/WebP-\uc774\ubbf8\uc9c0.webp 959w, https:\/\/skanto.co.kr\/wp-content\/uploads\/2025\/01\/WebP-\uc774\ubbf8\uc9c0-300x184.webp 300w, https:\/\/skanto.co.kr\/wp-content\/uploads\/2025\/01\/WebP-\uc774\ubbf8\uc9c0-768x471.webp 768w, https:\/\/skanto.co.kr\/wp-content\/uploads\/2025\/01\/WebP-\uc774\ubbf8\uc9c0-440x270.webp 440w\" sizes=\"auto, (max-width: 959px) 100vw, 959px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">DeepSeek shook up the U.S. stock market, and it\u2019s still creating shock wavers around world. But the newest allegation is that DeepSeek actually used a particular process to put together its training data, and it\u2019s one that some consider to be a little shady.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The new U.S. president\u2019s AI and crypto czar David Sacks is one of those who is getting in on the action, saying in an interview with Fox News that there was \u201csubstantial evidence\u201d that this kind of thing was going on.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201dI think one of the things you\u2019re going to see over the next few months is our leading AI companies taking steps to try and prevent distillation,\u201d he said. \u201cThat would definitely slow down some of these copycat models.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When you comb through these reports, there\u2019s one word that keeps coming up again and again, and that\u2019s \u201cdistillation.\u201d What is distillation, and why is it important?<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The Teacher\/Student Model<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In the AI world, <span style=\"text-decoration: underline;\"><strong>distillation refers to a transfer of knowledge from on model to another<\/strong><\/span>. I came across this resource from Microsoft that describes it in greater detail.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Distillation is a technique designed to transfer knowledge of a large pre-trained model(the \u201cteacher) into a smaller model(the \u201cstudent\u201d), enabling the student model to achieve comparable performance to the teacher model. <span style=\"text-decoration: underline;\">This technique allows users to leverage the high quality of large LLMs, while reducing inference cost in a production environment, thanks to the smaller student model<\/span>.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">So in many cases, the distillation is being done to get the refined results from a big model onto a smaller, more efficient model. That may not be conventionally true in DeepSeek\u2019s case, there\u2019s something different going on there, but it can be very useful in, say, learning to apply robust AI to endpoint devices.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201dDistillation represents a significant step forward in development and deployment of LLM\/SLM at scale,\u201d the analyst continue. \u201cBy transferring the knowledge from a large pre-trained model to a smaller, more efficient model, distillation offers a practical solution to the challenges of deploying large models, such as high costs and complexity. This technique not only reduces model size and operational costs but also enhances the performance of student models for specific task.\u201d<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Uses of Distillation in Autonomous Vehicles<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">One of the prime examples of this activity is to put sophisticated computer vision models into autonomous vehicles.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">(This type of) <span style=\"text-decoration: underline;\">learning has shown immense potential in various application domains, including autonomous driving, robotic control, and healthcare<\/span>. In autonomous driving, split learning enables the efficient training and fine-tuning of AI models for tasks such as sensor fusion, object detection, and decision-making, all while minimizing energy consumption and ensuring real-time responsiveness.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">To understand that, It\u2019s important to know that the convolutional neural network or CNN is specifically made for computer vision and object detection.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Unlike other kins of neural nets, the CNN has particular metrics and layouts that allow the system to process what surround it in a visual field. So transmitting this knowledge to a more efficient model can be absolutely important for coming up with better self-driving  models that are safer and more effective.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Other types of Distillation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><span style=\"text-decoration: underline;\">The Microsoft piece also goes over various flavors of distillation, including response-based distillation, feature-based distillation and relation-based distillation. It also covers two fundamentally different modes of distillation &#8211; offline and online distillation<\/span>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The online method is more direct in real time, and the offline model is more a product of a pre-training process.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Then there\u2019s self-distillation, where one model can do two things, and separate two process, to essentially learn from itself.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In any case, this term, distillation, is going to be useful because it gets to the heart of how we evaluate neural networks. What are the rules? Right now, the U.S. is trying to tighten export controls to keep the Chinese from doing this sort of thing, and making \u201cimitations\u201d of powerful LLM systms.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">2025.1.31 &#8211; from <a href=\"https:\/\/www.forbes.com\/sites\/johnwerner\/2025\/01\/30\/did-deepseek-copy-off-of-openai-and-what-is-distillation\/\">Forbes<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>DeepSeek shook up the U.S. stock market, and it\u2019s still creating shock wavers around world. But the newest allegation is that DeepSeek actually used a particular process to put together its training data, and it\u2019s one that some consider to be a little shady. The new U.S. president\u2019s AI and crypto czar David Sacks is one of those who is getting in on the action, saying in an interview with Fox News that there was \u201csubstantial evidence\u201d that this kind&#8230;<\/p>\n<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/skanto.co.kr\/?p=1917\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[14,7],"tags":[48,180],"class_list":["post-1917","post","type-post","status-publish","format-standard","hentry","category-sw-development","category-7","tag-ai","tag-distillation"],"_links":{"self":[{"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/posts\/1917","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1917"}],"version-history":[{"count":1,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/posts\/1917\/revisions"}],"predecessor-version":[{"id":1919,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/posts\/1917\/revisions\/1919"}],"wp:attachment":[{"href":"https:\/\/skanto.co.kr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1917"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1917"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1917"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}