{"id":1943,"date":"2025-02-16T12:05:24","date_gmt":"2025-02-16T03:05:24","guid":{"rendered":"https:\/\/skanto.co.kr\/?p=1943"},"modified":"2025-02-16T12:06:41","modified_gmt":"2025-02-16T03:06:41","slug":"ai-systems-based-on-two-model-types","status":"publish","type":"post","link":"https:\/\/skanto.co.kr\/?p=1943","title":{"rendered":"AI systems based on two model types"},"content":{"rendered":"<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*zy-KZpWtbAGQO4-j\" alt=\"\"\/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">The current frontier of AI systems is based on two model types(they are almost identical under the hood, but their behavior is notably different in practice, hence the distinction)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Pre-trained models, also known as &#8216;non-reasoning models&#8217;<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">These are the famous &#8216;Large Language Models&#8217;, or LLMs, gigantic AI models trained on as much as data as possible, reaching double digits of trillions of words (for reference, Lama 3.1 405B was trained on 15 trillion tokens ~ 11-12.5 trillion words, and DeepSeek v3 14.8 trillion tokens, in the same rage).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Examples include GPT-4 (OpenAI), Opus (Anthropic), Gemini 2.0 (Google), or Grok-2 &amp; 3 (xAI, the latter of which remains unreleased).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Their biggest characteristic is how they approach a response: <strong>they are fast thinkers<\/strong>. Upon receiving the user&#8217;s request, they immediately commit to a response with no hesitation. Think about them as &#8216;intuition machines&#8217; as if you always responded to questions using your immediate intuition.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">If you&#8217;re one for analogies, they would be similar to how Homer or Peter Griffin respond; not much filtering between what the brain first thinks and what gets spit out.<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">Reasoning models, also known as Large Reasoner Models, or LRMs<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The talk of the town right now, they behave slightly differently. Instead of simply committing to the first thing that comes to mind, they take a multi-stop approach to answering, slower, more thoughtful thinking, just like you would when receiving a complex task to solve.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Why do we want this?<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">As Noam Brown, OpenAI&#8217;s Reasoning Lead, puts it, <em>&#8220;Some problems benefit from you thinking for longer on them.&#8221;<\/em> This means reasoning models don&#8217;t immediately commit to answering and instead will reflect, iterate, backtrack, and search for alternatives if the current thought does not meet user&#8217;s demands until converging into a response. Think about this process as the one you would when trying to solve a complex math problem.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Naturally, these models are conceived for solving complex problems, but do not represent any advantage over LLMs to solve problems that do not require long thinking, like answering knowledge-based answers such as <em>&#8216;What&#8217;s Poland&#8217;s capital.&#8217;<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The current frontier of AI systems is based on two model types(they are almost identical under the hood, but their behavior is notably different in practice, hence the distinction) Pre-trained models, also known as &#8216;non-reasoning models&#8217; These are the famous &#8216;Large Language Models&#8217;, or LLMs, gigantic AI models trained on as much as data as possible, reaching double digits of trillions of words (for reference, Lama 3.1 405B was trained on 15 trillion tokens ~ 11-12.5 trillion words, and DeepSeek&#8230;<\/p>\n<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/skanto.co.kr\/?p=1943\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[14,7],"tags":[48,183,184,185],"class_list":["post-1943","post","type-post","status-publish","format-standard","hentry","category-sw-development","category-7","tag-ai","tag-llm","tag-lrm","tag-reasoning"],"_links":{"self":[{"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/posts\/1943","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1943"}],"version-history":[{"count":2,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/posts\/1943\/revisions"}],"predecessor-version":[{"id":1945,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=\/wp\/v2\/posts\/1943\/revisions\/1945"}],"wp:attachment":[{"href":"https:\/\/skanto.co.kr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1943"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1943"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/skanto.co.kr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1943"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}