Learning Library

← Back to Library

AI Model Cards: Visual Cheat Sheet

9m • Unknown Channel • ai-ml • tutorial • intermediate • Watch on YouTube ↗

Key Points

Explaining AI model differences is notoriously hard because people struggle to attach meaning to arbitrary version numbers, so semantic, story‑like descriptors work much better.
The speaker proposes turning the 16 top Hugging Face models into a printable card deck, giving each model a single-word tagline that captures its core strength for use in classrooms and casual conversations.
Each card includes a concise “model card” worksheet designed for learners, turning technical specs into visual, memorable teaching tools.
Example: Model 0.3 is labeled “Artificer” to convey its technically competent, problem‑solving, creation‑focused nature despite its “cold” output style.
Example: The 200‑billion‑parameter Yi‑5 model is dubbed “Voyager” to highlight its fluency and bridge‑building between English and Chinese communications.

Sections

Full Transcript

# AI Model Cards: Visual Cheat Sheet **Source:** [https://www.youtube.com/watch?v=7G0S7DSvKxU](https://www.youtube.com/watch?v=7G0S7DSvKxU) **Duration:** 00:09:03 ## Summary - Explaining AI model differences is notoriously hard because people struggle to attach meaning to arbitrary version numbers, so semantic, story‑like descriptors work much better. - The speaker proposes turning the 16 top Hugging Face models into a printable card deck, giving each model a single-word tagline that captures its core strength for use in classrooms and casual conversations. - Each card includes a concise “model card” worksheet designed for learners, turning technical specs into visual, memorable teaching tools. - Example: Model 0.3 is labeled “Artificer” to convey its technically competent, problem‑solving, creation‑focused nature despite its “cold” output style. - Example: The 200‑billion‑parameter Yi‑5 model is dubbed “Voyager” to highlight its fluency and bridge‑building between English and Chinese communications. ## Sections - [00:00:00](https://www.youtube.com/watch?v=7G0S7DSvKxU&t=0s) **AI Model Card Deck Concept** - The speaker highlights the difficulty of explaining AI model differences and proposes a printable card deck that assigns each major Hugging Face model a one‑word summary, providing a visual, story‑based tool for better human understanding. - [00:03:10](https://www.youtube.com/watch?v=7G0S7DSvKxU&t=190s) **Voyager’s Role and Grock Issues** - The speaker explains Voyager’s multilingual, cross‑cultural purpose while clarifying it isn’t limited to code or poetry, praises Claude‑based Polymath for its versatility, and criticizes the undocumented “Grock” model for unconventional behavior and alignment problems. - [00:06:18](https://www.youtube.com/watch?v=7G0S7DSvKxU&t=378s) **Nate's Preferred LLM Stack** - He details how he allocates GPT‑3.5, GPT‑4, Claude Opus, and Gemini 2.5 Pro for different tasks, noting each model handles roughly 10‑70% of his queries. ## Full Transcript

0:00You know, one of the hardest things in 0:01AI right now is explaining the 0:03difference between models. And I have 0:05really struggled with that because that 0:08is one of my top requests that I get by 0:10DM, by email, by sonic signal from the 0:13aliens in the sky, whatever you want to 0:14call it. I get a lot of asks for, you 0:17know, why is 40 supposed to be dumber 0:19than 03? And I get it. Naming 0:22conventions are weird. I think I solved 0:25it. 0:26The key issue is actually the way humans 0:30learn. We don't learn well by trying to 0:33attach a piece of random meaning to a 0:36text string. Like we don't use key 0:39values that way if you're a developer. 0:40Like that's just not how humans work 0:42very well. We need something that gives 0:44us semantic meaning. We're 0:46storytellers. And model makers are so 0:49busy making models, they're not giving 0:51us the semantic meaning. And that you 0:54know what that's great. They can make 0:55great models. Fantastic. I am a geek and 0:59I'm a board gamer and I'm a card gamer. 1:02And I had an idea. Why not just turn all 1:06of the major models, all 16 of the major 1:09models on the Hugging Face leaderboards 1:11right now into a card deck. Make them a 1:17card deck that you can actually print, 1:20that you can put into a classroom, that 1:22you can give to your relatives if 1:23they're not sure what these models do. 1:26Make it 1:27visual, and I think it's going to be 1:30fun. Each card has a one-word summary of 1:35what that model is best at. And if 1:37you're like, "Oh my gosh, Nate, this is 1:39a, you know, substack advertorial." It's 1:41not because I'm actually going to give 1:42you distinct value here. That's just for 1:44you guys. 1:45We are going to go through the key 1:49models and including some of the ones I 1:51don't talk about often here and why I 1:54picked the word I did because you guys 1:56are a more advanced audience and I think 1:58you'll have fun with it and I think it 2:00highlights the challenge of picking the 2:02right word to describe something that is 2:05as sort of nebulous as latent space and 2:08how a model navigates it. Let me start 2:10with 03 which I've talked about a ton. 2:12I'll get to some of the rare models in a 2:15minute. The Artificer is what I named 2:18it, and I've wrestled with that. 2:19Artificer is a weird word, right? It's 2:20it's like a renfare word. Uh, but I like 2:24it because it gets at this idea of being 2:27technically competent, solving hard 2:30problems, and focusing on creating 2:32things, which is very much the vibe if 2:35you're an 03. It's a little bit cold as 2:36a model, but it's very good at problem 2:38solving and creating things. Uh, and by 2:42the way, in in the Substack, each of 2:43these is like a printable worksheet, a 2:46model card, like it's designed to go 2:48into the classroom. It's designed for 2:49learners. It's very fun. Uh, so let's 2:52try a little bit of a rare one. Now, why 2:54would I have picked this? I went with 2:56the 2:58Voyager for 3:00Y.5 200 billion parameter model. Now, 3:03give you a second. Why do you think I 3:05chose Voyager? 3:07If you're familiar with Yi, it is 3:10specialized in fluency between English 3:13and Chinese comments and 3:16communications. And so to me, it felt 3:19really naturally to it felt really 3:21natural to have Voyager be a Voyager 3:25between continents, between cultures, 3:27and 3:28connect. And I thought that was a great 3:30way of summing up one of the things it's 3:32really good at. Does that mean that 3:33Voyager can never write code? Obviously 3:35not. Does that mean that Voyager should 3:37never write, you know, a poem or never 3:39write an email for you? That's not the 3:41point. You need a way to simplify so we 3:44have semantic meaning. So we can 3:46remember things. Claude for opus the 3:50polymath. I think it's extraordinary 3:52both reading critique. I'm getting 3:54better at prompting it for writing and 3:56it's really really good at code problem 3:58solving. Polymath just felt right. 4:01Um, here's one. Uh, I often get comments 4:05uh underneath these YouTube videos. 4:06Where's Grock? Nate, why don't you talk 4:08about Grock? Well, part of why, by the 4:09way, is they don't release a model card. 4:11It is easier to do these when someone 4:14releases a model card, and I really wish 4:15the Groc would. That's a sort of 4:17separate beef. I called Grock the 4:20Maverick. Uh, I called out that it sort 4:23of takes unconventional opinions. I 4:25called out that it invents 4:26unconventional uh ideas based on the ex 4:29Twitter stream, etc., etc. And in the 4:31caveats, I called out that there have 4:32been some recent issues with 4:34misalignment for that model. And each of 4:37these, that's not just calling out 4:38Grock. Every single one of these 16 4:41cards, I call out something that is an 4:44issue with that model because I don't 4:45believe any model is perfect. I'm not, 4:47you know, trying to take sides here. 4:49Just calling out, you know, the the 4:50balls and strikes like I see them, as as 4:52they said in baseball. My grandpa was a 4:54baseball fan. 4:56I call out uh Perplexity, which almost 4:59no one considers a model, but it's 5:00actually it scores very well on the 5:02leaderboards. People think of Perplexity 5:04as just an LLM powered search engine, 5:06but they built Sonar and Sonar is 5:09designed for web search and so it 5:12counts. And I love that I get a chance 5:15to talk about these models that I don't 5:17often talk about. I talk about uh Llama 5:20345B. I talked about mixtrol 822 billion 5:24collective. Do you know what that is? 5:26They didn't name it collective. I named 5:27it collective for semantic meaning 5:29because it helps you remember that it 5:32uses a model of experts to vote on 5:36tokens which I think is super 5:39interesting. And so it actually performs 5:41pretty well given its 22 billion 5:44parameterization. It's good for privacy. 5:47But like you get an idea of how the 5:50model works because I used the word 5:52collective and because I drew a little 5:54picture of like three people all 5:56together around a model concept. You get 5:59the idea. It's kind of half Magic the 6:02Gathering card deck, half AI nery. And I 6:07had a ton of fun with it. I have 6:09exercises for the classroom. If you're 6:11someone who actually wants to use it for 6:14learning, it's like pre-built for that. 6:16If you're someone who just wants to like 6:18print out the cards and keep them by 6:20your desk and build your stack, you can 6:21also do that. And I wanted to share with 6:24you guys, you know, you might wonder 6:25like what is Nate's stack? What is what 6:27is Nate using all the 6:29time? To no one's surprise, uh 03 tops 6:34the list for me. I would say it gets 6:35probably 60 or 70% of my queries right 6:38now. It's the daily driver. Uh, chat 6:43GPT40 is something that I use pretty 6:46frequently and that is I want to call it 6:5010 15% of my chats where it's very 6:52simple stuff like reword this, reformat 6:54this, put this into markdown. Uh, it's 6:56also a warmer model so I can sometimes 6:58have companionable chats with it that 03 7:01just is a little bit cold for. You don't 7:03want tables when you're having a chat 7:05about your day. 7:07uh Claude for Opus I use when I'm trying 7:09to build like a dashboard for my week or 7:12when I'm trying to understand how to 7:14structure uh a problem in coding. It's a 7:17good response back and forth. It's an 7:19excellent problem solver. I find it's 7:21not quite as good at long context chats 7:25and that seems to be a struggle with 7:26claude 7:27models. Uh but I would say I use that 7:30one about 10 15% of the time as well. 7:32And I I realize I'm running out of 7:34percentage points because I did not plan 7:36this in my head. Uh but uh more rarely I 7:39will say uh I use Gemini 2.5 Pro as 7:43verifiers and fact checkers. I find it's 7:45really helpful for giving me a totally 7:47alternate perspective that tends to be 7:48pretty grounded. And so when I'm like I 7:50don't trust Opus, I don't trust 03. I 7:53need a second opinion. This is too 7:54important. I reach for Gemini 2.5 Pro. 7:58Um and to be honest, that's a habit 8:00stack. like the memory hauling in JH GPT 8:02is real and so just having it something 8:04that remembers me is part of what drives 8:06that split. It's not even necessarily 8:07model 8:08capability. And then last but not least, 8:10deep research. When I want something 8:12that like I can walk away, I can make a 8:14cup of coffee, I can come back, I reach 8:17for deep research and it tends to 8:20produce very high quality results and 8:22it's absolutely worth the 10 or 15 8:25minutes. All right, there you go. I hope 8:28you've enjoyed this. I hope you've 8:30gotten some value out of kind of 8:31thinking about the idea of semantic 8:33meaning for models. Look, I don't care 8:36if you want to go to my Substack or 8:38throw up your hands and run away. That's 8:39not the point. The point is we remember 8:41things with semantic meaning. Model 8:43makers have not learned that lesson and 8:48I I needed something to teach people and 8:50so I made this. If you want to say I 8:52have a better one, artificer is 8:53terrible. No one knows what that word 8:54means. I'd be the first to agree with 8:56you and I would say make one better. 8:58make one better. Um, and let me know 9:00about it. All right.