Why Your LLM Answers Are Mediocre (And How to Fix It)

AILABS-393 Ym-iMJ-sds0 Watch on YouTube Published October 28, 2025
Failed
Duration
14:45
Views
7,905
Likes
197

Processing Error

1 validation error for SummaryOutput bullet_summary List should have at most 10 items after validation, not 11 [type=too_long, input_value=['LLMs respond exactly as...oding and development.'], input_type=list] For further information visit https://errors.pydantic.dev/2.12/v/too_long

2,843 words Language: en Auto-generated

Models have come a long way. They have become smarter, faster, and better with more capabilities than ever before. But we are still not using them correctly. LLMs are everywhere. They are deeply ingrained in our lives. Everyone is using them in their systems to make decisions and automate various tasks. And if they are not being used properly, that's a big concern. In this video, I will show you how to use them the right way. For that, you need to understand what they are under the hood and how to utilize them to their full potential. This video is divided into three sections, each focusing on different aspects of how you can use LLMs. Let's get started. The first part of the video focuses on improving everyday use. Let's explore how we can get the best results from LLMs the right way. A little heads up, I know most of you already know the things I'm going to talk about in this section, but it's still really important to understand these fundamentals. Without them, we can't move forward in learning how to use LLMs properly. The first step is understanding how they work. LLMs will respond exactly the way you instruct them to. The clearer you are about what you want to do, the more focused towards your required response they will be. Now that you know what you want from the LLM, the next step is to understand which tool or model is best for which purpose. For example, Claude is known to do better in coding tasks. Chat GPT on the other hand excels in creative ideation versatility and has a wider ecosystem than Claude. Gemini offers a stronger research and data access capability and can process longer documents with a wider context window and has access to Google products. Each of these models offers different advantages, meaning you need to choose the exact model that fits your task. One small mention here is that even though we use chat GPT for most of the examples, the features discussed here are available in other LLMs too, including Claude, Gemini, and any other platform of your choice. Here we will discuss some tips that can help you improve your day-to-day prompting in your routine. After deciding which model you want to use, you can start prompting. If the generated response is not what you want, you can give it another prompt. This process is known as iterative feedback where you repeatedly provide input to the model with different prompts and added instructions to gradually reach the desired response. Another important thing is to avoid vague or fluffy terms like not too long or not too short. Instead, use clear and specific instructions about what you want it to do. You can also use prompts that slightly mention the output style you want. This helps guide the AI model toward your goal, allowing it to understand the type of output you're looking for and respond accordingly. The most important thing that powers the intelligence of an LLM is its ability to reason and think through the actions and decisions, providing reasons for each decision because each smaller decision leads to a bigger decision, just like humans reason. Models have the ability to reason and act based on the reasoning they generate. The idea is to understand that an AI doesn't just answer a question or make a decision. It also explains the reasoning behind why it approached the decision in a certain way. This process is known as reason and act or react. The next important feature of LLMs is their ability to break down a larger problem into smaller subpros and solve them step by step with chain of thought reasoning. This approach helps them handle complex mathematical and logical problems with ease. There are two types of models, simple models and thinking models. The thinking models generally take more time and follow the approaches mentioned earlier to reach a well-thoughtout decision. Newer models like GPT5 can perform this extended reasoning inherently when faced with complex problems. You don't need to manually trigger it as the model automatically detects when a problem requires deeper thinking and applies reasoning steps to solve it. If your problem requires extended thinking and you're not sure whether chat GPT will do it automatically, you can also enable it manually. All you need to do is turn on the thinking mode and add your prompt. Chat GPT will then use its thinking model to perform the task. You can also cancel the thinking midway and chat GPT will switch back to responding normally using the standard model. Now that we talked about effective prompting techniques, let's move on to the next important concept, the context window. LLMs are built on an architecture that uses something called attention. This attention helps the model map the relationships between words, ideas, and their meanings. Much like humans, as the conversation grows longer, the model's ability to maintain focus on earlier words and information starts to fade. Have you ever noticed that when you're brainstorming ideas with chat GPT, at some point, the ideas start becoming repetitive? This isn't the AI's fault. It happens because the model's memory or context window is limited, and when it gets filled up, the earlier instructions and ideas begin to get discarded and fall out of the model's attention span, and hence the ideas repeat. The context window is usually measured in tokens which are chunks of text representing words or parts of words. The larger the context window, the more information the model will be able to retain. How do these models remember the information if they don't have the memory of their own? They use a mechanism that sends all the conversation that has happened so far to the model which then processes everything to generate a response. As the conversation grows, this leads to the context window filling up. It's a good practice to start a new chat for every new idea instead of continuing in the previous one. This way, you begin with a fresh context window, allowing the model to stay focused and produce more accurate results. All models support multiple forms of input. Instead of just text, you can include photos or files. For example, you can add a PDF and ask the model to summarize the key findings, which may take a little time, but will provide a proper summary. You can also add context wherever possible to keep the responses focused on what you want. You can use PDFs or links as well as most models can now look them up and provide answers based on the information fetched from that resource. You can also use chat GPT with images and ask questions related to them. Chat GPT can process the information in the image and provide detailed explanations. You can also use images as context for the LLM and ask questions based on them. You've probably already used this before, but you can also use dictate mode to give voice input for long instructions, which are then transcribed into text and used as input. If you want a more natural and real life conversation experience, you can use the voice mode with chat GPT. The voice mode is a great feature that allows chat GPT to talk to you in real time and can be used for many different purposes. You can even use it to have the AI act as your interviewer and practice for your next interview. You can change or manage the voices as there are multiple options to choose from. Each voice has its own character and sounds very natural. You can try these voices and select the one that suits you best. Now, the context window isn't the only form of memory. Chat GPT and other models also use other types of memory that help them retain user information to provide better responses. One of these is chat memory where chat GPT remembers the conversations from previous chats. For example, if you mention that you are lactose intolerant while asking about food, then in future chats whenever you ask for food suggestions, it will remember that and recommend lactose-free food options. The second type is user memory. You can ask chat GPT to store any piece of information in user memory which it retains across all conversations. This memory is what the model automatically picks up important details about you over time through regular conversations, helping the model understand you better and provide more personalized responses in the long run. You can view and manage user memory from settings under personalization. In addition to memory personalization, you can tailor the output to your liking. Personalization is self-set, meaning you decide how chat GPT should behave and what information it should consider while responding. You can set chat GPT's personality to match the style that suits you best. You can add custom instructions to guide how you want your responses to be. You can give it your nickname, provide your occupation, and include any other instructions. Once you save these settings, ChatGpt will respond the way you want it to. These personalizations are purposed to make the responses match your preferences, allowing you to use Chat GPT in the way you like and enjoy the most. Now that we've covered the day-to-day use cases, let's move on to something more specific and see how you can increase your productivity through better organization. There are many things we use GPT repeatedly for, and we often give it the same instructions over and over again to make it behave a certain way. This information usually gets lost between chats. Instead of prompting it repeatedly, you can use a feature that the model offers called projects. Projects are the feature that lets you manage long-term work, files, and conversations all in one place. Instead of switching between multiple chats, and losing track of the instructions you've given before, a project allows you to keep everything in one place. What guides the model to take on a specific role and behave in a particular way? They are the detailed instructions known as system prompts. System prompts help the model understand what the project is all about. It's a good practice to write the system prompt in markdown format because it organizes the structure into proper sections making it easier to follow. The most important thing when creating a system prompt is to specify a certain role for the model. The reason for assigning a role is that LLMs are trained on a massive variety of data and by defining a role you essentially narrow down the data set and help the model focus on what you actually need. You can use all the techniques we talked about earlier in preparing the best system prompt for your use. To improve the systems behavior, you can include a sample output in the prompt. When you provide a sample output, the model understands the structure it should follow and will try to align all subsequent outputs with the style and format specified in your instructions. You can either use projects for a one-time but data heavy task such as planning a trip. This allows you to provide all the necessary instructions in a single system prompt. Attach any files that might enhance your experience and ask as many questions as you want throughout the duration of the trip. Or you can use it for tasks that you perform frequently and often find yourself repeating the same instructions for. For example, if you want your essays to be fact checked and grammar checked, you can create a project for that. By simply specifying the instructions in the system prompt once, you can use it repeatedly without having to give the same directions every time. The possibilities of what you can achieve with projects are endless. It's good practice to create a new project for any task or long-term work. Whether it's coding, writing, brainstorming, or anything else. Projects keep your instructions organized and in one place, allowing chat GPT to follow them consistently without you having to repeat them. That brings us to our next section. Modern LLMs have become highly autonomous and are able to curate information and perform much better across a wider data set. Why do you think that is? It's not because LLMs are getting better, but these models now have access to a set of tools that improves their performance. As I mentioned earlier, you can use the web as a source of context. The web search feature allows you to look for data on any website you need. When you enter a query, the model searches the web, finds the relevant links, and provides a summary or answer based on that information. What happens with web search is that it only scratches the surface. It doesn't dig into the exact details beneath the main page and cannot explore the deeper links or content within the web page you specify. Hence, it falls short on finding key details related to the given topic and might miscommunicate the links to you. What should you do if the normal search isn't providing answers like you need? In that case, you can use chat GPT's deep research feature. This feature goes beyond a normal search, digging deeper into the keywords and context. It may ask you some clarifying questions to refine the research. While it might take a bit longer, it produces a much more comprehensive and detailed research report based on your input. This is an impressive feature and it can be extremely helpful for researchers and academics who need in-depth information. Have you used chat GPT for writing or any other task, but the AI just doesn't give you exactly what you want? You end up having to prompt it over and over to get the structure just right, and each time it rewrites everything, wasting a lot of your time. Or if you want to make a slight change, you have to ask chat GPT to incorporate it and then wait again while it rewrites the entire content from scratch. With the canvas feature, you and your AI model can work collaboratively tackling different aspects of a task together in a seamless way. You can use canvas mode to edit any part you don't like, or you can ask Chat GPT to make modifications for you. Instead of rewriting the entire content, it focuses only on the part that needs changes and works on that. You can view the changes you've made and restore a previous version if needed. You can also go back to the latest version. This feature is really useful because it lets you keep the best version while discarding any faulty edits. You can use canvas mode to create UI prototypes and see how they look in action. It's an excellent tool for fast prototyping because it provides a virtual environment where you can run your code. You can preview the output to see how it appears or ask chat GPT to make further changes to the code. A great feature of Canvas is that you can share it with others, allowing multiple people to collaborate on the same document easily. It's not just a solo tool. It enables teamwork and lets everyone work together on the same project seamlessly. As many of you must have already tried, you can also use Chat GPT to generate images. You can create beautiful images with just a simple prompt and minimal effort. It might take a little time to process, but the final result is well worth it. You can view the generated image, ask chat GPT to make further modifications using the image generator and also download it for later use. Previously, it relied on a separate model, but from the version 40 onwards, chat GPT is integrated with an inherent image generation model. Ever since this model was released, people have been using it as a Photoshop alternative for creating simple visuals, marketing materials, and modifying images. This has become one of the most popular features of Chat GPT. Alternate models available include Google's nano banana that is extremely powerful and is able to create realistic images. Chat GPT like other models offers a powerful terminal-based coding tool known as codeex. It can work across all your files and perform coding tasks step by step powered by OpenAI's models. Similarly, Gemini provides a Gemini CLI which allows you to perform similar tasks and can be seamlessly integrated to help you build websites and other projects. Claude Code is Anthropic's terminal-based interface, enabling you to create, edit, and modify applications directly within your terminal. That brings us to the end of the video. I hope the tips and tricks we covered helped make your workflow more efficient and improve performance, allowing you to unlock the full potential of LLMs. Thank you for watching and I'll see you in the next

Summary not available

Annotations not available