Welcome to Reasoning with o1 built in partnership with OpenAI. Your instructor for this course is Colin Jarvis, who's head of AI Solutions at OpenAI. Colin great to meet you. Thanks, Andrew. I'm really excited to be working with you on this course. In this short course, you learn how to best prompt and use OpenAI's o1 model. This recently released model has shown remarkable improvements in reasoning and planning tasks. In our very first short course Isa Fulford from OpenAI taught some prompting techniques to get the best performance from GPT 3.5. She described "Chain-Of-Thought", a technique which "Gives the model Time to Think". With Chain-of-Thought prompting, you might instruct the model to 'think step by step', and maybe also give some examples of step by step reasoning. And in response to that, rather than just providing the answer directly to query the model will process a query step by step. Here's the example from the 2022 paper. "Chain of Thought Prompting Elicits Reasoning in Large Language Models by Jason Wei and others from the Google Brain team. In this example, when using chain, it's not prompting, you also provide the model with an example of a response that takes a problem and breaks it down into simpler steps. When responding to the query, the model creates a chain of simple steps which allows it to answer the question successfully. OpenAI has taken this to a new level and fine tuned the model using reinforcement learning to autonomously incorporate chain of thought step by step reasoning into this response process. While the performance we see today is impressive. What would be significant long term is test time or inference time scaling. We found that the performance of o1 consistently improves with more reinforcement learning, called train time compute, and with more time spent thinking, which we call test time or inference time compute. This is a whole new dimension you can use to scale LLM performance. The o1 model, however, is not the right model for all situations. In the course, you'll learn to recognize what tasks o1 when is suited for and when you might want to use a smaller or faster model, or combine those two. Here's the outline. We'll start with an overview of the o1 models and when you might want to use them, as well as how scaling performance at inference time works. Then you'll learn about prompting the o1 models to get the best performance. The best way to prompt o1 is pretty different from earlier models. You'll then learn how to use o1 to solve complex, multi-step tasks with planning. In this case, optimizing a supply chain logistical challenge using an o1 orchestrator with a 4.0 worker, you'll use combinations of models together o1 for planning a sequence of tasks and faster, less expensive models for task execution. After that you'll use o1 to do some coding. It's really good at this. Then you'll try out a really cool new feature, reasoning with images. Image understanding has traditionally been difficult to get into production, but with o1 we are seeing new levels of performance with these tasks. Finally, we'll wrap using o1 to generate and optimize your prompts, a technique we call Meta prompting. Many people worked to create this course. I'd like to thank from OpenAI, Roy Ziv, James Hills and Boris Power. From DeepLearning.AI, Geoff Ladwig and Esmaeil Gargari also contributed to this course. All right. Let's go on to the next video. We learn more about the details of how o1 was trained.

Reasoning with o1

Introduction

Reasoning with images

Meta-prompting

Conclusion

Appendix – Tips, Help, and Download

Course Feedback

Community