RAG with Phi-3-Medium as a Model as a Service from Azure Model Catalog
An implementation with Prompt Flow and Streamlit
In April 2024, Microsoft announced the release of a new family of AI models: Phi-3. These models redefine what’s possible with small language models (SLMs) and offer exceptional performance and cost-effectiveness across various language, reasoning, coding, and math benchmarks. In fact, in a landscape where very large language models like GPT-4o or Llama-3 demonstrate outstanding capabilities and dominate the benchmarks leaderboards, there are some considerations that arise:
- Do we really need such LLMs for every use case?
- What if I want to deploy my model locally?
- What if I want to fine-tune my model?
Henceforth, one of the latest research trend in the field of GenAI is in fact the development of the SLMs, including Phi-3, and it is incredible to see how these models are not only way lighter than their bigger cousings, but also they do not lose in performance in a proportional way.
For example, if we consider the smaller version of Phi-3, with only 7B parameters, we can see that it beats GPT-3.5-turbo in all the most popular LLMs benchmarks: