RAG with Phi-3-Medium as a Model as a Service from Azure Model Catalog

An implementation with Prompt Flow and Streamlit

Valentina Alto
10 min readJun 7, 2024

--

In April 2024, Microsoft announced the release of a new family of AI models: Phi-3. These models redefine what’s possible with small language models (SLMs) and offer exceptional performance and cost-effectiveness across various language, reasoning, coding, and math benchmarks. In fact, in a landscape where very large language models like GPT-4o or Llama-3 demonstrate outstanding capabilities and dominate the benchmarks leaderboards, there are some considerations that arise:

  • Do we really need such LLMs for every use case?
  • What if I want to deploy my model locally?
  • What if I want to fine-tune my model?

Henceforth, one of the latest research trend in the field of GenAI is in fact the development of the SLMs, including Phi-3, and it is incredible to see how these models are not only way lighter than their bigger cousings, but also they do not lose in performance in a proportional way.

For example, if we consider the smaller version of Phi-3, with only 7B parameters, we can see that it beats GPT-3.5-turbo in all the most popular LLMs benchmarks:

--

--

Valentina Alto

Data&AI Specialist at @Microsoft | MSc in Data Science | AI, Machine Learning and Running enthusiast