
This AI Model Runs On Your Phone (With No Internet)!
AI Summary
In this video, the speaker explores a significant advancement in mobile technology: the ability to run high-quality AI models locally on a smartphone without an internet connection. This capability is particularly useful for users who prioritize privacy, as it ensures data is not sent to cloud services like OpenAI, Google, or Anthropic. It also provides utility in offline environments, such as during a flight.
The core of this demonstration centers on an app called "Locally AI," developed by Adrian Gronden. The app allows users to download and run various open-weight models directly on their iPhones. While several models are available, including Apple’s Foundation model, Gemma 2, and Llama 3.2, the speaker focuses on the newly released Qwen 3.5 family. Released on March 2nd, the Qwen 3.5 models come in four sizes: 800 million, 2 billion, 4 billion, and 9 billion parameters. According to the transcript, these models are highly capable, with benchmarks suggesting they perform on par with some of the best open-source models and even outperform "GPT5 Nano" in certain tests.
The speaker provides a breakdown of the hardware requirements for running these models effectively. The 4-billion parameter model is recommended for the iPhone 15 Pro or newer, while the 2-billion parameter version can run on a standard iPhone 15. The smallest version, the 800-million parameter model, is compatible with the iPhone 14 and newer models. During the setup process, the speaker notes that the app offers various customization options, including the ability to provide custom instructions, adjust the "temperature" of the model's responses, and set up Siri shortcuts for hands-free interaction.
To test the performance of the Qwen 3.5 models, the speaker conducts several real-world trials. In a basic logic test—counting the number of "Rs" in the word "strawberry"—the model succeeds by breaking down the word. However, more complex logic riddles reveal some limitations. When asked whether one should walk or drive to a car wash 200 meters away, the model initially fails to realize that a car is a physical requirement for the service, instead providing a theoretical comparison of walking versus driving times. Despite these logic flaws, the speaker finds the model excellent for brainstorming. When asked for YouTube video ideas regarding AI's impact on daily life, the 2-billion parameter model quickly generates a diverse list of creative titles, ranging from "AI generated relationships" to "Robot exoplanet robo turtles."
The app also features a "Thinking Mode," indicated by a light bulb icon, which activates a chain-of-thought process. When this mode is enabled, the model does not respond immediately but instead shows its reasoning steps. While this results in more deeply engaging and specific output, it also places a higher demand on the phone’s processor. The speaker observes that the device becomes noticeably warmer and the interface can become "choppy" as the conversation history grows and the context window fills up.
Beyond text, the Locally AI app includes vision and voice capabilities. The speaker demonstrates the vision feature by taking a photo of a drink; the model correctly identifies it and analyzes its nutritional value based on the image. The voice mode, which requires a separate download, allows for conversational interaction. In a test, the speaker asks for dinner suggestions, and the model provides several options, including a "taco bar," though the speaker notes some mild confusion regarding the practicality of that specific suggestion.
A crucial part of the video is the "offline proof" demonstration. To prove the app requires no external connectivity, the speaker switches his phone to airplane mode, disabling both Wi-Fi and cellular data. He then asks the model for advice on how to handle a child's tantrum after taking away an iPad. The model responds quickly and provides a comprehensive, well-structured guide, proving that the AI is functioning entirely on-device.
In conclusion, the speaker emphasizes that while these local models may not yet match the absolute state-of-the-art power of massive cloud-based models like Claude Opus or GPT-4, they are likely superior to the top-tier models available just eighteen months ago. The primary takeaway is the balance of performance and privacy. By using Locally AI, users can access sophisticated AI assistance without sharing their prompts or data with major tech corporations for training purposes. The speaker clarifies that the video is not sponsored and that he simply wanted to highlight a powerful, free tool that represents a major step forward for on-device AI.