Architecture overview
Flutter is the best framework for building AI apps that run everywhere. One codebase, Android, iOS, Web, Desktop — and Gemini's API is clean enough that integrating it well is genuinely fun. Here's the practical guide I wish I'd had.
Setting up the Gemini dependency
Google's official package is google_generative_ai. Add it to your pubspec.yaml:
dependencies:
google_generative_ai: ^0.4.3
flutter_riverpod: ^2.5.1
Never hardcode your API key. Use flutter_dotenv or build flavours to inject it via environment variables.
The architecture that actually works
The temptation is to call the Gemini API directly from your widget. Resist it. The architecture that scales is:
- GeminiService — a plain Dart class that wraps the API
- ConversationRepository — manages message history and state
- ChatNotifier (Riverpod) — exposes state to the UI
- ChatScreen — renders messages and handles input
Streaming responses
This is where most tutorials fall down. Streaming is what makes AI apps feel fast. Here's the pattern:
final response = model.generateContentStream(
[Content.text(userMessage)],
);
await for (final chunk in response) {
final text = chunk.text;
if (text != null) {
// Update state incrementally
ref.read(messageProvider.notifier).appendText(text);
}
}
Pair this with a StreamBuilder or a Riverpod StreamNotifier, and your UI updates character by character — exactly like ChatGPT.
Multimodal input: images + text
Gemini 1.5 Pro is natively multimodal. This is a massive opportunity for mobile apps, since your users have a camera. Here's how to send an image:
final image = await ImagePicker().pickImage(source: ImageSource.camera);
final bytes = await image!.readAsBytes();
final content = [
Content.multi([
TextPart('What's in this image?'),
DataPart('image/jpeg', bytes),
])
];
final response = await model.generateContent(content);
Managing context and conversation history
Gemini's API is stateless — you send the full conversation history with each request. For long conversations, you'll hit token limits. The practical approach is a sliding window: keep the last N exchanges, always include a system prompt, and summarise older history if needed.
Error handling and rate limits
Production apps need graceful degradation. Handle GenerativeAIException for API errors, implement exponential backoff for rate limit responses, and always give users feedback when something goes wrong.
Testing AI features
Don't mock the model — mock your GeminiService layer. This lets you write unit tests for your business logic without hitting the API. For integration tests, use actual API calls but mark them as integration tests excluded from normal CI runs.
The cleanest Flutter AI apps I've seen treat the LLM as an infrastructure dependency — like a database — not as the main character of the architecture.
At Roboto Systems, we use this exact pattern across all our Roboto AI apps. The code you write once for Roboto Cart AI works in Roboto Notes AI with minimal changes.