How to Set Up Google Gemini API Access
Configure Google Gemini API credentials and make your first multimodal request.
Jay Banlasan
The AI Systems Guy
The google gemini api setup guide is shorter than most people expect. Google built Gemini to be accessible from day one, and the free tier is genuinely useful for testing and low-volume production work. I set this up for clients when they need multimodal capabilities, meaning the model can process text, images, PDFs, and audio in a single request. That is something GPT-4o can also do, but Gemini's context window (up to 1 million tokens) makes it the right call for processing large documents.
For business use, Gemini Pro is the workhorse. Gemini 1.5 Flash is cheaper and faster for high-volume tasks. Gemini 1.5 Pro is what you reach for when you need to process a 300-page PDF or analyze an hour-long video transcript in one shot.
What You Need Before Starting
- A Google account
- Python 3.9 or higher
- Access to Google AI Studio at aistudio.google.com
pipfor installing packages- A
.envfile for key storage
Step 1: Get Your Gemini API Key
Go to aistudio.google.com. Sign in with your Google account. Click "Get API key" in the left sidebar, then "Create API key." Select a Google Cloud project (or create one). Copy the key.
Add it to your .env file:
GOOGLE_API_KEY=AIza-your-key-here
The free tier gives you 15 RPM (requests per minute) and 1 million tokens per minute on Gemini 1.5 Flash. That is enough for most automation tasks at low volume.
Step 2: Install the Google Generative AI SDK
pip install google-generativeai python-dotenv
For newer projects, Google also ships a unified SDK:
pip install google-genai python-dotenv
This tutorial uses google-generativeai since it is more widely documented. The unified google-genai SDK is the future direction but is still maturing.
Step 3: Make Your First Text Request
import os
import google.generativeai as genai
from dotenv import load_dotenv
load_dotenv()
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content("What are the top 3 uses of AI in small business operations?")
print(response.text)
Run it:
python gemini_test.py
You will get a response in under 2 seconds. Flash is notably faster than Pro for simple text tasks.
Step 4: Make a Multimodal Request (Text + Image)
This is where Gemini separates itself. You can send an image and ask the model questions about it:
import os
import google.generativeai as genai
from dotenv import load_dotenv
import PIL.Image
load_dotenv()
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
model = genai.GenerativeModel("gemini-1.5-flash")
# Load a local image
image = PIL.Image.open("screenshot.png")
# Ask the model about the image
response = model.generate_content([
"Describe what this screenshot shows. List any key data points or metrics visible.",
image
])
print(response.text)
This works with PNG, JPG, WEBP, HEIC, and HEIF formats. For business applications, I use this to process invoices, screenshots of dashboards, and product photos for description generation.
Step 5: Process a PDF Document
Gemini can handle PDFs directly without extraction preprocessing:
import os
import google.generativeai as genai
from dotenv import load_dotenv
import pathlib
load_dotenv()
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
model = genai.GenerativeModel("gemini-1.5-pro")
# Upload the file using the Files API
pdf_path = pathlib.Path("contract.pdf")
uploaded_file = genai.upload_file(path=pdf_path, display_name="Contract")
print(f"Uploaded: {uploaded_file.display_name}")
# Ask questions about the document
response = model.generate_content([
uploaded_file,
"Summarize the key obligations for both parties in this contract. Use bullet points."
])
print(response.text)
# Clean up uploaded file
genai.delete_file(uploaded_file.name)
Note: Use gemini-1.5-pro for documents longer than 50 pages. Flash handles shorter documents fine.
Step 6: Add System Instructions and Configure Safety Settings
For consistent business outputs, configure the model with a system prompt and adjust safety thresholds:
import os
import google.generativeai as genai
from dotenv import load_dotenv
load_dotenv()
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
model = genai.GenerativeModel(
model_name="gemini-1.5-flash",
system_instruction="You are a business data analyst. Respond in plain English. Use bullet points. Keep responses under 200 words unless asked to elaborate.",
generation_config=genai.GenerationConfig(
temperature=0.2,
max_output_tokens=1000,
response_mime_type="text/plain"
)
)
# For JSON output, change response_mime_type:
model_json = genai.GenerativeModel(
model_name="gemini-1.5-flash",
generation_config=genai.GenerationConfig(
temperature=0.1,
response_mime_type="application/json"
)
)
response = model_json.generate_content(
'Extract the company name, date, and total amount from this invoice text: "Invoice from Acme Corp, dated 2024-05-15, total $1,250.00"'
)
print(response.text) # Returns clean JSON
Step 7: Build a Reusable Wrapper
import os
import google.generativeai as genai
from dotenv import load_dotenv
from typing import Optional, Union
import PIL.Image
load_dotenv()
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
def ask_gemini(
prompt: str,
image_path: Optional[str] = None,
model_name: str = "gemini-1.5-flash",
system_prompt: Optional[str] = None,
temperature: float = 0.3
) -> str:
"""
Send a request to Gemini, optionally with an image.
Args:
prompt: Text question or instruction
image_path: Optional path to an image file
model_name: gemini-1.5-flash (fast/cheap) or gemini-1.5-pro (powerful)
system_prompt: Optional behavior instructions
temperature: 0.0 deterministic, 1.0 creative
Returns:
Response text
"""
model = genai.GenerativeModel(
model_name=model_name,
system_instruction=system_prompt,
generation_config=genai.GenerationConfig(
temperature=temperature,
max_output_tokens=2000
)
)
content = [prompt]
if image_path:
image = PIL.Image.open(image_path)
content.append(image)
response = model.generate_content(content)
return response.text
if __name__ == "__main__":
result = ask_gemini(
"What are 5 ways a law firm could use AI to save time on admin tasks?",
system_prompt="You are a business consultant. Be specific and practical."
)
print(result)
What to Build Next
- Connect Gemini to your Google Drive to process documents as they are uploaded
- Use the 1M context window to analyze entire CRM export files in one call
- Combine Gemini Vision with a file watcher to auto-process incoming invoice scans
Related Reading
- How to Set Up Anthropic Claude with System Prompts - Compare system prompt behavior across models
- How to Stream AI Responses in Real-Time - Streaming works with Gemini too, same pattern
- How to Create AI API Keys Securely - Keep your Google API key safe
Want this system built for your business?
Get a free assessment. We will map every system your business needs and show you the ROI.
Get Your Free Assessment