GenAI Fundamentals Cheat Sheet
GenAI Fundamentals Cheat Sheet
Your complete reference for GenAI concepts, models, and parameters. Bookmark this page!
Model Quick Reference
Closed Source Models
| Model | Provider | Context | Best For | Cost |
|---|---|---|---|---|
| GPT-4o | OpenAI | 128K | General purpose, multimodal | $$$ |
| GPT-4o mini | OpenAI | 128K | Budget tasks, high volume | $ |
| Claude 3.5 Sonnet | Anthropic | 200K | Coding, analysis, long docs | $$$ |
| Claude 3.5 Haiku | Anthropic | 200K | Fast, simple tasks | $ |
| Gemini 1.5 Pro | 1M+ | Very long context | $$ | |
| Gemini 1.5 Flash | 1M+ | Fast, budget multimodal | $ |
Open Source Models
| Model | Provider | Context | Parameters | Run Locally? |
|---|---|---|---|---|
| Llama 3.1 405B | Meta | 128K | 405B | Needs serious GPUs |
| Llama 3 70B | Meta | 8K | 70B | Needs good GPU |
| Llama 3 8B | Meta | 8K | 8B | Runs on consumer GPU |
| Mixtral 8x7B | Mistral | 32K | 47B (sparse) | Runs on good GPU |
| Mistral 7B | Mistral | 32K | 7B | Runs on consumer GPU |
Parameter Tuning Guide
Temperature Settings by Task
| Task | Temperature | Why |
|---|---|---|
| Code generation | 0 | Deterministic, consistent output |
| Bug fixing | 0 | Accuracy matters most |
| Data extraction | 0 | Need exact, consistent results |
| Technical writing | 0.3 | Slightly creative but factual |
| Summarization | 0.3 | Faithful to source material |
| General Q&A | 0.7 | Balanced default |
| Creative writing | 1.0 | More varied, expressive output |
| Brainstorming | 1.0–1.5 | Maximum creativity |
Common Parameter Combinations
// Precise and factual (code, data extraction, math) { temperature: 0, max_tokens: 2048, top_p: 1, } // Balanced (general conversation, explanations) { temperature: 0.7, max_tokens: 1024, top_p: 1, } // Creative (brainstorming, writing, ideation) { temperature: 1.2, max_tokens: 2048, presence_penalty: 0.6, } // Structured output (JSON, data extraction) { temperature: 0, max_tokens: 1024, response_format: { type: "json_object" }, }
Token Quick Reference
1 token ≈ 4 characters ≈ ¾ of a word
100 tokens ≈ 75 words
1K tokens ≈ 750 words ≈ 1.5 pages
10K tokens ≈ 7,500 words ≈ 15 pages
100K tokens ≈ 75,000 words ≈ a novel
Estimating Costs
GPT-4o mini at $0.15/1M input tokens:
1,000 short queries (~500 tokens each) = $0.075
That's less than a penny for 1,000 queries.
GPT-4o at $2.50/1M input tokens:
1,000 short queries (~500 tokens each) = $1.25
Claude 3.5 Sonnet at $3.00/1M input tokens:
1,000 short queries (~500 tokens each) = $1.50
API Call Templates
OpenAI
const response = await fetch("https://api.openai.com/v1/chat/completions", { method: "POST", headers: { "Content-Type": "application/json", "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`, }, body: JSON.stringify({ model: "gpt-4o-mini", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Your prompt here" }, ], temperature: 0.7, max_tokens: 1024, }), }); const data = await response.json(); const answer = data.choices[0].message.content;
Anthropic (Claude)
const response = await fetch("https://api.anthropic.com/v1/messages", { method: "POST", headers: { "Content-Type": "application/json", "x-api-key": process.env.ANTHROPIC_API_KEY!, "anthropic-version": "2023-06-01", }, body: JSON.stringify({ model: "claude-3-5-sonnet-20241022", max_tokens: 1024, system: "You are a helpful assistant.", messages: [{ role: "user", content: "Your prompt here" }], }), }); const data = await response.json(); const answer = data.content[0].text;
Google (Gemini)
const response = await fetch( `https://generativelanguage.googleapis.com/v1/models/gemini-1.5-flash:generateContent?key=${process.env.GOOGLE_AI_KEY}`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ contents: [{ parts: [{ text: "Your prompt here" }] }], generationConfig: { temperature: 0.7, maxOutputTokens: 1024 }, }), } ); const data = await response.json(); const answer = data.candidates[0].content.parts[0].text;
Key Terms Glossary
| Term | Definition |
|---|---|
| AI (Artificial Intelligence) | Software that performs tasks requiring human-like intelligence |
| ML (Machine Learning) | AI that learns patterns from data instead of explicit rules |
| Deep Learning | ML using multi-layered neural networks |
| GenAI (Generative AI) | AI that creates new content (text, images, code, audio) |
| LLM (Large Language Model) | A deep learning model trained on massive text data |
| Transformer | The neural network architecture used by all modern LLMs |
| Token | The unit of text LLMs process (~4 characters) |
| Context Window | Maximum tokens a model can process at once |
| Temperature | Controls randomness of output (0 = focused, 1+ = creative) |
| top_p | Nucleus sampling — limits token selection to most probable |
| max_tokens | Hard limit on response length |
| Embedding | Numerical vector representation of text meaning |
| Attention | Mechanism that lets the model relate different parts of text |
| Hallucination | When an LLM generates confident but incorrect information |
| Fine-tuning | Adapting a pre-trained model to specific tasks or data |
| RLHF | Reinforcement Learning from Human Feedback — aligning models |
| RAG | Retrieval Augmented Generation — combining search with generation |
| Prompt Engineering | Crafting inputs to get better model outputs |
| System Prompt | Instructions that set the model's behavior and persona |
| Zero-shot | Asking the model to do something without examples |
| Few-shot | Providing examples in the prompt to guide output format |
| Inference | Running a trained model to generate output |
| Latency | Time between sending a request and getting a response |
| Throughput | Number of requests a model can handle per second |
| Multimodal | Models that handle multiple types (text + images + audio) |
| Agent | An AI system that can take actions and use tools autonomously |
| Vector Database | Database optimized for storing and searching embeddings |
Decision Tree: Which Model Should I Use?
START: What am I building?
│
├─ Learning / experimenting?
│ └─→ GPT-4o mini (cheapest, good quality)
│
├─ Code-heavy application?
│ ├─ Need best quality → Claude 3.5 Sonnet
│ └─ Budget-conscious → GPT-4o mini
│
├─ Processing very long documents?
│ └─→ Gemini 1.5 Pro (1M+ tokens)
│
├─ High-volume production (100K+ requests/day)?
│ ├─ Quality critical → GPT-4o or Claude Sonnet
│ ├─ Good enough quality → GPT-4o mini or Claude Haiku
│ └─ Maximum control → Llama 3 via Groq/Together
│
├─ Data must stay private?
│ ├─ Can manage GPUs → Self-host Llama 3
│ └─ Need managed → Azure OpenAI or GCP Vertex AI
│
├─ Multimodal (text + images)?
│ ├─ Best quality → GPT-4o
│ └─ Budget → Gemini 1.5 Flash
│
└─ Simple classification / routing?
└─→ GPT-4o mini or any small model
AI Prompt Templates for Learning More
Use these with any AI assistant to deepen your understanding:
Explain a concept: "Explain [concept] like I'm a developer who has never worked with AI. Use a coding analogy."
Compare technologies: "Compare [A] vs [B] for [use case]. Give me a table with pros, cons, pricing, and your recommendation."
Architecture advice: "I'm building [describe app]. Design the AI architecture — which models, what parameters, how to handle [specific challenge]."
Debug AI issues: "My AI feature is [describe problem]. Here's my prompt: [paste prompt]. Here's the output: [paste output]. What's wrong and how do I fix it?"
Optimize costs: "I'm spending $[amount]/month on AI API calls. Here's my usage: [describe]. How can I reduce costs without losing quality?"
Learn by building: "Walk me through building a [type of AI feature] from scratch. Start with the simplest version and then show me how to improve it step by step."
Common Patterns Reference
System Prompt Template
const systemPrompt = `You are a [role] specialized in [domain]. Your responsibilities: - [Responsibility 1] - [Responsibility 2] - [Responsibility 3] Rules: - [Constraint 1] - [Constraint 2] - If you're unsure, say "I'm not certain about this." - Always provide sources when citing facts. Output format: [Describe expected format — JSON, markdown, plain text, etc.]`;
Error Handling for AI Calls
async function callAI(messages: Message[], retries = 3): Promise<string> { for (let i = 0; i < retries; i++) { try { const response = await fetch("https://api.openai.com/v1/chat/completions", { method: "POST", headers: { "Content-Type": "application/json", "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`, }, body: JSON.stringify({ model: "gpt-4o-mini", messages, temperature: 0.7, }), }); if (response.status === 429) { // Rate limited — wait and retry const waitTime = Math.pow(2, i) * 1000; console.log(`Rate limited. Waiting ${waitTime}ms...`); await new Promise(resolve => setTimeout(resolve, waitTime)); continue; } if (!response.ok) { throw new Error(`API error: ${response.status}`); } const data = await response.json(); return data.choices[0].message.content; } catch (error) { if (i === retries - 1) throw error; console.log(`Attempt ${i + 1} failed. Retrying...`); } } throw new Error("All retries failed"); }
The Complete GenAI Developer Workflow
1. UNDERSTAND the problem — What AI capability do you need?
2. CHOOSE a model — Match capability to cost and requirements
3. DESIGN the prompt — System prompt + user prompt + examples
4. SET parameters — Temperature, max_tokens, response format
5. BUILD the integration — API calls, error handling, caching
6. TEST thoroughly — Edge cases, hallucinations, cost monitoring
7. OPTIMIZE — Adjust prompts, try different models, reduce tokens
8. MONITOR — Track costs, quality, latency in production
Keep Learning
This book covered the fundamentals. Here's where to go next:
- Prompt Engineering — Learn advanced techniques for getting better results
- Building with AI APIs — Hands-on projects with OpenAI, Anthropic, and Google
- RAG Systems — Build AI that uses your own data
- AI Agents — Build autonomous AI systems that use tools
- Fine-tuning — Customize models for your specific domain
Remember: The best way to learn GenAI is by building. Start with a simple project — a chatbot, a summarizer, a code helper — and iterate. Every project teaches you something new about how these models work and how to use them effectively.
Happy building!