Books/GenAI Fundamentals/GenAI Fundamentals Cheat Sheet

    GenAI Fundamentals Cheat Sheet

    GenAI Fundamentals Cheat Sheet

    Your complete reference for GenAI concepts, models, and parameters. Bookmark this page!

    Model Quick Reference

    Closed Source Models

    ModelProviderContextBest ForCost
    GPT-4oOpenAI128KGeneral purpose, multimodal$$$
    GPT-4o miniOpenAI128KBudget tasks, high volume$
    Claude 3.5 SonnetAnthropic200KCoding, analysis, long docs$$$
    Claude 3.5 HaikuAnthropic200KFast, simple tasks$
    Gemini 1.5 ProGoogle1M+Very long context$$
    Gemini 1.5 FlashGoogle1M+Fast, budget multimodal$

    Open Source Models

    ModelProviderContextParametersRun Locally?
    Llama 3.1 405BMeta128K405BNeeds serious GPUs
    Llama 3 70BMeta8K70BNeeds good GPU
    Llama 3 8BMeta8K8BRuns on consumer GPU
    Mixtral 8x7BMistral32K47B (sparse)Runs on good GPU
    Mistral 7BMistral32K7BRuns on consumer GPU

    Parameter Tuning Guide

    Temperature Settings by Task

    TaskTemperatureWhy
    Code generation0Deterministic, consistent output
    Bug fixing0Accuracy matters most
    Data extraction0Need exact, consistent results
    Technical writing0.3Slightly creative but factual
    Summarization0.3Faithful to source material
    General Q&A0.7Balanced default
    Creative writing1.0More varied, expressive output
    Brainstorming1.0–1.5Maximum creativity

    Common Parameter Combinations

    // Precise and factual (code, data extraction, math)
    {
      temperature: 0,
      max_tokens: 2048,
      top_p: 1,
    }
    
    // Balanced (general conversation, explanations)
    {
      temperature: 0.7,
      max_tokens: 1024,
      top_p: 1,
    }
    
    // Creative (brainstorming, writing, ideation)
    {
      temperature: 1.2,
      max_tokens: 2048,
      presence_penalty: 0.6,
    }
    
    // Structured output (JSON, data extraction)
    {
      temperature: 0,
      max_tokens: 1024,
      response_format: { type: "json_object" },
    }

    Token Quick Reference

    1 token    ≈ 4 characters ≈ ¾ of a word
    100 tokens ≈ 75 words
    1K tokens  ≈ 750 words ≈ 1.5 pages
    10K tokens ≈ 7,500 words ≈ 15 pages
    100K tokens ≈ 75,000 words ≈ a novel
    

    Estimating Costs

    GPT-4o mini at $0.15/1M input tokens:
      1,000 short queries (~500 tokens each) = $0.075
      That's less than a penny for 1,000 queries.
    
    GPT-4o at $2.50/1M input tokens:
      1,000 short queries (~500 tokens each) = $1.25
    
    Claude 3.5 Sonnet at $3.00/1M input tokens:
      1,000 short queries (~500 tokens each) = $1.50
    

    API Call Templates

    OpenAI

    const response = await fetch("https://api.openai.com/v1/chat/completions", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
      },
      body: JSON.stringify({
        model: "gpt-4o-mini",
        messages: [
          { role: "system", content: "You are a helpful assistant." },
          { role: "user", content: "Your prompt here" },
        ],
        temperature: 0.7,
        max_tokens: 1024,
      }),
    });
    const data = await response.json();
    const answer = data.choices[0].message.content;

    Anthropic (Claude)

    const response = await fetch("https://api.anthropic.com/v1/messages", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "x-api-key": process.env.ANTHROPIC_API_KEY!,
        "anthropic-version": "2023-06-01",
      },
      body: JSON.stringify({
        model: "claude-3-5-sonnet-20241022",
        max_tokens: 1024,
        system: "You are a helpful assistant.",
        messages: [{ role: "user", content: "Your prompt here" }],
      }),
    });
    const data = await response.json();
    const answer = data.content[0].text;

    Google (Gemini)

    const response = await fetch(
      `https://generativelanguage.googleapis.com/v1/models/gemini-1.5-flash:generateContent?key=${process.env.GOOGLE_AI_KEY}`,
      {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({
          contents: [{ parts: [{ text: "Your prompt here" }] }],
          generationConfig: { temperature: 0.7, maxOutputTokens: 1024 },
        }),
      }
    );
    const data = await response.json();
    const answer = data.candidates[0].content.parts[0].text;

    Key Terms Glossary

    TermDefinition
    AI (Artificial Intelligence)Software that performs tasks requiring human-like intelligence
    ML (Machine Learning)AI that learns patterns from data instead of explicit rules
    Deep LearningML using multi-layered neural networks
    GenAI (Generative AI)AI that creates new content (text, images, code, audio)
    LLM (Large Language Model)A deep learning model trained on massive text data
    TransformerThe neural network architecture used by all modern LLMs
    TokenThe unit of text LLMs process (~4 characters)
    Context WindowMaximum tokens a model can process at once
    TemperatureControls randomness of output (0 = focused, 1+ = creative)
    top_pNucleus sampling — limits token selection to most probable
    max_tokensHard limit on response length
    EmbeddingNumerical vector representation of text meaning
    AttentionMechanism that lets the model relate different parts of text
    HallucinationWhen an LLM generates confident but incorrect information
    Fine-tuningAdapting a pre-trained model to specific tasks or data
    RLHFReinforcement Learning from Human Feedback — aligning models
    RAGRetrieval Augmented Generation — combining search with generation
    Prompt EngineeringCrafting inputs to get better model outputs
    System PromptInstructions that set the model's behavior and persona
    Zero-shotAsking the model to do something without examples
    Few-shotProviding examples in the prompt to guide output format
    InferenceRunning a trained model to generate output
    LatencyTime between sending a request and getting a response
    ThroughputNumber of requests a model can handle per second
    MultimodalModels that handle multiple types (text + images + audio)
    AgentAn AI system that can take actions and use tools autonomously
    Vector DatabaseDatabase optimized for storing and searching embeddings

    Decision Tree: Which Model Should I Use?

    START: What am I building?
    │
    ├─ Learning / experimenting?
    │  └─→ GPT-4o mini (cheapest, good quality)
    │
    ├─ Code-heavy application?
    │  ├─ Need best quality → Claude 3.5 Sonnet
    │  └─ Budget-conscious → GPT-4o mini
    │
    ├─ Processing very long documents?
    │  └─→ Gemini 1.5 Pro (1M+ tokens)
    │
    ├─ High-volume production (100K+ requests/day)?
    │  ├─ Quality critical → GPT-4o or Claude Sonnet
    │  ├─ Good enough quality → GPT-4o mini or Claude Haiku
    │  └─ Maximum control → Llama 3 via Groq/Together
    │
    ├─ Data must stay private?
    │  ├─ Can manage GPUs → Self-host Llama 3
    │  └─ Need managed → Azure OpenAI or GCP Vertex AI
    │
    ├─ Multimodal (text + images)?
    │  ├─ Best quality → GPT-4o
    │  └─ Budget → Gemini 1.5 Flash
    │
    └─ Simple classification / routing?
       └─→ GPT-4o mini or any small model
    

    AI Prompt Templates for Learning More

    Use these with any AI assistant to deepen your understanding:

    Explain a concept: "Explain [concept] like I'm a developer who has never worked with AI. Use a coding analogy."

    Compare technologies: "Compare [A] vs [B] for [use case]. Give me a table with pros, cons, pricing, and your recommendation."

    Architecture advice: "I'm building [describe app]. Design the AI architecture — which models, what parameters, how to handle [specific challenge]."

    Debug AI issues: "My AI feature is [describe problem]. Here's my prompt: [paste prompt]. Here's the output: [paste output]. What's wrong and how do I fix it?"

    Optimize costs: "I'm spending $[amount]/month on AI API calls. Here's my usage: [describe]. How can I reduce costs without losing quality?"

    Learn by building: "Walk me through building a [type of AI feature] from scratch. Start with the simplest version and then show me how to improve it step by step."

    Common Patterns Reference

    System Prompt Template

    const systemPrompt = `You are a [role] specialized in [domain].
    
    Your responsibilities:
    - [Responsibility 1]
    - [Responsibility 2]
    - [Responsibility 3]
    
    Rules:
    - [Constraint 1]
    - [Constraint 2]
    - If you're unsure, say "I'm not certain about this."
    - Always provide sources when citing facts.
    
    Output format:
    [Describe expected format — JSON, markdown, plain text, etc.]`;

    Error Handling for AI Calls

    async function callAI(messages: Message[], retries = 3): Promise<string> {
      for (let i = 0; i < retries; i++) {
        try {
          const response = await fetch("https://api.openai.com/v1/chat/completions", {
            method: "POST",
            headers: {
              "Content-Type": "application/json",
              "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
            },
            body: JSON.stringify({
              model: "gpt-4o-mini",
              messages,
              temperature: 0.7,
            }),
          });
    
          if (response.status === 429) {
            // Rate limited — wait and retry
            const waitTime = Math.pow(2, i) * 1000;
            console.log(`Rate limited. Waiting ${waitTime}ms...`);
            await new Promise(resolve => setTimeout(resolve, waitTime));
            continue;
          }
    
          if (!response.ok) {
            throw new Error(`API error: ${response.status}`);
          }
    
          const data = await response.json();
          return data.choices[0].message.content;
        } catch (error) {
          if (i === retries - 1) throw error;
          console.log(`Attempt ${i + 1} failed. Retrying...`);
        }
      }
      throw new Error("All retries failed");
    }

    The Complete GenAI Developer Workflow

    1. UNDERSTAND the problem — What AI capability do you need?
    2. CHOOSE a model — Match capability to cost and requirements
    3. DESIGN the prompt — System prompt + user prompt + examples
    4. SET parameters — Temperature, max_tokens, response format
    5. BUILD the integration — API calls, error handling, caching
    6. TEST thoroughly — Edge cases, hallucinations, cost monitoring
    7. OPTIMIZE — Adjust prompts, try different models, reduce tokens
    8. MONITOR — Track costs, quality, latency in production
    

    Keep Learning

    This book covered the fundamentals. Here's where to go next:

    • Prompt Engineering — Learn advanced techniques for getting better results
    • Building with AI APIs — Hands-on projects with OpenAI, Anthropic, and Google
    • RAG Systems — Build AI that uses your own data
    • AI Agents — Build autonomous AI systems that use tools
    • Fine-tuning — Customize models for your specific domain

    Remember: The best way to learn GenAI is by building. Start with a simple project — a chatbot, a summarizer, a code helper — and iterate. Every project teaches you something new about how these models work and how to use them effectively.

    Happy building!


    🌐 www.genai-mentor.ai