100 GenAI interview
Top 10 EloNo ratings yet
Play a challenge match to create ratings.
Terms in this set
Click study modes above to mirror Quizlet flow.
1
What is generative AI?
AI that can create new content such as text, images, or code
2
How does generative AI differ from discriminative AI?
Generative models learn p(x, y), discriminative learn p(y|x)
3
What is a Large Language Model (LLM)?
A neural network trained on large text corpora to model language
4
Which architecture underpins most state-of-the-art LLMs?
Transformers
5
What does the self-attention mechanism do in Transformers?
Computes weighted interactions between all token pairs in a sequence
6
Why are positional encodings used in Transformers?
To provide token order information
7
What is tokenization in LLMs?
Splitting text into subword units or tokens
8
What is the context window of an LLM?
The maximum number of tokens the model can consider in one prompt
9
What does 'temperature' control in LLM sampling?
Randomness of the output distribution
10
What is 'top-k' or 'top-p' sampling used for?
Restricting sampling to the most likely tokens to control diversity
11
What is Retrieval-Augmented Generation (RAG)?
A method that combines retrieval from external data with generation by an LLM
12
Which components are core to a RAG system?
Retriever, knowledge store, and generator
13
Why use RAG instead of relying solely on an LLM’s internal knowledge?
To ground answers in up-to-date, domain-specific data
14
What is typically stored in the knowledge store for RAG?
Documents or chunks with associated embeddings
15
What does the retriever do in RAG?
Fetches relevant documents given a query embedding
16
Which type of database is commonly used for RAG similarity search?
Vector database or embedding index
17
What is 'chunking' in RAG pipelines?
Splitting documents into smaller text segments for embedding
18
Why is chunk size important in RAG?
It trades off retrieval granularity vs. context completeness
19
What is 'query rewriting' in advanced RAG systems?
Rephrasing or expanding the query to improve retrieval quality
20
Which technique improves RAG by re-ranking retrieved documents?
Cross-encoder re-ranking
21
What is an 'AI agent' in the LLM ecosystem?
A system where an LLM can perceive context, plan, call tools, and act in a loop
22
What does 'tool use' or 'function calling' enable for LLM agents?
Calling external APIs, databases, or functions from model outputs
23
Which is a common pattern for an LLM agent loop?
Observe → Plan → Act → Observe
24
What is a 'multi-agent' system in GenAI?
Multiple specialized agents collaborating or competing on tasks
25
Why are tools important for production agents?
They let agents access real-time data, systems, and side effects
26
What is a 'planner' in an agent architecture?
A component that breaks goals into steps or sub-tasks
27
What is a 'memory' module for agents?
Structured store for past interactions, facts, or state
28
Short-term conversational memory is typically implemented using:
Appending recent messages into the context window
29
Long-term memory for agents is often stored in:
A vector store or database
30
Which is a risk of unconstrained autonomous agents?
Unbounded tool calls, costs, and harmful or unintended actions
31
What is prompt engineering?
Designing and structuring inputs to steer LLM behavior
32
Which is an example of a prompt engineering technique?
Few-shot prompting with examples
33
What is chain-of-thought (CoT) prompting?
Asking the model to show intermediate reasoning steps
34
Why do many providers restrict explicit chain-of-thought output?
To reduce risk of over-reliance, leakage, and misinterpretation of internal reasoning
35
What is 'system message' vs 'user message' in chat-based LLM APIs?
System sets high-level behavior, user provides the query/task
36
What is hallucination in LLMs?
The model generating confident but factually incorrect content
37
Which strategy helps reduce hallucinations?
RAG with grounded retrieval and citation
38
What is a safety or guardrail layer in GenAI systems?
A post-processing or filtering step to block unsafe or disallowed outputs
39
What is jailbreak testing?
Trying to bypass or defeat safety constraints of LLMs
40
Which is a common evaluation approach for GenAI answers?
Human evaluation or model-based rubric scoring
41
Which metric is often used for text similarity in GenAI evaluation?
BLEU / ROUGE / BERTScore
42
What is model-based evaluation in LLM systems?
Using another model (or the same) as a 'judge' to rate outputs
43
What is instruction tuning?
Fine-tuning a model on (instruction, response) pairs to follow instructions better
44
What is supervised fine-tuning (SFT) in LLM training?
Training the model on curated input-output examples with cross-entropy loss
45
What is RLHF (Reinforcement Learning from Human Feedback)?
Reinforcement learning where human preferences guide a reward model
46
What is DPO (Direct Preference Optimization) conceptually used for?
Directly aligning model outputs with preference pairs without an explicit reward model
47
What is LoRA (Low-Rank Adaptation) used for?
Parameter-efficient fine-tuning by adding low-rank matrices to weight updates
48
Why is parameter-efficient fine-tuning popular?
It allows adapting large models with fewer trainable parameters and lower cost
49
What is knowledge distillation in the context of LLMs?
Training a smaller student model to imitate a larger teacher model
50
What is the main idea behind Mixture-of-Experts (MoE) LLMs?
Multiple expert subnetworks where only a subset is activated per token
51
Why are MoE architectures attractive for scaling?
They allow very large parameter counts while keeping per-token compute manageable
52
What is a 'small language model' (SLM) in current discussions?
Any model under 10B parameters, optimized for on-device or low-latency use
53
Which is a key tradeoff between large LLMs and SLMs?
SLMs tend to be faster/cheaper but may have lower general capability
54
What is a diffusion model in generative AI?
A model that iteratively denoises random noise to generate images or other data
55
Which generative paradigm is most associated with image generation today?
Diffusion models and latent diffusion
56
What is a Generative Adversarial Network (GAN)?
Two models, generator and discriminator, trained in opposition
57
What is a multimodal model?
A model that handles multiple data types such as text, images, or audio together
58
Which is an example use case of a vision-language model?
Image captioning or visual question answering
59
What is 'few-shot' learning with LLMs?
Providing a few examples in the prompt to steer behavior without weight updates
60
What is 'zero-shot' capability in LLMs?
Model can perform tasks without explicit task-specific training or examples
61
Why is observability important in GenAI applications?
To understand prompts, outputs, costs, and failures for monitoring and debugging
62
What is a 'prompt log' in production systems?
Structured record of prompts, model versions, and outputs
63
What is response caching for LLMs?
Caching outputs for repeated prompts or semantically similar queries
64
Why is cost control critical for GenAI deployments?
LLM APIs and GPU inference can be expensive at scale
65
Which approach commonly reduces GenAI inference costs?
Routing easy queries to cheaper or smaller models (model routing)
66
What is 'model routing' or 'model cascade'?
Using smaller/cheaper models first and escalating to more powerful ones when needed
67
Why is PII handling important in GenAI systems?
To comply with privacy regulations and avoid leaking sensitive data
68
Which technique helps protect sensitive user data in logs?
Redaction or pseudonymization of PII
69
What is 'drift' in the context of GenAI systems?
Change in data distribution, user behavior, or model versions affecting performance
70
Why is versioning of prompts and models important?
To reproduce behavior, debug issues, and compare performance across changes
71
What is an orchestration framework like LangChain or LlamaIndex used for?
Managing pipelines of prompts, tools, retrieval, and models in GenAI apps
72
In LangChain terminology, what is a 'Chain'?
A sequence or graph of calls (LLMs, tools, retrievers) composed into a pipeline
73
What is the role of a 'Tool' in LangChain-style frameworks?
External function or API that the LLM/agent can invoke
74
What is 'guardrailing' in GenAI frameworks?
Enforcing safety, compliance, or formatting rules on LLM inputs and outputs
75
What is a 'stateful agent'?
Agent whose behavior depends on stored state or memory across turns
76
Why might you use a graph-based workflow (DAG) for GenAI pipelines?
To model complex branching, dependencies, and parallel steps
77
Which is a typical risk when connecting agents to powerful tools (e.g., shell, database writes)?
Potential destructive actions, data loss, or security issues
78
What is 'tool grounding'?
Verifying that tool calls and parameters are well-formed and safe before execution
79
Why are eval harnesses (automated tests) important for GenAI apps?
They provide repeatable checks that changes in prompts/models don’t regress behavior
80
What is a 'golden set' or 'eval set' in GenAI evaluation?
Curated set of inputs with expected or reference outputs to test system quality
81
Which of the following is an example of an LLM 'judge' pattern?
One LLM grades or critiques another LLM’s answer using a rubric
82
Why is deterministic behavior sometimes desired in GenAI APIs?
To ensure reproducible outputs for testing and compliance
83
How can you increase determinism for an LLM call?
Set temperature to 0 and disable sampling randomness
84
Why do many GenAI systems use hybrid search (sparse + dense)?
To combine keyword and semantic similarity for better retrieval
85
What is 're-ranking' in RAG or search?
Reordering retrieved documents using a more expensive scoring model
86
Why is grounding with citations valuable in GenAI responses?
It helps users verify sources and trust the answer
87
Which is a common ethical risk of deploying GenAI?
Amplifying bias, misinformation, or privacy leaks at scale
88
What is watermarking in the context of generative AI outputs?
Embedding hidden signals in outputs to indicate they are AI-generated
89
Why is domain adaptation important for enterprise GenAI?
Enterprises need models aligned to their domain vocabulary, workflows, and policies
90
Which is generally the lowest-risk way to adapt a base LLM to a domain?
RAG plus careful prompt design and safety filters
91
What is 'toolformer'-style training?
Training models to decide when and how to call tools during generation
92
Why are latency and throughput trade-offs important in GenAI APIs?
They determine user experience and how many requests can be served per second
93
What is streaming output from an LLM?
Sending tokens incrementally as they are generated to reduce perceived latency
94
Which technique helps reduce context window usage in long conversations?
Summarizing earlier turns into shorter context
95
What is 'semantic caching'?
Caching based on embedding similarity so semantically similar queries reuse answers
96
Why do some systems separate 'orchestration layer' from 'model provider'?
To allow multi-model, multi-provider routing, observability, and safety in one place
97
What is a 'safety policy' in GenAI systems?
Set of rules describing allowed and disallowed content or behavior
98
Why is continuous evaluation important after deploying a GenAI feature?
User behavior, data, or provider models can change over time impacting quality and risk
99
What is a 'safety sandwich' pattern?
Wrapping LLM calls with pre- and post-safety filters or checkers
100
Which of the following best describes a robust GenAI application architecture today?
Orchestration layer with RAG, tools/agents, safety, evals, logging, and multiple model backends