Log in
public

100 GenAI interview

100 termsBy guiem
Top 10 EloNo ratings yet

Play a challenge match to create ratings.

Terms in this set

1

What is generative AI?

AI that can create new content such as text, images, or code

2

How does generative AI differ from discriminative AI?

Generative models learn p(x, y), discriminative learn p(y|x)

3

What is a Large Language Model (LLM)?

A neural network trained on large text corpora to model language

4

Which architecture underpins most state-of-the-art LLMs?

Transformers

5

What does the self-attention mechanism do in Transformers?

Computes weighted interactions between all token pairs in a sequence

6

Why are positional encodings used in Transformers?

To provide token order information

7

What is tokenization in LLMs?

Splitting text into subword units or tokens

8

What is the context window of an LLM?

The maximum number of tokens the model can consider in one prompt

9

What does 'temperature' control in LLM sampling?

Randomness of the output distribution

10

What is 'top-k' or 'top-p' sampling used for?

Restricting sampling to the most likely tokens to control diversity

11

What is Retrieval-Augmented Generation (RAG)?

A method that combines retrieval from external data with generation by an LLM

12

Which components are core to a RAG system?

Retriever, knowledge store, and generator

13

Why use RAG instead of relying solely on an LLM’s internal knowledge?

To ground answers in up-to-date, domain-specific data

14

What is typically stored in the knowledge store for RAG?

Documents or chunks with associated embeddings

15

What does the retriever do in RAG?

Fetches relevant documents given a query embedding

16

Which type of database is commonly used for RAG similarity search?

Vector database or embedding index

17

What is 'chunking' in RAG pipelines?

Splitting documents into smaller text segments for embedding

18

Why is chunk size important in RAG?

It trades off retrieval granularity vs. context completeness

19

What is 'query rewriting' in advanced RAG systems?

Rephrasing or expanding the query to improve retrieval quality

20

Which technique improves RAG by re-ranking retrieved documents?

Cross-encoder re-ranking

21

What is an 'AI agent' in the LLM ecosystem?

A system where an LLM can perceive context, plan, call tools, and act in a loop

22

What does 'tool use' or 'function calling' enable for LLM agents?

Calling external APIs, databases, or functions from model outputs

23

Which is a common pattern for an LLM agent loop?

Observe → Plan → Act → Observe

24

What is a 'multi-agent' system in GenAI?

Multiple specialized agents collaborating or competing on tasks

25

Why are tools important for production agents?

They let agents access real-time data, systems, and side effects

26

What is a 'planner' in an agent architecture?

A component that breaks goals into steps or sub-tasks

27

What is a 'memory' module for agents?

Structured store for past interactions, facts, or state

28

Short-term conversational memory is typically implemented using:

Appending recent messages into the context window

29

Long-term memory for agents is often stored in:

A vector store or database

30

Which is a risk of unconstrained autonomous agents?

Unbounded tool calls, costs, and harmful or unintended actions

31

What is prompt engineering?

Designing and structuring inputs to steer LLM behavior

32

Which is an example of a prompt engineering technique?

Few-shot prompting with examples

33

What is chain-of-thought (CoT) prompting?

Asking the model to show intermediate reasoning steps

34

Why do many providers restrict explicit chain-of-thought output?

To reduce risk of over-reliance, leakage, and misinterpretation of internal reasoning

35

What is 'system message' vs 'user message' in chat-based LLM APIs?

System sets high-level behavior, user provides the query/task

36

What is hallucination in LLMs?

The model generating confident but factually incorrect content

37

Which strategy helps reduce hallucinations?

RAG with grounded retrieval and citation

38

What is a safety or guardrail layer in GenAI systems?

A post-processing or filtering step to block unsafe or disallowed outputs

39

What is jailbreak testing?

Trying to bypass or defeat safety constraints of LLMs

40

Which is a common evaluation approach for GenAI answers?

Human evaluation or model-based rubric scoring

41

Which metric is often used for text similarity in GenAI evaluation?

BLEU / ROUGE / BERTScore

42

What is model-based evaluation in LLM systems?

Using another model (or the same) as a 'judge' to rate outputs

43

What is instruction tuning?

Fine-tuning a model on (instruction, response) pairs to follow instructions better

44

What is supervised fine-tuning (SFT) in LLM training?

Training the model on curated input-output examples with cross-entropy loss

45

What is RLHF (Reinforcement Learning from Human Feedback)?

Reinforcement learning where human preferences guide a reward model

46

What is DPO (Direct Preference Optimization) conceptually used for?

Directly aligning model outputs with preference pairs without an explicit reward model

47

What is LoRA (Low-Rank Adaptation) used for?

Parameter-efficient fine-tuning by adding low-rank matrices to weight updates

48

Why is parameter-efficient fine-tuning popular?

It allows adapting large models with fewer trainable parameters and lower cost

49

What is knowledge distillation in the context of LLMs?

Training a smaller student model to imitate a larger teacher model

50

What is the main idea behind Mixture-of-Experts (MoE) LLMs?

Multiple expert subnetworks where only a subset is activated per token

51

Why are MoE architectures attractive for scaling?

They allow very large parameter counts while keeping per-token compute manageable

52

What is a 'small language model' (SLM) in current discussions?

Any model under 10B parameters, optimized for on-device or low-latency use

53

Which is a key tradeoff between large LLMs and SLMs?

SLMs tend to be faster/cheaper but may have lower general capability

54

What is a diffusion model in generative AI?

A model that iteratively denoises random noise to generate images or other data

55

Which generative paradigm is most associated with image generation today?

Diffusion models and latent diffusion

56

What is a Generative Adversarial Network (GAN)?

Two models, generator and discriminator, trained in opposition

57

What is a multimodal model?

A model that handles multiple data types such as text, images, or audio together

58

Which is an example use case of a vision-language model?

Image captioning or visual question answering

59

What is 'few-shot' learning with LLMs?

Providing a few examples in the prompt to steer behavior without weight updates

60

What is 'zero-shot' capability in LLMs?

Model can perform tasks without explicit task-specific training or examples

61

Why is observability important in GenAI applications?

To understand prompts, outputs, costs, and failures for monitoring and debugging

62

What is a 'prompt log' in production systems?

Structured record of prompts, model versions, and outputs

63

What is response caching for LLMs?

Caching outputs for repeated prompts or semantically similar queries

64

Why is cost control critical for GenAI deployments?

LLM APIs and GPU inference can be expensive at scale

65

Which approach commonly reduces GenAI inference costs?

Routing easy queries to cheaper or smaller models (model routing)

66

What is 'model routing' or 'model cascade'?

Using smaller/cheaper models first and escalating to more powerful ones when needed

67

Why is PII handling important in GenAI systems?

To comply with privacy regulations and avoid leaking sensitive data

68

Which technique helps protect sensitive user data in logs?

Redaction or pseudonymization of PII

69

What is 'drift' in the context of GenAI systems?

Change in data distribution, user behavior, or model versions affecting performance

70

Why is versioning of prompts and models important?

To reproduce behavior, debug issues, and compare performance across changes

71

What is an orchestration framework like LangChain or LlamaIndex used for?

Managing pipelines of prompts, tools, retrieval, and models in GenAI apps

72

In LangChain terminology, what is a 'Chain'?

A sequence or graph of calls (LLMs, tools, retrievers) composed into a pipeline

73

What is the role of a 'Tool' in LangChain-style frameworks?

External function or API that the LLM/agent can invoke

74

What is 'guardrailing' in GenAI frameworks?

Enforcing safety, compliance, or formatting rules on LLM inputs and outputs

75

What is a 'stateful agent'?

Agent whose behavior depends on stored state or memory across turns

76

Why might you use a graph-based workflow (DAG) for GenAI pipelines?

To model complex branching, dependencies, and parallel steps

77

Which is a typical risk when connecting agents to powerful tools (e.g., shell, database writes)?

Potential destructive actions, data loss, or security issues

78

What is 'tool grounding'?

Verifying that tool calls and parameters are well-formed and safe before execution

79

Why are eval harnesses (automated tests) important for GenAI apps?

They provide repeatable checks that changes in prompts/models don’t regress behavior

80

What is a 'golden set' or 'eval set' in GenAI evaluation?

Curated set of inputs with expected or reference outputs to test system quality

81

Which of the following is an example of an LLM 'judge' pattern?

One LLM grades or critiques another LLM’s answer using a rubric

82

Why is deterministic behavior sometimes desired in GenAI APIs?

To ensure reproducible outputs for testing and compliance

83

How can you increase determinism for an LLM call?

Set temperature to 0 and disable sampling randomness

84

Why do many GenAI systems use hybrid search (sparse + dense)?

To combine keyword and semantic similarity for better retrieval

85

What is 're-ranking' in RAG or search?

Reordering retrieved documents using a more expensive scoring model

86

Why is grounding with citations valuable in GenAI responses?

It helps users verify sources and trust the answer

87

Which is a common ethical risk of deploying GenAI?

Amplifying bias, misinformation, or privacy leaks at scale

88

What is watermarking in the context of generative AI outputs?

Embedding hidden signals in outputs to indicate they are AI-generated

89

Why is domain adaptation important for enterprise GenAI?

Enterprises need models aligned to their domain vocabulary, workflows, and policies

90

Which is generally the lowest-risk way to adapt a base LLM to a domain?

RAG plus careful prompt design and safety filters

91

What is 'toolformer'-style training?

Training models to decide when and how to call tools during generation

92

Why are latency and throughput trade-offs important in GenAI APIs?

They determine user experience and how many requests can be served per second

93

What is streaming output from an LLM?

Sending tokens incrementally as they are generated to reduce perceived latency

94

Which technique helps reduce context window usage in long conversations?

Summarizing earlier turns into shorter context

95

What is 'semantic caching'?

Caching based on embedding similarity so semantically similar queries reuse answers

96

Why do some systems separate 'orchestration layer' from 'model provider'?

To allow multi-model, multi-provider routing, observability, and safety in one place

97

What is a 'safety policy' in GenAI systems?

Set of rules describing allowed and disallowed content or behavior

98

Why is continuous evaluation important after deploying a GenAI feature?

User behavior, data, or provider models can change over time impacting quality and risk

99

What is a 'safety sandwich' pattern?

Wrapping LLM calls with pre- and post-safety filters or checkers

100

Which of the following best describes a robust GenAI application architecture today?

Orchestration layer with RAG, tools/agents, safety, evals, logging, and multiple model backends