--- title: "Post-Only Identity Treatments with `simulate_treatment()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Post-Only Identity Treatments with `simulate_treatment()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Purpose This vignette shows a short workflow for collaborators who currently use `run_ai_on_chapters()` but only keep the post-treatment responses. For that use case, `simulate_treatment()` is usually simpler: 1. You define the prompt directly. 2. You can still use `{identity}` and `{intervention_text}` placeholders. 3. You get one row per simulation turn, without pre/post delta columns that you do not need. The example below compares `American` and `Brazilian` identities in a post-only design. # 1. Set package options As in other `nalanda` workflows, it is easiest to set the model routing once at the start of the script. ```{r, eval = TRUE} library(nalanda) options( nalanda.integration = "vertexai", nalanda.base_url = "https://ai-gateway.apps.cloud.rt.nyu.edu/v1/" ) ``` # 2. Define the treatment text and prompt The intervention text can still be passed separately and inserted into the prompt with `{intervention_text}`. This is useful when your script generates many treatment variants programmatically. ```{r, eval = TRUE} intervention_text <- "Imagine you come across a short community message about climate action. It explains that small everyday choices can add up when many people participate, and it encourages residents to back clean-energy policies, talk with others about practical climate solutions, and choose lower-carbon habits when possible. The message ends by saying that collective action now can help protect future generations." prompt <- "{intervention_text} On a scale from 0 to 100, how accurate do you think this statement is? Statement: Climate change is a global emergency On a scale from 0 to 100, how much do you support public policies that reduce greenhouse gas emissions?" ``` Here: 1. `{identity}` will be filled from `groups` via `context_text`. 2. `{intervention_text}` will be filled from the function argument. # 3. Preview the prompt before running If you like to inspect prompt text before launching a run, you can build a single concrete prompt from the same elements you will pass to `simulate_treatment()`. ```{r, eval = TRUE} groups <- c("American", "Brazilian") context_text <- "You are simulating an adult who identifies as {identity}." prompt_preview <- make_treatment_prompt( prompt_template = prompt, intervention_text = intervention_text, identity_context = gsub("{identity}", groups[1], context_text, fixed = TRUE), identity_label = groups[1] ) cat(prompt_preview) ``` This plays the same role as `make_baseline_prompt()` or `make_post_prompt()`, but for the one-turn `simulate_treatment()` workflow. # 4. Run the post-only simulation ```{r, eval = FALSE} res <- simulate_treatment( intervention_text = intervention_text, groups = groups, context_text = context_text, prompt = prompt, response_type = ellmer::type_object( climate_belief = ellmer::type_number(), policy_support = ellmer::type_number() ), n_simulations = 2, temperature = 0, model = "gemini-2.5-flash-lite" ) ``` This is the key difference from `run_ai_on_chapters()`: 1. there is no baseline turn, 2. there are no `pre_*` or `delta_*` columns, 3. the output matches the post-only design directly. If you want a multi-turn design later, `simulate_treatment()` can also take a prompt vector of length greater than 1, with one element per turn. # 5. Inspect the raw output Each row is one simulated response for one identity and one simulation draw. ```{r, echo = FALSE} example_res <- tibble::tibble( treatment = "intervention_1", sim = c(1L, 2L, 1L, 2L), identity = c("American", "American", "Brazilian", "Brazilian"), turn_index = 1L, turn_type = "turn_1", climate_belief = c(84, 82, 76, 74), policy_support = c(71, 69, 62, 60) ) knitr::kable(example_res) ``` # 6. Summarize by identity Use `summarize_treatment_results()` to compute mean scores by identity. ```{r, eval = FALSE} summary_by_identity <- summarize_treatment_results( res, by_identity = TRUE ) summary_by_identity ``` ```{r, echo = FALSE} example_summary <- tibble::tibble( ch = "intervention_1", id = c("American", "Brazilian"), turn = "turn_1", n = c(2L, 2L), m_climate_belief = c(83.0, 75.0), sd_climate_belief = c(1.4, 1.4), m_policy_support = c(70.0, 61.0), sd_policy_support = c(1.4, 1.4), idx = 1L ) knitr::kable(example_summary) ``` # 7. Compare another group to Americans If `American` is your reference group, a simple downstream comparison is: ```{r, eval = FALSE} library(dplyr) library(tidyr) comparison_vs_american <- summary_by_identity |> select( identity, mean_climate_belief, mean_policy_support ) |> pivot_wider( names_from = identity, values_from = c(mean_climate_belief, mean_policy_support) ) |> mutate( climate_belief_gap_vs_american = mean_climate_belief_Brazilian - mean_climate_belief_American, policy_support_gap_vs_american = mean_policy_support_Brazilian - mean_policy_support_American ) comparison_vs_american ``` ```{r, echo = FALSE} comparison_vs_american <- tibble::tibble( climate_us = 83.0, climate_br = 75.0, policy_us = 70.0, policy_br = 61.0, gap_climate = -8.0, gap_policy = -9.0 ) knitr::kable(comparison_vs_american) ``` # When to use which function Use `run_ai_on_chapters()` when you want a true pre/post design and plan to use within-agent change scores such as `delta_outgroup` or `delta_gap`. Use `simulate_treatment()` when: 1. the design is post-only, 2. the treatment is not really a book chapter, 3. you want direct control over the prompt wording, and 4. you want to insert `{intervention_text}` programmatically.