| Title: | R Toolbox to answer the question: Do books really change lives? |
|---|---|
| Description: | Provides tools and utilities for analyzing research data related to books, reading, and prosocial behavior. Named after the historic Nalanda Mahavihara, a center of learning and scholarly collaboration in ancient India. |
| Authors: | Rémi Thériault [aut, cre] (ORCID: <https://orcid.org/0000-0003-4315-6788>) |
| Maintainer: | Rémi Thériault <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.0.1.4 |
| Built: | 2026-05-18 18:13:59 UTC |
| Source: | https://github.com/centerconflictcooperation/nalanda |
Computes the mean and standard deviation of an outcome across simulation replicates within each model-by-unit cell. This is the recommended first step before computing inter-model agreement: it collapses intra-model sampling noise so that downstream metrics reflect genuine model differences rather than Monte Carlo variance.
aggregate_simulations( data, outcome = "outcome", by = c("model", "book_id", "chapter_id", "group") )aggregate_simulations( data, outcome = "outcome", by = c("model", "book_id", "chapter_id", "group") )
data |
A data frame with one row per simulation run, containing columns for model identity, unit identifiers, and the outcome variable. |
outcome |
Character string naming the outcome column to aggregate
(default |
by |
Character vector of column names to group by. Must include a column
identifying the model (typically |
A tibble with one row per unique combination of by, plus:
Mean of outcome across simulation runs.
Standard deviation across runs.
Number of simulation replicates in the cell.
sim_data <- data.frame( model = rep(c("gpt-4o", "gemini-2.5-flash"), each = 40), book_id = rep("BookA", 80), chapter_id = rep(paste0("ch", 1:4), each = 10, times = 2), group = rep(c("Democrat", "Republican"), 40), sim = rep(1:10, 8), rating = rnorm(80, 60, 10) ) aggregate_simulations(sim_data, outcome = "rating", by = c("model", "book_id", "chapter_id", "group"))sim_data <- data.frame( model = rep(c("gpt-4o", "gemini-2.5-flash"), each = 40), book_id = rep("BookA", 80), chapter_id = rep(paste0("ch", 1:4), each = 10, times = 2), group = rep(c("Democrat", "Republican"), 40), sim = rep(1:10, 8), rating = rnorm(80, 60, 10) ) aggregate_simulations(sim_data, outcome = "rating", by = c("model", "book_id", "chapter_id", "group"))
Takes chapter files named like 1_howcanyou.txt and 2_howcanyou.txt,
groups them by the shared title stem after the underscore, orders chapters
by their numeric prefix, and writes one combined .txt file per group
using filenames like 1_howcanyou.txt.
combine_book_files( input_dir, output_dir = file.path(input_dir, "combined"), separator = "\n\n", overwrite = FALSE )combine_book_files( input_dir, output_dir = file.path(input_dir, "combined"), separator = "\n\n", overwrite = FALSE )
input_dir |
Character scalar. Folder containing chapter |
output_dir |
Character scalar. Folder where combined |
separator |
Character scalar. Text inserted between chapters when combining them. Defaults to two line breaks. |
overwrite |
Logical scalar. If |
A tibble with one row per combined book and columns describing the numeric output file, original title stem, and source chapter numbers.
Combines text files named with page/range suffixes such as
3_part1-001-050.txt, 3_part1-051-100.txt, and
3_part1-101-138.txt into a single 3_part1.txt file. Files without a
trailing -start-end range are copied to output_dir when needed.
combine_split_chapter_files( input_dir, output_dir = input_dir, extension = "txt", separator = "\n\n", overwrite = FALSE, remove_sources = FALSE )combine_split_chapter_files( input_dir, output_dir = input_dir, extension = "txt", separator = "\n\n", overwrite = FALSE, remove_sources = FALSE )
input_dir |
Character scalar. Folder containing chapter |
output_dir |
Character scalar. Folder where consolidated files should
be written. Defaults to |
extension |
Character scalar file extension to match, without a leading
dot by default. Defaults to |
separator |
Character scalar. Text inserted between chunks when combining them. Defaults to two line breaks. |
overwrite |
Logical scalar. If |
remove_sources |
Logical scalar. If |
A tibble with one row per output file and columns describing the output path, source files, and action taken.
Separates post-processing from model execution so users can re-compute metrics without re-running API calls.
compute_run_ai_metrics(x, per_group = NULL)compute_run_ai_metrics(x, per_group = NULL)
x |
A data frame or list-like object from |
per_group |
Optional logical. Whether the run used per-group mode
(
|
A simulation-level tibble with derived metrics (for example
pre_outgroup, post_outgroup, delta_outgroup, and in per-group mode
also pre_ingroup, post_ingroup, pre_gap, post_gap,
delta_ingroup, delta_gap).
metrics <- compute_run_ai_metrics(toy_run_ai_turns) head(metrics) # The processed output can be passed on to summary and plotting helpers. summary_by_chapter <- summarize_chapter_scores(metrics) head(summary_by_chapter)metrics <- compute_run_ai_metrics(toy_run_ai_turns) head(metrics) # The processed output can be passed on to summary and plotting helpers. summary_by_chapter <- summarize_chapter_scores(metrics) head(summary_by_chapter)
Compute cumulative chapter metrics against the original baseline
compute_run_ai_metrics_cumulative(x, per_group = NULL)compute_run_ai_metrics_cumulative(x, per_group = NULL)
x |
A data frame or list-like object from |
per_group |
Optional logical. Whether the run used per-group mode. If
|
A chapter-level tibble comparing each post-chapter state against the
baseline turn from the same book × sim × identity conversation.
Compute one-turn ingroup/outgroup metrics from raw output
compute_run_ai_metrics_one_turn(x, per_group = NULL)compute_run_ai_metrics_one_turn(x, per_group = NULL)
x |
A data frame or list-like object from
|
per_group |
Optional logical. Whether the run used per-group mode. If
|
A simulation-level tibble with one-turn metrics. In per-group mode
this includes ingroup_rating, outgroup_rating, and gap. In
single-question mode it includes overall_rating and outgroup_rating.
This helper computes common agreement metrics used in text-analysis papers, including accuracy, macro precision/recall/F1, Spearman correlation, and weighted Cohen's kappa.
evaluate_text_analysis( data, truth_col, estimate_col, by = NULL, metric = c("accuracy", "macro_precision", "macro_recall", "macro_f1", "spearman", "weighted_kappa"), kappa_weights = c("quadratic", "linear") )evaluate_text_analysis( data, truth_col, estimate_col, by = NULL, metric = c("accuracy", "macro_precision", "macro_recall", "macro_f1", "spearman", "weighted_kappa"), kappa_weights = c("quadratic", "linear") )
data |
A data frame containing reference and predicted columns. |
truth_col |
Name of the reference-label column. |
estimate_col |
Name of the model-estimate column. |
by |
Optional character vector of grouping columns. |
metric |
Character vector of metrics to compute. Supported values are
|
kappa_weights |
Weighting scheme for Cohen's kappa. One of
|
A tibble with one row per requested group and one column per metric.
This helper uploads a local PDF to a multimodal model through ellmer and
asks the model to return clean running text. It is useful when ordinary OCR
struggles with stamps, overlays, or poor scan quality but the target model
can read PDFs directly.
extract_pdf_text_with_llm( pdf_path, prompt = paste("Transcribe the main body text from this PDF as plain UTF-8 text.", "Keep the wording faithful to the source.", "Ignore repeated stamps, watermarks, page numbers, headers, footers,", "and other obvious non-book overlays when they are not part of the book.", "Preserve paragraph breaks.", "Return only the extracted text.", "Do not add any introduction, explanation, summary, XML, markdown fences,", "or labels such as 'The following is the main body text from the PDF:'.", sep = " "), model = "gpt-5-mini", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), temperature = 1, seed = 42, output_path = NULL, timeout_s = getOption("ellmer_timeout_s", 120), max_tries = getOption("ellmer_max_tries", 5), retry_wait = 3, overwrite = FALSE )extract_pdf_text_with_llm( pdf_path, prompt = paste("Transcribe the main body text from this PDF as plain UTF-8 text.", "Keep the wording faithful to the source.", "Ignore repeated stamps, watermarks, page numbers, headers, footers,", "and other obvious non-book overlays when they are not part of the book.", "Preserve paragraph breaks.", "Return only the extracted text.", "Do not add any introduction, explanation, summary, XML, markdown fences,", "or labels such as 'The following is the main body text from the PDF:'.", sep = " "), model = "gpt-5-mini", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), temperature = 1, seed = 42, output_path = NULL, timeout_s = getOption("ellmer_timeout_s", 120), max_tries = getOption("ellmer_max_tries", 5), retry_wait = 3, overwrite = FALSE )
pdf_path |
Character scalar path to a local PDF file, a character vector
of PDF paths, or a named/nested list of PDF paths such as the output of
|
prompt |
Character scalar instruction shown alongside the PDF. The default asks for faithful transcription while ignoring obvious non-book overlays such as repeated stamps, page numbers, and headers/footers. |
model |
Character. Model name for the chat backend. |
integration |
Optional Portkey/gateway route slug. If supplied and
|
virtual_key |
Optional legacy virtual key. If supplied and |
base_url |
Character. Base URL for API calls. |
temperature |
Numeric. Sampling temperature passed to the backend. |
seed |
Integer. Random seed for reproducibility. |
output_path |
Optional output target. For a single PDF, this may be
either an exact |
timeout_s |
Numeric scalar request timeout in seconds. Applied via
|
max_tries |
Integer scalar total number of request attempts. Applied via
|
retry_wait |
Numeric scalar seconds to wait between manual retries after a failed single-file attempt. |
overwrite |
Logical scalar. If |
In testing through the NYU Portkey/gateway path, PDF extraction was more
reliable with gpt-5-mini than with Gemini routes. Gemini-family models may
still work in other environments, but PDF handling through
chat_portkey() was inconsistent in our tests.
If pdf_path is a single file, a character scalar containing the
extracted text. If pdf_path is a character vector or nested list, returns
text with the same structure and names as the input. If output_path is
supplied, text files are also written to disk.
Read a file as raw bytes, drop NUL characters, guess or use a provided source encoding, convert to UTF-8 and normalize common punctuation and newlines.
fix_text_file(path, from = NULL)fix_text_file(path, from = NULL)
path |
Character scalar. Path to the text file. |
from |
Optional character. Source encoding to feed to iconv. If NULL the function will try to guess and fall back to WINDOWS-1252. |
A character scalar with cleaned text (UTF-8).
Estimate Spotify audiobook durations for new chapters or books from a reference data set where Spotify duration is already known. The predictor can be either text file size in bytes or word count. File size is often the simplest option when all chapters are plain text files created by the same workflow.
interpolate_spotify_audiobook_duration( reference, target = NULL, duration_col, books_path = NULL, target_book = NULL, book_col = "book", extension = "txt", reference_book_col = NULL, size_col = NULL, words_col = NULL, file_col = NULL, text_col = NULL, measure = c("file_size", "word_count"), duration_unit = c("seconds", "minutes", "hours", "hms"), output_unit = c("minutes", "seconds", "hours", "hms"), method = c("ratio", "lm") )interpolate_spotify_audiobook_duration( reference, target = NULL, duration_col, books_path = NULL, target_book = NULL, book_col = "book", extension = "txt", reference_book_col = NULL, size_col = NULL, words_col = NULL, file_col = NULL, text_col = NULL, measure = c("file_size", "word_count"), duration_unit = c("seconds", "minutes", "hours", "hms"), output_unit = c("minutes", "seconds", "hours", "hms"), method = c("ratio", "lm") )
reference |
Data frame with known Spotify durations and, unless
|
target |
Data frame with chapters or books to estimate. When
|
duration_col |
Character scalar. Column in |
books_path |
Character scalar or |
target_book |
Character vector or |
book_col |
Character scalar. Book identifier column in |
extension |
Character scalar. File extension to read from |
reference_book_col |
Character scalar or |
size_col |
Character scalar or |
words_col |
Character scalar or |
file_col |
Character scalar or |
text_col |
Character scalar or |
measure |
Character scalar. Either |
duration_unit |
Unit of |
output_unit |
Unit for the returned estimate column. Use |
method |
Estimation method. |
A tibble containing target plus .duration_seconds, an
estimated_duration_* column in output_unit, .duration_measure, and
.duration_method. The total estimated duration is also stored in the
estimated_total_seconds and estimated_total_* attributes.
reference <- tibble::tibble( book = c("A", "B"), file_size_bytes = c(100000, 150000), spotify_duration_minutes = c(120, 180) ) chapters <- tibble::tibble( chapter = c("chapter_1", "chapter_2"), file_size_bytes = c(25000, 50000) ) interpolate_spotify_audiobook_duration( reference, chapters, duration_col = "spotify_duration_minutes", size_col = "file_size_bytes", duration_unit = "minutes" )reference <- tibble::tibble( book = c("A", "B"), file_size_bytes = c(100000, 150000), spotify_duration_minutes = c(120, 180) ) chapters <- tibble::tibble( chapter = c("chapter_1", "chapter_2"), file_size_bytes = c(25000, 50000) ) interpolate_spotify_audiobook_duration( reference, chapters, duration_col = "spotify_duration_minutes", size_col = "file_size_bytes", duration_unit = "minutes" )
Given either a path to a directory of book folders or a single folder that directly contains chapter files, return a named list where each element is a character vector of chapter file paths (ordered by number or name).
list_book_chapters(books_path = "books", extension = "txt")list_book_chapters(books_path = "books", extension = "txt")
books_path |
Character scalar. Path containing subdirectories for each book (default "books"). |
extension |
Character scalar file extension to match, without a leading
dot by default. Defaults to |
A named list of character vectors of file paths.
This helper creates prompts in the same style used by Rathje et al. (2024):
a direct question, followed by a numeric response instruction, followed by
the text placeholder. The returned prompt is a template and may include
placeholders such as {text} or {language} that are expanded later by
run_text_analysis().
make_annotation_prompt( question, labels = NULL, scale = NULL, anchors = NULL, text_label = "Here is the text:", text_placeholder = "{text}" )make_annotation_prompt( question, labels = NULL, scale = NULL, anchors = NULL, text_label = "Here is the text:", text_placeholder = "{text}" )
question |
Character scalar question shown before the response instructions. |
labels |
Optional character vector of class labels in numeric order.
For example, |
scale |
Optional numeric vector of length 2 giving the response scale
range, such as |
anchors |
Optional character vector of length 2 giving the low and high
anchor labels used with |
text_label |
Character scalar introducing the text block. |
text_placeholder |
Character scalar placeholder to insert where the text should appear. |
A character scalar prompt template.
Constructs the prompt: identity context + question(s). If the question
template contains {group}, it is expanded once per group (ingroup first).
Otherwise, the question is used as-is (single-question mode).
make_baseline_prompt( identity_context, question_template, groups, identity_label )make_baseline_prompt( identity_context, question_template, groups, identity_label )
identity_context |
Character scalar. The full context string for this identity. |
question_template |
Character scalar. Optionally contains |
groups |
Character vector of all group labels. |
identity_label |
Character scalar. The group label assigned as identity (used to determine ingroup-first ordering). |
Character scalar prompt.
# Per-group mode (asks about each group, ingroup first): make_baseline_prompt( identity_context = "You are simulating an American Democrat.", question_template = "How warmly (0-100) do you feel towards {group}s?", groups = c("Democrat", "Republican"), identity_label = "Democrat" ) # Single-question mode (asks once, as-is): make_baseline_prompt( identity_context = "You are simulating an American Democrat.", question_template = "How warmly (0-100) do you feel towards your outgroup?", groups = c("Democrat", "Republican"), identity_label = "Democrat" )# Per-group mode (asks about each group, ingroup first): make_baseline_prompt( identity_context = "You are simulating an American Democrat.", question_template = "How warmly (0-100) do you feel towards {group}s?", groups = c("Democrat", "Republican"), identity_label = "Democrat" ) # Single-question mode (asks once, as-is): make_baseline_prompt( identity_context = "You are simulating an American Democrat.", question_template = "How warmly (0-100) do you feel towards your outgroup?", groups = c("Democrat", "Republican"), identity_label = "Democrat" )
Constructs the prompt: material text + question(s). If the question
template contains {group}, it is expanded once per group (ingroup first).
Otherwise, the question is used as-is (single-question mode).
make_post_prompt(chapter_text, question_template, groups, identity_label)make_post_prompt(chapter_text, question_template, groups, identity_label)
chapter_text |
Character scalar. The full material text. |
question_template |
Character scalar. Optionally contains |
groups |
Character vector of all group labels. |
identity_label |
Character scalar. The group label assigned as identity. |
Character scalar prompt.
# Per-group mode: make_post_prompt( chapter_text = "This is a chapter about cooperation...", question_template = "How warmly (0-100) do you feel towards {group}s?", groups = c("Democrat", "Republican"), identity_label = "Democrat" ) # Single-question mode: make_post_prompt( chapter_text = "This is a chapter about cooperation...", question_template = "How warmly (0-100) do you feel towards your outgroup?", groups = c("Democrat", "Republican"), identity_label = "Democrat" )# Per-group mode: make_post_prompt( chapter_text = "This is a chapter about cooperation...", question_template = "How warmly (0-100) do you feel towards {group}s?", groups = c("Democrat", "Republican"), identity_label = "Democrat" ) # Single-question mode: make_post_prompt( chapter_text = "This is a chapter about cooperation...", question_template = "How warmly (0-100) do you feel towards your outgroup?", groups = c("Democrat", "Republican"), identity_label = "Democrat" )
simulate_treatment()
This helper expands a single simulate_treatment() prompt template into a
concrete prompt string. It is useful for inspecting prompt wording before
launching a run, much like make_baseline_prompt() and make_post_prompt()
are useful for run_ai_on_chapters().
make_treatment_prompt( prompt_template, intervention_text, identity_context = "", identity_label = NA_character_ ) build_simulate_treatment_prompt( prompt_template, intervention_text, identity_context = "", identity_label = NA_character_ )make_treatment_prompt( prompt_template, intervention_text, identity_context = "", identity_label = NA_character_ ) build_simulate_treatment_prompt( prompt_template, intervention_text, identity_context = "", identity_label = NA_character_ )
prompt_template |
Character scalar. A single prompt template that may
include |
intervention_text |
Character scalar. The intervention text to insert
into |
identity_context |
Character scalar. Optional identity context to prepend to the prompt. |
identity_label |
Character scalar. Optional identity label used to
expand |
A character scalar containing the concrete prompt.
make_treatment_prompt( prompt_template = "{intervention_text}\n\nRate this as {identity}.", intervention_text = "A short climate message.", identity_context = "You are simulating an American adult.", identity_label = "American" )make_treatment_prompt( prompt_template = "{intervention_text}\n\nRate this as {identity}.", intervention_text = "A short climate message.", identity_context = "You are simulating an American adult.", identity_label = "American" )
Quantifies how consistently different AI models score the same units using
ICC(2,1) (intraclass correlation, absolute agreement) and/or Kendall's W
(coefficient of concordance). Models produce continuous scores on a 1–100
scale; this function operates on those raw scores (typically after
aggregating simulation runs via aggregate_simulations()).
model_agreement( data, outcome = "mean_outcome", unit_by = c("book_id", "chapter_id", "group"), group_by = NULL, model_col = "model", metrics = c("icc", "kendall_w") )model_agreement( data, outcome = "mean_outcome", unit_by = c("book_id", "chapter_id", "group"), group_by = NULL, model_col = "model", metrics = c("icc", "kendall_w") )
data |
A data frame with one row per model-by-unit combination. |
outcome |
Character string naming the score column (default
|
unit_by |
Character vector of columns that jointly identify a unit
(default |
group_by |
Optional character vector. If provided, agreement metrics
are computed separately within each level of these columns (e.g.,
|
model_col |
Character string naming the model column (default
|
metrics |
Character vector of metrics to compute. One or both of
|
Each model is treated as a rater and each unique combination of
unit_by columns as a target. ICC captures agreement in both level and
rank order; Kendall's W converts the continuous scores to ranks internally
and assesses rank-order concordance only.
ICC(2,1) is the primary recommendation for continuous scores. It penalises models that systematically differ in level and in rank ordering. Interpret with Cicchetti (1994) cut-offs: < .40 poor, .40–.59 fair, .60–.74 good, >= .75 excellent.
Kendall's W converts the continuous 1–100 scores to ranks and asks only whether models rank the units the same way. Useful when the absolute scale is arbitrary or when the researcher cares about ordinal agreement (e.g., "which book scored highest?") rather than exact score match.
For a quick "single consistency score," report ICC. Add Kendall's W as a supplementary rank-agreement check.
Always aggregate simulation runs first via aggregate_simulations().
Failing to do so inflates n and distorts agreement estimates.
Units with missing scores for one or more models are excluded from ICC and
Kendall's W because agreement metrics require the same units to be scored by
all raters. The reported n_units is the number of complete units used in
the calculation.
A tibble with columns: any group_by columns, plus
"icc" or "kendall_w".
The agreement statistic (0–1 scale).
Qualitative label (e.g., "good", "moderate").
Number of models (raters).
Number of units (targets).
p-value for the statistic (F-test for ICC, chi-squared approximation for Kendall's W).
## Not run: # After aggregating simulations agg <- aggregate_simulations(sim_data, outcome = "rating", by = c("model", "book_id", "chapter_id", "group")) # Overall agreement model_agreement(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id", "group")) # Agreement by political group model_agreement(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id"), group_by = "group") ## End(Not run)## Not run: # After aggregating simulations agg <- aggregate_simulations(sim_data, outcome = "rating", by = c("model", "book_id", "chapter_id", "group")) # Overall agreement model_agreement(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id", "group")) # Agreement by political group model_agreement(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id"), group_by = "group") ## End(Not run)
Builds a slide-ready sensitivity table by running model_agreement() across
several substantively useful unit definitions (for example book-level,
chapter-level, and party-specific agreement). Lower-level rows are
aggregated with an NA-safe mean before each agreement calculation, so skipped
chapters do not turn an entire book mean into NA.
model_agreement_sensitivity( data, outcome = "mean_outcome", model_col = "model", analyses = NULL, metrics = c("icc", "kendall_w"), format = c("wide", "long"), digits = 2, drop_missing = TRUE )model_agreement_sensitivity( data, outcome = "mean_outcome", model_col = "model", analyses = NULL, metrics = c("icc", "kendall_w"), format = c("wide", "long"), digits = 2, drop_missing = TRUE )
data |
A data frame with one row per model-by-unit combination. |
outcome |
Character string naming the score column (default
|
model_col |
Character string naming the model column (default
|
analyses |
Optional named list defining analyses to run. Each element
should be a list with |
metrics |
Character vector of metrics to compute. One or both of
|
format |
Character. |
digits |
Integer. Number of decimal places used in the formatted wide table. |
drop_missing |
Logical. Whether to drop rows with missing model, unit,
or grouping identifiers before computing each analysis (default |
A tibble. With format = "wide", columns include Analysis level,
Subgroup, N models, N units, ICC, and Kendall's W.
## Not run: model_agreement_sensitivity( agg, outcome = "mean_delta_gap", model_col = "model" ) ## End(Not run)## Not run: model_agreement_sensitivity( agg, outcome = "mean_delta_gap", model_col = "model" ) ## End(Not run)
Computes Pearson and/or Spearman correlations between every pair of models
on a shared set of units. This is a diagnostic complement to the omnibus
metrics in model_agreement(): it reveals which models diverge.
model_pairwise_cor( data, outcome = "mean_outcome", unit_by = c("book_id", "chapter_id", "group"), group_by = NULL, model_col = "model", methods = c("pearson", "spearman") )model_pairwise_cor( data, outcome = "mean_outcome", unit_by = c("book_id", "chapter_id", "group"), group_by = NULL, model_col = "model", methods = c("pearson", "spearman") )
data |
A data frame with one row per model-by-unit combination. |
outcome |
Character string naming the score column (default
|
unit_by |
Character vector of columns that jointly identify a unit
(default |
group_by |
Optional character vector. If provided, agreement metrics
are computed separately within each level of these columns (e.g.,
|
model_col |
Character string naming the model column (default
|
methods |
Character vector of correlation types. One or both of
|
A tibble with columns: any group_by columns, plus model_a,
model_b, method, correlation, and n_units.
## Not run: pw <- model_pairwise_cor(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id", "group")) plot_model_agreement(pw, type = "heatmap") ## End(Not run)## Not run: pw <- model_pairwise_cor(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id", "group")) plot_model_agreement(pw, type = "heatmap") ## End(Not run)
Takes each model's continuous scores (1–100 scale) and derives rankings from them, then evaluates cross-model concordance via Kendall's W. The rankings are computed by the researcher from the raw scores — models themselves only produce continuous ratings, not ordinal ranks. Useful for answering "Do models rank books the same way?"
model_rank_consistency( data, outcome = "mean_outcome", unit_by = c("book_id", "chapter_id"), rank_within = NULL, model_col = "model" )model_rank_consistency( data, outcome = "mean_outcome", unit_by = c("book_id", "chapter_id"), rank_within = NULL, model_col = "model" )
data |
A data frame with one row per model-by-unit combination. |
outcome |
Character string naming the score column (default
|
unit_by |
Character vector of columns that jointly identify a unit
(default |
rank_within |
Optional character vector of columns that define separate
ranking contexts (e.g., |
model_col |
Character string naming the model column (default
|
unit_by must identify exactly one row per model within each ranking context.
If the data still contain lower-level rows (for example, chapters) and you
want book-level ranks, aggregate those rows to the book level before calling
this function.
Units with missing scores for one or more models are excluded from the
concordance calculation. The reported n_items is the number of complete
items used.
A list with two elements:
Tibble with the unit columns, model, score, and rank.
Tibble with Kendall's W and associated statistics,
one row per rank_within combination (or one row total).
## Not run: rc <- model_rank_consistency(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id"), rank_within = "group") rc$concordance rc$ranks agg_book <- agg |> dplyr::group_by(model, book_id, group) |> dplyr::summarise(mean_rating = mean(mean_rating), .groups = "drop") rc_book <- model_rank_consistency(agg_book, outcome = "mean_rating", unit_by = "book_id", rank_within = "group") ## End(Not run)## Not run: rc <- model_rank_consistency(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id"), rank_within = "group") rc$concordance rc$ranks agg_book <- agg |> dplyr::group_by(model, book_id, group) |> dplyr::summarise(mean_rating = mean(mean_rating), .groups = "drop") rc_book <- model_rank_consistency(agg_book, outcome = "mean_rating", unit_by = "book_id", rank_within = "group") ## End(Not run)
nalanda() returns a neutral, research-friendly fun fact about the
ancient Nalanda University. The goal is simply to provide a small,
informative piece of historical context with no evaluative or cultural
interpretation.
nalanda()nalanda()
A character string containing one factual statement about Nalanda.
nalanda()nalanda()
Aggregates lower-level rows to the requested unit level, then calls
model_pairwise_cor(). This is a convenience wrapper for cases where data
still contain chapter, party, or simulation-detail rows but the researcher
wants correlations at a broader level, such as book-level correlations.
pairwise_for_level( data, outcome = "mean_outcome", unit_by = c("book_id", "chapter_id", "group"), group_by = NULL, model_col = "model", methods = c("pearson", "spearman"), drop_missing = TRUE )pairwise_for_level( data, outcome = "mean_outcome", unit_by = c("book_id", "chapter_id", "group"), group_by = NULL, model_col = "model", methods = c("pearson", "spearman"), drop_missing = TRUE )
data |
A data frame with one row per model-by-unit combination. |
outcome |
Character string naming the score column (default
|
unit_by |
Character vector of columns that jointly identify a unit
(default |
group_by |
Optional character vector. If provided, agreement metrics
are computed separately within each level of these columns (e.g.,
|
model_col |
Character string naming the model column (default
|
methods |
Character vector of correlation types. One or both of
|
drop_missing |
Logical. Whether to drop rows with missing model, unit,
or grouping identifiers before aggregating (default |
Output of model_pairwise_cor() for the requested level.
## Not run: # Book-level pairwise correlations from chapter-party-level aggregated data pw_book <- pairwise_for_level( agg, outcome = "mean_delta_gap", unit_by = "book", model_col = "model", methods = "pearson" ) summarize_model_correlations(pw_book, method = "pearson") ## End(Not run)## Not run: # Book-level pairwise correlations from chapter-party-level aggregated data pw_book <- pairwise_for_level( agg, outcome = "mean_delta_gap", unit_by = "book", model_col = "model", methods = "pearson" ) summarize_model_correlations(pw_book, method = "pearson") ## End(Not run)
Create a faceted plot (one facet per book) showing mean scores and error bars.
plot_chapter_scores_faceted( summary_df, dv = "post_outgroup", y_label = "Simulated scores" )plot_chapter_scores_faceted( summary_df, dv = "post_outgroup", y_label = "Simulated scores" )
summary_df |
Data frame produced by |
dv |
Character. Column name prefix for mean and sd. For example,
|
y_label |
Character string for y-axis label. |
A ggplot2 object.
chapter_summary <- summarize_chapter_scores(toy_sim_results) plot_chapter_scores_faceted( chapter_summary, dv = "delta_outgroup", y_label = "Mean outgroup change" )chapter_summary <- summarize_chapter_scores(toy_sim_results) plot_chapter_scores_faceted( chapter_summary, dv = "delta_outgroup", y_label = "Mean outgroup change" )
Simple line plot of mean simulated outgroup rating across chapter order for each book.
plot_chapter_trajectories( summary_df, dv = "mean_post_outgroup", y_label = "Simulated scores" )plot_chapter_trajectories( summary_df, dv = "mean_post_outgroup", y_label = "Simulated scores" )
summary_df |
A data frame produced by |
dv |
Character. Column name to plot on the y-axis. Defaults to
|
y_label |
Character. Y-axis label. |
A ggplot2 object.
chapter_summary <- summarize_chapter_scores(toy_sim_results) plot_chapter_trajectories( chapter_summary, dv = "mean_delta_gap", y_label = "Mean gap change" )chapter_summary <- summarize_chapter_scores(toy_sim_results) plot_chapter_trajectories( chapter_summary, dv = "mean_delta_gap", y_label = "Mean gap change" )
Create a plot showing means over chapter timepoints using rempsyc::plot_means_over_time for the wide-format response variables.
plot_chapters_over_time( chapters, dv = "delta_gap", group = "book", x_label = "Chapter", y_label = "Simulated scores", plot_title = NULL, plot_subtitle = "", append_model_info = TRUE, ci_type = "between", legend.position = "bottom", groups.order = "decreasing", text_size = 20, line_width = 3, point_size = 4, reverse_score = FALSE, error_bars = TRUE, neutrality_line = TRUE, point_images = NULL, image_size = 0.04, image_nudge_x = 0, image_nudge_y = 0, image_jitter_width = 0, image_jitter_height = 0, facet = NULL, facet_ncol = NULL, facets.order = "increasing" )plot_chapters_over_time( chapters, dv = "delta_gap", group = "book", x_label = "Chapter", y_label = "Simulated scores", plot_title = NULL, plot_subtitle = "", append_model_info = TRUE, ci_type = "between", legend.position = "bottom", groups.order = "decreasing", text_size = 20, line_width = 3, point_size = 4, reverse_score = FALSE, error_bars = TRUE, neutrality_line = TRUE, point_images = NULL, image_size = 0.04, image_nudge_x = 0, image_nudge_y = 0, image_jitter_width = 0, image_jitter_height = 0, facet = NULL, facet_ncol = NULL, facets.order = "increasing" )
chapters |
A data frame or list of processed simulation rows, typically
returned by |
dv |
Character. Name of the column to plot as the dependent variable (default: "pre_post_outgroup_difference"). |
group |
The group by which to plot the variable |
x_label |
Character. X-axis label. |
y_label |
Character. Y-axis label. |
plot_title |
Optional character title. If |
plot_subtitle |
Optional plot subtitle. |
append_model_info |
Logical. If |
ci_type |
Character. Type of confidence interval to pass to |
legend.position |
Position for legend. |
groups.order |
Specifies the desired display order of the groups on the legend. Either provide the levels directly, or a string: "increasing" or "decreasing", to order based on the average value of the variable on the y axis, or "string.length", to order from the shortest to the longest string (useful when working with long string names). "Defaults to "decreasing". |
text_size |
Numeric. Base text size for axis/title text. |
line_width |
Numeric. Line thickness used in |
point_size |
Numeric. Point size used in |
reverse_score |
Logical. Whether to reverse score scale. |
error_bars |
Logical. Show error bars. |
neutrality_line |
Logical. Add a horizontal neutrality line at 50. |
point_images |
Optional named list mapping group levels to image file
paths (PNG recommended). When supplied, the point markers are replaced with
the corresponding images, and the legend labels are updated to show the
matching image alongside the group name when |
image_size |
Numeric. Size of images when |
image_nudge_x |
Numeric. Horizontal offset applied to point images only.
Defaults to |
image_nudge_y |
Numeric. Vertical offset applied to point images only.
Defaults to |
image_jitter_width |
Numeric. Horizontal jitter width applied to point
images only. Defaults to |
image_jitter_height |
Numeric. Vertical jitter height applied to point
images only. Defaults to |
facet |
The variable by which to facet grid. |
facet_ncol |
Optional numeric value passed to |
facets.order |
Specifies the desired display order of facet panels. Either provide the levels directly, or a string: "increasing" or "decreasing", to order panels based on the average value of the y variable, or "string.length" to order panels by facet label length. Defaults to "increasing". |
A ggplot2 object.
plot_chapters_over_time( toy_sim_results, dv = "delta_outgroup", group = "party", facet = "book", y_label = "Outgroup change" )plot_chapters_over_time( toy_sim_results, dv = "delta_outgroup", group = "party", facet = "book", y_label = "Outgroup change" )
Plot chapter trajectories for one-turn simulations
plot_chapters_over_time_one_turn(chapters, dv = "outgroup_rating", ...)plot_chapters_over_time_one_turn(chapters, dv = "outgroup_rating", ...)
chapters |
Raw output from |
dv |
Character. Name of the metric to plot. Defaults to
|
... |
Additional arguments passed to |
A ggplot2 object.
Generates a forest plot displaying mean reduction in affective polarization across books, including 95% confidence intervals.
plot_forest_books( forest_df, dv = "delta_gap", add_ci_label = TRUE, digits = 2, label_cols = c("book"), show_ci_label = TRUE, ci_multiline = TRUE, ci_show_party = FALSE, show_legend = TRUE, ci_label_fontsize = NULL, ci_label_lineheight = 0.85, header = NULL, title = "", xlab = "", xticks = NULL, xticks.digits = NULL, zero = NA, show_overall = TRUE, ci.vertices = FALSE )plot_forest_books( forest_df, dv = "delta_gap", add_ci_label = TRUE, digits = 2, label_cols = c("book"), show_ci_label = TRUE, ci_multiline = TRUE, ci_show_party = FALSE, show_legend = TRUE, ci_label_fontsize = NULL, ci_label_lineheight = 0.85, header = NULL, title = "", xlab = "", xticks = NULL, xticks.digits = NULL, zero = NA, show_overall = TRUE, ci.vertices = FALSE )
forest_df |
Either:
|
dv |
Character. Variable prefix used when |
add_ci_label |
Logical. Passed to |
digits |
Integer. Passed to |
label_cols |
Character vector of left-side label columns.
Defaults to |
show_ci_label |
Logical. If TRUE, appends an internally generated |
ci_multiline |
Logical. For grouped party output, print each party CI on
its own line (using |
ci_show_party |
Logical. Include party names in CI labels. |
show_legend |
Logical. Show party legend when grouped data are present. |
ci_label_fontsize |
Optional numeric size for the CI label column. Useful when grouped party CIs are shown on multiple lines. |
ci_label_lineheight |
Numeric line height for the CI label column when
|
header |
Labels of the columns to be displayed and as specified in
|
title |
Plot title |
xlab |
X-axis label |
xticks |
Optional numeric vector of x-axis tick positions. Defaults to
|
xticks.digits |
Integer number of digits for x-axis tick labels.
Defaults to |
zero |
Numeric scalar, NA, or NULL. Reference line position for forestplot. Defaults to NA (no zero/reference line). NULL is treated as NA for convenience. |
show_overall |
Logical. If TRUE (default), a vertical dashed line indicating the overall mean effect is added. |
ci.vertices |
Logical. Whether to draw CI vertices in the forest plot. |
Books are ordered from strongest to weakest mean effect.
The plot uses circular markers for point estimates and displays 95% confidence intervals.
The vertical dashed line (if enabled) represents the average effect across books.
The temporary mean/lower/upper columns required by forestplot are
generated internally when needed.
If party is present, estimates are drawn as multiple CIs per book row
(one row per book; one estimate per party).
A forestplot grob object.
book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) forest_df <- prepare_forest_books(book_summary, dv = "delta_gap") plot_forest_books( forest_df, xlab = "Reduction in polarization gap", show_overall = FALSE )book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) forest_df <- prepare_forest_books(book_summary, dv = "delta_gap") plot_forest_books( forest_df, xlab = "Reduction in polarization gap", show_overall = FALSE )
Creates diagnostic visualizations for model agreement or pairwise correlation results.
plot_model_agreement(data, type = c("metrics", "heatmap"), method = NULL)plot_model_agreement(data, type = c("metrics", "heatmap"), method = NULL)
data |
Output of |
type |
Character. |
method |
Character. Correlation method to plot when |
A ggplot2 object.
## Not run: plot_model_agreement(model_agreement(agg, outcome = "mean_rating"), type = "metrics") plot_model_agreement(model_pairwise_cor(agg, outcome = "mean_rating"), type = "heatmap") plot_model_agreement(model_pairwise_cor(agg, outcome = "mean_rating"), type = "heatmap", method = "pearson") ## End(Not run)## Not run: plot_model_agreement(model_agreement(agg, outcome = "mean_rating"), type = "metrics") plot_model_agreement(model_pairwise_cor(agg, outcome = "mean_rating"), type = "heatmap") plot_model_agreement(model_pairwise_cor(agg, outcome = "mean_rating"), type = "heatmap", method = "pearson") ## End(Not run)
Creates a heatmap from the ranks element returned by
summarize_top_units(..., include_ranks = TRUE). Rows are units, columns are
models, and cells show each model's rank for that unit.
plot_top_unit_heatmap( data, item_col = NULL, model_col = "model", facet_by = NULL, top_n_items = NULL, item_labels = NULL, show_values = TRUE, title = "Unit ranks by model" )plot_top_unit_heatmap( data, item_col = NULL, model_col = "model", facet_by = NULL, top_n_items = NULL, item_labels = NULL, show_values = TRUE, title = "Unit ranks by model" )
data |
The |
item_col |
Character. Column identifying the ranked item. If |
model_col |
Character. Column identifying the model (default
|
facet_by |
Optional character vector of columns to facet by, e.g.
|
top_n_items |
Optional integer. If supplied, keep only the best
|
item_labels |
Optional character vector for display labels. Use a named
vector to map item IDs to labels, e.g. |
show_values |
Logical. If |
title |
Optional plot title. |
A ggplot2 object.
## Not run: top_books <- summarize_top_units( agg, outcome = "mean_delta_gap", item_by = "book", rank_within = "party", include_ranks = TRUE ) plot_top_unit_heatmap(top_books$ranks, item_col = "book", facet_by = "party") ## End(Not run)## Not run: top_books <- summarize_top_units( agg, outcome = "mean_delta_gap", item_by = "book", rank_within = "party", include_ranks = TRUE ) plot_top_unit_heatmap(top_books$ranks, item_col = "book", facet_by = "party") ## End(Not run)
Creates a connected Cleveland dot plot from summarize_top_units() output
when rankings were computed within a two-level subgroup, such as party. Each
row is an item, dots show subgroup-specific mean ranks, and connecting lines
show how much the ranking differs between subgroups.
plot_top_unit_pairs( data, item_col = NULL, subgroup_col = "party", top_n_items = NULL, item_labels = NULL, subgroup_order = NULL, title = "Paired subgroup ranks", x_breaks = NULL, x_limits = NULL )plot_top_unit_pairs( data, item_col = NULL, subgroup_col = "party", top_n_items = NULL, item_labels = NULL, subgroup_order = NULL, title = "Paired subgroup ranks", x_breaks = NULL, x_limits = NULL )
data |
Output of |
item_col |
Character. Column identifying the ranked item. If |
subgroup_col |
Character. Two-level subgroup column to connect, e.g.
|
top_n_items |
Optional integer. If supplied, keep only items with the
best average |
item_labels |
Optional character vector for display labels. Use a named
vector to map item IDs to labels, e.g. |
subgroup_order |
Optional character vector giving the two subgroup levels in display order. |
title |
Optional plot title. |
x_breaks |
Optional numeric vector of x-axis breaks. If |
x_limits |
Optional numeric vector of length 2. If |
A ggplot2 object.
## Not run: top_books_party <- summarize_top_units( agg, outcome = "mean_delta_gap", item_by = "book", rank_within = "party" ) plot_top_unit_pairs(top_books_party, item_col = "book", subgroup_col = "party") ## End(Not run)## Not run: top_books_party <- summarize_top_units( agg, outcome = "mean_delta_gap", item_by = "book", rank_within = "party" ) plot_top_unit_pairs(top_books_party, item_col = "book", subgroup_col = "party") ## End(Not run)
Creates a dot plot from summarize_top_units() output. Units are ordered by
average rank across models, point size shows the mean score, and text labels
show how many models placed the unit in the top N.
plot_top_units( data, item_col = NULL, facet_by = NULL, top_n_items = NULL, item_labels = NULL, title = "Units most consistently ranked highest", x_breaks = NULL, x_limits = NULL, caption = NULL, show_top_n_label = TRUE )plot_top_units( data, item_col = NULL, facet_by = NULL, top_n_items = NULL, item_labels = NULL, title = "Units most consistently ranked highest", x_breaks = NULL, x_limits = NULL, caption = NULL, show_top_n_label = TRUE )
data |
Output of |
item_col |
Character. Column identifying the ranked item. If |
facet_by |
Optional character vector of columns to facet by, e.g.
|
top_n_items |
Optional integer. If supplied, keep only the best
|
item_labels |
Optional character vector for display labels. Use a named
vector to map item IDs to labels, e.g. |
title |
Optional plot title. |
x_breaks |
Optional numeric vector of x-axis breaks. If |
x_limits |
Optional numeric vector of length 2. If |
caption |
Optional plot caption. If |
show_top_n_label |
Logical. If |
A ggplot2 object.
## Not run: top_books <- summarize_top_units( agg, outcome = "mean_delta_gap", item_by = "book", rank_within = "party" ) plot_top_units(top_books, item_col = "book", facet_by = "party") ## End(Not run)## Not run: top_books <- summarize_top_units( agg, outcome = "mean_delta_gap", item_by = "book", rank_within = "party" ) plot_top_units(top_books, item_col = "book", facet_by = "party") ## End(Not run)
Computes standard errors and 95% confidence intervals for book-level
estimates. This function assumes the input is already aggregated at the
book level (e.g., using
summarize_chapter_scores(..., aggregate_level = "book")).
prepare_forest_books( summary_books, dv = "delta_gap", add_ci_label = TRUE, digits = 2 )prepare_forest_books( summary_books, dv = "delta_gap", add_ci_label = TRUE, digits = 2 )
summary_books |
A data frame containing at least |
dv |
Character. The variable prefix to use for the forest plot.
Defaults to |
add_ci_label |
Logical. Default is TRUE. If TRUE, a formatted
character column |
digits |
Integer. Default is 2. Number of decimal places used
when formatting the |
Standard errors are computed as sd / sqrt(sim). Confidence intervals
are calculated using a normal approximation (mean +/- 1.96 * SE).
When a summary row has one observation and the standard deviation is
therefore missing, the CI is collapsed to the point estimate.
A tibble with added columns:
Mean effect.
Standard error of the mean.
Lower bound of the 95% CI.
Upper bound of the 95% CI.
book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) prepare_forest_books(book_summary, dv = "delta_gap")book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) prepare_forest_books(book_summary, dv = "delta_gap")
Compute a weighted score across selected numeric columns and return the input
data with a final weighted_score, sorted by score.
rank_weighted( data, weights, normalize = TRUE, decreasing = TRUE, na_rm = FALSE )rank_weighted( data, weights, normalize = TRUE, decreasing = TRUE, na_rm = FALSE )
data |
A data frame. |
weights |
Named numeric vector of weights. Names must match columns in
|
normalize |
Logical. If |
decreasing |
Logical. If |
na_rm |
Logical. If |
A tibble containing all original columns plus weighted_score,
sorted by score.
book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) rank_weighted( book_summary, weights = c(mean_delta_outgroup = 0.6, mean_delta_gap = 0.4) )book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) rank_weighted( book_summary, weights = c(mean_delta_outgroup = 0.6, mean_delta_gap = 0.4) )
Convert a list of chapter file paths (as produced by list_book_chapters()) into
a nested list of chapter texts: list(book -> list(chapter_name -> text)).
read_book_texts(chapter_list)read_book_texts(chapter_list)
chapter_list |
A named list of character vectors with file paths. |
A nested list of character scalars (texts) with chapter basenames as
names. Each book element also stores its book name in a book attribute so
selecting a single book with $ preserves enough metadata for simulation
helpers.
Scans a folder for chapter files and renames them to chapter1.ext,
chapter2.ext, ...
using heuristics for ordering (intro, part 1/2, numeric chapter numbers, appendix, etc.).
rename_chapters(folder, extension = "txt")rename_chapters(folder, extension = "txt")
folder |
Character scalar. Path to the folder containing chapter files. |
extension |
Character scalar file extension to match, without a leading
dot by default. Defaults to |
A tibble with columns old_path, base, order_score, new_name, and new_path.
This function implements a cumulative multi-turn design where each simulation creates one persistent chat per book and identity. The chat first establishes a baseline, then processes chapters sequentially in order, one turn per chapter, preserving context across the full book.
run_ai_cumulative_chapters( book_texts, groups, context_text, question_text, output_mode = c("structured", "text"), n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, checkpoint_dir = NULL, checkpoint_prefix = "run_ai_cumulative_chapters", save_dir = NULL, save_prefix = "results" )run_ai_cumulative_chapters( book_texts, groups, context_text, question_text, output_mode = c("structured", "text"), n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, checkpoint_dir = NULL, checkpoint_prefix = "run_ai_cumulative_chapters", save_dir = NULL, save_prefix = "results" )
book_texts |
A nested list of books -> chapters as returned by
|
groups |
Character vector of group labels (length >= 2). |
context_text |
Character. Either a scalar template containing
|
question_text |
Character scalar. A question template containing the
placeholder |
output_mode |
Character. |
n_simulations |
Integer. Number of repeated simulations per book per identity. |
temperature |
Numeric. Sampling temperature passed to the chat backend. |
seed |
Integer. Random seed for reproducibility (incremented for each simulation). |
model |
Character. Model name for the chat backend. |
integration |
Optional Portkey/gateway route slug. Use a route returned
by |
virtual_key |
Optional legacy virtual key. |
base_url |
Character. Base URL for API calls. |
excerpt_chars |
Integer. Number of chapter characters to retain in the stored prompt previews shown in results. |
checkpoint_dir |
Optional directory. If supplied, each completed
book/identity/simulation conversation is saved as its own |
checkpoint_prefix |
Character scalar used at the start of checkpoint
filenames when |
save_dir |
Optional directory. If supplied, each book is saved as one
|
save_prefix |
Character scalar used in book-level filenames when
|
A tibble or named list of tibbles with cumulative turn-level rows. The baseline turn is followed by one post turn per chapter, all within the same chat per book/identity/simulation.
## Not run: raw_cumulative <- run_ai_cumulative_chapters( book_texts = list( "Book A" = list( chapter_1 = "A first chapter about cooperation.", chapter_2 = "A second chapter about conflict and repair." ) ), groups = c("Democrat", "Republican"), context_text = "You are simulating an American adult who politically identifies as a {identity}.", question_text = "On a scale from 0 to 100, how warmly do you feel towards {group}s?", n_simulations = 1, temperature = 0, seed = 42 ) compute_run_ai_metrics_cumulative(raw_cumulative) ## End(Not run)## Not run: raw_cumulative <- run_ai_cumulative_chapters( book_texts = list( "Book A" = list( chapter_1 = "A first chapter about cooperation.", chapter_2 = "A second chapter about conflict and repair." ) ), groups = c("Democrat", "Republican"), context_text = "You are simulating an American adult who politically identifies as a {identity}.", question_text = "On a scale from 0 to 100, how warmly do you feel towards {group}s?", n_simulations = 1, temperature = 0, seed = 42 ) compute_run_ai_metrics_cumulative(raw_cumulative) ## End(Not run)
This function implements a two-turn sequential chat design to measure the effect of reading book chapters on attitudes. For each simulation and each identity assignment, the function:
Establishes a baseline by assigning an identity, then asking for ratings of each group (ingroup first, outgroup second).
Shows the chapter and asks for post-intervention ratings in the same chat session (same ordering: ingroup first, outgroup second).
This design creates a within-agent pre-post comparison, with conversation memory maintained between turns. Ingroup and outgroup columns are computed post-hoc from the assigned identity and the group labels.
run_ai_on_chapters( book_texts, groups, context_text, question_text, output_mode = c("structured", "text"), n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, checkpoint_dir = NULL, checkpoint_prefix = "run_ai_on_chapters", save_dir = NULL, save_prefix = "results" )run_ai_on_chapters( book_texts, groups, context_text, question_text, output_mode = c("structured", "text"), n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, checkpoint_dir = NULL, checkpoint_prefix = "run_ai_on_chapters", save_dir = NULL, save_prefix = "results" )
book_texts |
A single character (one chapter) or a nested list of
books -> chapters as returned by |
groups |
Character vector of group labels (length >= 2). These are the
groups being compared. Example: |
context_text |
Character. Either:
|
question_text |
Character scalar. A question template containing the
placeholder |
output_mode |
Character. |
n_simulations |
Integer. Number of repeated simulations per chapter per identity (each simulation = 2 chat turns). |
temperature |
Numeric. Sampling temperature passed to the chat backend. |
seed |
Integer. Random seed for reproducibility (incremented for each simulation). |
model |
Character. Model name for the chat backend (for example,
|
integration |
Optional Portkey/gateway route slug. Should look like
|
virtual_key |
Optional legacy virtual key. Should look like
|
base_url |
Character. Base URL for API calls. |
excerpt_chars |
Integer. Number of chapter characters to retain in the stored post-prompt preview shown in results. |
checkpoint_dir |
Optional directory. If supplied, each completed
book/chapter/identity/simulation unit is saved as its own |
checkpoint_prefix |
Character scalar used at the start of checkpoint
filenames when |
save_dir |
Optional directory. If supplied, each book is saved as one
|
save_prefix |
Character scalar used in book-level filenames when
|
Authentication uses PORTKEY_API_KEY via ellmer::chat_portkey(). Set it
persistently in .Renviron:
usethis::edit_r_environ() # Add a line like: # PORTKEY_API_KEY=your_api_key_here
Then restart your R session.
A tibble of raw turn-level ratings, or a named list of tibbles (one
per book). Each row is one rating observation and includes:
chapter, sim, identity, turn_index, turn_type, target_group,
and rating, plus prompt and metadata columns.
Use compute_run_ai_metrics() to derive ingroup/outgroup summaries and
gap/delta metrics.
The object has class nalanda and model attributes.
# Per-group mode (asks about each group, ingroup first): make_baseline_prompt( identity_context = "You are simulating an American Democrat.", question_template = "How warmly do you feel towards {group}s?", groups = c("Democrat", "Republican"), identity_label = "Democrat" ) # Single-question mode (asks once, as-is): make_baseline_prompt( identity_context = "You are simulating an American Democrat.", question_template = "How warmly do you feel towards your political outgroup?", groups = c("Democrat", "Republican"), identity_label = "Democrat" )# Per-group mode (asks about each group, ingroup first): make_baseline_prompt( identity_context = "You are simulating an American Democrat.", question_template = "How warmly do you feel towards {group}s?", groups = c("Democrat", "Republican"), identity_label = "Democrat" ) # Single-question mode (asks once, as-is): make_baseline_prompt( identity_context = "You are simulating an American Democrat.", question_template = "How warmly do you feel towards your political outgroup?", groups = c("Democrat", "Republican"), identity_label = "Democrat" )
This function implements a one-turn design where identity context, chapter
text, and the rating question are combined into a single prompt. Independent
prompts are executed in parallel with ellmer::parallel_chat_structured().
run_ai_on_chapters_one_turn( book_texts, groups, context_text, question_text, output_mode = c("structured", "text"), n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, max_active = 10, rpm = 500 )run_ai_on_chapters_one_turn( book_texts, groups, context_text, question_text, output_mode = c("structured", "text"), n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, max_active = 10, rpm = 500 )
book_texts |
A single character (one chapter) or a nested list of
books -> chapters as returned by |
groups |
Character vector of group labels (length >= 2). These are the
groups being compared. Example: |
context_text |
Character. Either:
|
question_text |
Character scalar. A question template containing the
placeholder |
output_mode |
Character. |
n_simulations |
Integer. Number of repeated simulations per chapter per identity. |
temperature |
Numeric. Sampling temperature passed to the chat backend. |
seed |
Integer. Random seed for reproducibility. As in
|
model |
Character. Model name for the chat backend. |
integration |
Optional Portkey/gateway route slug. If supplied and
|
virtual_key |
Optional legacy virtual key. If supplied and |
base_url |
Character. Base URL for API calls. |
excerpt_chars |
Integer. Number of chapter characters to retain in the stored prompt preview shown in results. |
max_active |
Integer. Maximum number of concurrent requests passed to
|
rpm |
Integer. Requests-per-minute cap passed to
|
A tibble of raw single-turn ratings, or a named list of tibbles (one
per book). Each row is one rating observation and includes chapter,
sim, identity, turn_index, turn_type, target_group, and
rating, plus prompt and metadata columns. Use
compute_run_ai_metrics_one_turn() to derive chapter-level one-turn
summaries.
This function applies a prompt template to each row of a text dataset and
extracts structured responses with ellmer. It is designed for dataset-first
workflows such as sentiment, emotion, offensiveness, or moral-foundation
annotation across many short texts.
run_text_analysis( data, text_col = "text", prompt, response_type, output_mode = c("structured", "text"), id_col = NULL, n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, max_active = 10, rpm = 500 )run_text_analysis( data, text_col = "text", prompt, response_type, output_mode = c("structured", "text"), id_col = NULL, n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, max_active = 10, rpm = 500 )
data |
A data frame with at least one text column. |
text_col |
Name of the column containing the text to analyze. |
prompt |
Character scalar prompt template. It may reference any columns
in |
response_type |
An |
output_mode |
Character. |
id_col |
Optional column name identifying each text row. When omitted, a
sequential |
n_simulations |
Integer. Number of repeated runs per row. |
temperature |
Numeric. Sampling temperature passed to the backend. |
seed |
Integer. Random seed for reproducibility. |
model |
Character. Model name for the chat backend. |
integration |
Optional Portkey/gateway route slug. Use a route returned
by |
virtual_key |
Optional legacy virtual key. |
base_url |
Character. Base URL for API calls. |
excerpt_chars |
Integer. Number of text characters to retain in stored prompt previews. |
max_active |
Integer. Maximum number of concurrent requests passed to
|
rpm |
Integer. Requests-per-minute cap passed to
|
A tibble containing the original row metadata, simulation index, structured response fields, and stored prompt previews.
Exports a forestplot object to both PNG and PDF files using grid graphics devices.
save_forest_plot( plot_object, filename, width = 16/1.8, height = 9/1.8, res = 300 )save_forest_plot( plot_object, filename, width = 16/1.8, height = 9/1.8, res = 300 )
plot_object |
A forestplot grob object. |
filename |
Character string specifying the file path without extension. |
width |
Width of the output figure in inches. |
height |
Height of the output figure in inches. |
res |
Resolution in DPI for the PNG output (default = 300). |
Because forestplot uses grid graphics (not ggplot2),
ggsave() is not compatible. This function opens graphics
devices manually and prints the plot object.
Two files are created:
filename.pngHigh-resolution raster image
filename.pdfVector-based PDF
## Not run: book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) forest_plot <- plot_forest_books(book_summary, xlab = "Delta gap") save_forest_plot(forest_plot, tempfile("nalanda-forest")) ## End(Not run)## Not run: book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) forest_plot <- plot_forest_books(book_summary, xlab = "Delta gap") save_forest_plot(forest_plot, tempfile("nalanda-forest")) ## End(Not run)
This function provides a simpler, prompt-first interface for running one or
more turns against an intervention text. Each element of prompt defines one
turn in the chat sequence. When groups is supplied, the same prompt
sequence is repeated for each group identity; groups do not create additional
turns.
simulate_treatment( intervention_text = "", prompt, response_type, output_mode = c("structured", "text"), groups = NULL, context_text = NULL, n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, checkpoint_dir = NULL, checkpoint_prefix = "simulate_treatment", save_dir = NULL, save_prefix = "results" )simulate_treatment( intervention_text = "", prompt, response_type, output_mode = c("structured", "text"), groups = NULL, context_text = NULL, n_simulations = 1, temperature = 0, seed = 42, model = "gemini-2.5-flash-lite", integration = getOption("nalanda.integration"), virtual_key = getOption("nalanda.virtual_key"), base_url = getOption("nalanda.base_url"), excerpt_chars = 200, checkpoint_dir = NULL, checkpoint_prefix = "simulate_treatment", save_dir = NULL, save_prefix = "results" )
intervention_text |
A single character string or a nested list of
intervention texts. This is mapped internally onto the same job grid used
by |
prompt |
Character vector of prompt templates. Each element defines one
turn. Prompt templates may include |
response_type |
An |
output_mode |
Character. |
groups |
Optional character vector of group labels. If supplied, the full prompt sequence is rerun for each group identity. |
context_text |
Optional character scalar or vector. If provided, it is
prepended to every turn for the corresponding group. Scalar values are
recycled across groups, and |
n_simulations |
Integer. Number of repeated simulations per intervention per identity. |
temperature |
Numeric. Sampling temperature passed to the chat backend. |
seed |
Integer. Random seed for reproducibility (incremented for each simulation index). |
model |
Character. Model name for the chat backend. |
integration |
Optional Portkey/gateway route slug. If supplied and
|
virtual_key |
Optional legacy virtual key. If supplied and |
base_url |
Character. Base URL for API calls. |
excerpt_chars |
Integer. Number of intervention-text characters to retain in stored prompt previews. |
checkpoint_dir |
Optional directory. If supplied, each completed
treatment/identity/simulation unit is saved as its own |
checkpoint_prefix |
Character scalar used at the start of checkpoint
filenames when |
save_dir |
Optional directory. If supplied, each intervention collection
is saved as one |
save_prefix |
Character scalar used in book-level filenames when
|
A tibble of raw turn-level responses, or a named list of tibbles
(one per book/intervention collection). Each row includes treatment,
sim, identity, turn_index, turn_type, and one column per field
returned by response_type, plus stored prompt previews and metadata
columns.
## Not run: simulate_treatment( intervention_text = "A short passage about people working together.", prompt = c( "Read the following text:\n\n{intervention_text}\n\nRate its readability from 0 to 100." ), response_type = ellmer::type_object( score = ellmer::type_number() ), n_simulations = 2, temperature = 0, seed = 42 ) simulate_treatment( groups = c("South African", "Danish"), context_text = "You are simulating an adult who identifies as {identity}.", prompt = c( climate_belief = paste( "Generally speaking, do you usually think of yourself as Danish or South African?", "On a scale from 0 to 100, how accurate do you think this statement is?", "Statement: Human activities are causing climate change" ) ), response_type = ellmer::type_object( rating = ellmer::type_number() ), n_simulations = 2, temperature = 0, seed = 42 ) ## End(Not run)## Not run: simulate_treatment( intervention_text = "A short passage about people working together.", prompt = c( "Read the following text:\n\n{intervention_text}\n\nRate its readability from 0 to 100." ), response_type = ellmer::type_object( score = ellmer::type_number() ), n_simulations = 2, temperature = 0, seed = 42 ) simulate_treatment( groups = c("South African", "Danish"), context_text = "You are simulating an adult who identifies as {identity}.", prompt = c( climate_belief = paste( "Generally speaking, do you usually think of yourself as Danish or South African?", "On a scale from 0 to 100, how accurate do you think this statement is?", "Statement: Human activities are causing climate change" ) ), response_type = ellmer::type_object( rating = ellmer::type_number() ), n_simulations = 2, temperature = 0, seed = 42 ) ## End(Not run)
Aggregate simulation results by chapter (and book, if present) computing number of simulations, means and SDs for core model outputs. In the current schema, this includes ingroup/outgroup pre-post ratings plus delta and gap metrics (for example delta_outgroup, delta_ingroup, and delta_gap).
summarize_chapter_scores( x, aggregate_level = c("chapter", "book"), book_chapter_strategy = c("all", "last"), standardize = c("none", "z", "minmax", "max"), model_aggregation = c("none", "mean"), by_party = FALSE )summarize_chapter_scores( x, aggregate_level = c("chapter", "book"), book_chapter_strategy = c("all", "last"), standardize = c("none", "z", "minmax", "max"), model_aggregation = c("none", "mean"), by_party = FALSE )
x |
A data frame or list-like object containing simulation rows as produced by run_ai_on_chapters(). Expected columns include chapter, pre/post ingroup-outgroup fields, and the derived difference columns used in summaries (pre_gap, post_gap, delta_outgroup, delta_ingroup, delta_gap). If book and party are present, the summary will include those groupings. |
aggregate_level |
Character. One of "chapter" (default) or "book". When "book", results are aggregated to the book level. |
book_chapter_strategy |
Character. One of |
standardize |
Character. How to standardize metric columns before
summarizing. |
model_aggregation |
Character. |
by_party |
Logical. If TRUE, summaries are computed separately by party (if present). |
A tibble summarizing each chapter (and book if present). The returned object will have the original model attribute copied to it.
chapter_summary <- summarize_chapter_scores(toy_sim_results) chapter_summary book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) book_summary party_summary <- summarize_chapter_scores( toy_sim_results, by_party = TRUE ) head(party_summary)chapter_summary <- summarize_chapter_scores(toy_sim_results) chapter_summary book_summary <- summarize_chapter_scores( toy_sim_results, aggregate_level = "book" ) book_summary party_summary <- summarize_chapter_scores( toy_sim_results, by_party = TRUE ) head(party_summary)
This helper turns raw simulation output into a tally table showing which
reported party was returned for each requested identity. It is especially
useful for first-turn checks from run_ai_on_chapters_one_turn(), where the
main question is whether the model actually accepts the assigned identity.
summarize_identity_adherence( x, by = c("model", "book", "chapter", "identity"), compact = FALSE, expected_col = "identity", observed_col = "party" )summarize_identity_adherence( x, by = c("model", "book", "chapter", "identity"), compact = FALSE, expected_col = "identity", observed_col = "party" )
x |
A data frame or list-like object containing raw rows from
|
by |
Character vector of columns to group by before tallying.
Defaults to |
compact |
Logical. If |
expected_col |
Character scalar naming the requested identity column.
Defaults to |
observed_col |
Character scalar naming the model-reported identity
column. Defaults to |
Because raw chapter outputs can contain repeated rows per simulation (for example one row per target group, or one row per turn), this function first reduces the input to one identity-assignment row per simulated unit.
A tibble with one row per grouping combination and reported identity,
including counts (n), totals within group (total_n), proportions
(prop), and a logical matches_requested.
x <- tibble::tibble( chapter = c("chapter_1", "chapter_1", "chapter_1", "chapter_1"), sim = c(1, 1, 2, 2), identity = c("Democrat", "Democrat", "Democrat", "Democrat"), party = c("Democrat", "Democrat", "Republican", "Republican"), target_group = c("Democrat", "Republican", "Democrat", "Republican"), rating = c(70, 40, 68, 35) ) summarize_identity_adherence(x)x <- tibble::tibble( chapter = c("chapter_1", "chapter_1", "chapter_1", "chapter_1"), sim = c(1, 1, 2, 2), identity = c("Democrat", "Democrat", "Democrat", "Democrat"), party = c("Democrat", "Democrat", "Republican", "Republican"), target_group = c("Democrat", "Republican", "Democrat", "Republican"), rating = c(70, 40, 68, 35) ) summarize_identity_adherence(x)
This helper converts raw identity-adoption output into one row per grouping combination, reporting how often the model's reported identity matched the requested one.
summarize_identity_match_rates( x, by = c("model", "identity"), compact = FALSE, expected_col = "identity", observed_col = "party" )summarize_identity_match_rates( x, by = c("model", "identity"), compact = FALSE, expected_col = "identity", observed_col = "party" )
x |
A data frame or list-like object from a simulation workflow
containing |
by |
Character vector of columns to group by. Defaults to
|
compact |
Logical. If |
expected_col |
Character scalar naming the requested identity column. |
observed_col |
Character scalar naming the model-reported identity column. |
A tibble with counts and match rates, including n_requested,
n_match, adoption_rate, n_mismatch, and mismatch_rate.
Condenses the output of model_pairwise_cor() into one row per correlation
method and subgroup. The summary includes the average pairwise correlation
and the "most aligned" model, defined as the model with the highest average
correlation with all other models. This is useful for adding a small
headline annotation to correlation-matrix slides.
summarize_model_correlations(data, method = NULL, digits = 2)summarize_model_correlations(data, method = NULL, digits = 2)
data |
Output of |
method |
Optional character. If supplied, keep only one correlation
method, e.g. |
digits |
Integer. Number of decimal places used in the display |
A tibble with any subgroup columns from data, plus method,
mean_correlation, median_correlation, min_correlation,
max_correlation, n_pairs, most_aligned_model,
most_aligned_correlation, and label.
## Not run: pw <- model_pairwise_cor(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id", "group")) summarize_model_correlations(pw, method = "pearson") ## End(Not run)## Not run: pw <- model_pairwise_cor(agg, outcome = "mean_rating", unit_by = c("book_id", "chapter_id", "group")) summarize_model_correlations(pw, method = "pearson") ## End(Not run)
This helper provides a compact bird's-eye view of where repeated simulation
runs vary across chapters, books, parties, or models. It reuses the
simulation-level SD columns from summarize_chapter_scores() and reports how
often each metric showed non-zero variation within the requested grouping.
summarize_simulation_stability(x, by = "party", metrics = NULL, tol = 0)summarize_simulation_stability(x, by = "party", metrics = NULL, tol = 0)
x |
A data frame or list-like object. This can be raw simulation metrics
(for example from |
by |
Character vector of columns used for the compact summary. Defaults
to |
metrics |
Optional character vector of metric names to inspect without
the |
tol |
Numeric tolerance for treating an SD as zero. Defaults to |
If x is already output from summarize_chapter_scores(), the function uses
it directly. Otherwise, it first computes chapter-level summaries with
by_party = TRUE, because party-specific stability is the most common
diagnostic use case.
Groups with only one simulation row have NA SD values from
stats::sd(). Those groups are treated as not testable for variability and
are excluded from the variation counts.
A tibble with one row per grouping combination, including the number
of assessed units (n_units), the proportion of units showing any
pre-period variation, the proportion showing any post-period variation,
and an overall all_stable flag.
stability <- summarize_simulation_stability(toy_sim_results) stability summarize_simulation_stability( toy_sim_results, by = c("model", "party") )stability <- summarize_simulation_stability(toy_sim_results) stability summarize_simulation_stability( toy_sim_results, by = c("model", "party") )
Aggregates lower-level rows to a chosen unit level, ranks units within each model, and summarizes which units most consistently appear near the top across models. This is useful for questions such as "Which books consistently have the strongest effects across models?"
summarize_top_units( data, outcome = "mean_outcome", item_by = "book_id", rank_within = NULL, model_col = "model", top_n = 3, higher_is_better = TRUE, standardize = c("z", "none", "minmax", "max"), include_ranks = FALSE, drop_missing = TRUE )summarize_top_units( data, outcome = "mean_outcome", item_by = "book_id", rank_within = NULL, model_col = "model", top_n = 3, higher_is_better = TRUE, standardize = c("z", "none", "minmax", "max"), include_ranks = FALSE, drop_missing = TRUE )
data |
A data frame with one row per model-by-unit combination. |
outcome |
Character string naming the score column (default
|
item_by |
Character vector identifying the items to rank, e.g. |
rank_within |
Optional character vector defining separate ranking
contexts, e.g. |
model_col |
Character string naming the model column (default
|
top_n |
Integer. Number of top-ranked items to count for each model. |
higher_is_better |
Logical. If |
standardize |
Character. How to standardize item scores within each
model before computing cross-model mean scores. |
include_ranks |
Logical. If |
drop_missing |
Logical. Whether to drop rows with missing model, item,
or ranking-context identifiers before aggregating (default |
A tibble, or a list with summary and ranks when
include_ranks = TRUE.
The summary table contains:
rank_within columnsOptional grouping columns used to define separate ranking contexts, such as party.
item_by columnsThe ranked item identifiers, such as book.
mean_scoreMean outcome score for the item across models.
score_scaleThe score standardization method used for
mean_score.
mean_rankAverage rank of the item across models. Lower values
indicate more consistently high-ranked items when
higher_is_better = TRUE.
overall_mean_rankWhen rank_within is supplied, the item's
average rank computed without those ranking contexts. This preserves a
common item order for subgroup displays.
median_rankMedian rank of the item across models.
top_n_modelsNumber of models that ranked the item within the
top top_n items in its ranking context. For example, if
top_n = 3 and top_n_models = 4, then 4 models placed that item in
their top 3.
n_modelsNumber of models with non-missing ranks for the item.
top_nThe top-N threshold used to compute top_n_models.
top_n_labelCompact display label combining top_n_models and
n_models, such as "4/5".
When include_ranks = TRUE, the ranks table contains one row per
model-by-item combination, including score, rank, and top_n.
## Not run: summarize_top_units( agg, outcome = "mean_delta_gap", item_by = "book", rank_within = "party", model_col = "model", top_n = 3 ) ## End(Not run)## Not run: summarize_top_units( agg, outcome = "mean_delta_gap", item_by = "book", rank_within = "party", model_col = "model", top_n = 3 ) ## End(Not run)
Aggregate raw outputs from simulate_treatment() by treatment,
computing counts plus means and SDs for numeric response fields. This is
useful for structured outputs such as readability, sentiment, or any other
custom numeric scores returned by the model.
summarize_treatment_results( x, aggregate_level = c("treatment", "book"), by_identity = FALSE, by_turn = TRUE, fields = NULL )summarize_treatment_results( x, aggregate_level = c("treatment", "book"), by_identity = FALSE, by_turn = TRUE, fields = NULL )
x |
A data frame or list-like object containing raw rows as produced by
|
aggregate_level |
Character. One of |
by_identity |
Logical. If |
by_turn |
Logical. If |
fields |
Optional character vector of numeric columns to summarize.
Defaults to all numeric columns except common bookkeeping fields such as
|
A tibble summarizing the requested aggregation level. Numeric fields
are returned as mean_* and sd_* columns, alongside a sim count.
Model metadata attributes are copied to the result.
readability <- tibble::tibble( treatment = c("treatment_1", "treatment_1", "treatment_2"), sim = c(1, 2, 1), turn_type = "turn_1", readability_score = c(6, 8, 7), readability_confidence = c(4, 5, 4) ) summarize_treatment_results(readability)readability <- tibble::tibble( treatment = c("treatment_1", "treatment_1", "treatment_2"), sim = c(1, 2, 1), turn_type = "turn_1", readability_score = c(6, 8, 7), readability_confidence = c(4, 5, 4) ) summarize_treatment_results(readability)
A synthetic turn-level dataset that mimics the raw output returned by
run_ai_on_chapters() in per-group mode. It is intended for examples,
documentation, and testing the full workflow from raw turns to derived
metrics with compute_run_ai_metrics().
toy_run_ai_turnstoy_run_ai_turns
A tibble with 128 rows and 11 variables:
Book title.
Chapter identifier.
Simulation index.
Simulated respondent identity.
Political party grouping.
Conversation turn index.
Whether the row comes from the baseline or post turn.
Group being rated on that row.
Synthetic rating on a 0-100 scale.
Stored baseline prompt preview.
Stored post prompt preview.
Synthetic example created for package documentation.
A small synthetic dataset that mimics the row-level structure returned by
run_ai_on_chapters(). It is intended for examples, documentation, and
testing plotting and summarization workflows without running live AI
simulations. The object also includes model and temperature attributes
so plotting helpers can display realistic metadata in subtitles.
toy_sim_resultstoy_sim_results
A tibble with 32 rows and 15 variables:
Book title.
Chapter identifier.
Simulation index.
Simulated respondent identity.
Political party grouping.
Pre-reading ingroup rating.
Post-reading ingroup rating.
Pre-reading outgroup rating.
Post-reading outgroup rating.
Pre-reading ingroup minus outgroup gap.
Post-reading ingroup minus outgroup gap.
Change in ingroup rating.
Change in outgroup rating.
Change in affective polarization gap.
Short synthetic chapter excerpt.
Synthetic example created for package documentation.