library(knitr)
library(quarto)
write_bib(names(sessionInfo()$otherPkgs), file="bibliography.bib")The performance of widely used language models tasked with identifying the distribution responsible for generating simulated data
The Organization of the Website
Eaglesmith, Justus, Tim Johnson, and Robert W. Walker. 2025. The performance of widely used language models tasked with identifying the distribution responsible for generating simulated data.
The support document for the letter.
The directory Final-Prompt-Results contains all the results. They are organized hierarchically by model [0409-turbo, 0806, 0613, 1211] and then by distribution.
- The 0409-turbo page
- The 0806 page
- The 0613 page
- The 1211 page
In each subdirectory, there are:
- a series of .qmd files containing Combiner. These combine the input and output files [in jsonl format] with the original .RData files.
- a series of .RData files with
Fullthat represent the combined input, output and RData files. - a series of .RData files with an R random generation function that store the original calls and data.
- a .R file containining Combiner that binds the rows of the Full data files.
- Two sets of .jsonl files: output contains the OpenAI responses and -Call-Revised.jsonl contains the batch files sent to OpenAI.
Details
Taking the example of the beta distribution, there are three sets of results.
For gpt-4o-2024-08-06:
The results are combined in:
For gpt-4-0613:
The results are combined in:
For gpt-4-turbo-2024-04-09:
The results are combined in:
Other results follow an identical pattern substituting the names of the distributions and parameters relevant in the given case, e.g. Final-Prompt-Results/0409-turbo/beta-turbo/Beta-Combiner52-0409.html becomes Final-Prompt-Results/0409-turbo/poisson-turbo/Poisson-Combined-5-0409.html
As the files make clear, the keys for merging the batch results and the batch queries are the unique-id’s that we assign for each distribution/parameter/iteration.