LLM Distributions Letter: Results

There is commented code for loading in each of the four data files. I have combined them and mutated them because it is a long operation.

library(magrittr)
load("Complete.Data.01.2026.RData")

Fresh.Outcome

I have recreated the outcomes below in a variable called Fresh.Outcome. We should be careful about how these are recoded to be both transparent about what it does and for our clarity in reporting. Here is that table.

Complete.Data.01.2026$Fresh.Outcome <- c(1:196000) %>% map_chr(., function(x) {Complete.Data.01.2026$response$body$choices[[x]]$message$content})
Complete.Data.01.2026$model <- Complete.Data.01.2026$body$model
Complete.Data.01.2026 <- Complete.Data.01.2026 |> mutate(modelF = factor(model, levels = c('gpt-4-turbo-2024-04-09', 'gpt-4-0613', 'gpt-4o-2024-08-06', 'gpt-5.2-2025-12-11')))
Complete.Data.01.2026 |> group_by(Fresh.Outcome, model) |> summarise(Count = n()) |>  ungroup() |> pivot_wider(names_from = model, values_from = Count) |> gt()
Fresh.Outcome gpt-5.2-2025-12-11 gpt-4-0613 gpt-4-turbo-2024-04-09 gpt-4o-2024-08-06
15 NA NA NA
Arcsine 36 NA NA NA
Bernoulli 744 1071 690 835
Beta 4165 14 8885 2141
Betabinomial 33 NA NA NA
Bimodal NA 49 NA NA
Binomial 15525 84 1969 1028
Categorical 86 NA NA NA
Cauchy 35 NA 6 NA
Chisquare 9 NA NA NA
Degenerate 71 71 71 71
ExGaussian 3 NA NA NA
Exponential 196 156 930 5266
Gamma 2714 NA 517 17
Geometric 583 NA NA NA
Gumbel 76 NA NA NA
Laplace 3 NA 454 NA
Log-Normal NA NA NA 3
Log-normal NA NA NA 938
Logistic NA NA 7 NA
Lognormal 8127 3979 7974 468
Mixture 1 NA 1 NA
Multinomial 24 NA NA NA
NegBinomial 5 NA NA NA
Negative Binomial NA 5 NA NA
NegativeBinomial 224 4 NA NA
Negativebinomial 197 NA NA NA
Negbinomial 1 NA NA NA
Normal 5201 29210 3230 12090
Pareto 83 NA 126 142
Pascal 77 NA NA NA
Poisson 8553 12981 22764 17042
Skew-normal NA NA 10 NA
Skew-right NA 3 NA NA
SkewNormal NA NA 160 NA
Skewed NA NA NA 1
Skewness NA 245 5 NA
Skewnormal 15 NA 3 NA
Student 29 NA NA NA
Studentt 1 NA NA NA
Triangular 16 NA 1 NA
Uniform 1250 1128 1141 8557
Weibull 78 NA 56 400
ZIP 1 NA NA NA
Zero-inflated NA NA NA 1
ZeroInflatedPoisson 4 NA NA NA
ZeroinflatedPoisson 8 NA NA NA
Zipf 2 NA NA NA
arcsine 1 NA NA NA
beta 4 NA NA NA
binomial 1 NA NA NA
categorical 3 NA NA NA
exgaussian 1 NA NA NA
exponential 3 NA NA NA
gamma 33 NA NA NA
geometric 3 NA NA NA
lognormal 723 NA NA NA
negativebinomial 8 NA NA NA
normal 11 NA NA NA
skewnormal 3 NA NA NA
t 13 NA NA NA
uniform 2 NA NA NA

I am going to recreate the Outcome variable as follows.

Complete.Data.01.2026 <- Complete.Data.01.2026 |> 
  mutate(Outcome = case_match(Fresh.Outcome, 
                              c("arcsine","Arcsine") ~ "Arcsine",
                              c("categorical","Categorical") ~ "Categorical",
                              c("exponential","Exponential") ~ "Exponential",
                              c("beta","Beta") ~ "Beta",
                              c("Chisquare","Chi-Square") ~ "Chi-square",
                              c("exgaussian","ExGaussian") ~ "ExGaussian",
                              c("Normal","normal") ~ "Normal",
                              c("binomial","Binomial") ~ "Binomial",
                              c("geometric","Geometric") ~ "Geometric",
                              c("gamma","Gamma") ~ "Gamma",                              c("Lognormal","lognormal","Log-Normal","Log-normal") ~ "Lognormal",
                              c("NegativeBinomial","Negative Binomial","NegBinomial","Negbinomial","negativebinomial","Negativebinomial") ~ "Negative Binomial",
                              c("Skewed","Skewnormal","SkewNormal","Skew-normal","Skew-right","Skewness","skewnormal")~ "Skewed",
                              c("Student","Studentt","Studentt","t")~ "Student's t", c("Uniform","uniform")~ "Uniform",
                              c("ZeroInflatedPoisson","ZeroinflatedPoisson","Zero-inflated","ZIP")~ "ZIP",
                              .default = Fresh.Outcome))
Complete.Data.01.2026 |> group_by(Outcome, model) |> summarise(Count = n()) |>  ungroup() |> pivot_wider(names_from = model, values_from = Count) |> gt()
Outcome gpt-5.2-2025-12-11 gpt-4-0613 gpt-4-turbo-2024-04-09 gpt-4o-2024-08-06
15 NA NA NA
Arcsine 37 NA NA NA
Bernoulli 744 1071 690 835
Beta 4169 14 8885 2141
Betabinomial 33 NA NA NA
Bimodal NA 49 NA NA
Binomial 15526 84 1969 1028
Categorical 89 NA NA NA
Cauchy 35 NA 6 NA
Chi-square 9 NA NA NA
Degenerate 71 71 71 71
ExGaussian 4 NA NA NA
Exponential 199 156 930 5266
Gamma 2747 NA 517 17
Geometric 586 NA NA NA
Gumbel 76 NA NA NA
Laplace 3 NA 454 NA
Logistic NA NA 7 NA
Lognormal 8850 3979 7974 1409
Mixture 1 NA 1 NA
Multinomial 24 NA NA NA
Negative Binomial 435 9 NA NA
Normal 5212 29210 3230 12090
Pareto 83 NA 126 142
Pascal 77 NA NA NA
Poisson 8553 12981 22764 17042
Skewed 18 248 178 1
Student's t 43 NA NA NA
Triangular 16 NA 1 NA
Uniform 1252 1128 1141 8557
Weibull 78 NA 56 400
ZIP 13 NA NA 1
Zipf 2 NA NA NA

The discrete tag

Complete.Data.01.2026 <- Complete.Data.01.2026 |> mutate(Type = case_match(name, c("Binomial","Geometric","Poisson") ~ "Discrete", .default = "not Discrete"))
Complete.Data.01.2026 <- Complete.Data.01.2026 |> mutate(Correct = as.character(name==Outcome)) |> mutate(Correct = case_match(Correct, "TRUE" ~ "Correct", .default = "Incorrect"))

Basic Percent Correctly Predicted

MNTotals <- Complete.Data.01.2026 |> group_by(model, name) |> summarise(Total = n()) |> ungroup()
PCP <- Complete.Data.01.2026 |> group_by(model, name, Correct) |> summarise(count = n()) |> ungroup()
PCPTab <- left_join(PCP, MNTotals) |> filter(Correct=="Correct") |> mutate(PCP = count / Total) |> select(model, name, PCP)
PCPTab
# A tibble: 29 × 3
   model                  name          PCP
   <chr>                  <chr>       <dbl>
 1 gpt-4-0613             Beta      0.0028 
 2 gpt-4-0613             Binomial  0.00394
 3 gpt-4-0613             Lognormal 0.880  
 4 gpt-4-0613             Normal    1      
 5 gpt-4-0613             Poisson   0.554  
 6 gpt-4-0613             Uniform   0.0002 
 7 gpt-4-turbo-2024-04-09 Beta      0.972  
 8 gpt-4-turbo-2024-04-09 Binomial  0.104  
 9 gpt-4-turbo-2024-04-09 Gamma     0.044  
10 gpt-4-turbo-2024-04-09 Lognormal 0.857  
# ℹ 19 more rows

The Plot

Tab.All <- Complete.Data.01.2026 |> group_by(model, name, Outcome) |> summarise(Count = n()) |> ungroup() |> group_by(model,name) |> mutate(Total = sum(Count)) |> ungroup() |> mutate(PCP = round(Count / Total, digits=3)) |> ungroup()
Tab.All <- Tab.All |> mutate(PCPa = round(PCP, digits=2))
Tab.All$PCPa1 <- as.character(Tab.All$PCPa)
Tab.All <- Tab.All |> mutate(PCPa1 = case_when(PCPa1 == "0" ~ "<0.01", .default = PCPa1))
FF3 <- ggplot(Tab.All) + 
  aes(y=fct_rev(Outcome), x=name, fill=PCP, facet=modelF) + 
  geom_tile(aes(color=Dist.Match), lwd=0.2, width=0.95, height=0.95) + 
  scale_color_manual(values = c("white","black")) + 
  guides(color="none") +
  scale_fill_gradientn(colors = my_greys) + 
  labs(x="Known Distribution", y="Model Outcome") +
  geom_text(aes(label=PCPa1), size=2, color="black") +
  labs(caption="Values are row proportions \n Outlined cells isolate row/column matches", title="Known Distributions and Model Responses", fill="Row. Pct.") + 
  theme_minimal() + 
  theme(legend.location = "plot", 
        panel.spacing = unit(0.5, "lines"), 
        plot.label= element_text(size=5), 
        plot.title=element_text(size=10, face = "bold", hjust=0),
        plot.title.position = "plot", 
        axis.text.x = element_text(size=8, angle=30), 
        axis.text.y = element_text(size=5), 
        legend.direction = "horizontal",
        legend.text = element_text(size = 4), 
        legend.title = element_text(size = 6),
        legend.key.width = unit(0.1, "in"),
        legend.position = c(0.0, -0.1)
) +
    facet_wrap(vars(modelF), scales="free_y")
ggsave(FF3, filename="FinalFigureFreeY-v2.jpeg", width=6.5, height=6.5, dpi=900, units="in")

Free Y

References

knitr::write_bib(names(sessionInfo()$otherPkgs), file="bibliography.bib")

References

Bache, Stefan Milton, and Hadley Wickham. 2025. Magrittr: A Forward-Pipe Operator for r. https://magrittr.tidyverse.org.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. https://www.jstatsoft.org/v40/i03/.
Iannone, Richard, Joe Cheng, Barret Schloerke, Shannon Haughton, Ellis Hughes, Alexandra Lauer, Romain François, JooYoung Seo, Ken Brevoort, and Olivier Roy. 2025. Gt: Easily Create Presentation-Ready Display Tables. https://gt.rstudio.com.
Mock, Thomas. 2025. gtExtras: Extending Gt for Beautiful HTML Tables. https://github.com/jthomasmock/gtExtras.
Müller, Kirill, and Hadley Wickham. 2026. Tibble: Simple Data Frames. https://tibble.tidyverse.org/.
Scheinin, Ilari. 2019. Rasterpdf: Plot Raster Graphics in PDF Files. https://ilarischeinin.github.io/rasterpdf.
Spinu, Vitalie, Garrett Grolemund, and Hadley Wickham. 2024. Lubridate: Make Dealing with Dates a Little Easier. https://lubridate.tidyverse.org.
Urbanek, Simon, and Jeffrey Horner. 2025. Cairo: R Graphics Device Using Cairo Graphics Library for Creating High-Quality Bitmap (PNG, JPEG, TIFF), Vector (PDF, SVG, PostScript) and Display (X11 and Win32) Output. http://www.rforge.net/Cairo/.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2023. Tidyverse: Easily Install and Load the Tidyverse. https://tidyverse.tidyverse.org.
———. 2025a. Forcats: Tools for Working with Categorical Variables (Factors). https://forcats.tidyverse.org/.
———. 2025b. Stringr: Simple, Consistent Wrappers for Common String Operations. https://stringr.tidyverse.org.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, Dewey Dunnington, and Teun van den Brand. 2025. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://ggplot2.tidyverse.org.
Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. Dplyr: A Grammar of Data Manipulation. https://dplyr.tidyverse.org.
Wickham, Hadley, and Lionel Henry. 2026. Purrr: Functional Programming Tools. https://purrr.tidyverse.org/.
Wickham, Hadley, Lionel Henry, Thomas Lin Pedersen, T Jake Luciani, Matthieu Decorde, and Vaudor Lise. 2025. Svglite: An SVG Graphics Device. https://svglite.r-lib.org.
Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2025. Readr: Read Rectangular Text Data. https://readr.tidyverse.org.
Wickham, Hadley, Davis Vaughan, and Maximilian Girlich. 2025. Tidyr: Tidy Messy Data. https://tidyr.tidyverse.org.