GSoC Diaries #8: Final Evaluation

Hey,

It was just a couple of months ago, on May 14, when I received an email asking if I still would be interested in working on Stan through the summer as part of Google Summer of Code, as I initially didn’t receive the funding. I said yes, and since then, my time has been full of learning, experimenting, and implementing. Working on bayesplot was so much fun, which I plan to keep doing so even after Google Summer of Code, but now, it is time to close the chapter of GSoC formally.

Project Description

Quoting my project proposal, my time during Google Summer of Code was about working on bayesplot library under Stan:

The bayesplot package is a key tool in the Stan ecosystem for visualizing Bayesian analysis results. However, it currently lacks specialized visualizations for predictive checks of discrete and categorical outcomes, and some existing plots could be improved for better usability and accuracy. These limitations hinder researchers’ ability to thoroughly validate their models against real-world data, particularly for discrete and bounded data types. This project aims to expand and refine bayesplot by implementing new visualization techniques based on recommendations from Säilynoja et al¹. Deliverables: New plotting functions integrated into bayesplot (e.g., ppc_density_bounded, calibration plots). Enhanced versions of existing functions (e.g., rootograms, residual plots). Comprehensive documentation with examples and guides. Unit tests for robustness and edge cases. Blog posts tracking progress and showcasing new features. By the end of the project, bayesplot will offer a more versatile and user-friendly toolkit for Bayesian model checking, strengthening its role in the Stan community. These improvements will empower researchers to diagnose and communicate model fit more effectively, advancing Bayesian workflows across disciplines like machine learning, statistics, and data science.

That’s what I pretty much did. I worked on bayesplot to enhance users’ experience of visualising Bayesian models, by adding new visualisation functions, updating the existing functions and adding new capabilities, and making sure that documentation is clear enough both for new users and people who are considering contributing to bayesplot. Let’s look at what I did, item by item.

My Progress

Of course, my time at GSoC started with exploring the codebase. Even though I used the bayesplot before, there is a difference between importing the package and calling functions from it and actually understanding what’s under the hood of those functions. During the community bonding period and also a bit during the first week of coding period -as I also had my final exams during the first week of June- I tried to understand how the codebase is formatted, what the general coding practices are, what is the appropriate language in documentation, and more. After that initial period, however, I started coding and implementing. Here is the list of what I did, what I am working on at the moment, and what is left to do.

Completed

ppc_dots & ppd_dots (Issue #354 and PR #357): These newly added functions are for plotting the dot plots of results of Bayesian models. They were visualising the distribution of a variable by plotting dots at specific quantiles of the data. This was my first contribution to the bayesplot, and it took me a while to get it done. The reason I decided to go with this was that this task was pretty self-contained, and since it’s a completely new function, there was a smaller risk of breaking things. That being said, implementing a function from the ground up is a tough task for a new contributor, so it took a lot of back and forth between me and my mentors, especially Teemu and Jonah, for me to understand everything. For example, I wasn’t quite sure whether to use geom_dots or ggdist_dots under the hood of this function. Since ggplot2 is automatically imported by bayesplot, geom_dots was a natural part of the package. However, it was limited in terms of its offerings. Many times, dots were overflowing or dots were getting sized wrongly and manipulating arguments provided by geom_dots wasn’t enough to prevent these issues. The only way that I could have fixed those issues while using geom_dots was by implementing a custom geom, which would have been quite complicated. On the other hand, ggdist_dots was simpler to use, and it didn’t have the issues that geom_dots had. However, using ggdist_dots meant adding another dependency to the package, which I was hesitant about. After discussing, we decided that ggdist can be added to the suggested packages list and at the beginning of ppc_dots and ppd_dots, there would be a check to see if ggdist is installed. This worked perfectly, and my implementation, along with the documentation and testing, got merged into the main branch.
ppc_error_binned(x) (Issue #358 and PR #359): This task started based on Teemu’s idea that current offerings of bayesplot for discrete data can be enhanced (issue #343). He made multiple suggestions that later led to greater discussions, but one point that he made was the fact that ppc_error_binned doesn’t currently support covariates on the x-axis. To solve that, adding an optional x argument to ppc_error_binned, which would work similarly to ppc_interval, made sense. That way, users will have the ability to plot residuals against x, but since it’s optional, this change wouldn’t break any existing plots. This task was my second, and in all fairness, it was quite simpler compared to adding completely new plots. After all, what this update did was add an optional parameter that, if anything is passed to it, it would get passed rightly into the helper function, which does the actual plotting. So, adding an if-else branching, implementing some tests and updating the documentation sufficed.
ppc_rootogram(style = “discrete”) (Issue #360 and PR #362): In their paper, Säilynoja et al suggested that visual styles that are currently offered by ppc_rootogram are not suitable for the data that the function visualises. Even though the function is designed to plot discrete data points, all of the visual elements that it uses have continuity. Lines, histograms, and filled areas, all these visual elements are signalling continuity in the data. Therefore, an alternative style is needed. The suggestion in the paper was a new style that uses points and point ranges, making sure that the user is confident about the discrete nature of the data. We had lots of discussions about the colouring of the visual items, what visual style we need to follow, and more, all of which I shared in my biweekly diary entries. However, the main challenge in this task was preserving the old styles and implementing a new style, and doing all of this neatly and not turning the code into spaghetti. Especially considering that old styles and the new style had almost no common point, it wasn’t an easy task, to be honest. After my first working implementation of the function, I was given feedback that this function needs to be refactored and split into helper functions since the version that I presented would be hard to maintain. It was fair feedback and got me back to work. I rewrote the code, this time in a clearer manner. It finally got approved and merged into the main branch, and we hope to gather some feedback from the users to further enhance it.
ppc_error_scatter_avg(x = x) (Issue #364 and PR #367): We decided that to reduce the number of available functions and have a smoother user experience, it made sense to combine ppc_error_scatter_avg and ppc_error_scatter_avg_vs_x. The simple idea behind this is that these two functions are fairly similar, and all ppc_error_scatter_avg_vs_x does is format the x variable appropriately and then call ppc_error_scatter_avg. With this change, if a vector is passed to the optional x argument of ppc_error_scatter_avg, then the average of the errors computed from y and each dataset (row) in yrep can be plotted against that x. Otherwise, the average of errors will be plotted against y. This simply meant that now, the same function is able to do two tasks that were previously done by two separate functions and at the same time maintains the ease of use that bayesplot really cares about.
ppc_stat(discrete) & ppd_stat(discrete) (Issue #330 and PR #369): One user of bayesplot voiced a frustration related to test statistics functions provided by bayesplot. The issue was that in ppc_stat and ppd_stat -also in their grouped versions- the data is plotted by histograms. This is fine if you work with continuous data, which can be visualised nicely by histograms. However, in discrete datasets, histograms look ugly and aren’t really appropriate. Even though it wasn’t in my task list, I decided to take on this task because to me, it seemed that this request from a user seemed more important than the other tasks I had left in my task list. Therefore, I updated the ppc_stat, ppd_stat, ppc_stat_grouped, and ppd_stat_grouped to make sure that they call geom_bar instead of geom_histogram if the discrete argument is set to true. Otherwise, functions kept working the same.
Admin work – New Contributors (Issue #365 and Issue #366): I also went ahead and did some admin work! Since I began to explore the codebase, I noticed small errors here and there. There was already a label for issues that are suitable for new contributors; however, we decided that it would be even better if we gathered them under a master issue, where new contributors can go and pick a task. I did that, and on top of it, created another issue that is suitable for new contributors, highlighting little mistakes made in the tests and documentation of ppc-discrete. In bayesplot, the threshold of contribution is lower than many other R packages, but I hope this master issue will make it even lower, welcoming everyone with open arms who wants to be part of the Stan community.
Admin Work – Residual Plots (Issue #361): As I said before, Teemu had multiple ideas to make error/residual plots offered by bayesplot better; however, doing them would require a comprehensive work, one that goes beyond the scope of this GSoC project. That being said, we still had discussions about this in Slack and in a meeting that all of my mentors and I attended. After that meeting, we decided that it would be good to gather our thoughts to see the bigger picture and plan things better. So, I went ahead and dug up our Slack messages, open issues, existing PRs, and, of course, the paper by Säilynoja et al. to see what kind of suggestions exist for error/residual plots. This issue is a result of that work, gathering all the ideas, and providing a clear picture regarding the implementation of enhanced error/residual plots.

In Progress

ppc_residual_scatter (Issue #358 and PR #368): Currently, the error plots existing in bayesplot calculate the errors of a model by stat(y – yrep). The idea of implementing a function that would plot y – stat(yrep) instead of stat(y – yrep) has existed for some time, scattered through different issues and PRs with concurrent discussions. However, the gist of them all was the need for an alternative function to plot errors differently. We decided that it would be best to call this new kind of function residual functions instead of error functions due to their having a different way of calculating the errors. To turn this idea into an implementation, I made a draft PR, which I wasn’t completely sure about, nor were my mentors. Therefore, we decided to slow down this a bit and have a meeting to discuss the future of this alternative kind of error plots in a more detailed way.

In Future

adding bounds to density plots (Issue #317): The only task item from my list that I haven’t started implementing at all was this task. This task is about adding bounds argument to all plots -there are 10 of them- which are very good at plotting density plots when the data is smooth and unbounded, however, have problems when the dataset is bounded. These functions utilise geom_bounds and stat_bounds under the hood, which have their own ways of handling bounded data, so it is possible to build on top of their structure to provide enhanced plotting capabilities to users. This is something that I plan to work on in the future.

Here are some additional links where you can check all of my activity at bayesplot GitHub repository:

Conclusion

Looking back, I can confidently say that applying for Google Summer of Code was an awesome decision. I enjoyed it so much; it was so fun to explore the codebase, implement new functions, learn about the general software development practices, as well as open source software development practices. I am committed to staying with bayesplot after the official end of GSoC, and we are already talking about what to do next. For all of this, I am grateful to my mentors, Teemu Säilynoja, Jonah Gabry, and Aki Vehtari, for their continuous support. I learnt a lot from them and they answered all of my questions, regardless of how basic and simple they are, and I felt so well and relieved whenever we met or discussed things on Slack, knowing that I have people that I can openly ask things. It was a pleasure to work with them!

I also thank GSoC for introducing me to the open source community. I didn’t have a single commit in an open source project before, but now whenever I come across a public library on GitHub, I check issues to see if there is anything I can quickly implement and help maintainers. I already made my first commit to brms, a package that provides an interface to fit Bayesian generalized (non-)linear multivariate multilevel models using Stan. I remember reading that one of the important goals of GSoC is introducing students to the open source community so that there are more open source contributors and projects in the future. It’s easy to say that it achieves that goal extremely well.

Säilynoja et al. (2025). Recommendations for visual predictive checks in Bayesian workflow. [Online]. Available: https://arxiv.org/abs/2503.01509 ↩︎