I’m relatively new to discrete choice experiments and have really enjoyed learning about the different analysis approaches and techniques used. It is such a rapidly evolving field and there is always something new to learn. While there is a lot happening to push the boundaries, I’ve recently been helping a couple of people with the analysis of their first DCE. While a lot of your analysis approach should be worked out before you begin the DCE, when you get to the point of actually doing the analysis for the first time there is a whole lot of stuff around which commands to use that you might still need help with. I realised there are some references I just keep recommending and coming back to, so I’ve shared them here maybe you’ll find them helpful too.
It often helps to know at the start what you are aiming to achieve at the end. I think this is a nice example of describing the methods and assumptions of a DCE around parental preferences for vaccination programs really clearly and succinctly. The other general information I refer people to is the ISPOR Analysis of DCE guidelines, which include the ESTIMATE checklist of things to consider when justifying your choice of approach.
When I did the DCE course run through HERU in Aberdeen it was suggested that the typical approach to considering analysis of DCEs was to be to start with a simple model and then use more complex models to address specific issues that arise with your data or relate to your research question. This commonly means starting with a conditional logit model, and then considering options such as mixed logit and latent class analysis. The ISPOR Analysis of DCE guidelines have clear descriptions of the theory and assumptions of these approaches, and I found this paper interesting in comparing mixed logit and latent class approaches.
I am originally a SAS user, and so when I first started analysing DCE data I assumed I would do so in SAS. However, after much investigation I’ve realised this is easier said that done and have now moved to using STATA for the DCE analysis, although I’m still much more comfortable doing the data management and preparation in SAS. Using two different packages is time consuming, clunky and the opposite of “reproducible research”, so my next step is to convert managing my DCE data AND analysis in R. I haven’t got very far, so if anyone knows any good packages then please pass them on! I promise to update this page if I find something useful.
It is straight forward to run a conditional logit in SAS using PROC MDC (user guide). Some resources I found helpful to implement PROC MDC is this example code for conditional logit with PROC MDC and this SAS user group paper “Discrete choice modelling with PROC MDC”. The error message I’ve had most often in doing this analysis is “CHOICE=variable contains redundant alternatives” which relates to the data looking like people have chosen more than one option in a choice set. If you get this, check the cleaning and the sorting of your data!
You can do effectively the same analysis using PROC PHREG, as described by this technote, plus there is a suite of marketing research guides that describe various ways to analysis discrete choice data.
Moving on from conditional logit to mixed logit or latent class analysis is more difficult in SAS. There is a guide in this video to running conditional logit models and mixed logit models (using PROC MDC, starts at 5:30 minutes), although I could never get their mixed logit method to work (entirely possible due to user error!). I did also contact the SAS helpdesk and they said it would be difficult, but recommended using PROC BCHOICE (Bayesian Choice) for mixed logit analysis with DCE data that has multiple choice sets per participant. There is some documentation here and a worked example here. Again, I never really got this to work but it could be my mistake.
Having faffed around in SAS for long enough, I caved in and transitioned to using STATA like everyone else in my research group! I found this a really nice introductory, step by step guide to analysis in STATA, including data set up and Conditional Logit and Mixed logit options. There is also this article which is a guide to analysing DCE data and model selection, and includes STATA code (as well Nlogit and Biogene) in the supplementary material. Finally, this working paper is useful for describing the theory and code for doing more advanced models, like Mixed Logit and Latent Class analysis in STATA, although the code isn’t annotated which I found frustrating as a new STATA user.
For latent class analysis is STATA I found this article in the STATA journal a useful description of the command, and this was a nice example of a paper that used mixed logit and latent class models and wrote them up clearly. Finally, these three articles (one, two, three) seemed like good examples of calculating and displaying relative importance graphs.
I’m keen to analyse my next DCE in R, so have started looking at how I might do this. I have found the following resources, but if anyone has any experience with DCEs in R then please get in touch!
- Two papers by Aizaki and Aizaki & Nishimura on designing DCEs in R, and including analysis using conditional logit models
- Example R code and case study of mixed logit model with multiple choices per respondent, including analysis and helpful tips, written by Kenneth Train and Yves Croissant
- An mlogit package for analysing DCE data in R, as described in Kenneth Train (2009)
- Thanks to Nikita Khanna for pointing me to this paper & code for doing sample size calculations for a DCE in R.