Introduction to Latent Class Analysis in Health Valuation

class: center, middle, inverse, title-slide

.title[
# Introduction to Latent Class Analysis in Health Valuation
]
.subtitle[
## .small[Session 1 Applied Choice Analysis]
]
.author[
### Benjamin M. Craig
]
.author[
### Suzana Karim
]
.date[
### 2022-06-10
]

---

class: inverse middle center

<img style="border-radius: 50%;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/Craig2019-scaled.jpg"
alt=""
width="225px"
/>
# .yellow[Benjamin M. Craig]

???
Hello! My name is Benjamin M. Craig and I am excited to have the opportunity to introduce many of you to applied choice analysis in health valuation with my colleague, Suzana Karim.

---
.left-column[
![Erasmus 2019](https://r4hpr.org/wp-content/uploads/2022/06/Craig_Erasmus2019.jpg)
]

.right-column[
## Current Roles:
**Associate Professor of Economics**, University of South Florida, USA 
**Executive Director**, International Academy of Health Preference Research 
**Co-editor**, R4HPR.org 
**Member**, EuroQol Group

##Workshop Objective:
###To conduct applied choice and latent-class analyses to estimate **EQ-5D value sets** for economic evaluations
]

---
#Acknowledgements

.panelset[
.panel[.panel-name[Sponsor]
This workshop is sponsored through a research grant (302-EO) from the <a href="https://euroqol.org/" target="_blank">**EuroQol Research Foundation**</a>. 
<img style="border-radius: 20%;float:left;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/EQ-5D_banner.jpg"
width="500px"
/>
]

.panel[.panel-name[Host]
In July 2022, we expect to make the workshop materials available on <a href="https://r4hpr.org/" target="_blank">**R4HPR.org**</a>, a repository of educational materials and code co-edited by **Karin Groothuis-Oudshoorn** and Benjamin M. Craig. 
<img style="border-radius: 20%;float:left;margin-right: 40px;" 
src="https://r4hpr.org/wp-content/uploads/2022/03/R4HPR.png"
width="250px"
/>
<img style="border-radius: 20%;float:center;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/Karin.jpg"
width="250px"
/>
]
.panel[.panel-name[Contributor]
The Dutch EQ-5D-Y-3L dataset was contributed by **Bram Roudijk** for use in the workshop exercises. 
<img style="border-radius: 20%;float:left;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/Bram.jpg"
width="250px"
/>
]
.panel[.panel-name[Disclaimer]
- I have no actual or potential conflict of interest in relation to this presentation.
- I will not be discussing off-label use of any medications.
- Neither the organizations above nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or use of any information, apparatus, product, or process disclosed.
]
]

---
class: inverse
.left-column[
# Three Sessions
]
.right-column[
##.yellow[10 June:] Applied Choice Analysis
##.yellow[17 June:] Latent Class Analysis
##.yellow[24 June:] Applications & Advanced Topics
]

###_Attendees who participate in all sessions & complete both exercises will receive a .yellow[Workshop Completion Certificate] from the EuroQol Research Foundation._

---
class: header_background
#Early Career Researchers (ECRs)

>This workshop was specifically designed to enhance the technical capacity of **early career researchers** and better prepare them for advanced topics in health preference research (e.g., time preferences).

>At the end of today's session, we will provide a link to a form so that attendees can register for a small-group tutoring session (30 minutes) and complete in the **optional hands-on exercise**.

>To participate, an attendee must:
1. Know **English** well enough to enroll in an English-speaking university.
1. Install and run **R code**, including the code provided.
1. Agree to attend **all sessions** and complete the **two workshop exercises**.&#9745;

.footnote[&#9745;  Necessary to receive a **Workshop Completion Certificate** from the EuroQol Research Foundation.]

---
.left-column[ 
# Session 1 Outline
]

.right-column[
###Introduction, Benjamin M. Craig .grey[(40 minutes)]

1. **Health Preference Research**

1. **Applied Choice Analysis**

1. **Health Valuation**

###Break .grey[(10 minutes)]

###Estimation in R, Suzana Karim .grey[(40 minutes)]

1. **Conditional Logit**

1. **Heteroskedastic Logit**

1. **Optional Hands-on Exercise**
]

???
In Session 1, I will first introduce health preference research and the components of a health preference study. Next, I will go into greater depth about the analysis of choice data, namely the econometrics of the conditional logit and heteroskedastic logit models, including their estimation by maximum likelihood. My last section will cover the implication of these results within health valuation.

After the break, Ms. Karim will introduce the Dutch EQ-5D-Y paired comparison data and its R code.  She will review how you can estimate the conditional and heteroskedastic logit model.  Attendees are welcome to sign up and participate in an optional hands-on exercise in R, producing a value set for the Dutch EQ-5D-Y.

Over the next week, we will host a series of small-group tutoring sessions (30 minutes), allowing for personalized feedback on how to estimate an EQ-5D value set using choice data. As I have mentioned, these are optional.

---
<img style="float:center;margin-right: 10px;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/post-35425-0-61835800-1316277681.jpg"
width="1100px"
/>

> With over **170 registrants** from around the world, our objective seems overly ambitious. **Nevertheless**, I am optimistic that this workshop has something for everyone.  .red[Please ask questions in the chat or via email (benjamin.craig@r4hpr.org)]. I'll also hang out afterwards.

---
class: inverse middle center

##".yellow[Health preference research (HPR)] refers to any investigation dedicated to understanding.yellow[the value of health and health-related alternatives]using observational or experimental methods."

.footnote[.left[Benjamin M. Craig, et al (2017) "Health preference research:an overview." _The Patient_ 10(4): 507-510.]]

???
---
class: header_background
#.yellow[HPR mantra:] _choice defines value_

##"Psychophysical formulations which are made ideally for discriminatory judgments of .red[simple physical stimuli] can be applied also to discriminatory judgments involving .red[social values] even when these values are loaded with prejudice or bias."

.footnote[.left[ L. L. Thurstone (1928) An Experimental Study of Nationality Preferences, The Journal of General Psychology, 1:3-4, 405-42; L. L. Thurstone (1927) A law of comparative judgment. Psychol. Rev. 34, 273-286.]
]

???
The basis of health preference research began with LL Thurstone's seminal work on the law of comparative judgment in psychophysics.  Studies in psychophysics typically asked subjects to choose between two weights, lights, sounds in terms of which was heavier, brighter or louder.  Thurstone applied this approach to the estimation of social values. For example, presenting two alternatives and asking which do you prefer?

---
class: header_background center
# Two Examples of Applied Choice Analyses

## .red[Perception of physical stimuli:] **&#9788;** .black[versus] &#9965;

## .red[Social values of objects:] &#9855; .black[versus] &#9785;

???
Any applied choice analysis entails a careful examination of empirical evidence on human behavior. In psychophysics, subjects may be exposed to physical stimuli, such as lights, and asked which is brighter?  In health preference research, subjects may be shown two health outcomes and asked which do you prefer?  In HPR, we conduct applied choice analyses to understand the value of health and health-related alternatives using observational and experimental methods.

---
# .red[Social values]: &#9855; versus $$$

<img style="border-radius: 10%;float:center;margin-right: 10px;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/Thorndike37.png"
width="600px"
/>
.footnote[Thorndike, E. L. (1937). Valuations of certain pains, deprivations, and frustrations. The Pedagogical Seminary and Journal of Genetic Psychology, 51,227-239.]

---
# .red[Social values]: Relief of Child Health Problems
.left[<img style="border-radius: 0%;float:left;margin-right: 30px;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/sickkids_small.jpg"
width="500px"
/>]

###How to evaluate a pediatric clinic?

- Craig BM, Brown DS, Reeve BB (2016) Valuation of Child Behavioral Problems from the Perspective of US Adults. _Med Decis Making_. 36(2): 199–209.

- Craig BM, Brown DS, Reeve BB (2015) The Value Adults Place on Child Health and Functional Status. _Value Health_. 18(4):449-56.

- Craig BM, Greiner W, Brown DS, Reeve BB (2016) Valuation of Child Health-Related Quality of Life in the United States. _Health Econ_. 25(6):768-77.

---
#Eight steps to a health preference study

.panelset[
.panel[.panel-name[General]
**Step 1**: Define the research question (Why, What, Who) 
**Step 2**: Conduct formative research 
**Step 3**: Make the preference-elicitation tasks 
**Step 4**: Construct the experimental design 
**Step 5**: Create the survey instrument 
**Step 6**: Collect the preference evidence 
**Step 7**: Analyse the data and .red[estimation] 
**Step 8**: Interpret the results and their uncertainty 
]

.panel[.panel-name[Peru EQ-5D-5L]
**Step 1**: The value Peruvian adults place on quality and length of life 
**Step 2**: EQ-5D-5L descriptive system and lifespan 
**Step 3**: cTTO, paired comparisons, matched pairs 
**Step 4**: Efficient and generator developed 
**Step 5**: Computer-based survey instrument 
**Step 6**: Face-to-face interviews in multiple sites 
**Step 7**: Hybrid, Logit, ZBT with power function 
**Step 8**: EQ-5D-5L value set 
 
Federico Augustovski, María Belizán, Luz Gibbons, Nora Reyes, Elly Stolk, Benjamin M. Craig, Romina A. Tejada (2020)
Peruvian Valuation of the EQ-5D-5L: A Direct Comparison of Time Trade-Off and Discrete Choice Experiments, Value in Health,
23(7): 880-888
]
.panel[.panel-name[Dutch EQ-5D-Y-3L]
**Step 1**: The value Dutch adults place on child health outcomes 
**Step 2**: EQ-5D-Y-3L descriptive system for a 10-year-old child 
**Step 3**: Paired comparisons (e.g., A versus B) 
**Step 4**: Efficient 
**Step 5**: Online survey instrument 
**Step 6**: Self-completion by panelists 
**Step 7**: Logit Estimation 
**Step 8**: EQ-5D-Y-3L value set 
 
**Bram Roudijk** and Aureliano Finch (2022) Paper under development, EuroQol Research Foundation 
]
]
???
In step 1, we define the research question by specifying the motivation, laying out the objects, and selecting the subjects.
In step 2, we conduct formative research, which entails identifying attributes, developing the descriptive framework, specifying the model & hypotheses
In step 3, we make the preference-elicitation tasks by presenting the decision, selecting the tasks, and evaluating the prototypes
In step 4: we construct the experimental design which envolves selecting the sets, evaluating the responses, arranging sets into blocks, and assigning blocks to subjects
In step 5: we create the survey instrument, which includes consent, background, and follow-up questions as well as evaluating the instrument prototypes.
In step 6: we collect the preference evidence, implementation of the study protocol, trainings, documentation, and quality control.
In step 7: we analyse the data and estimate the models, which begins with a battery of assessments followed by the estimation of the primary and secondary models. Today, we will focus solely on model estimation.
In step 8: we intepret the results and their uncertainty, which includes marginal effects, hypothesis testing, confidence intervals, bootstrapping, and choice prediction.

---
class: header_background
#Summary of Health Preference Research

###**Psychophysics**: Choices depend on the subjects' perceptions of objects

###**Choice defines value**: Preference studies help us understand values

###**Scientific Rigor**: Making causal statements about value is not easy

###_Following eight steps, a .red[health preference study] uses observational and experimental methods to collect empirical evidence on choices to understanding the value of health and health-related alternatives._

???
Psychophysics has shown us that all choice depend on the perceptions of objects by subjects. Like Thurstone, we conduct studies to understand values, not utilities. To make causal statements about values requires randomization.

As an introduction, I reviewed the eight steps in a health preference study.

Conducting a health preference study is a mandatory skill for anyone interested in the value of health and health-related alternatives. It typically requires a week of coursework to cover each of the eight steps.  This workshop will only cover a component of model estimation, namely applied choice analysis and latent class analysis in health valuation.

---
class: inverse middle center
#“The wise man is one who knows what he does not know.”
.footnote[Lao Tzu (400 BC) Tao Te Ching]

???

So far, this first section has introduced you to health preference research, its psychophysical origin of choice analyses, and the eight steps in a health preference study.  Introducing each step takes about a week of coursework. However, once you have collected your preference evidence and assessed its descriptive qualities, you can estimate your first models.

If you are interested in learning more about HPR more generally, please reach out to me or Suzana Karim.  The remainder of this workshop will be less conceptual and more rigorous and practical.

---
class: header_background
#Applied Choice Analysis in Health Valuation
.pull-left[
## Health problems of a 10-year-old child

- The **EQ-5D-Y-3L** descriptive system (_Y-3L_) has five three-level attributes.

- Each of the 243 profiles ( `$3^5$` ) may be expressed as a vector of attributes.

- The best profile `$11111$` is known as "_full health_"

- The worst profile `$33333$` is known as "_the pits_"

.red[**OBJECTIVE:**] To estimate values for all _Y-3L_ profiles
]
.pull-right[
<img style="border-radius: 0%;float:left;margin-left: 10px;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/sickkids_small.jpg"
width="500px"
/>
]
---
<img style="float:center;margin-right: 10px;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/EQ-5D-Y.png"
width="1000px"
/>
- The value of each _Y-3L_ profile `$V_{j}$` ia a linear function `$1-X_{j}'\beta$` of the ten incremental indicator variables representing the attributes levels (Table 1).

- The coefficients `$\beta$` in the value function represent the causal relationships between the object's attributes `$X_{j}$` and its value `$V_{j}$` (i.e., _main effects_).

.red[**OBJECTIVE:**] To estimate the ten coefficients `$\beta$` indicating the losses in value incurred across the 243 _Y-3L_ profiles

.footnote[Craig BM, Greiner W, Brown DS, Reeve BB (2016) Valuation of Child Health-Related Quality of Life in the United States. _Health Econ_. 25(6):768-77.]

---
class: inverse middle center

##"The .yellow[cumulative density function (CDF)] takes the values of the objects in the choice set `$V_{j}$` and predicts the probability of each choice `$P_{j}$`."

???
---
class: header_background
#Which `$CDF\left(V_{A}, V_{B}\right)$`?

###CDFs based on value differences `$\left(V_{A}-V_{B}\right)$`

.red[**Logit:**] `$\frac{e^{V_{A}}}{e^{V_{A}}+e^{V_{B}}} = \frac{1}{1+e^{V_{B}-V_{A}}}$` 
.red[**Probit:**] `$\Phi \left(V_{A}-V_{B}\right)$` 
.red[**Truncated linear:**] `$0.5+\left(V_{A}-V_{B}\right)/c$` 
.red[**Angular:**] `$0.5+\sin(V_{A}-V_{B})/2$` 
.red[**Gompertz:**] `$e^{-e^{V_{A}-V_{B}}}$`

###CDFs based on value ratios `$\left(V_{A}/V_{B}\right)$`

.green[**Half Cauchy:**] `$1-2atan(V_{A}/V_{B})/\pi$` 
.green[**Zermelo-Bradley-Terry:**] `$\frac{V_{A}}{V_{A}+V_{B}} = \frac{1}{1+{V_{B}/V_{A}}}$`

NOTE: When `$V_{A}=V_{B}$`, all CDF functions equal 0.5, except for the Gompertz 
---
class: header_background
#Which `$CDF\left(V_{A}, V_{B}\right)$`?
<img style="float:center;margin-right: 10px;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/CDF_comparisons.png"
width="700px"
/>
---
#Next, you must select:

1. ###An **objective function**, such as .red[likelihood function], pseudo-likelihood function, or chi-squared function

1. ###An **optimization algorithm**, such as Berndt-Hall-Hall-Hausman (BHHH), Newton-Raphson (NR), .red[Broyden-Fletcher-Goldfarb-Shanno (BFGS)], Nelder-Mead (NM), or Simulated Annealing (SANN)

1. ###A **software package**, such as .red[R], Stata, Matlab, etc.

NOTE: Non-classical econometric approaches using Bayesian and machine learning and fall beyond the scope of this introductory course.

---
class: header_background
#Summary of Applied Choice Analysis

- ###Y-3L Values: `$V_{j}$` is a .green[linear function of ten variables], `$1-X_{j}'\beta$`
- ###CDF: .green[Logit Model], `$P_{A} = \frac{e^{V_{A}}}{e^{V_{A}}+e^{V_{B}}}$` 
- ###Objective: .green[Likelihood function],  `$LF=\prod_{i=1}^{N}\prod_{t=1}^{T}P_{j}$` 
- ###Optimization algorithm: .green[Broyden-Fletcher-Goldfarb-Shanno] (BFGS)
- ###Software: the .green[maxLik] package in R

###Overall, this is known as a **conditional logit estimation**.
---
class: header_background
#What is Health Valuation?

##Your preferences tells us what you .pink[WANT]. 
##Your habits reveal what you .pink[DO].

???
---

class: inverse
.left-column[
# Related Fields
]
.right-column[
###.yellow[Conjoint Analysis]: market of objects (e.g., branding)
###.yellow[Choice Modelling]: .pink[choices] between objects (e.g., habits)
###.yellow[Health Preferences]: .pink[preferences] between objects (e,g., wants)
###.yellow[Health Valuation]: preferences between health outcomes on an anchored scale.
]

---
### Anchored Scaling in Health Valuation

####**_Natural_ Scale:** 
- Each logit coefficient `$\beta_k$` is the incremental effect of a health problem on the log-odds, `$ln\left(P_{no problem}/P_{problem}\right)$`

- The sum of all coefficients `$\mu = \sum_{k=1}^{10}\beta_{k}$` is the log-odds between _full health_ and _pits_, `$ln\left(P_{11111}/P_{33333}\right)$`

####**_Pits_ Scale:**
- The value of a health outcome `$V_{j}$` ia a linear function `$1-X_{j}'\beta$` on the _natural_ scale
- On a pits scale, `$V^*_{j} =1-X_{j}'\beta/\mu$` 
- Note that `$V^*_{33333}=0$` and `$V^*_{11111}=1$`

####**Heteroskedasticity**
- The randomness in choices varies systematically in ways unrelated to preferences
- In practice, `$\mu$` varies by task characteristics `$W$`
- For example, time of day, object position (left-right), task sequence, etc.

---

#Review of Health Valuation

.panelset[
.panel[.panel-name[Preference Evidence]

**Preference evidence** is an empirical dataset `$WXyZ$` which includes:

- _Tasks_ and their characteristics `$W$` (e.g., first task)

- _Objects_ and their attributes `$X$` (e.g., pits profile, 33333)

- _Behaviors_ `$y$` (e.g., subject chooses A over B)

- _Subjects_ and their characteristics `$Z$`  (e.g., subject identifies as male)
]

.panel[.panel-name[Likelihood Function]
**Multinomial Logit**:  Objects `$X$` are homogeneous and behaviors `$y$` vary by subject characteristics `$Z$`

`$LF=\prod_{i=1}^{N}\frac{1}{1+e^{V_{Z_i,B}-V_{Z_i,A}}}$`

**Conditional Logit**:  Subjects `$Z$` are homogeneous and behaviors `$y$` vary by object attributes `$X$`

`$LF=\prod_{i=1}^{N}\prod_{t=1}^{T}\frac{1}{1+e^{V_{j}-V_{k}}}$`

**Heteroskedastic Logit**: Same as conditional logit, except `$\mu_t$` varies by task characteristics `$W$`

`$LF=\prod_{i=1}^{N}\prod_{t=1}^{T} \frac{1}{1+e^{\mu_t\left(V_{j}-V_{k}\right)}}$`

]
.panel[.panel-name[Scaling]
**Log-odds scale**: coefficients `$\beta$` represent preference and behaviors 
**Pits scale**: ratios `$\beta/\mu$` represent values on an anchored scale 
**Why scale?**: (1) shows _relative value_ of health problems; (2) facilitates control of _behavioral heteroskedasticity_

.red[**OBJECTIVE:**]

1. To estimate the ten logit coefficients `$\beta$` on a log-odds scale (with and without heteroskedasticity), and

1. To show the values of _Y-3L_ profiles on a pits scale for use in economic evaluations.
]

.panel[.panel-name[Next Steps]

After the break, Ms. Karim will introduce **Estimation in R** as well as the optional hands-on exercise.

Attendees interested in the **Workshop Completion Certificate** will need to complete the exercise and register for a small-group tutoring session (link to be provided shortly).

In the next session (17 June), I will introduce **preference heterogeneity** in health valuation, namely how to identify observable and latent differences `$\beta$` by subject characteristics `$Z$`, such as stratification and latent classes. 
]
]
---
<img style="float:center;margin-right: 10px;" 
src="https://r4hpr.org/wp-content/uploads/2022/06/post-35425-0-61835800-1316277681.jpg"
width="1100px"
/>

> I highly recommend that you consider participating in the register for the tutoring session and complete the exercise! 
### Be the piranha!

---
class: inverse middle center
#BREAK (10 minutes)