Describe stages of value based decision making = representation and valuation
Encoding of value as a common currency (align on same axis) in vmpfc/ofc
Representation and valuation - which area responsible for this, represent state of world
what about changes in value across different directions = 2 things measured
Desirability = current worth of available options - what we measured with the common currency
Availability = chance that chosen items obtained or goal of action realized
- distinguish between the 2 - processed by diff areas
patient with vmpfc lesion has suboptimal performance in Iowa task = describe overall
Broad extensive Pfc lesion - but found specific deficits - high risk deck the whole time
And not able to compute value so cannot switch
Perseverate in choosing risk decks even tho have negative average returns in long run
Simailr to Wisconsin (inability to change their behaviour with new info) but now = domain diff = estimating probabilistic rewards rather than rules
Can we be more precise = decompose which type of value is encoded
prefrontal areas
Ventral lateral frontal cortex
Obitoforntal cortex
Regions in vmpfc
Compare availability and desirability
Lesions in patients rarely localized
Experimental lesions in macaque can test more finely the contribution of diff Pfc areas
definition vmpfc
Sometimes encompasses parts of ofc - lateral parts area 11/13
Large, encompasses many areas
describe 3 arm bandit task
3 obejcts - each associated with probability of obtaining reward
Probability of obtaining reward changes slowly within blocks (not completely stable)
Abrupt changes in the probabilities across blocks
Monkeys need to evaluate options
what do you have to do to do 3 arm bandit task
Have to integrate info across history = user reward history to build value
Have to reexplore and figure out which one is best
why is location of obejcts changed on 3 arm bandit task
Controlling for things we aren’t testing for
Make sure nto assigning value tot location = control for location bias
Identify of obejct most important
describe what happens in bandit task when ofc lesion
Does nto affect behaviour when bilateral excitotoxic lesion
Measure probability of choosing high reward option - when switch probabilities
In control subjects = tacks changing reward probabilities= choose it with same probability of reward
No changes in behaviour wirh ofc lesion
describe what happens in bandit task when vlpfc lesion
Bilateral excitotoxic lesion
= changes behaviour, big deficit in task
Subjects with vlpfc lesions have deficit in tracking dynamics of stimulus outcome contingencies
Able to track initially (lagging a bit tho) but then cannot update choices to pick new high reward option
Choices are less affected by recent outcomes leading to deficit in tracking reward contingencies
= cannot integrate recent outcomes to build expected value fo world
specificity of value properties affected by lesions across areas = conclsuions of when lesion ofc and vlpfc 3 arm bandit task exp
Ofc lesion does not affect behaviour in bandit task
Vlpfc lesion impairs ability to track dynamic stimulus outcome contingencies = availability of reward - ability to evaluate if obejct reachable- possible signature of a required contribution of vlpfc to compute and act upon availability component of value = assessing probability it is present
Now = try to find one that does affect when ofc lesion - double dissociation
describe devaluation task
Measure changes in desirability of a reward
60 training trails per session, 30/food reward= train
In test phase = first have access to one reward type - consume untill satiety = eat berries untill no longer wanted
Then tested = 30 trails on preference between the 2 reward types
Devalued berries - value decreased, now don’t want, changed preference
what happens in devaluation task when lesion ofc
Measure proportion of choices shifted away from high value (non satiated) option comapired to baseline condition without satiation
Inability to adapt following ofc lesion= monkeys with ofc lesion made fewer adaptive choices- unable to reevaluate value of options after they ahve changed - can’t make devaluation choice = cannot update value based on internal state = don’t feel devaluation and can’t switch
Possible signature of a required contribution of ofc to compute and act upon desirability compoennt of value
double dissociation function = gen
2 tasks measure diff areas and diff aspects value = these functions specific o that areas - bc deciding specific to 1 task
Double dissociation of function between ofc and vlpfc - selection and independent contributions to diff types of value learning
double dissociation function = ofc
Signature of a required contribution of ofc to compute and act upon desirability competent of value - what is subjective value of one unit of reward
double dissociation function = vlpfc
Signature of a required contribution of vlpfc to compute and act upon availability components of value - what is probability of obtaining reward
subdivsions ofc
Ofc divided into medial ofc and lateral ofc based on anatomical and connectivity info
= use tasks and targeted lesions to start probing at finer level
3 arm bandit task adapted version
Reward probability of best and second best options are very close and relatively stable during the task
- 2 options very close to each other = have to compare relative values - small differences in values
what happens when lesion lateral ofc - adapted 3 arm bandit task
No behavioural deficits in task following lateral ofc lesion - chose best option with same probality
what happens when lesion medial ofc - adapted 3 arm bandit task
Behavioural deficits in task following medial ofc lesion made= lose ability to make distinction between options
critical to make fine judgement options - fine discrimination
Mofc deficits rescued in sessions with larger differences across options = suggests that mofc critical for comparing available options in order to guide choices
describe devaluation task =lesion lateral and medial ofc
Look at how devaluation changes over time
More precise lesion loazlition to understand contributions of subdivisions of ofc
Control animals ahve robust changes in behaviour after devaluation - time course of behaviour changes, keeps picking one they haven’t had much of
describe devaluation task =lesion lateral ofc results
Long term deficits in behaviour and do not adapt to changes in value due to satiation
Picks both options equally
describe devaluation task =lesion medial ofc results
More transient
Initially strong deficits in behaviour but those recover (pick non devalued option) back to baseline from second post op test
Time course = imrpoant, can recover for medial ofc lesion
muscimol devaluation task = set up
Muscimol inactaivtions = allow for a temporally and spatially defined lesion - compare effect of lesion during devaluation (when eating berries) process or after devaluation process (when choose)
Is it a deficit in creating value or in being able to act on it?
Certain areas/circuits can be needed for learning/udpating value but not for action selection
Dissociation between building value representation and using representation for action