object recognition Flashcards by S K

what is the goal of our object recognition system?

must be able to deal with different representations of the same object

How well did you know this?

Not at all

Perfectly

what does the goal of the object recognition system apply to?

structures differences

differences in pose, distance, lighting, position, viewpoint

degraded images - occlusion, noise, distortion, filtering

How well did you know this?

Not at all

Perfectly

what is the two-stream model?

post V1, information is transmitted via two pathways - ventral and dorsal stream

more complex networks link to frontal lobe and other areas

How well did you know this?

Not at all

Perfectly

what is the ventral stream?

what pathway

from V1 to V2 then V4 and IT cortex

associated with object recognition and memory

How well did you know this?

Not at all

Perfectly

what is the dorsal stream?

where and how pathway

from V1 to V2 and finally V5/MT

associated with motion, location, saccadic control

How well did you know this?

Not at all

Perfectly

what is object agnosia?

damage to ventral stream can create a deficiency in object recognition

How well did you know this?

Not at all

Perfectly

what is the problem with models of object recognition?

all models share a common assumption

the senses register the presence and an internal representation of the stimulus is generated (perceptual representation)

object is recognised when there is a match between the perceptual representation and some stored representation of the same object

object recognition requires the interaction of perception and memory

How well did you know this?

Not at all

Perfectly

what did David Marr assert about models of object recognition?

necessarily includes a computational approach

should have three levels of analysis - computational (what is the system doing and why?), algorithmic (which processes, rules and algorithms are used to solve the problem?), implementational (how are these processes implemented by the system?)

How well did you know this?

Not at all

Perfectly

what are template-matching models?

simplest model for object recognition

assumes we have templates - perceptual representations stored in our long-term memory

template is available for any object

when an object appears in the receptive field of this detector that matches that template, it signals

How well did you know this?

Not at all

Perfectly

what is the problem with template-matching models?

work well but letters must be in exactly expected location and orientation

for this to work, would need a detector for every possible orientation, scale and font - requiring an impossibly large brain

How well did you know this?

Not at all

Perfectly

what are feature detection models?

Selfridge’s Pandemonium model (1959)

a built up template model

described in terms of demons with different jobs

model fits well with Hubel & Wiesel’s work which suggests feature detecting neurons

How well did you know this?

Not at all

Perfectly

what are feature demons in the feature detection model?

look at image and simply write down how many examples of their feature they see

How well did you know this?

Not at all

Perfectly

what are cognitive demons in the feature detection model?

shout if they think that the combination of features applies to their letter

more confident they are, louder they shout

How well did you know this?

Not at all

Perfectly

what are decision demons in the feature detection model?

listens to the cognitive demons and decides who is shouting the loudest, providing that as the perceived letter

How well did you know this?

Not at all

Perfectly

how are early implementations of the feature detection model crude?

feature demons still using templates

still no information on configuration

still can’t distinguish between different versions

How well did you know this?

Not at all

Perfectly

what are the controversies in object recognition?

if asked to describe shape of coin, might say its a disc - this is always true, its a feature of the object

physical shape of coin is viewpoint-independent

perceptual representation changes and therefore is viewer-dependent

How well did you know this?

Not at all

Perfectly

what is Marr & Nishihara’s (1978) structural description model?

Study These Flashcards

believed goal of model is to describe to object unambiguously therefore system must be invariant to transformations in viewpoint, illumination, etc which means the system must know which properties are invariant under transformation and how other properties may vary

coordinate system is object centred, negating the problem of transformation variance

what are the primitives in Marr & Nishihara’s (1978) structural description model?

Study These Flashcards

primitive are basic units of information in its representation

volumetric approach - volumes only require axis and size info - maintains specificity without requiring too much storage space

see everything in cylinders

how is information organised into an object description in Marr & Nishihara’s (1978) structural description model?

Study These Flashcards

viewer centred - input image, edge image , 2.5D sketch
object centred - 3D model

at 4th stage, system creates viewpoint-independent representation of the object by describing its volumetric structure, 3D model enables consistent recognition of the object from any angle, ensuring the object can be identified no matter how it is viewed

object is described in terms of its axes and the volumes around them - description is modular and hierarchical meaning the object can be described at many scales, allowing for identity matching and discrimination

what is the input image in Marr & Nishihara’s (1978) structural description model?

Study These Flashcards

retinal image

intensity and wavelength of light at each point

what is the edge image in Marr & Nishihara’s (1978) structural description model?

Study These Flashcards

zero crossing, blobs, edges, bars, ends, curves, boundaries

what is the 2.5D sketch in Marr & Nishihara’s (1978) structural description model?

Study These Flashcards

surfaces with local orientations and discontinuities in depth

what is the 3D model in Marr & Nishihara’s (1978) structural description model?

Study These Flashcards

composed of 3D “primitive” volumes, organised hierarchically by scale

Study These Flashcards

what is the recognition - "model" store - in Marr & Nishihara's (1978) structural description model?

in our brain, have full catalogue of 3D representations of objects even if your object perception doesn't exactly match anything in your model store, find closest match have sufficient information on the object from the image and your memory to help you interact with it (image manipulation)

what is Biedermann's (1987) structural description model?

model also called recognition by components (RBC) proposed a set of primitive volumes into which objects are decomposed (not just cylinders) physiological evidence presented from neuronal recordings

what are non-accidental properties in Biedermann's (1987) structural description model?

collinearity (straightness of lines) curvature symmetry parallelism co-termination

what are geons in Biedermann's (1987) structural description model?

volumes are geons (geometric ions) - many features of these geons remain in 2D and 3D space (non-accidental properties) estimated that there are <=36 geons therefore 36^2 = 1296 pairs of geons, which can be attached in different ways and of different relative sizes - proposed that there are ~75,000 possible 2-geon objects

what are experimental evidence did Biedermann's (1987) present for structural description model?

for these geons in human object recognition right column is missing key (2D) geons, while centre column is missing an equal amount of other contour info object recognition was much poorer in missing-geon condition (solid lines), particularly when presented for a short time

how does partial occlusion affect Biedermann's (1987) structural description model?

changes objects from flat to 3D visual system will generate hypothesis about how the object's contours may continue behind the occluder - cognitive processes that allow people to infer the full shape of an object even though it's partially hidden geons theory struggles to explain how humans perceive and recognise objects that are partially obscured as it doesn't fully account for this inferential processing

what are the pros of structural description models?

invariance is well-explained recognition relies on description rather than matching graded representations cope with discrimination and generalisation evidence that structural information matters to humans and neurons

what are the cones of structural description models?

extracting model parameters can be hard in real images structural description is difficult for some objects driven by theoretical desirability rather than behavioural or physiological evidence

what are view-dependent models?

don't need to catalogue of fixed 3D models but rather a catalogue of shape descriptions that match view-dependent characteristics of objects brute force association finding of a canonical perspective supports the idea of an experience-based catalogue uses a viewer-based coordinate system weighted approach between layers

what is a canonical perspective in view-dependent models?

what comes to mind when we think of an object view that seems to maximise the information use for recognition

what are primitives in view-dependent models?

sub-regions of the image - "abstract" features that might consist of lines, curves, textures, colour, shading feature-sensitive units combine into each other in a weighted way, getting more complication size and position invariant (because IT neurons have big receptive fields) these feed into view-tuned object recognition cells recognition by matching input to closest stored view

what is a view layer in view-dependent models?

units code a particular view of the object

what is an object layer in view-dependent models?

units respond to a particular object view-independent

what is the evidence for view-dependent models?

human object recognition is not always viewpoint invariant viewing sphere - practiced recognising objects from specific viewpoints, tested at novel viewpoints - interpolation (between previous viewpoints) easiest, extrapolation (beyond previous viewpoints but in same exists) medium difficulty, orthogonal axis (from a completely new viewpoint) hardest behavioural, physiological evidence exists

what are the pros of view-dependent models?

straight forward minimises transformations that must be performed newer models are based directly on what we know of physiology abstract features are recombinable good behavioural, physiological and simulation-based evidence

what are cons of view-dependent models?

humans often show quite good generalisation across viewpoints even for novel objects still more memory intensive than e.g. geon model

object recognition Flashcards

(40 cards)