The hippocampus and parahippocampal region are essential for representing episodic memories across various spatial locations, while interacting with various objects, and for using those memories for future adaptive behavior. The ‘dual-stream model’ was initially formulated based on anatomical characteristics of the medial temporal lobe, dividing the parahippocampal region into two streams that separately process and relay spatial and non-spatial information to the hippocampus. Despite its significance, the dual-stream model in its original form cannot explain various recent experimental results, and many researchers have recognized the need for a modification of the model. Here, we argue that categorizing the parahippocampal region into spatial and non-spatial streams a priori may be too simplistic, particularly in light of ambiguous situations in which a sensory cue alone (e.g., visual scene) may not allow such a definitive categorization. Upon reviewing evidence of the literature, including our own, that reveals the importance of goal-directed behavioral responses in determining the relative involvement of the parahippocampal processing streams, we propose the GIST (Goal-directed Interaction of Stimulus and Task-demand) model. In the GIST model, input stimuli such as visual scenes and objects are first processed by both the postrhinal and perirhinal cortices - the postrhinal cortex more heavily involved with visual scenes and perirhinal cortex with objects - with relatively little dependence on behavioral task demand. However, once perceptual ambiguities are resolved and the scenes and objects are identified and recognized, the information is then processed through the medial and lateral entorhinal cortex, depending on whether it is used to fulfill navigational or non-navigational goals, respectively. As complex sensory stimuli are utilized for both navigational and non-navigational purposes in an intermixed fashion in naturalistic settings, the hippocampus may be required to then put together these experiences into a coherent map to allow flexible cognitive operations for adaptive behavior.