I did a short test 2.5D game (2 scenes) and will speak from my short experience.
You first create your scene in your 3D program i.e. 3D Studio Max. Then you create a camera and put it where you want to achieve the scene you wish. For instance our scene was a house with a front garden. We set our camera in front of the house, looking at the house.
Having set your camera you then render the scene. By rendering the scene you create your image file, for instance
background.png (you can use several formats). Then you export to create
geometry.3DS.
You then create a scene in Wintermute. Set the background to
background.png and also load
geometry.3DS. And you are done

What Wintermute does (as far as I can guess) is use the
geometry.3DS to calculate where the floor, walls, obstacles etc are to be able to place your actor, apply collision detection (e.g. when there is a wall stop moving the character) and where to clip the actor (e.g. where to cut the actor if there is something in front of him).
The connection between
geometry.3DS and
background.png is the camera you created and exported in
geometry.3DS (you must have noticed that you can change the camera in Scene 3D options). You used that camera to render
background.png, and so they match.
The user only sees
background.png, not knowing what happens under the hood.
Bear in mind that
geomtry.3DS does not need to have the full scene. Before you export it, you can remove any details or textures.