"Time Binoculars"
Seeing different times and events overlaid on actual places from the actual places and online

Michael Naimark

michael@naimark.net

January 2010

Abstract

"Time Binoculars" are permanently installed, user aimed, binocular viewers at popular sites around the world, like the iconic coin-operated teardrop-shaped viewers (1), but with several novel improvements. First, small, high-resolution displays are integrated inside the unit, one for each eye, on axis with the binocular optics. The intensity of the display and binocular optics can be controlled, enabling the full range of transparency and opacity for both. Second, the unit's aiming mechanics, for panning, tilting, and zooming, incorporate sensors which can be used to determine the proper viewpoint for the displays. They may also have motors on them for remote or automated control. Third, small, high-resolution cameras may also be integrated inside the unit, one for each eye and on axis with the binocular optics. Finally, the unit may be connected via an onboard computer to the Internet. Thus, Time Binoculars enable onsite users to look around an actual site and see perfectly aligned augmentations of what they see, such as different times of day or different seasons, or historical views, or full-out Hollywood-style reenactments, or yet-to-be-invented games. At the same time, a community of online users can watch and participate as well.

1. Introduction

1.1 The Wonder of Place

People travel. Tourism is among the world's largest industries (2). "Setting foot" on an actual place of spectacular beauty, or where a particular structure once stood, or where an important event took place, has a deep undefined resonance. People feel it.

1.2 The Magic of Perfect Registration

Now, imagine standing on the spot where a battle took place or where an important leader gave a speech and watching a reenactment through a binocular viewer, perfectly superimposed over the actual place, in 3D and with sound. When the alignment between the actual view through the binoculars and the digital images on the display are perfect, a sort of magic occurs. This can be seen in examples of "re-photography" (3) and "re-projection" (4).

1.3 The "Bang per Buck" of Stereo-Panoramas instead of 3D Models

Time Binoculars, in theory, could be entirely portable, but physically anchoring them in the ground affords enormous practicality. By limiting the aiming to angular movements and zoom, no 3D computer model is required. And if a stereoscopic pair is used, as is the case with binoculars, users will be able to look around in full 3D as if viewing an actual 3D model, just without being able to move around laterally, like looking around while standing in a single spot (5). The cost, in terms of storage and processing, is merely "2x" one 2D image.

1.4 Opportunities for Live on the Internet

Time Binoculars are the ultimate webcam, capable of showing actual places in real time, and in stereoscopic 3D if one desires, but with a variety of overlays available. They can be robotically controlled, or viewed while onsite users are aiming the unit, or both, for example, for unique gaming opportunities. Live two-way audio can enhance the experience as well. Time Binoculars can also be shown inside pre-existing Earth models such as Googe Earth, displayed as perfectly aligned overlays (6).

2. Description

2.1 Hardware

Time Binoculars are permanently installed, weatherproof binocular viewers that can be freely aimed by its users. Such units (7) have existed for many years and are often coin-operated. They may be panned and tilted within preset limits, depending on the landscape to be viewed, and sometimes zoomed as well.

Time Binoculars have digital displays installed inside the unit, one for each eye. These displays are on axis with the binocular optics, using such known means as beam splitters, prisms, or half-silvered mirrors, in such a way that the images shown on the display appears to the user as superimposed over what is seen through the binocular optics, with additional optics to accommodate comfortable focus through known means. Additionally, filters which can electronically change their degree of transmission, either through known mechanical or solid state means, may be used to electronically vary the degree of transmission coming from both the binocular optics and the digital display. The result is that the user may see only what is seen through the binocular optics, only what is seen from the digital display, or any combination of both.

Time Binoculars may also have aim sensors installed on the pan and tilt mechanics of the unit, for example using known angular sensors, and if the unit is capable of zooming, on the zoom mechanics as well, for example using known angular or linear sensors. Additionally, aim motors may be installed to control the pan, tilt, and zoom mechanisms electronically.

Time Binoculars may also have digital cameras installed inside the unit, one for each eye, as well. Like the digital displays, these cameras may be also on axis with the binocular optics, using such known means as beam splitters, prisms, or half-silvered mirrors. Cameras may also be installed on the front of the unit in place of actual binocular optics. Or they may be installed on the front of the unit very close to the actual binocular optics, thus not exactly on axis but very close, without any additional beam splitters, prisms, or half-silvered mirrors.

In addition to the digital displays, transmission filters, aim sensors, aim motors, and digital cameras, Time Binoculars may also incorporate microphones and speakers or headsets, as well as various "game-style controls", including, for example, gamepad, paddle, trackball, and joystick inputs as well as vibration and haptic outputs. All of the input and output data may be interfaced to a computer. The computer may be onboard the unit as well, possibly built into the base, or in a nearby external weatherproof enclosure. The computer is connected to a network such as the Internet.

2.2 Software

The displays receive imagery from the computer based on pre-programmed material, on real time generation from a graphics engine, on the onboard digital cameras, or on any combination thereof. The imagery is also based on the data from the sensors. The computer processes the image material and the sensor data to display imagery corresponding to the exact location and position of the actual unit.

Generally speaking, the imagery may be represented as two 2D images, which may be dynamic (e.g, video or animation), one for each eye, as the panning and tilting of the whole unit approximates the panning and tilting of each display around its nodal point. The imagery may also be represented as two 2D images from a 3D model as well, for more computational cost, but with more accuracy.

The imagery may be represented as two 2D panoramas, which may be dynamic, covering the entire potential area that the user can pan and tilt the unit. The sensor data can then be used to determine which sections of the 2D panoramas to display in real time. Hence, as the user pans, tilts, and possibly zooms the unit, the displayed imagery continues to appear as perfectly aligned overlays with the binocular optics and with the aim toward the landscape.

Speakers or headsets may receive audio from the computer based on pre-programmed material, on real time generation from an audio engine, on input from the onboard microphones, on input from streaming audio from a network such as the Internet, or on any combination thereof.

Game-style controls mounted on the unit may input to the computer to be used to control the aim motors rather than manually moving the unit by hand. Vibration or haptic output from the computer may give the user additional sensations, for example through their hands.

Over a network such as the Internet, one or both displays may be streamed in real time through known means. If both displays are streamed, the imagery may be viewed stereoscopically, through a variety of known means including active, passive, or anaglyphic glasses, or using a head-mounted or hand-held stereoscopic display. The sensor data may also be streamed through known means, and this data can be used to dynamically update the aim of how the imagery is displayed, for example, as a "posed" image inside a 3D Earth model (8). Data from remote sensors or controls may also stream back through the network to control the aim through the aim motors on the unit.

2.3 Imagery

The imagery displayed may be from a variety of sources. One source is a pre-recorded or pre-rendered motion picture. For example, a reenactment of the "storming of the Bastille" can be recorded as live action or rendered as animation or composited from both, from the exact viewpoints of actual units. The source should cover the entire panoramic area that the unit can physically move, though this may be "tiled" later from source material from non-panoramic cameras through known means. From a live action production point of view, shooting from a single viewpoint is relatively practical and economical, as the stage and backdrops can incorporate "forced perspective" and other known Hollywood-style tricks exploiting this single viewpoint constraint. One may imagine multiple "numbered" units strategically installed around a structure or along a path that tell a rich story when viewed sequentially.

The imagery may also come from the onboard cameras, either pre-recorded or in real time. Pre-recorded data from the cameras may be used to show different times of day, of season, and over enough time, of years. Panoramas may be "tiled" from the material. These panoramas may use known means to either add or subtract components from the data, for example, people walking around can be "accumulated" or "removed."

Imagery from the onboard cameras may be digitally composited with pre-recorded or pre-rendered imagery in such a way that binocular optics are not necessary, but rather, where the user sees only a digital composite from the displays. This has two advantages. First, it "normalizes" the look of the imagery to all digital, rather than a mix of "real" (through the binocular optics) and digital. More importantly, it allows for opaque overlays rather than transparent ones. For example, live action imagery of citizens scaling the walls of the Bastille would appear solid in front of the actual walls, through known means of digital compositing, rather than as ghost-like transparent characters.

Imagery need not be representational but may be symbolic or informative. For example, text or arrows or highlighted areas may augment the onsite imagery from either binocular optics or onboard cameras, perfectly registered as well. This material may also be interactively called up, for example, using onboard controls to "double click" on components of the actual scene for information about them.

It is noteworthy that, since the imagery may be fundamentally 2D (or more precisely 2x 2D), it is relatively easy for a community of users to modify and generate their own imagery, made to perfectly register with the actual viewpoint. For example, "artists' renderings" of past or fictive moments, home made video dramas, or "virtual graffiti" may be made and uploaded to the unit or to an offsite server.

3. Modes of Operation

3.1 Onsite User Driven - The most common mode of operation is where the onsite user is controlling the aim of the unit, either manually or through controls driving the aim motors. The user is free to "look around" and may see a linear reenactment of history or may, through the controls, change the time of day, season, or year. In this mode, all users on the network are passive observers (though may be able to, for example, have a real time audio dialogue with the onsite user).

3.2 Remote User Driven - Another mode of operation is where one or more remote users control the aim of the unit by controlling the aim motors. This may occur when no onsite users are present, or indeed, the unit may be purely robotic and made only for remote users.

3.3 Combination Driven - Through common controls and possibly live two-way audio, an unique interaction may occur between onsite and remote users. For example, it may be between "locals" and "virtual tourists." Or it may be for some yet-to-be-invented game.

3.4 Auto Driven - If the unit has onboard aim motors, several automatic functions may be useful. For example, the unit may make regular "sweeps" of the viewpoint, panning and tilting through its entire range, for the purpose of making near-real-time panoramas, similar to the GigaPan system (9). Or, it may be used, using known computer vision techniques, to follow action, in order to accumulate "most interesting" video clips (or it may simply follow pre-scripted narrative action with which the users must view). Such uses emphasize that, used as a recording device over, theoretically, vast epochs of time (10), a unique and powerful visual database may be built.

4. Additional Enhancements and Features

4.1 The unit may be binocular or monocular, or multiscopic (more than 2 viewpoints).

4.2 Using computer vision techniques, the angular orientation of the unit and the degree of zoom may be automatically derived from the camera data.

4.3 The unit may have a nearby display for several people to watch simultaneously.

4.4 The unit may have an alert ("hot now") input (11) triggered by either onsite or remote users when they see something of current interest, which can over time, accumulate in a visual database.

References

(1) http://www.toweropticalco.com/gallery.html

(2) http://www.unwto.org/facts/menu.html

(3) http://www.thirdview.org/3v/rephotos/index.html

(4) http://www.naimark.net/projects/displacements/displ_v2005.html

(5) http://www.naimark.net/projects/bnh3.html

(6) http://interactive.usc.edu/viewfinder/

(7) Google search for [ "coin operated" binoculars ]

(8) http://interactive.usc.edu/viewfinder/

(9) http://www.gigapansystems.com/

(10) http://www.naimark.net/projects/bigprojects/hereforevercam.pdf

(11) http://stevens.usc.edu/IP/3995

Back to Projects Pending 2010