"Time
Binoculars"
Seeing
different times and events overlaid on actual places from the actual places and
online
Michael Naimark
michael@naimark.net
January 2010
Abstract
"Time Binoculars" are permanently installed, user aimed, binocular viewers at
popular sites around the world, like the iconic coin-operated
teardrop-shaped viewers (1), but with several novel
improvements. First, small, high-resolution displays are integrated inside the
unit, one for each eye, on axis with the binocular optics. The intensity of the
display and binocular optics can be controlled, enabling the full range of
transparency and opacity for both. Second, the unit's aiming mechanics, for
panning, tilting, and zooming, incorporate sensors which can be used to
determine the proper viewpoint for the displays. They may also have motors on
them for remote or automated control. Third, small, high-resolution cameras may
also be integrated inside the unit, one for each eye and on axis with the
binocular optics. Finally, the unit may be connected via an onboard computer to
the Internet. Thus, Time Binoculars enable onsite users to look around an
actual site and see perfectly aligned augmentations of what they see, such as
different times of day or different seasons, or historical views, or full-out
Hollywood-style reenactments, or yet-to-be-invented games. At the same time, a
community of online users can watch and participate as well.
1. Introduction
1.1 The Wonder of Place
People travel. Tourism is among the world's largest industries (2). "Setting foot" on an actual place of spectacular
beauty, or where a particular structure once stood, or where an important event
took place, has a deep undefined resonance. People feel it.
1.2 The Magic of Perfect Registration
Now, imagine standing on the spot where a battle took place or where an
important leader gave a speech and watching a reenactment through a binocular
viewer, perfectly superimposed over the actual place, in 3D and with sound.
When the alignment between the actual view through the binoculars and the
digital images on the display are perfect, a sort of magic occurs. This can be
seen in examples of "re-photography" (3) and "re-projection" (4).
1.3 The "Bang
per Buck" of Stereo-Panoramas instead of 3D Models
Time Binoculars, in theory, could be entirely portable, but physically
anchoring them in the ground affords enormous practicality. By limiting the
aiming to angular movements and zoom, no 3D computer model is required. And if
a stereoscopic pair is used, as is the case with binoculars, users will be able
to look around in full 3D as if viewing an actual 3D model, just without being
able to move around laterally, like looking around while
standing in a single spot (5). The cost, in terms
of storage and processing, is merely "2x" one 2D image.
1.4 Opportunities for Live on the Internet
Time Binoculars are the ultimate webcam, capable of showing actual places in
real time, and in stereoscopic 3D if one desires, but with a variety of
overlays available. They can be robotically controlled, or viewed while onsite
users are aiming the unit, or both, for example, for unique gaming
opportunities. Live two-way audio can enhance the experience as well. Time
Binoculars can also be shown inside pre-existing Earth models such as Googe
Earth, displayed
as perfectly aligned overlays (6).
2. Description
2.1 Hardware
Time Binoculars are permanently installed, weatherproof binocular viewers that
can be freely aimed by its users. Such
units (7) have existed for many years and are often coin-operated.
They may be panned and tilted within preset limits, depending on the landscape
to be viewed, and sometimes zoomed as well.
Time Binoculars have digital displays installed inside the unit, one for each
eye. These displays are on axis with the binocular optics, using such known
means as beam splitters, prisms, or half-silvered mirrors, in such a way that
the images shown on the display appears to the user as superimposed over what
is seen through the binocular optics, with additional optics to accommodate
comfortable focus through known means. Additionally, filters which can
electronically change their degree of transmission, either through known
mechanical or solid state means, may be used to electronically vary the degree
of transmission coming from both the binocular optics and the digital display.
The result is that the user may see only what is seen through the binocular
optics, only what is seen from the digital display, or any combination of both.
Time Binoculars may also have aim sensors installed on the pan and tilt
mechanics of the unit, for example using known angular sensors, and if the unit
is capable of zooming, on the zoom mechanics as well, for example using known
angular or linear sensors. Additionally, aim motors may be installed to control
the pan, tilt, and zoom mechanisms electronically.
Time Binoculars may also have digital cameras installed inside the unit, one
for each eye, as well. Like the digital displays, these cameras may be also on
axis with the binocular optics, using such known means as beam splitters,
prisms, or half-silvered mirrors. Cameras may also be installed on the front of
the unit in place of actual binocular optics. Or they may be installed on the
front of the unit very close to the actual binocular optics, thus not exactly
on axis but very close, without any additional beam splitters, prisms, or
half-silvered mirrors.
In addition to the digital displays, transmission filters, aim sensors, aim
motors, and digital cameras, Time Binoculars may also incorporate microphones
and speakers or headsets, as well as various "game-style controls", including,
for example, gamepad, paddle, trackball, and joystick inputs as well as
vibration and haptic outputs. All of the input and output data may be
interfaced to a computer. The computer may be onboard the unit as well,
possibly built into the base, or in a nearby external weatherproof enclosure.
The computer is connected to a network such as the Internet.
2.2 Software
The displays receive imagery from the computer based on pre-programmed material,
on real time generation from a graphics engine, on the onboard digital cameras,
or on any combination thereof. The imagery is also based on the data from the
sensors. The computer processes the image material and the sensor data to
display imagery corresponding to the exact location and position of the actual
unit.
Generally speaking, the imagery may be represented as two 2D images, which may
be dynamic (e.g, video or animation), one for each eye, as the panning and
tilting of the whole unit approximates the panning and tilting of each display
around its nodal point. The imagery may also be represented as two 2D images
from a 3D model as well, for more computational cost, but with more accuracy.
The imagery may be represented as two 2D panoramas, which may be dynamic,
covering the entire potential area that the user can pan and tilt the unit. The
sensor data can then be used to determine which sections of the 2D panoramas to
display in real time. Hence, as the user pans, tilts, and possibly zooms the
unit, the displayed imagery continues to appear as perfectly aligned overlays
with the binocular optics and with the aim toward the landscape.
Speakers or headsets may receive audio from the computer based on
pre-programmed material, on real time generation from an audio engine, on input
from the onboard microphones, on input from streaming audio from a network such
as the Internet, or on any combination thereof.
Game-style controls mounted on the unit may input to the computer to be used to
control the aim motors rather than manually moving the unit by hand. Vibration
or haptic output from the computer may give the user additional sensations, for
example through their hands.
Over a network such as the Internet, one or both displays may be streamed in
real time through known means. If both displays are streamed, the imagery may
be viewed stereoscopically, through a variety of known means including active,
passive, or anaglyphic glasses, or using a head-mounted or hand-held
stereoscopic display. The sensor data may also be streamed through known means,
and this data can be used to dynamically update the aim of how the imagery is
displayed, for example, as a "posed" image inside a 3D Earth
model (8). Data from remote sensors or controls may also stream
back through the network to control the aim through the aim motors on the unit.
2.3 Imagery
The imagery displayed may be from a variety of sources. One source is a
pre-recorded or pre-rendered motion picture. For example, a reenactment of the
"storming of the Bastille" can be recorded as live action or rendered as
animation or composited from both, from the exact viewpoints of actual units.
The source should cover the entire panoramic area that the unit can physically
move, though this may be "tiled" later from source material from non-panoramic
cameras through known means. From a live action production point of view,
shooting from a single viewpoint is relatively practical and economical, as the
stage and backdrops can incorporate "forced perspective" and other known
Hollywood-style tricks exploiting this single viewpoint constraint. One may
imagine multiple "numbered" units strategically installed around a structure or
along a path that tell a rich story when viewed sequentially.
The imagery may also come from the onboard cameras, either pre-recorded or in
real time. Pre-recorded data from the cameras may be used to show different
times of day, of season, and over enough time, of years. Panoramas may be
"tiled" from the material. These panoramas may use known means to either add or
subtract components from the data, for example, people walking around can be
"accumulated" or "removed."
Imagery from the onboard cameras may be digitally composited with pre-recorded
or pre-rendered imagery in such a way that binocular optics are not necessary,
but rather, where the user sees only a digital composite from the displays.
This has two advantages. First, it "normalizes" the look of the imagery to all
digital, rather than a mix of "real" (through the binocular optics) and
digital. More importantly, it allows for opaque overlays rather than
transparent ones. For example, live action imagery of citizens scaling the
walls of the Bastille would appear solid in front of the actual walls, through
known means of digital compositing, rather than as ghost-like transparent
characters.
Imagery need not be representational but may be symbolic or informative. For
example, text or arrows or highlighted areas may augment the onsite imagery
from either binocular optics or onboard cameras, perfectly registered as well.
This material may also be interactively called up, for example, using onboard
controls to "double click" on components of the actual scene for information
about them.
It is noteworthy that, since the imagery may be fundamentally 2D (or more
precisely 2x 2D), it is relatively easy for a community of users to modify and
generate their own imagery, made to perfectly register with the actual
viewpoint. For example, "artists' renderings" of past or fictive moments, home
made video dramas, or "virtual graffiti" may be made and uploaded to the unit
or to an offsite server.
3. Modes of Operation
3.1 Onsite User Driven - The most common mode of operation is where the onsite
user is controlling the aim of the unit, either manually or through controls
driving the aim motors. The user is free to "look around" and may see a linear
reenactment of history or may, through the controls, change the time of day,
season, or year. In this mode, all users on the network are passive observers
(though may be able to, for example, have a real time audio dialogue with the
onsite user).
3.2 Remote User Driven - Another mode of operation is where one or more remote
users control the aim of the unit by controlling the aim motors. This may occur
when no onsite users are present, or indeed, the unit may be purely robotic and
made only for remote users.
3.3 Combination Driven - Through common controls and possibly live two-way
audio, an unique interaction may occur between onsite and remote users. For
example, it may be between "locals" and "virtual tourists." Or it may be for
some yet-to-be-invented game.
3.4 Auto Driven - If the unit has onboard aim motors, several automatic
functions may be useful. For example, the unit may make regular "sweeps" of the
viewpoint, panning and tilting through its entire range, for the purpose of
making near-real-time panoramas, similar to the GigaPan system (9).
Or, it may be used, using known computer vision techniques, to follow action,
in order to accumulate "most interesting" video clips (or it may simply follow pre-scripted
narrative action with which the users must view). Such uses emphasize that,
used as a recording device over, theoretically, vast
epochs of time (10), a unique and powerful
visual database may be built.
4. Additional Enhancements and Features
4.1 The unit may be binocular or monocular, or multiscopic (more than 2 viewpoints).
4.2 Using
computer vision techniques, the angular orientation of the unit and the degree
of zoom may be automatically derived from the camera data.
4.3 The unit may have a nearby display for several people to watch simultaneously.
4.4 The unit may have an alert
("hot now") input (11) triggered by either
onsite or remote users when they see something of current interest, which can
over time, accumulate in a visual database.
References
(1) http://www.toweropticalco.com/gallery.html
(2) http://www.unwto.org/facts/menu.html
(3) http://www.thirdview.org/3v/rephotos/index.html
(4)
http://www.naimark.net/projects/displacements/displ_v2005.html
(5) http://www.naimark.net/projects/bnh3.html
(6) http://interactive.usc.edu/viewfinder/
(7) Google search for [ "coin operated" binoculars ]
(8) http://interactive.usc.edu/viewfinder/
(9) http://www.gigapansystems.com/
(10) http://www.naimark.net/projects/bigprojects/hereforevercam.pdf
(11) http://stevens.usc.edu/IP/3995
Back to Projects Pending 2010
|