Bell MyLifeBits Talks February2003
MyLifeBits:
Attempting to realize the Memex Vision
Gordon Bell
February 2003
http://research.microsoft.com/barc/MediaPresence/MyLifeBits.aspx
With Jim Gemmell & Roger Lueder
1
Outline … MyLifeBits
Background…fulfilling
the Memex vision
Cyberizing
everything
File to database transition
Use…beyond search
Long-term agenda and outlook
2
Memex
Posited by Vannevar Bush in “As We May Think”
The Atlantic Monthly, July 1945
“A memex is a device in which an individual stores all
his books, records, and communications, and which
is mechanized so that it may be consulted with
exceeding speed and flexibility”
Supports: Annotations, links between documents, and
“trails” through the documents
“yet if the user inserted 5000 pages of material a day it
would take him hundreds of years to fill the
repository, so that he can be profligate and enter
material freely”
3
Sketch of memex
4
Bush’s camera on the head
Capturing what you see
6
Memory Overload
As hard drives get bigger and cheaper,
we're storing way too much.
By Jim Lewis
There's a famous allegory about a map of the
world that grows in detail until every point in
reality has its counterpoint on paper; the twist
being that such a map is at once ideally accurate
and entirely useless, since it's the same size as
the thing it's meant to represent.
7
"The PC is going to be the place where you store the
information and really the center of control“ Billg
1/7/2001
MyLifeBits is a project to “cyberize” everything!
What?
Recall of all articles, books, CDs,
photos, video, communication (e.g. mail, phone),
web
Why? …“because we can”
Office: communicate, store, & work
Home & Media Center: ambiance &entertainment
Immortality for progeny. Memory aids
Goal:
to understand the 1 TByte PC c2006:
need, utility, cost, feasibility and tools.
8
Knowledge worker scenarios
Gordon: Researcher, consumer,
computer system tester,
nerd wanna-be, and average man
Melissa: middle manager
Patrick: Consultant
Nicholas: Analyst
Sondra: Office manager
9
The guinea pig
Gordon Bell is digitizing his life
Has now scanned virtually all:
Books written (and read when possible)
Personal documents (correspondence including memos and email,
bills, legal documents, papers written, …)
Photos
Posters, paintings, photo of things (artifacts, …medals, plaques)
Home movies and videos
CD collection
And, of course, all PC files
Now recording: phone, radio, TV (movies), web pages…
conversations?
Paperless throughout 2002. 12” scanned, 12’ discarded .
Only 30 GB!!!
10
Capture and encoding
11
I mean everything
12
Input: tools, time, and cost
Scanners:
HP Digital Sender, flat beds with ADF, 2HP photo, faxing. (Duplex, color, feed-thru, etc.)
A good
commercial scanner costs 2K-10K
Photos:
$1 or 0.5-5 min.
Large posters: ~ 1-5 hr.
Artifacts: ~ 10 min. including photo
Scanning to TIF, PDF: than hierarchy)
2. Visualizations for search, display, insight
3. Annotations and links add value and essential
4.
Increase search ability and value of information.
So make many kinds and them easy to create!
Stories are the ultimate annotation
Keep the links when you author: “transclusion”
19
MLB database: size and content?
Database features are essential: Consistency, Indexing,
Pivoting, Queries, Speed/scalability, Backup, replication.
Folders &Files were the starting point >> database into sets
aka “collections” that are identical to the folder structure
Outlook (msgs, attachments, calendar, contacts)
Web trails including voice message annotation
Journal (Outlook), trails: every document use & transaction
What about?
Money (transactions, payees, etc.)…is their lifelog/trail
Streets and trips to cross-index to all docs
Attributes for photos for retrieval? Location, time, settings
Presentations as a report or trail. Each slide an object!
20
21
Media center 2
22
Legacy
Spkr Cables/links
stereo
CD
5 speakers Speaker 5+1
Legacy
Spkr Plasma 2 or 3
IR
Cable/Enet 2
stereo
LVCR
egacy
IR 8
Video*
Stereo 4
5.1 digital
5.1 digital 2
Redundant
DVD
comp. Receiver
Comp./S-video 3
Cassette
Set top
Cable/
Satellite
Ethernet
Camera
Mic
stereo
Wfr
Plasma panel 1
Power 10
Kbd/mse 2
Monitor II (opt.)
Camera 2
Total 42 – 46
Things 18+remot
stereo
Video*
Set top
Media
Center
Computer
Kbd Mse
5.1 digital
Video*
SVHS-wide
Plasma Panel
*Video = composite or S-video
23
Photos
24
Caneel Bay Vacation Jan. 1998
Gordon, Gwen, Brig, Pam,
Fiona, Bob, Laura and Kolbe
25
Searching: the most useful app?
Challenge:
What questions for useful results?
Lots of ways to look at what you retrieve
Need for breaking the returns into segments
Searching for an indexer and search engine:
index service, Enfish, dtSearch
Stuff I’ve Seen MSR’s index & search…
evolving in the right direction.
Productizing
Longhorn
would remove the pressure for
26
27
28
29
Detail view
30
Resource explorer
Ancestor (collections), annotations, descendant
& preview panes turned on
31
Interface to xls
32
33
Statistics of use
34
Synchronized timelines with
histogram guide
35
Visualization
Browsing
& searching. “Get me what I want|need!”
Help
the user find things among possible items versus
Waiting for an ideal system that can find “what I want”
Publication:
Conventional & web, presentations,
etc.
Helps understand the nature of the content e.g.
histogram of objects in time
Context: Links to help understand the relationship
between objects. Provides more search handles.
Information density: what is it?
What is its relationship to others?
Content important. Flash and form, less useful.
36
Value of media depends on
annotations
“Its
just bits until it is annotated”
37
System annotations provide base
level of value
Date
7/7/2000
38
Tracking usage – even better
Date 7/7/2000. Opened 30 times, emailed to 10
people (its valued by the user!)
39
Get the user to say a little
something is a big jump
Date 7/7/2000. Opened 30 times, emailed to 10 people. “BARC
dim sum intern farewell Lunch”
40
Getting the user to tell a story is the
ultimate in media value
A story is a “layout” in time and space
Most valuable content (by selection, and by being well annotated)
Stories must include links to any media they use (for future navigation/search –
“transclusion”).
Cf: MovieMaker; Creative Memories PhotoAlbums
Dapeng was an
intern at BARC
for the summer
of 2000
We took him to
lunch at our
favorite Dim Sum
place to say
farewell
At table L-R: Dapeng, Gordon, Tom, Jim, Don,
Vicky, Patrick, Jim
41
Value of media depends on
annotations
“Its just bits until it is annotated”
Auto-annotate
whenever
possible e.g. GPS cameras
Make manual annotation
as easy as possible. XP
photo capture, voice,
photos with voice, etc
Support gang annotation
Make stories easy
42
43
The Agenda for the Tbyte(s), Lifetime, PC:
The killer app after office and mail.
1.
2.
Guarantee that data will live forever! “dear appy” problem
Cheap, easy, and data-rich (e.g. time, place) capture:
GPS and time everywhere
Paper capture has to be as easy as discard (scanner/shredder)
E-book…e-magazines & journals need to have critical mass!
Telephony and audio capture with indexing
Media Center compatible for entertainment (photos, video, TV, radio)
3.
4.
One? dbase for all books, conversations, mail, web pages …
vs. long-term use of hierarchical files. Is dbase intuitive?
Annotations/meta-information add every-increasing value
Ease of annotation because it aids search and becomes the content
Content analysis (critical for photo & video!)
5.
6.
7.
Information control: privacy, security, expunge/deniability,…
New “killer apps”: alzheimer, immortality, surrogate memory?
45
Any GUI to improve use (e.g. time to learn, use, retention)
The End
46
The “dear appy” problem
Dear Appy,
How committed are you?
Please come back to me,
Lost and forgotten data
Who’s
responsible?
media
platform, file, and databases
evolving standards and formats
evolving and/or disappearing apps
47
Digitizing our lives
Right now, it is affordable to buy 100 GB/year
In 5 years 1TB/year is afforadable!
It’s hard to fill a terabyte/year just by keeping what you see or
hear, but you can:
Look at 9800 pictures a day (300 KB JPEGs)
Read 2900 documents a day (1MB files)
Listening to audio or view compressed video 24 hours/day (it takes
more than 256 kb/s to fill a TB in a year)
Watch 1.5 Mb/s video 4 hours each day.
As Bush said, we can “be profligate and enter material freely”
48
Attempting to realize the Memex Vision
Gordon Bell
February 2003
http://research.microsoft.com/barc/MediaPresence/MyLifeBits.aspx
With Jim Gemmell & Roger Lueder
1
Outline … MyLifeBits
Background…fulfilling
the Memex vision
Cyberizing
everything
File to database transition
Use…beyond search
Long-term agenda and outlook
2
Memex
Posited by Vannevar Bush in “As We May Think”
The Atlantic Monthly, July 1945
“A memex is a device in which an individual stores all
his books, records, and communications, and which
is mechanized so that it may be consulted with
exceeding speed and flexibility”
Supports: Annotations, links between documents, and
“trails” through the documents
“yet if the user inserted 5000 pages of material a day it
would take him hundreds of years to fill the
repository, so that he can be profligate and enter
material freely”
3
Sketch of memex
4
Bush’s camera on the head
Capturing what you see
6
Memory Overload
As hard drives get bigger and cheaper,
we're storing way too much.
By Jim Lewis
There's a famous allegory about a map of the
world that grows in detail until every point in
reality has its counterpoint on paper; the twist
being that such a map is at once ideally accurate
and entirely useless, since it's the same size as
the thing it's meant to represent.
7
"The PC is going to be the place where you store the
information and really the center of control“ Billg
1/7/2001
MyLifeBits is a project to “cyberize” everything!
What?
Recall of all articles, books, CDs,
photos, video, communication (e.g. mail, phone),
web
Why? …“because we can”
Office: communicate, store, & work
Home & Media Center: ambiance &entertainment
Immortality for progeny. Memory aids
Goal:
to understand the 1 TByte PC c2006:
need, utility, cost, feasibility and tools.
8
Knowledge worker scenarios
Gordon: Researcher, consumer,
computer system tester,
nerd wanna-be, and average man
Melissa: middle manager
Patrick: Consultant
Nicholas: Analyst
Sondra: Office manager
9
The guinea pig
Gordon Bell is digitizing his life
Has now scanned virtually all:
Books written (and read when possible)
Personal documents (correspondence including memos and email,
bills, legal documents, papers written, …)
Photos
Posters, paintings, photo of things (artifacts, …medals, plaques)
Home movies and videos
CD collection
And, of course, all PC files
Now recording: phone, radio, TV (movies), web pages…
conversations?
Paperless throughout 2002. 12” scanned, 12’ discarded .
Only 30 GB!!!
10
Capture and encoding
11
I mean everything
12
Input: tools, time, and cost
Scanners:
HP Digital Sender, flat beds with ADF, 2HP photo, faxing. (Duplex, color, feed-thru, etc.)
A good
commercial scanner costs 2K-10K
Photos:
$1 or 0.5-5 min.
Large posters: ~ 1-5 hr.
Artifacts: ~ 10 min. including photo
Scanning to TIF, PDF: than hierarchy)
2. Visualizations for search, display, insight
3. Annotations and links add value and essential
4.
Increase search ability and value of information.
So make many kinds and them easy to create!
Stories are the ultimate annotation
Keep the links when you author: “transclusion”
19
MLB database: size and content?
Database features are essential: Consistency, Indexing,
Pivoting, Queries, Speed/scalability, Backup, replication.
Folders &Files were the starting point >> database into sets
aka “collections” that are identical to the folder structure
Outlook (msgs, attachments, calendar, contacts)
Web trails including voice message annotation
Journal (Outlook), trails: every document use & transaction
What about?
Money (transactions, payees, etc.)…is their lifelog/trail
Streets and trips to cross-index to all docs
Attributes for photos for retrieval? Location, time, settings
Presentations as a report or trail. Each slide an object!
20
21
Media center 2
22
Legacy
Spkr Cables/links
stereo
CD
5 speakers Speaker 5+1
Legacy
Spkr Plasma 2 or 3
IR
Cable/Enet 2
stereo
LVCR
egacy
IR 8
Video*
Stereo 4
5.1 digital
5.1 digital 2
Redundant
DVD
comp. Receiver
Comp./S-video 3
Cassette
Set top
Cable/
Satellite
Ethernet
Camera
Mic
stereo
Wfr
Plasma panel 1
Power 10
Kbd/mse 2
Monitor II (opt.)
Camera 2
Total 42 – 46
Things 18+remot
stereo
Video*
Set top
Media
Center
Computer
Kbd Mse
5.1 digital
Video*
SVHS-wide
Plasma Panel
*Video = composite or S-video
23
Photos
24
Caneel Bay Vacation Jan. 1998
Gordon, Gwen, Brig, Pam,
Fiona, Bob, Laura and Kolbe
25
Searching: the most useful app?
Challenge:
What questions for useful results?
Lots of ways to look at what you retrieve
Need for breaking the returns into segments
Searching for an indexer and search engine:
index service, Enfish, dtSearch
Stuff I’ve Seen MSR’s index & search…
evolving in the right direction.
Productizing
Longhorn
would remove the pressure for
26
27
28
29
Detail view
30
Resource explorer
Ancestor (collections), annotations, descendant
& preview panes turned on
31
Interface to xls
32
33
Statistics of use
34
Synchronized timelines with
histogram guide
35
Visualization
Browsing
& searching. “Get me what I want|need!”
Help
the user find things among possible items versus
Waiting for an ideal system that can find “what I want”
Publication:
Conventional & web, presentations,
etc.
Helps understand the nature of the content e.g.
histogram of objects in time
Context: Links to help understand the relationship
between objects. Provides more search handles.
Information density: what is it?
What is its relationship to others?
Content important. Flash and form, less useful.
36
Value of media depends on
annotations
“Its
just bits until it is annotated”
37
System annotations provide base
level of value
Date
7/7/2000
38
Tracking usage – even better
Date 7/7/2000. Opened 30 times, emailed to 10
people (its valued by the user!)
39
Get the user to say a little
something is a big jump
Date 7/7/2000. Opened 30 times, emailed to 10 people. “BARC
dim sum intern farewell Lunch”
40
Getting the user to tell a story is the
ultimate in media value
A story is a “layout” in time and space
Most valuable content (by selection, and by being well annotated)
Stories must include links to any media they use (for future navigation/search –
“transclusion”).
Cf: MovieMaker; Creative Memories PhotoAlbums
Dapeng was an
intern at BARC
for the summer
of 2000
We took him to
lunch at our
favorite Dim Sum
place to say
farewell
At table L-R: Dapeng, Gordon, Tom, Jim, Don,
Vicky, Patrick, Jim
41
Value of media depends on
annotations
“Its just bits until it is annotated”
Auto-annotate
whenever
possible e.g. GPS cameras
Make manual annotation
as easy as possible. XP
photo capture, voice,
photos with voice, etc
Support gang annotation
Make stories easy
42
43
The Agenda for the Tbyte(s), Lifetime, PC:
The killer app after office and mail.
1.
2.
Guarantee that data will live forever! “dear appy” problem
Cheap, easy, and data-rich (e.g. time, place) capture:
GPS and time everywhere
Paper capture has to be as easy as discard (scanner/shredder)
E-book…e-magazines & journals need to have critical mass!
Telephony and audio capture with indexing
Media Center compatible for entertainment (photos, video, TV, radio)
3.
4.
One? dbase for all books, conversations, mail, web pages …
vs. long-term use of hierarchical files. Is dbase intuitive?
Annotations/meta-information add every-increasing value
Ease of annotation because it aids search and becomes the content
Content analysis (critical for photo & video!)
5.
6.
7.
Information control: privacy, security, expunge/deniability,…
New “killer apps”: alzheimer, immortality, surrogate memory?
45
Any GUI to improve use (e.g. time to learn, use, retention)
The End
46
The “dear appy” problem
Dear Appy,
How committed are you?
Please come back to me,
Lost and forgotten data
Who’s
responsible?
media
platform, file, and databases
evolving standards and formats
evolving and/or disappearing apps
47
Digitizing our lives
Right now, it is affordable to buy 100 GB/year
In 5 years 1TB/year is afforadable!
It’s hard to fill a terabyte/year just by keeping what you see or
hear, but you can:
Look at 9800 pictures a day (300 KB JPEGs)
Read 2900 documents a day (1MB files)
Listening to audio or view compressed video 24 hours/day (it takes
more than 256 kb/s to fill a TB in a year)
Watch 1.5 Mb/s video 4 hours each day.
As Bush said, we can “be profligate and enter material freely”
48