Discussion:
Multimedia User Interfaces
(too old to reply)
a***@hotmail.com
2011-08-31 20:06:24 UTC
Permalink
microsoft.public.windowsmedia.player,

Greetings. I would like to describe some new multimedia user interface
features including spatiotemporal content selection, bookmarking,
spatiotemporal zooming, structure-based navigation, and client-side
text-based search into multimedia objects. This composition intends to
succinctly introduce those ideas and to describe an illustrative usage
scenario, video blogging.

1. Spatiotemporal Content Selection

Spatiotemporal content selection is meant as spatial multimedia
selection, temporal multimedia selection or both. Spatial multimedia
selection is meant, herein, as indicating rectangular regions on a
video rendering surface. Temporal multimedia selection is meant as
selecting temporal intervals of multimedia, perhaps on the timeline
upon which the playhead moves. Users can gesture with keyboard, mouse,
multitouch, voice or NUI to indicate spatial regions and temporal
intervals of multimedia objects. It occurs that spatial, temporal and
spatiotemporal regions of multimedia content can be identified by
media fragments URI (http://www.w3.org/TR/media-frags). Extensible
context menus are envisioned for spatial, temporal and spatiotemporal
selections.

2. Bookmarking

With regard to bookmarking, users can gesture to place a bookmark, or
point of interest, at a point in a video. It is envisioned that
bookmarking gestures result in bookmark objects being placed at the
position of the playhead as the user gestures. After making use of
more of or the entirety of a multimedia object, users can return to
their indicated points of interest, or bookmarks, to, for example,
select temporal intervals around each such point. Extensible context
menus are envisioned for bookmark points.

3. Spatiotemporal Zooming

With spatiotemporal zooming, users can zoom, from media fragments, to
containing spatial regions and/or temporal intervals or to the
multimedia object that contains the region and/or interval.
Spatiotemporal zooming can make use of tracks that accompany a video,
for example zooming from a search result fragment to a chapter of a
multimedia object that contains the search result media fragment.

4. Structure-Based Navigation

Beyond sequences of chapters are possible outlines or tree-based
structures for multimedia objects. With such tracks, user interface
implementation ideas include that buttons for chapter traversal can
have menus for indicating the simultaneous traversal options from the
current playhead position. Spatiotemporal zooming can combine with
structure-based navigation to allow users to zoom from a media
fragment to structural elements of the multimedia object that contain
a media fragment. For example, a structural model could include books,
parts, chapters, pages, paragraphs and sentences, and, from a media
fragment, a user could zoom to a containing structural element, and
then also navigate by means of those structural elements, based upon
the particular structural model specified in a track.

5. Client-side Text-based Search

By making use of tracks that accompany a multimedia object or of
client-side audio/video indexing and search, client-side text-based
search into documents can include the option of searching into
multimedia objects. Search results can be indicated by highlighted
portions on the timeline or otherwise visually indicated, perhaps as
per bookmarks. The finding of text string occurrences in documents can
extend into multimedia objects contained in those documents.

6. Usage Scenario: Video Blogging

Video blogging is an illustrative usage scenario for the above
multimedia user interface features. A video blogger makes use of a
multimedia search engine for multimedia. Video fragments are indicated
in the search results. The user watches a search result media fragment
and decides that they are interested in seeing its entire video blog
article. The user makes use of zooming to navigate to a containing
section of or to the entire video blog article. As the user watches
the other video blogger's video, they make use of bookmarking to place
points of interest for later use. After watching the video blog, the
user makes temporal selections around those bookmarked points, while
perhaps making use of the structural data in one or more tracks of the
video. The user then makes use of extensible context menus and
utilizes the selected clips in a video authoring software to compose a
video blog article with clips from one or more multimedia objects. It
also occurs that, by making use of media fragment URI hyperlinks,
users can additionally tweet about spatiotemporal selections of
multimedia.

Other usage scenarios for the new multimedia user interface features
include making use of video from political speeches, news, punditry,
arts and entertainment, civil discourse, and arbitrary multimedia
content, for example when tweeting, blogging or video blogging.



Kind regards,

Adam Sobieski
a***@hotmail.com
2011-09-01 23:17:44 UTC
Permalink
Regarding those multimedia user interface ideas, here are some
examples to clarify.

Regarding point one, a spatial selection is selecting a rectangle of a
video. By itself, a spatial selection is a subrectangle for the entire
multimedia object's duration:

http://example.com/video.avi#xywh=160,120,320,240
(http://www.w3.org/TR/media-frags/#naming-space)

A temporal selection is selecting, perhaps making use of the timeline,
a portion or interval of a multimedia object. By itself, a temporal
selection is for the entire movie's rectangle:

http://example.com/video.avi#t=10,20
(http://www.w3.org/TR/media-frags/#naming-time)

Combining those, selecting a rectangle of the multimedia object and an
interval of it, simultaneously, is a spatiotemporal selection:

http://example.com/video.avi#xywh=160,120,320,240&t=10,20

Point two, or bookmarking, is about placing points of interest on a
multimedia object's timeline, for later use, without having to pause
the multimedia user experience.

Point three, observing the URI's for spatial, temporal and
spatiotemporal media fragments, is about starting from one of those,
as per <video src="http://example.com/
video.avi#xywh=160,120,320,240&t=10,20"/>, and being able to navigate
to either larger rectangles, wider intervals, both, or to the
video.avi object. Zooming also includes from a media fragment, such as
http://example.com/video.avi#t=10,20, to a containing structural
element, for example http://example.com/video.avi#id=chapter-1.

Point four, videos also have tracks and, in such tracks, are possible
structures beyond lists of chapters. Possible are structures like
books, parts, chapters, pages, paragraphs and sentences. It is
possible to select a structural element of a video.

http://example.com/video.avi#id=chapter-1
(http://www.w3.org/TR/media-frags/#naming-name)

With such structural tracks, people can traverse multimedia objects in
structure-based ways, as per from a point in a multimedia object to
http://example.com/video.avi#id=part-2.

The fifth idea, is about client-side text searching into videos. Many
document viewers and web browsers provide searching into documents for
text occurances and that functionality is described as possible to
extend into the multimedia objects in those documents. Client-side
text-based multimedia search can be facilitated by processing the
tracks that accompany multimedia objects, such as transcripts or
captions, and by audio and natural language processing techniques.



Kind regards,

Adam Sobieski
a***@hotmail.com
2011-09-03 02:48:01 UTC
Permalink
As convenient, after some discussions, the following summarizations
have emerged:

1. Selecting rectangles of video and intervals on video timelines.
Selecting crop regions and timespans. Those selections can have
context menus on them.

2. Bookmarking. Placing points on the timeline of multimedia objects
while watching them to then make later use of those bookmarked points.
After indicating bookmark points, selections can be then made around
or near those bookmark points.

3. Selections of multimedia objects, media fragments, or clips, can be
described by media fragments URI (http://www.w3.org/TR/media-frags). A
spatial selection, as per a rectangle, http://example.com/video.avi#xywh=160,120,320,240
, a temporal selection, as per an interval of the timeline,
http://example.com/video.avi#t=10,20 , and a combination or
spatiotemporal selection http://example.com/video.avi#xywh=160,120,320,240&t=10,20
can be identified by URI and multimedia objects. Navigating from those
to larger rectangles or wider intervals is point 3.

4. Videos can, upcoming, contain more structure than lists of
chapters. User interface ideas include being able to navigate through
videos that have more structure than just chapters. For example, a
video might include a track that describes books, parts, chapters,
pages, paragraphs and sentences.

5. By making use of tracks that accompany a multimedia object or of
client-side audio/video indexing and search, client-side text-based
search into multimedia is possible. Users can find text occurences in
videos and navigate to them.



Kind regards,

Adam Sobieski

Loading...