Supplemental Results for TMM

In this page we present some results as supplemental materials to the paper we submitted to the ICME Special Issue of TMM.You will need a flash player to play the videos, and you can click on the image to view a full resolution version..

Section 1: Results on Selected Viewpoints and Cameras under Different Display Resolutions
a. Plotted Graph
b. Corresponding Videos (*The following video is almost shown in the same display resolution as it should be. The quality of this flash version is not very high due to bandwidth limitation for smoothing online streaming. Please focus on the content within the selected viewpoint rather than the video quality itself. Click here for a high-quality video.)
Demo video
Section 2: Results on Generated Summaries under Different User Preferences
In this part, we give some results on generated summaries to demonstrate the personalization capabilities of our system. In each plot, we will show selected clips, the map of events and the annotated dominant players.
The following four summaries are generated by setting the display resolution to 480x360 and setting the summary length to 2min 30 sec. (*Click on the graph for a bigger view.)

Part A: results on different story telling patterns: i.e., more replays or less replays, more closeup-views or less closeup views;
1) In the first graph, we asked for a story-telling pattern with few replays and close-up views;
2) For the second graph, we asked for a story-telling pattern with more replays and close-up views;
These two experiments (1 & 2) show that our system could provide summaries of different story-telling fashions;
Favor a concise summary with less closeup views/replaysFavor a detailed summary with more closeup views/replays
Demo video Demo video

Part B: Results of personalized summaries focusing on user-preferred players;
3) For the third graph, we form a summary to focus on player A9 (No 9 of the yellow team);
4) For the forth graph, we form a summary to focus on player B9 (No 9 of the black team);
These two experiments (3 & 4) show that our system could provide summaries to focus on a user-favorite player;
Favor a summary showing more actions by Player A9Favor a summary showing more actions by Player B9
Demo video Demo video
Part C: Summaries of different user-specified lengths.
As we can see from the two demo videos below, the story does not expand linearly with respect to the increasing user-specified length. They will expand the story by following the predefined rules of story organization, e.g, the constraints on replay insertion.
Favor a summary of length 3 Minutes 30 SecondsFavor a summary of length 4 Minutes 30 Seconds
Demo video Demo video
Section 3: Subjective Evaluation Results
Subjective evaluations have been performed separately in [13] and [36], relating to viewpoint selection and summarization, respectively. Produced results have been evaluated from both their global impressions and visual/story-telling artifacts, where the efficiency of each corresponding method has been confirmed. The main contribution of the paper lies in integrating our video analysis and production/summarization components into a fully automatic framework. Therefore, novel subjective evaluations are performed here to address: 1.) The relevance of our personalized video production concept; 2.) The efficiency of our proposed implementation in achieving this personalized production.
Part A: Subjective Evaluation on The Relevance of Our Production Concept
We interviewed 17 people for their opinions about our personalized production concept. Interestingly, the panel corresponds to a representative set of users since they include 4 content production professionals, 3 sport professionals, 6 basket-ball fans, and 4 computer vision experts. In the following questionnaire , we considered several content creation scenarios [full game production for VoD, summary generation, interactive browsing, and access to game/players statistics], and asked them to write down what they would expect from a personalized service for each application scenario. We also prepared a list of personalization criteria, including many game actions, statistics and audio/graphic elements, and asked them to evaluate the importance of each criterion.
Download the PDF file for the questionnaire we used for collecting user's feedback on our concept
User’s understanding/feeling of personalized summarization concept has been evaluated before and after playing with our following summarization prototype (in Section 5 of this web page).
Hence, we summarize their feedbacks into two groups.
The a priori opinions collected before playing with the prototype are summarized as follows:
• All application scenarios envisioned in the questionnaire (production, summarization, browsing, statistical analysis) are considered to be relevant to at least one kind of end-user;
• The level of interest of a given user towards a specific scenario depends on his/her professional background. Coaches and players are especially interested in statistics and browsing capabilities. Fans are interested in generic and personalized summaries. Content production professionals are primarily interested in raw content provisioning at low cost, e.g. for VoD services or to feed the manual construction of summaries.
• When considering the personalization/browsing criteria, it appears that recognition of both the action and its associated players is important. Classification of structured actions, i.e., selection of actions for which a given player receives the ball in a particular place, only interests coaches. Zoomed-in views are of little importance to sport coaches and players, and only interest content producers, if sufficient resolution can be preserved.
After having played with the prototype, the following additional conclusions could be drawn:
• The personalization criteria currently implemented in our summarization test-bed are considered as being relevant. However, their implementation effectiveness is generally evaluated to medium or even low for the replays and resolution. Based on users oral feedbacks, we conclude that this is probably due to the fact that (1) the image quality degrades a lot when zooming-in the picture, and (2) replays are not properly inserted in the narrative flow. They are played very fast, and do not always provide a different point-of-view on the action.
• Additional personalization criteria have been pointed out by the users, including the opportunity to select a time period of the game, and the period of the game during which one specific player was on the field. Sport professionals pointed out the fact that all criteria listed in the questionnaire were relevant. Augmenting a top view with the label of each player would also be useful.
• Audio support and graphics elements (score, timer, etc) are considered as a fundamental component of the summary. The absence of those components in our test-bed has often been pointed as one of its main drawbacks.
• Most of the interviewees believe that consumers would be ready to pay for such personalized services. Fans consider that the services should be part of a provider package. Content distribution professionals would be ready to pay a fee to access high-quality contents. Sports professionals (clubs, managers, coaches, etc) tolerate lower quality content, but request personalized access mechanisms.
Part B: Subjective Evaluation on The Effectiveness of Our Implementation
Besides the qualitative impression over the prototype, we also perform a standalone subjective evaluation via ”mean opinion scores” to provide some quantitative results. We prepared the following webpage, which presents 5 groups of videos under different user-preferences. Viewers were asked to score the relevance of several personalization criteria, and also the effectiveness of our implementation in personalizing the video with respect to each investigated criterion. Both notions were scored using four ranks, i.e., ”Very High”(=4), ”High”(=3), ”Low”(=2) and ”Very Low”(=1). Five criteria have been selected, i.e., ”Display Resolution”, ”Replay Insertion”, ”Preferred Event”, ”Preferred Player” and ”Summary Duration”.
Result Page for Subjective Evaluation
*Note that the main purpose of this subjective evaluation is to investigate the relevance of the proposed system in personalizing video summaries for satisfying various user preferences. In the second evaluation, the viewers were asked to focus on the relevance of video contents to the pre-specified user preferences, rather than the video quality itself.
We collected answers from 20 persons, and plotted the mean scores and their standard deviations in the following figure.
In the top sub-graph of the above figure, we present the relevance of the five factors as personalization criterion, while in the bottom, we show how user appreciate the effectiveness of our implementation. We make the following major observations:
• As an overall result, all the five factors are regarded as highly relevant to personalized production, and our method is regarded as efficient to personalize the video with respect to these factors.
• ”Preferred Event” and ”Preferred Player” are rated as two most important personalization criteria, which coincides with conventional understanding of personalized video summaries. ”Display Resolution” is introduced to multiview video production, which was less often discussed in single-view video summarization. ”Replay Insertion” is an operation against conventional understanding of summarization as producing a concise video of the original source. Therefore, it is natural to find that they are less accepted. However, we still observe that these two factors are rated as ”highly” relevant, which not only validates our concept of video production from multi-view data, but also confirms our argument that video summarization should be regarded as a chance to personalize the contents rather than simply filtering important events.
• As for our implementation effectiveness, the highest score is obtained by personalization against ”Preferred Event”. Our implementation against ”Preferred Player” is also evaluated as ”highly” efficient. Hence, despite the possible incompleteness and errors of information of both events and dominant players, we are still allowed to provide meaningful results, so as to partially satisfy users’ semantic preferences.
Our personalized ability against ”display resolution” obtained the second highest score, which proves the efficiency of our production system, including both camera and viewpoint selection. Our implementation against ”Replay Insertion” has the lowest effectiveness, which also coincides with the overall impressions of those people after playing with our prototype. In order to improve the quality of ”Replay Insertion”, we need to have more accurate localization of beginning and ending points of events, and consider the proper presentation, e.g., slow playback, which is left as one of our future work.
Section 4: Results from the real deployed system in the Spiroudome
We will give some results from our real deployment of the system in the Spiroudome (Charleroi, Belgium). Since we haven't obtained the agreements from both teams, these results will be only open to reviewers of this paper. Note that you need ID and password to view this page. Here both the id and the password have been set to the manuscript ID assigned by the submission system, i.e., "MM-xxxxxx".
Please check this page.
Section 5: A Demo Summarization Prototype for Standalone Test
We have also implemented a lite demo system which you can download and test in your own PC. You can get the compressed rar file, which is 2.1GB in file size due to inclusion of related video data. Extract the rar file to a local folder and run ApidisMain.exe. Click on the "Basketball Video Producer+Summarizer" button to test the demo system.
Description: The summarization interface is presented in the following Figure. The main part is devoted to the display area. This area plays the video summary produced in real-time after the "generate!" icon has been pressed. This area also presents information about the relative (with respect to the initial raw content) and absolute timestamps of the edited video. Next to the display area, we identify three areas, denoted A, B, and C.
Area A allows the definition of user preferences. Those preferences include:
==> The total duration of the summary;
==> The resolution of the resulting edited video;
==> The relative importance given to the inclusion of close views and/or replays;
==> The semantic preferences, namely preferred action, player and team
Area B is mainly used for debugging purposes and plots the viewpoint selection parameters as a function of time.
Area C defines how the story is organized. The first row presents a timeline of the game, with clock-events definition. The second line defines which parts of the game timeline have been included in the final summary.
Download link
Section 6: Appendix
Appendix A: Results on Viewpoint Selection Criterion
We consider a special case where players are evenly distributed along the 28m long central line of the basketball court. Those players are of the same height 1.75m and the distance between any two consecutive players is set to 1m. Accordingly, we have 29 players in total. By moving a pinhole camera along the circular trail, we collect source camera views from all angles. The radius of this circular trail should be large enough, which is 80m here, so as to assure that the optimal viewpoint is covered by one of these camera views.

Without loss of generality, we assume that we intend to find an optimal viewpoint for a target display, whose resolution is only half of that of this pinhole camera. Intuitively, we conclude that the viewpoint that needs no resampling should have the same size of the target resolution, which is equivalent to put the virtual camera 40m away.

For each virtual camera position with a distance ranging from 5m to 60m, we compute its equivalent viewpoint in the corresponding source camera view, where positions of players within this viewpoint are easily computable by using projective geometry. We then use the proposed criterion to compute the benefit of this virtual camera position 3. We plot these benefits in Fig.9(b)-(d), by using both the complete form of the criterion and two incomplete forms with certain terms missing, which helps us to clarify the exact role of each term. When only completeness is considered (by omitting β(.) function and the occlusion term), enlarging the viewpoint to include more players is always beneficial, which drives the virtual camera far away (Fig.9(b)). Inclusion of β(.) function leads to three obvious changes, as revealed by Fig.9(c). 1) The tendancy of enlarging the viewpoint is withheld by the will to have a larger pixel size for each player, where a trade-off has been built up; 2) A virtual camera with parallel optical axis to the ground plane is most favored, since they increase both the number and the pixel size of visible players, without considering the occlusion; 3) A circular ridge starts to appear around 40m. This ridge becomes much clearer in Fig.9(d), where an even more balanced benefit is computed by further considering the occlusion term. In Fig.9(d), the maximal benefit is achieved from virtual cameras with an oblique view angle to the ground plane, among which those positions on the 40m circle are further favored so as to avoid unnecessary resampling, which coincides with our intuitive understanding and predefined guidelines about the optimal viewpoint. When the target resolution changes, the optimal circular ridge moves accordingly, which hence realizes the personalization against device resolutions.
When all players are standing on the left side of the court with a 0.5m mutual-distance between each two consecutive players, we saw the optimal viewpoint moves to the left side of the court as well. When all players are standing on the right side of the court with a 0.5m mutual-distance between each two consecutive players, we saw the optimal viewpoint moves to the right side of the court as well.
3D Graphs of viewpoint benefits
==>Each point in the wireframe represents one virtual viewpoint position, whose benefit is represented by the color.
==>The higher is the benefit of a viewpoint, the closer is the color to dark red.*Click to view a larger graph.
When all 29 players are standing in two lines along the long axis of the court When all 29 players are standing in three lines along the short axis of the court
Appendix B: Results on Broadcasted Basketball Videos
Explanation: This part shows the view-types in a broadcasted basketball video. It reveals that closeup views and replays are mainly inserted during breaks in today's practice of basketball video production. Furthermore, it tells us that due to the fast pace of basketball game, there are only a few chances for the director to insert closeup views and replays. We refer to these experiences to design our rule to prepare the render strategy.
*In this page, you can find more data including the view-type structure of other sports, such as soccer and volleyball.
View type graph of shots in a broadcasted basketball video
Checking here for corresponding videos segmented using the rule we defined in this page

Maintained by chen-fan AT Last update 2010-11-02