Automating the conversion of natural language fiction to multi-modal 3D animated virtual environments
- Authors: Glass, Kevin Robert
- Date: 2009
- Subjects: Virtual computer systems , Virtual storage (Computer science) , Virtual reality , Computer animation , Fiction -- Computer programs , Narration (Rhetoric) -- Computer simulation , Animation (Cinematography) , Natural language processing (Computer Science)
- Language: English
- Type: Thesis , Doctoral , PhD
- Identifier: vital:4632 , http://hdl.handle.net/10962/d1006518
- Description: Popular fiction books describe rich visual environments that contain characters, objects, and behaviour. This research develops automated processes for converting text sourced from fiction books into animated virtual environments and multi-modal films. This involves the analysis of unrestricted natural language fiction to identify appropriate visual descriptions, and the interpretation of the identified descriptions for constructing animated 3D virtual environments. The goal of the text analysis stage is the creation of annotated fiction text, which identifies visual descriptions in a structured manner. A hierarchical rule-based learning system is created that induces patterns from example annotations provided by a human, and uses these for the creation of additional annotations. Patterns are expressed as tree structures that abstract the input text on different levels according to structural (token, sentence) and syntactic (parts-of-speech, syntactic function) categories. Patterns are generalized using pair-wise merging, where dissimilar sub-trees are replaced with wild-cards. The result is a small set of generalized patterns that are able to create correct annotations. A set of generalized patterns represents a model of an annotator's mental process regarding a particular annotation category. Annotated text is interpreted automatically for constructing detailed scene descriptions. This includes identifying which scenes to visualize, and identifying the contents and behaviour in each scene. Entity behaviour in a 3D virtual environment is formulated using time-based constraints that are automatically derived from annotations. Constraints are expressed as non-linear symbolic functions that restrict the trajectories of a pair of entities over a continuous interval of time. Solutions to these constraints specify precise behaviour. We create an innovative quantified constraint optimizer for locating sound solutions, which uses interval arithmetic for treating time and space as contiguous quantities. This optimization method uses a technique of constraint relaxation and tightening that allows solution approximations to be located where constraint systems are inconsistent (an ability not previously explored in interval-based quantified constraint solving). 3D virtual environments are populated by automatically selecting geometric models or procedural geometry-creation methods from a library. 3D models are animated according to trajectories derived from constraint solutions. The final animated film is sequenced using a range of modalities including animated 3D graphics, textual subtitles, audio narrations, and foleys. Hierarchical rule-based learning is evaluated over a range of annotation categories. Models are induced for different categories of annotation without modifying the core learning algorithms, and these models are shown to be applicable to different types of books. Models are induced automatically with accuracies ranging between 51.4% and 90.4%, depending on the category. We show that models are refined if further examples are provided, and this supports a boot-strapping process for training the learning mechanism. The task of interpreting annotated fiction text and populating 3D virtual environments is successfully automated using our described techniques. Detailed scene descriptions are created accurately, where between 83% and 96% of the automatically generated descriptions require no manual modification (depending on the type of description). The interval-based quantified constraint optimizer fully automates the behaviour specification process. Sample animated multi-modal 3D films are created using extracts from fiction books that are unrestricted in terms of complexity or subject matter (unlike existing text-to-graphics systems). These examples demonstrate that: behaviour is visualized that corresponds to the descriptions in the original text; appropriate geometry is selected (or created) for visualizing entities in each scene; sequences of scenes are created for a film-like presentation of the story; and that multiple modalities are combined to create a coherent multi-modal representation of the fiction text. This research demonstrates that visual descriptions in fiction text can be automatically identified, and that these descriptions can be converted into corresponding animated virtual environments. Unlike existing text-to-graphics systems, we describe techniques that function over unrestricted natural language text and perform the conversion process without the need for manually constructed repositories of world knowledge. This enables the rapid production of animated 3D virtual environments, allowing the human designer to focus on creative aspects.
- Full Text:
- Date Issued: 2009
Implementing non-photorealistic rendering enhancements with real-time performance
- Authors: Winnemöller, Holger
- Date: 2002 , 2013-05-09
- Subjects: Computer animation , Computer graphics , Real-time data processing
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4580 , http://hdl.handle.net/10962/d1003135 , Computer animation , Computer graphics , Real-time data processing
- Description: We describe quality and performance enhancements, which work in real-time, to all well-known Non-photorealistic (NPR) rendering styles for use in an interactive context. These include Comic rendering, Sketch rendering, Hatching and Painterly rendering, but we also attempt and justify a widening of the established definition of what is considered NPR. In the individual Chapters, we identify typical stylistic elements of the different NPR styles. We list problems that need to be solved in order to implement the various renderers. Standard solutions available in the literature are introduced and in all cases extended and optimised. In particular, we extend the lighting model of the comic renderer to include a specular component and introduce multiple inter-related but independent geometric approximations which greatly improve rendering performance. We implement two completely different solutions to random perturbation sketching, solve temporal coherence issues for coal sketching and find an unexpected use for 3D textures to implement hatch-shading. Textured brushes of painterly rendering are extended by properties such as stroke-direction and texture, motion, paint capacity, opacity and emission, making them more flexible and versatile. Brushes are also provided with a minimal amount of intelligence, so that they can help in maximising screen coverage of brushes. We furthermore devise a completely new NPR style, which we call super-realistic and show how sample images can be tweened in real-time to produce an image-based six degree-of-freedom renderer performing at roughly 450 frames per second. Performance values for our other renderers all lie between 10 and over 400 frames per second on homePC hardware, justifying our real-time claim. A large number of sample screen-shots, illustrations and animations demonstrate the visual fidelity of our rendered images. In essence, we successfully achieve our attempted goals of increasing the creative, expressive and communicative potential of individual NPR styles, increasing performance of most of them, adding original and interesting visual qualities, and exploring new techniques or existing ones in novel ways. , KMBT_363 , Adobe Acrobat 9.54 Paper Capture Plug-in
- Full Text:
- Date Issued: 2002