An information extraction model for recommending the most applied case
- Authors: Padayachy, Thashen
- Date: 2019
- Subjects: Information technology , Information storage and retrieval systems System design
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10948/43325 , vital:36794
- Description: The amount of information produced by different domains is constantly increasing. One domain that particularly produces large amounts of information is the legal domain, where information is mainly used for research purposes. However, too much time is spent by legal researchers on searching for useful information. Information is found by using special search engines or by consulting hard copies of legal literature. The main research question that this study addressed is “What techniques can be incorporated into a model that recommends the most applied case for a field of law?”. The Design Science Research (DSR) methodology was used to address the research objectives. The model developed is the theoretical contribution produced from following the DSR methodology. A case study organisation, called LexisNexis, was to help investigate the real-world problem. The initial investigation into the real-world problem revealed that too much time is spent on searching for the Most Applied Case (MAC) and no formal or automated processes were used. An analysis of an informal process followed by legal researchers enabled the identification of different concepts that could be combined to create a prescriptive model to recommend the MAC. A critical analysis of the literature was conducted to obtain a better understanding of the legal domain and the techniques that can be applied to assist with problems faced in this domain, related to information retrieval and extraction. This resulted in the creation of an IE Model based only on theory. Questionnaires were sent to experts to obtain a further understanding of the legal domain, highlight problems faced, and identify which attributes of a legal case can be used to help recommend the MAC. During the Design and Development activity of the DSR methodology, a prescriptive MAC Model for recommending the MAC was created based on findings from the literature review and questionnaires. The MAC Model consists of processes concerning: Information retrieval (IR); Information extraction (IE); Information storage; and Query-independent ranking. Analysis of IR and IE helped to identify problems experienced when processing text. Furthermore, appropriate techniques and algorithms were identified that can process legal documents and extract specific facts. The extracted facts were then further processed to allow for storage and processing by query-independent ranking algorithms. The processes incorporated into the model were then used to create a proof-of-concept prototype called the IE Prototype. The IE Prototype implements two processes called the IE process and the Database process. The IE process analyses different sections of a legal case to extract specific facts. The Database process then ensures that the extracted facts are stored in a document database for future querying purposes. The IE Prototype was evaluated using the technical risk and efficacy strategy from the Framework for Evaluation of Design Science. Both formative and summative evaluations were conducted. Formative evaluations were conducted to identify functional issues of the prototype whilst summative evaluations made use of real-world legal cases to test the prototype. Multiple experiments were conducted on legal cases, known as source cases, that resulted in facts from the source cases being extracted. For the purpose of the experiments, the term “source case” was used to distinguish between a legal case in its entirety and a legal case’s list of cases referred to. Two types of NoSQL databases were investigated for implementation namely, a graph database and a document database. Setting up the graph database required little time. However, development issues prevented the graph database from being successfully implemented in the proof-of-concept prototype. A document database was successfully implemented as an alternative for the proof-of-concept prototype. Analysis of the source cases used to evaluate the IE Prototype revealed that 96% of the source cases were categorised as being partially extracted. The results also revealed that the IE Prototype was capable of processing large amounts of source cases at a given time.
- Full Text:
- Date Issued: 2019
- Authors: Padayachy, Thashen
- Date: 2019
- Subjects: Information technology , Information storage and retrieval systems System design
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10948/43325 , vital:36794
- Description: The amount of information produced by different domains is constantly increasing. One domain that particularly produces large amounts of information is the legal domain, where information is mainly used for research purposes. However, too much time is spent by legal researchers on searching for useful information. Information is found by using special search engines or by consulting hard copies of legal literature. The main research question that this study addressed is “What techniques can be incorporated into a model that recommends the most applied case for a field of law?”. The Design Science Research (DSR) methodology was used to address the research objectives. The model developed is the theoretical contribution produced from following the DSR methodology. A case study organisation, called LexisNexis, was to help investigate the real-world problem. The initial investigation into the real-world problem revealed that too much time is spent on searching for the Most Applied Case (MAC) and no formal or automated processes were used. An analysis of an informal process followed by legal researchers enabled the identification of different concepts that could be combined to create a prescriptive model to recommend the MAC. A critical analysis of the literature was conducted to obtain a better understanding of the legal domain and the techniques that can be applied to assist with problems faced in this domain, related to information retrieval and extraction. This resulted in the creation of an IE Model based only on theory. Questionnaires were sent to experts to obtain a further understanding of the legal domain, highlight problems faced, and identify which attributes of a legal case can be used to help recommend the MAC. During the Design and Development activity of the DSR methodology, a prescriptive MAC Model for recommending the MAC was created based on findings from the literature review and questionnaires. The MAC Model consists of processes concerning: Information retrieval (IR); Information extraction (IE); Information storage; and Query-independent ranking. Analysis of IR and IE helped to identify problems experienced when processing text. Furthermore, appropriate techniques and algorithms were identified that can process legal documents and extract specific facts. The extracted facts were then further processed to allow for storage and processing by query-independent ranking algorithms. The processes incorporated into the model were then used to create a proof-of-concept prototype called the IE Prototype. The IE Prototype implements two processes called the IE process and the Database process. The IE process analyses different sections of a legal case to extract specific facts. The Database process then ensures that the extracted facts are stored in a document database for future querying purposes. The IE Prototype was evaluated using the technical risk and efficacy strategy from the Framework for Evaluation of Design Science. Both formative and summative evaluations were conducted. Formative evaluations were conducted to identify functional issues of the prototype whilst summative evaluations made use of real-world legal cases to test the prototype. Multiple experiments were conducted on legal cases, known as source cases, that resulted in facts from the source cases being extracted. For the purpose of the experiments, the term “source case” was used to distinguish between a legal case in its entirety and a legal case’s list of cases referred to. Two types of NoSQL databases were investigated for implementation namely, a graph database and a document database. Setting up the graph database required little time. However, development issues prevented the graph database from being successfully implemented in the proof-of-concept prototype. A document database was successfully implemented as an alternative for the proof-of-concept prototype. Analysis of the source cases used to evaluate the IE Prototype revealed that 96% of the source cases were categorised as being partially extracted. The results also revealed that the IE Prototype was capable of processing large amounts of source cases at a given time.
- Full Text:
- Date Issued: 2019
A strategy for sustainable ICT development in deep rural environments
- Authors: Medupe, Tsietsi Jacob
- Date: 2019
- Subjects: Information technology , Sustainable development Information technology -- Developing countries Rural development -- Developing countries
- Language: English
- Type: Thesis , Doctoral , DPhil
- Identifier: http://hdl.handle.net/10948/41438 , vital:36483
- Description: This study provides a strategy for sustainable Information and Communcation Technology (ICT) development in deep rural environments and describes a case study conducted within the community of the AmaJingqi traditional council. It investigates the sustainability of the ICT services within a rural environments, the income profile and affordability of different members of the community and the strategy formulation model. The study’s main focus is on creating a strategy to be used as a guideline for the successful development and implementation of sustainable ICT development in deep rural environments and on defining ICT Sustainability. Furthermore, the different ICT users are profiled based on affordability and access to services, and deep rural environments are also defined. Moreover, the study describes the complete composition of sustainable ICT. It discusses design science research methodology and the reasons why the method is used is motivated and advanced. The study also outlines various research paradigms and philosophies and a number of research strategies are also discussed. The literature review focuses on various policies and frameworks which have been formulated to advance the universal access of ICT services by rural communities. It also outlines some of the ICT initiatives which have failed and the reasons for the failures and what will be corrected for similar mistakes not to be repeated. The study discusses the concepts of a strategy framework that outlines the theoretical foundation of the strategy formulation model, strategy implementation and control. It also discusses the diagnostics and outlines the various strategy guiding polices. The strategy is validated, expert reviews are solicited and the strategy is revised and finalised.
- Full Text:
- Date Issued: 2019
- Authors: Medupe, Tsietsi Jacob
- Date: 2019
- Subjects: Information technology , Sustainable development Information technology -- Developing countries Rural development -- Developing countries
- Language: English
- Type: Thesis , Doctoral , DPhil
- Identifier: http://hdl.handle.net/10948/41438 , vital:36483
- Description: This study provides a strategy for sustainable Information and Communcation Technology (ICT) development in deep rural environments and describes a case study conducted within the community of the AmaJingqi traditional council. It investigates the sustainability of the ICT services within a rural environments, the income profile and affordability of different members of the community and the strategy formulation model. The study’s main focus is on creating a strategy to be used as a guideline for the successful development and implementation of sustainable ICT development in deep rural environments and on defining ICT Sustainability. Furthermore, the different ICT users are profiled based on affordability and access to services, and deep rural environments are also defined. Moreover, the study describes the complete composition of sustainable ICT. It discusses design science research methodology and the reasons why the method is used is motivated and advanced. The study also outlines various research paradigms and philosophies and a number of research strategies are also discussed. The literature review focuses on various policies and frameworks which have been formulated to advance the universal access of ICT services by rural communities. It also outlines some of the ICT initiatives which have failed and the reasons for the failures and what will be corrected for similar mistakes not to be repeated. The study discusses the concepts of a strategy framework that outlines the theoretical foundation of the strategy formulation model, strategy implementation and control. It also discusses the diagnostics and outlines the various strategy guiding polices. The strategy is validated, expert reviews are solicited and the strategy is revised and finalised.
- Full Text:
- Date Issued: 2019
- «
- ‹
- 1
- ›
- »