How general-purpose can a GPU be?
- Authors: Machanick, Philip
- Date: 2015
- Language: English
- Type: article , text
- Identifier: http://hdl.handle.net/10962/61180 , vital:27988 , http://dx.doi.org/10.18489/sacj.v0i57.347
- Description: The use of graphics processing units (GPUs) in general-purpose computation (GPGPU) is a growing field. GPU instruction sets, while implementing a graphics pipeline, draw from a range of single instruction multiple datastream (SIMD) architectures characteristic of the heyday of supercomputers. Yet only one of these SIMD instruction sets has been of application on a wide enough range of problems to survive the era when the full range of supercomputer design variants was being explored: vector instructions. Supercomputers covered a range of exotic designs such as hypercubes and the Connection Machine (Fox, 1989). The latter is likely the source of the snide comment by Cray: it had thousands of relatively low-speed CPUs (Tucker & Robertson, 1988). Since Cray won, why are we not basing our ideas on his designs (Cray Inc., 2004), rather than those of the losers? The Top 500 supercomputer list is dominated by general-purpose CPUs, and nothing like the Connection Machine that headed the list in 1993 still exists.
- Full Text:
- Date Issued: 2015
- Authors: Machanick, Philip
- Date: 2015
- Language: English
- Type: article , text
- Identifier: http://hdl.handle.net/10962/61180 , vital:27988 , http://dx.doi.org/10.18489/sacj.v0i57.347
- Description: The use of graphics processing units (GPUs) in general-purpose computation (GPGPU) is a growing field. GPU instruction sets, while implementing a graphics pipeline, draw from a range of single instruction multiple datastream (SIMD) architectures characteristic of the heyday of supercomputers. Yet only one of these SIMD instruction sets has been of application on a wide enough range of problems to survive the era when the full range of supercomputer design variants was being explored: vector instructions. Supercomputers covered a range of exotic designs such as hypercubes and the Connection Machine (Fox, 1989). The latter is likely the source of the snide comment by Cray: it had thousands of relatively low-speed CPUs (Tucker & Robertson, 1988). Since Cray won, why are we not basing our ideas on his designs (Cray Inc., 2004), rather than those of the losers? The Top 500 supercomputer list is dominated by general-purpose CPUs, and nothing like the Connection Machine that headed the list in 1993 still exists.
- Full Text:
- Date Issued: 2015
MARS: Motif Assessment and Ranking Suite for transcription factor binding motifs
- Kibet, Caleb K, Machanick, Philip
- Authors: Kibet, Caleb K , Machanick, Philip
- Date: 2016
- Language: English
- Type: article , text
- Identifier: http://hdl.handle.net/10962/61155 , vital:27985 , http://dx.doi.org/10.1101/065615
- Description: We describe MARS (Motif Assessment and Ranking Suite), a web-based suite of tools used to evaluate and rank PWM-based motifs. The increased number of learned motif models that are spread across databases and in different PWM formats, leading to a choice dilemma among the users, is our motivation. This increase has been driven by the difficulty of modelling transcription factor binding sites and the advance in high-throughput sequencing technologies at a continually reducing cost. Therefore, several experimental techniques have been developed resulting in diverse motif-finding algorithms and databases. We collate a wide variety of available motifs into a benchmark database, including the corresponding experimental ChIP-seq and PBM data obtained from ENCODE and UniPROBE databases, respectively. The implemented tools include: a data-independent consistency-based motif assessment and ranking (CB-MAR), which is based on the idea that `correct motifs' are more similar to each other while incorrect motifs will differ from each other; and a scoring and classification-based algorithms, which rank binding models by their ability to discriminate sequences known to contain binding sites from those without. The CB-MAR and scoring techniques have a 0.86 and 0.73 median rank correlation using ChIP-seq and PBM respectively. Best motifs selected by CB-MAR achieve a mean AUC of 0.75, comparable to those ranked by held out data at 0.76 { this is based on ChIP-seq motif discovery using five algorithms on 110 transcription factors. We have demonstrated the benefit of this web server in motif choice and ranking, as well as in motif.
- Full Text:
- Date Issued: 2016
- Authors: Kibet, Caleb K , Machanick, Philip
- Date: 2016
- Language: English
- Type: article , text
- Identifier: http://hdl.handle.net/10962/61155 , vital:27985 , http://dx.doi.org/10.1101/065615
- Description: We describe MARS (Motif Assessment and Ranking Suite), a web-based suite of tools used to evaluate and rank PWM-based motifs. The increased number of learned motif models that are spread across databases and in different PWM formats, leading to a choice dilemma among the users, is our motivation. This increase has been driven by the difficulty of modelling transcription factor binding sites and the advance in high-throughput sequencing technologies at a continually reducing cost. Therefore, several experimental techniques have been developed resulting in diverse motif-finding algorithms and databases. We collate a wide variety of available motifs into a benchmark database, including the corresponding experimental ChIP-seq and PBM data obtained from ENCODE and UniPROBE databases, respectively. The implemented tools include: a data-independent consistency-based motif assessment and ranking (CB-MAR), which is based on the idea that `correct motifs' are more similar to each other while incorrect motifs will differ from each other; and a scoring and classification-based algorithms, which rank binding models by their ability to discriminate sequences known to contain binding sites from those without. The CB-MAR and scoring techniques have a 0.86 and 0.73 median rank correlation using ChIP-seq and PBM respectively. Best motifs selected by CB-MAR achieve a mean AUC of 0.75, comparable to those ranked by held out data at 0.76 { this is based on ChIP-seq motif discovery using five algorithms on 110 transcription factors. We have demonstrated the benefit of this web server in motif choice and ranking, as well as in motif.
- Full Text:
- Date Issued: 2016
Preliminary thoughts on services without servers
- Authors: Machanick, Philip , Hunt, K
- Date: 2014
- Language: English
- Type: Conference paper
- Identifier: vital:6612 , http://hdl.handle.net/10962/d1014082
- Description: Warehouse-scale computing supports cloud-based services such as shared disk space, computation services and social networks. Although warehouse-scale computing is inexpensive per user, the cost to entry is high, and the pressures to generate revenues to cover costs leads service providers to pursue monetizing services aggressively. In this paper, we explore some ideas for removing the need for central servers by exploiting peer-to-peer technologies.
- Full Text:
- Date Issued: 2014
- Authors: Machanick, Philip , Hunt, K
- Date: 2014
- Language: English
- Type: Conference paper
- Identifier: vital:6612 , http://hdl.handle.net/10962/d1014082
- Description: Warehouse-scale computing supports cloud-based services such as shared disk space, computation services and social networks. Although warehouse-scale computing is inexpensive per user, the cost to entry is high, and the pressures to generate revenues to cover costs leads service providers to pursue monetizing services aggressively. In this paper, we explore some ideas for removing the need for central servers by exploiting peer-to-peer technologies.
- Full Text:
- Date Issued: 2014
Back to good health
- Authors: Machanick, Philip
- Date: 2017
- Language: English
- Type: article , text
- Identifier: http://hdl.handle.net/10962/61144 , vital:27984 , http://dx.doi.org/10.18489/sacj.v29i3.565
- Description: From introduction: We have a bumper issue, with eleven research papers and one letter to the editor. 2016 was a difficult year for academia in South Africa with highly disruptive protests. 2017 was mostly better from that point of view, though the protest movement has not completely gone away. This issue contains some papers that were submissions to special issues that were not ready in time and hence to some extent is a catch-up issue. In previous issues this year, 29(1), published in July, contained nine research papers, of which five were extended papers from the 2016 SAICSIT annual conference. There was also a special issue on ICT in Education published in October, 29(2), which had five research papers. Two papers from the ICT in Education special issue spilled over to this issue. Overall, we have published 25 research papers this year, compared with four in 2016, fourteen in 2015 and nineteen in 2014. Numbers are therefore looking healthy again; I hope the underlying causes of protest are addressed so we do not have to endure another year like 2016. In the remainder of this editorial, I give an update on the effects of indexing in Scopus, list papers in this issue and end with changes in the editorial team.
- Full Text:
- Date Issued: 2017
- Authors: Machanick, Philip
- Date: 2017
- Language: English
- Type: article , text
- Identifier: http://hdl.handle.net/10962/61144 , vital:27984 , http://dx.doi.org/10.18489/sacj.v29i3.565
- Description: From introduction: We have a bumper issue, with eleven research papers and one letter to the editor. 2016 was a difficult year for academia in South Africa with highly disruptive protests. 2017 was mostly better from that point of view, though the protest movement has not completely gone away. This issue contains some papers that were submissions to special issues that were not ready in time and hence to some extent is a catch-up issue. In previous issues this year, 29(1), published in July, contained nine research papers, of which five were extended papers from the 2016 SAICSIT annual conference. There was also a special issue on ICT in Education published in October, 29(2), which had five research papers. Two papers from the ICT in Education special issue spilled over to this issue. Overall, we have published 25 research papers this year, compared with four in 2016, fourteen in 2015 and nineteen in 2014. Numbers are therefore looking healthy again; I hope the underlying causes of protest are addressed so we do not have to endure another year like 2016. In the remainder of this editorial, I give an update on the effects of indexing in Scopus, list papers in this issue and end with changes in the editorial team.
- Full Text:
- Date Issued: 2017
Transcription factor motif quality assessment requires systematic comparative analysis [version 2; referees: 2 approved]
- Kibet, Caleb K, Machanick, Philip
- Authors: Kibet, Caleb K , Machanick, Philip
- Date: 2016
- Language: English
- Type: article , text
- Identifier: http://hdl.handle.net/10962/61169 , vital:27987 , http://dx.doi.org/10.12688/f1000research.7408.2
- Description: Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis.
- Full Text:
- Date Issued: 2016
- Authors: Kibet, Caleb K , Machanick, Philip
- Date: 2016
- Language: English
- Type: article , text
- Identifier: http://hdl.handle.net/10962/61169 , vital:27987 , http://dx.doi.org/10.12688/f1000research.7408.2
- Description: Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis.
- Full Text:
- Date Issued: 2016
How to establish a bioinformatics postgraduate degree programme—a case study from South Africa
- Machanick, Philip, Tastan Bishop, Özlem
- Authors: Machanick, Philip , Tastan Bishop, Özlem
- Date: 2014
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/124641 , vital:35641 , https://doi.10.1093/bib/bbu014
- Description: The Research Unit in Bioinformatics at Rhodes University (RUBi), South Africa, offers a Masters of Science in Bioinformatics.Growing demand for bioinformatics qualifications results in applications from across Africa.Courses aim to bridge gaps in the diverse backgrounds of students who range from biologists with no prior computing exposure to computer scientists with no biology background. The programme is evenly split between coursework and research, with diverse modules from a range of departments coveringmathematics, statistics, computer science and biology, with emphasis on application to bioinformatics research. The early focus on research helps bring students up to speed with working as a researcher. We measure success of the programme by the high rate of subsequent entry to PhD study: 10 of 14 students who completed in the years 2011-2013.
- Full Text:
- Date Issued: 2014
- Authors: Machanick, Philip , Tastan Bishop, Özlem
- Date: 2014
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/124641 , vital:35641 , https://doi.10.1093/bib/bbu014
- Description: The Research Unit in Bioinformatics at Rhodes University (RUBi), South Africa, offers a Masters of Science in Bioinformatics.Growing demand for bioinformatics qualifications results in applications from across Africa.Courses aim to bridge gaps in the diverse backgrounds of students who range from biologists with no prior computing exposure to computer scientists with no biology background. The programme is evenly split between coursework and research, with diverse modules from a range of departments coveringmathematics, statistics, computer science and biology, with emphasis on application to bioinformatics research. The early focus on research helps bring students up to speed with working as a researcher. We measure success of the programme by the high rate of subsequent entry to PhD study: 10 of 14 students who completed in the years 2011-2013.
- Full Text:
- Date Issued: 2014
- «
- ‹
- 1
- ›
- »