Automated Optimization Methods for Scientific Workflows in e-Science Infrastructures

Automated Optimization Methods for Scientific Workflows in e-Science Infrastructures
Author: Sonja Holl
Publisher: Forschungszentrum Jülich
Total Pages: 207
Release: 2014
Genre:
ISBN: 389336949X

Download Automated Optimization Methods for Scientific Workflows in e-Science Infrastructures Book in PDF, Epub and Kindle

Scientific workflows have emerged as a key technology that assists scientists with the design, management, execution, sharing and reuse of in silico experiments. Workflow management systems simplify the management of scientific workflows by providing graphical interfaces for their development, monitoring and analysis. Nowadays, e-Science combines such workflow management systems with large-scale data and computing resources into complex research infrastructures. For instance, e-Science allows the conveyance of best practice research in collaborations by providing workflow repositories, which facilitate the sharing and reuse of scientific workflows. However, scientists are still faced with different limitations while reusing workflows. One of the most common challenges they meet is the need to select appropriate applications and their individual execution parameters. If scientists do not want to rely on default or experience-based parameters, the best-effort option is to test different workflow set-ups using either trial and error approaches or parameter sweeps. Both methods may be inefficient or time consuming respectively, especially when tuning a large number of parameters. Therefore, scientists require an effective and efficient mechanism that automatically tests different workflow set-ups in an intelligent way and will help them to improve their scientific results. This thesis addresses the limitation described above by defining and implementing an approach for the optimization of scientific workflows. In the course of this work, scientists’ needs are investigated and requirements are formulated resulting in an appropriate optimization concept. In a following step, this concept is prototypically implemented by extending a workflow management system with an optimization framework, including general mechanisms required to conduct workflow optimization. As optimization is an ongoing research topic, different algorithms are provided by pluggable extensions (plugins) that can be loosely coupled with the framework, resulting in a generic and quickly extendable system. In this thesis, an exemplary plugin is introduced which applies a Genetic Algorithm for parameter optimization. In order to accelerate and therefore make workflow optimization feasible at all, e-Science infrastructures are utilized for the parallel execution of scientific workflows. This is empowered by additional extensions enabling the execution of applications and workflows on distributed computing resources. The actual implementation and therewith the general approach of workflow optimization is experimentally verified by four use cases in the life science domain. All workflows were significantly improved, which demonstrates the advantage of the proposed workflow optimization. Finally, a new collaboration-based approach is introduced that harnesses optimization provenance to make optimization faster and more robust in the future.

Automated Workflow Scheduling in Self-Adaptive Clouds

Automated Workflow Scheduling in Self-Adaptive Clouds
Author: G. Kousalya
Publisher: Springer
Total Pages: 238
Release: 2017-05-25
Genre: Computers
ISBN: 3319569821

Download Automated Workflow Scheduling in Self-Adaptive Clouds Book in PDF, Epub and Kindle

This timely text/reference presents a comprehensive review of the workflow scheduling algorithms and approaches that are rapidly becoming essential for a range of software applications, due to their ability to efficiently leverage diverse and distributed cloud resources. Particular emphasis is placed on how workflow-based automation in software-defined cloud centers and hybrid IT systems can significantly enhance resource utilization and optimize energy efficiency. Topics and features: describes dynamic workflow and task scheduling techniques that work across multiple (on-premise and off-premise) clouds; presents simulation-based case studies, and details of real-time test bed-based implementations; offers analyses and comparisons of a broad selection of static and dynamic workflow algorithms; examines the considerations for the main parameters in projects limited by budget and time constraints; covers workflow management systems, workflow modeling and simulation techniques, and machine learning approaches for predictive workflow analytics. This must-read work provides invaluable practical insights from three subject matter experts in the cloud paradigm, which will empower IT practitioners and industry professionals in their daily assignments. Researchers and students interested in next-generation software-defined cloud environments will also greatly benefit from the material in the book.

Scientific Workflows

Scientific Workflows
Author: Jun Qin
Publisher: Springer Science & Business Media
Total Pages: 228
Release: 2012-08-15
Genre: Computers
ISBN: 3642307159

Download Scientific Workflows Book in PDF, Epub and Kindle

Creating scientific workflow applications is a very challenging task due to the complexity of the distributed computing environments involved, the complex control and data flow requirements of scientific applications, and the lack of high-level languages and tools support. Particularly, sophisticated expertise in distributed computing is commonly required to determine the software entities to perform computations of workflow tasks, the computers on which workflow tasks are to be executed, the actual execution order of workflow tasks, and the data transfer between them. Qin and Fahringer present a novel workflow language called Abstract Workflow Description Language (AWDL) and the corresponding standards-based, knowledge-enabled tool support, which simplifies the development of scientific workflow applications. AWDL is an XML-based language for describing scientific workflow applications at a high level of abstraction. It is designed in a way that allows users to concentrate on specifying such workflow applications without dealing with either the complexity of distributed computing environments or any specific implementation technology. This research monograph is organized into five parts: overview, programming, optimization, synthesis, and conclusion, and is complemented by an appendix and an extensive reference list. The topics covered in this book will be of interest to both computer science researchers (e.g. in distributed programming, grid computing, or large-scale scientific applications) and domain scientists who need to apply workflow technologies in their work, as well as engineers who want to develop distributed and high-throughput workflow applications, languages and tools.

Workflows for e-Science

Workflows for e-Science
Author: Ian J. Taylor
Publisher: Springer Science & Business Media
Total Pages: 532
Release: 2007-12-31
Genre: Computers
ISBN: 184628757X

Download Workflows for e-Science Book in PDF, Epub and Kindle

This is a timely book presenting an overview of the current state-of-the-art within established projects, presenting many different aspects of workflow from users to tool builders. It provides an overview of active research, from a number of different perspectives. It includes theoretical aspects of workflow and deals with workflow for e-Science as opposed to e-Commerce. The topics covered will be of interest to a wide range of practitioners.

Web Portal Design, Implementation, Integration, and Optimization

Web Portal Design, Implementation, Integration, and Optimization
Author: Polgar, Jana
Publisher: IGI Global
Total Pages: 296
Release: 2013-01-31
Genre: Computers
ISBN: 1466627808

Download Web Portal Design, Implementation, Integration, and Optimization Book in PDF, Epub and Kindle

Web Portal Design, Implementation, Integration, and Optimization discusses the challenges faced in building web services and integrating applications in order to reach the successful benefits web portals bring to an organization. This collection of research aims to be a resource for researchers, developers, and industry practitioners involved in the technological, business, organizational and social dimensions of web portals.

Modeling, Analysis, and Optimization of Data-driven Scientific Workflows

Modeling, Analysis, and Optimization of Data-driven Scientific Workflows
Author: Sven Köhler
Publisher:
Total Pages:
Release: 2014
Genre:
ISBN: 9781321018202

Download Modeling, Analysis, and Optimization of Data-driven Scientific Workflows Book in PDF, Epub and Kindle

This dissertation presents improvements to the modeling and efficient execution of scientific workflows. Many scientific workflow systems have been developed to solve a specific problem well, but many fail to address needs of a broader group of scientists. While there may never be a system that can satisfy all needs completely, a better balance between diverging design goals can be found. To this end, this work identifies a number of desiderata that occur in the design of a scientific workflow system and discusses to which degree they are addressed in current scientific workflow systems. A selection of systems is presented in detail and strengths and weaknesses with respect to the desiderata are described. From this discussion, beneficial characteristics, properties and implementation details of scientific workflow systems are derived, yielding a proposal for an improved scientific workflow system. Recently, the declarative database language Datalog gained popularity in research and was used in workflow-oriented projects. Therefore, the use of Datalog as (i) a workflow description language and (ii) as a tool for implementing components is investigated. Different and novel approaches to understand, visualize and profile the evaluation of a Datalog program are developed and demonstrated. Finally, new techniques for capturing and employing data and workflow provenance are developed. For example, provenance information is used to understand and debug database queries and workflow execution traces, or to more efficiently resume workflow execution after parameter changes or even system crashes. Provenance is critical for scientists using workflow systems and is therefore studied extensively. This dissertation presents an overview of current research topics in the field of provenance and some methods used to analyze provenance data using Datalog. When Datalog is used as a workflow description language, provenance of data has to be defined and available. Conversely, research in the field of database systems and Datalog can be extended to scientific workflow systems, for example to capture and analyze provenance. A new game-theoretic notion of provenance is presented that yields a detailed visual description of Why/How provenance for facts but also provide answers to Why-Not questions for missing facts in the result. A novel modification of the provenance game construction is sketched that removes dependencies on the active domain from the provenance explanations. Returning to classical workflow systems, some approaches to model and automate scientific problem solving are studied and discussed. This ultimately leads to the definition of a new scientific workflow system that is based on existing concepts that were identified as beneficial earlier but strives to improve on weaknesses identified in the presented case studies. Finally, a new method to improve fault tolerance of a scientific workflow system, which demonstrates all technologies discussed, is presented. Provenance of the workflow execution is analyzed, for example using Datalog, and used to speed up recovery of the workflow execution after a failure.

Guide to e-Science

Guide to e-Science
Author: Xiaoyu Yang
Publisher: Springer Science & Business Media
Total Pages: 554
Release: 2011-05-26
Genre: Computers
ISBN: 0857294393

Download Guide to e-Science Book in PDF, Epub and Kindle

This guidebook on e-science presents real-world examples of practices and applications, demonstrating how a range of computational technologies and tools can be employed to build essential infrastructures supporting next-generation scientific research. Each chapter provides introductory material on core concepts and principles, as well as descriptions and discussions of relevant e-science methodologies, architectures, tools, systems, services and frameworks. Features: includes contributions from an international selection of preeminent e-science experts and practitioners; discusses use of mainstream grid computing and peer-to-peer grid technology for “open” research and resource sharing in scientific research; presents varied methods for data management in data-intensive research; investigates issues of e-infrastructure interoperability, security, trust and privacy for collaborative research; examines workflow technology for the automation of scientific processes; describes applications of e-science.

Cloud Computing with e-Science Applications

Cloud Computing with e-Science Applications
Author: Olivier Terzo
Publisher: CRC Press
Total Pages: 320
Release: 2017-12-19
Genre: Computers
ISBN: 1466591161

Download Cloud Computing with e-Science Applications Book in PDF, Epub and Kindle

The amount of data in everyday life has been exploding. This data increase has been especially significant in scientific fields, where substantial amounts of data must be captured, communicated, aggregated, stored, and analyzed. Cloud Computing with e-Science Applications explains how cloud computing can improve data management in data-heavy fields such as bioinformatics, earth science, and computer science. The book begins with an overview of cloud models supplied by the National Institute of Standards and Technology (NIST), and then: Discusses the challenges imposed by big data on scientific data infrastructures, including security and trust issues Covers vulnerabilities such as data theft or loss, privacy concerns, infected applications, threats in virtualization, and cross-virtual machine attack Describes the implementation of workflows in clouds, proposing an architecture composed of two layers—platform and application Details infrastructure-as-a-service (IaaS), platform-as-a-service (PaaS), and software-as-a-service (SaaS) solutions based on public, private, and hybrid cloud computing models Demonstrates how cloud computing aids in resource control, vertical and horizontal scalability, interoperability, and adaptive scheduling Featuring significant contributions from research centers, universities, and industries worldwide, Cloud Computing with e-Science Applications presents innovative cloud migration methodologies applicable to a variety of fields where large data sets are produced. The book provides the scientific community with an essential reference for moving applications to the cloud.