Abstract
Workflow systems are designed to support the process automation of large scale business and scientific applications. In recent years, many workflow systems have been deployed on high performance computing infrastructures such as cluster, peer-to-peer (p2p), and grid computing (Moore, 2004; Wang, Jie, & Chen, 2009; Yang, Liu, Chen, Lignier, & Jin, 2007). One of the driving forces is the increasing demand of large scale instance and data/computation intensive workflow applications (large scale workflow applications for short) which are common in both eBusiness and eScience application areas. Typical examples (will be detailed in Section 13.2.1) include such as the transaction intensive nation-wide insurance claim application process; the data and computation intensive pulsar searching process in Astrophysics. Generally speaking, instance intensive applications are those processes which need to be executed for a large number of times sequentially within a very short period or concurrently with a large number of instances (Liu, Chen, Yang, & Jin, 2008; Liu et al., 2010; Yang et al., 2008). Therefore, large scale workflow applications normally require the support of high performance computing infrastructures (e.g. advanced CPU units, large memory space and high speed network), especially when workflow activities are of data and computation intensive themselves. In the real world, to accommodate such a request, expensive computing infrastructures including such as supercomputers and data servers are bought, installed, integrated and maintained with huge cost by system users
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ardagna, D., & Pernici, B. (2007). Adaptive service composition in flexible processes. IEEE Transactions on Software Engineering, 33(6), 369–384.
Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R. H., Konwinski, A., et al. (2009). Above the clouds: A Berkeley view of cloud computing (Tech. Rep., University of California, Berkeley).
de Assuncao, M. D., di Costanzo, A., & Buyya, R. (2009). Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters. Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, Garching, Germany, 1–10.
Bhargav-spantzel, A., Squicciarini, A. C., & Bertino, E. (2007). Trust negotiation in identity management. IEEE Security & Privacy, 5(2), 55–63.
Bose, R., & Frew, J. (2005). Lineage retrieval for scientific data processing: A survey. ACM Computing Surveys, 37(1), 1–28.
Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., & Brandic, I. (2009). Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation Computer Systems, 25(6), 599–616.
Calheiros, R. N., Ranjan, R., De Rose, C. A. F., & Buyya, R. (2009). CloudSim: A novel framework for modeling and simulation of cloud computing infrastructures and services (Tech. Rep., Grid Computing and Distributed Systems (GRIDS) Laboratory, Department of Computer Science and Software Engineering,The University of Melbourne).
Chen, J., & Yang, Y. (2007). Multiple states based temporal consistency for dynamic verification of fixed-time constraints in grid workflow systems. Concurrency and Computation: Practice and Experience (Wiley), 19(7), 965–982.
Chen, J., & Yang, Y. (2008). A taxonomy of grid workflow verification and validation. Concurrency and Computation: Practice and Experience, 20(4), 347–360.
Chen, J., & Yang, Y. (2010). Temporal dependency based checkpoint selection for dynamic verification of temporal constraints in scientific workflow systems. ACM Transactions on Software Engineering and Methodology, to appear. Retrieved 1st February 2010, from http://www.swinflow.org/papers/TOSEM.pdf.
Chen, J., & Yang, Y. (2007). Adaptive selection of necessary and sufficient checkpoints for dynamic verification of temporal constraints in grid workflow systems. ACM Transactions on Autonomous and Adaptive Systems, 2(2), Article 6.
Chen, J., & Yang, Y. (2008). Temporal dependency based checkpoint selection for dynamic verification of fixed-time constraints in grid workflow systems. Proceedings of the 30th International Conference on Software Engineering (ICSE 2008), Leipzig, Germany, 141–150.
Chervenak, A., Deelman, E., Livny, M., Su, M. H., Schuler, R., Bharathi, S., et al. (2007). Data placement for scientific applications in distributed environments. Proceedings of the 8th Grid Computing Conference, 267–274.
Deelman, E., Gannon, D., Shields, M., & Taylor, I. (2008). Workflows and e-Science: An overview of workflow system features and capabilities. Future Generation Computer Systems, 25(6), 528–540.
Deelman, E., & Chervenak, A. (2008). Data management challenges of data-intensive scientific workflows. Proceedings of the IEEE International Symposium on Cluster Computing and the Grid, 687–692.
Deelman, E., Singh, G., Livny, M., Berriman, B., & Good, J. (2008). The cost of doing science on the cloud: The montage example. Proceedings of the ACM/IEEE Conference on Supercomputing, Austin, TX, 1–12.
Erl, T. (2008). SOA: Principles of service design. Upper Saddle River, NJ: Prentice Hall.
Foster, I., Zhao, Y., Raicu, I., & Lu, S. (2008). Cloud computing and grid computing 360-degree compared. Proceedings of the Grid Computing Environments Workshop, 2008, GCE '08, 1–10.
Hadoop (2009). Retrieved 1st September 2009 from http://hadoop.apache.org/.
Hess, A., Holt, J., Jacobson, J., & Seamons, K. E. (2004). Content-triggered trust negotiation. ACM Transactions on Information and System Security, 7(3), 428–456.
Hoffa, C., Mehta, G., Freeman, T., Deelman, E., Keahey, K., Berriman, B., et al. (2008). On the use of cloud computing for scientific workflows. Proceedings of the 4th IEEE International Conference on e-Science, 640–645.
Keahey, K., Figueiredo, R., Fortes, J., Freeman, T., & Tsugawa, M. (2008). Science clouds: Early experiences in cloud computing for scientific applications. Proceedings of the First Workshop on Cloud Computing and its Applications (CCA'08), 1–6.
Kondo, D., Javadi, B., Malecot, P., Cappello, F., & Anderson, D. P. (2009). Cost-benefit analysis of cloud computing versus desktop grids. Proceedings of the IEEE International Symposium on Parallel & Distributed Processing, IPDPS'09, 1–12.
Lin, C., Varadharajan, V., Wang, Y., & Pruthi, V.t. (2004). Enhancing grid security with trust management. Proceedings of the 2004 IEEE International Conference on Services Computing (SCC04), 303–310.
Liu, K., Chen, J. J., Yang, Y., & Jin, H. (2008). A throughput maximization strategy for scheduling transaction-intensive workflows on SwinDeW-G. Concurrency and Computation: Practice and Experience, 20(15), 1807–1820.
Liu, K., Jin, H., Chen, J., Liu, X., Yuan, D., & Yang, Y. (2010). A compromised-time-cost scheduling algorithm in SwinDeW-C for instance-intensive cost-constrained workflows on cloud computing platform. International Journal of High Performance Computing Applications.
Liu, X., Chen, J., Liu, K., & Yang, Y. (2008). Forecasting duration intervals of scientific workflow activities based on time-series patterns. Proceedings of the 4th IEEE International Conference on e-Science (e-Science08), Indianapolis, IN, USA, 23–30.
Liu, X., Chen, J., Wu, Z., Ni, Z., Yuan, D., & Yang, Y. (2010). Handling recoverable temporal violations in scientific workflow systems: A workflow rescheduling based strategy. Proceedings of the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid10), Melbourne, Australia.
Liu, X., Chen, J., & Yang, Y. (September 2008). A probabilistic strategy for setting temporal constraints in scientific workflows. Proceedings of the 6th International Conference on Business Process Management (BPM08), Lecture Notes in Computer Science, Vol. 5240, Milan, Italy, 180–195.
McCormick, W. T., Sehweitzer, P. J., & White, T. W. (1972). Problem decomposition and data reorganization by a clustering technique. Operations Research, 20, 993–1009.
Moore, M. (2004). An accurate parallel genetic algorithm to schedule tasks on a cluster. Parallel Computing, 30, 567–583.
Moretti, C., Bulosan, J., Thain, D., & Flynn, P. J. (2008). All-Pairs: An abstraction for data-intensive cloud computing. Proceedings of the IEEE International Parallel and Distributed Processing Symposium, IPDPS'08, 1–11.
Askalon Project (2010). Retrieved 1st February 2010, from http://www.dps.uibk.ac.at/projects/askalon.
GrADS Project (2010). Retrieved 1st February 2010, from http://www.iges.org/grads/.
GridBus Project (2010). Retrieved 1st February 2010, from http://www.gridbus.org.
Kepler Project (2010). Retrieved 1st February 2010, from http://kepler-project.org/.
Pegasus Project (2010). Retrieved 1st February 2010, from http://pegasus.isi.edu/.
Taverna Project (2010). Retrieved 1st February 2010, from http://www.mygrid.org.uk/tools/taverna/.
Triana Project (2010). Retrieved 1st February 2010, from http://www.trianacode.org/.
Raghavan, B., Ramabhadran, S., Yocum, K., & Snoeren, A. C. (2007). Cloud control with distributed rate limiting. Proceedings of the 2007 ACM SIGCOMM, Kyoto, Japan, 337–348.
SECES (May 2008). Proceedings of the 1st International Workshop on Software Engineering for Computational Science and Engineering, in conjuction with the 30th International Conference on Software Engineering (ICSE2008), Leipzig, Germany.
Simmhan, Y. L., Plale, B., & Gannon, D. (2005). A survey of data provenance in e-Science. SIGMOD Rec. 34(3), 31–36.
VMware (2009). Retrieved 1st September 2009, from http://www.vmware.com/.
Wang, L. Z., Kunze, M., & Tao, J. (2008). Performance evaluation of virtual machine-based grid workflow system. http://doi.wiley.com/10.1002/cpe.1328, 1759–1771.
Wang, L. Z., Jie, W., & Chen, J. (2009). Grid computing: Infrastructure, service, and applications. Boca Raton, FL: CRC Press, Talyor & Francis Group.
Weiss, A. (2007). Computing in the cloud. ACM Networker, 11(4), 18–25.
Winsborough, W. H., & Li, N. H. (2006). Safety in automated trust negotiation. ACM Transactions on Information and System Security, 9(3), 352–390.
Yang, Y., Liu, K., Chen, J., Lignier, J., & Jin, H. (December 2007). Peer-to-peer based grid workflow runtime environment of swinDeW-G. Proceedings of the 3rd International Conference on e-Science and Grid Computing (e-Science07), Bangalore, India, 51–58.
Yang, Y., Liu, K., Chen, J., Liu, X., Yuan, D., & Jin, H. (December 2008). An algorithm in swinDeW-C for scheduling transaction-intensive cost-constrained cloud workflows. Proceedings of the 4th IEEE International Conference on e-Science (e-Science08), Indianapolis, IN, USA, 374–375.
Yu, J., & Buyya, R. (2005). A taxonomy of workflow management systems for grid computing. Journal of Grid Computing, (3), 171–200.
Yuan, D., Yang, Y., Liu, X., & Chen, J. (2010). A cost-effective strategy for intermediate data storage in scientific cloud workflow systems. Proceedings of the 24th IEEE International Parallel & Distributed Processing Symposium, Atlanta, GA, USA, to appear. Retrieved 1st February 2010, from http://www.ict.swin.edu.au/personal/yyang/papers/IPDPS10-IntermediateData.pdf..
Yuan, D., Yang, Y., Liu, X., & Chen, J. A data placement strategy in cloud scientific workflows. Future Generation Computer Systems, in press. http://dx.doi.org/10.1016/j.future.2010.02.004.
Acknowledgment
This work is partially supported by Australian Research Council under Linkage Project LP0990393.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Liu, X., Yuan, D., Zhang, G., Chen, J., Yang, Y. (2010). SwinDeW-C: A Peer-to-Peer Based Cloud Workflow System. In: Furht, B., Escalante, A. (eds) Handbook of Cloud Computing. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6524-0_13
Download citation
DOI: https://doi.org/10.1007/978-1-4419-6524-0_13
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-6523-3
Online ISBN: 978-1-4419-6524-0
eBook Packages: Computer ScienceComputer Science (R0)