OPTIMIZED DISTRIBUTED CLOUD ARCHITECTURES FOR ENTERPRISE SCALE DATA ENGINEERING APPLICATIONS

Authors

  • Hardik Patel Author

DOI:

https://doi.org/10.65009/re52cm82

Keywords:

Optimized Distributed Cloud Architecture, Enterprise-Scale Data Engineering, Latency-Aware Resource Allocation, Predictive Autoscaling, Hybrid Edge–Core–Cloud Integration, Intelligent Orchestration, and Fault-Tolerant Computing.,,

Abstract

Distributed cloud computing is now an integral part of enterprise-scale data engineering with 
the large-scale heterogeneous workloads requiring low latency, high throughput and resilient 
execution on geographically distributed resources. The paper introduces an efficient distributed 
cloud architecture that incorporates adaptive workload profiling, latency conscious resource 
mapping, data sensitive placement, predictive autoscaling with learning-based demand 
prediction, and the intelligent fault-containment between edge and core and multi-cloud 
systems. The framework dynamically optimizes resource usage, reduces data transfer cost and 
anticipates failure by hostilely avoiding failures by anomaly-conscious migration and recovery 
provisions. Significant performance improvements are measured with large synthetic and real 
enterprise workload traces, proving to be much higher than the current hybrid and distributed 
workload architectures. The suggested system is 27-35% faster in terms of execution time, 22% 
faster in terms of throughput, and 30% better in terms of overall resource consumption, and 
cuts Migration overhead by a wide margin, energy consumption, and cost of operation. The 
proactive resilience measures also significantly cut fault recovery time and probability of 
failure. The findings suggest that the architecture presents a scalable, efficient, and enterprise
scale base of next-generation data engineering applications running in distributed cloud 
environments.

,

References

B. Cheng, G. Solmaz, F. Cirillo, E. Kovacs, K. Terasawa, and A. Kitazawa, ‘‘FogFlow:

Easy programming of IoT services over cloud and edges for smart cities,’’ IEEE Internet

Things J., vol. 5, no. 2, pp. 696–707, Apr. 2018.

P. S. Janardhanan and P. Samuel, Launch overheads of spark applications on standalone

and hadoop YARN clusters, in Advances in Electrical and Computer Technologies, T.

Sengodan, M. Murugappan, and S. Misra, eds. Singapore: Springer, 2020, pp. 47–54.

S. Salloum, J. Z. Huang, and Y. He, Exploring and cleaning big data with random sample

data blocks, J. Big Data, vol. 6, no. 1, p. 45, 2019.

T. Z. Emara and J. Z. Huang, A distributed data management system to support large

scale data analysis, J. Syst. Softw., vol. 148, pp. 105–115, 2019.

X. Li, A. Garcia-Saavedra, X. Costa-Perez, C. J. Bernardos, C. Guimaraes, K. Antevski,

J. Mangues-Bafalluy, J. Baranda, E. Zeydan, D. Corujo, P. Iovanna, G. Landi, J. Alonso,

P. Paixao, H. Martins, M. Lorenzo, J. Ordonez-Lucena, and D. R. Lopez, ‘‘5Growth: An

end-to-end service platform for automated deployment and management of vertical

services over 5G networks,’’ IEEE Commun. Mag., vol. 59, no. 3, pp. 84–90, Mar. 2021.

Z. Ahmad, S. Duppala, R. Chowdhury, and S. Skiena, ‘‘Improved MapReduce load

balancing through distribution-dependent hash function optimization,’’ in Proc. IEEE

th Int. Conf. Parallel Distrib. Syst. (ICPADS), Hong Kong, Dec. 2020, pp. 9–18.

R. Anil, G. Capan, I. Drost-Fromm, T. Dunning, E. Friedman, T. Grant, S. Quinn, P.

Ranjan, S. Schelter, and O. ¨ Yılmazeł, Apache mahout: Machine learning on distributed

dataflow systems, J. Mach. Learn. Res., vol. 21, no. 127, pp. 1–6, 2020.

A. Banerjee, ‘‘Blockchain with IoT: Applications and use cases for a new paradigm of

supply chain driving efficiency and cost,’’ in Advances in Computers, vol. 115.

Amsterdam, The Netherlands: Elsevier, 2019, pp. 259–292.

S. Perera, V. Gupta, and W. Buckley, ‘‘Management of online server congestion using

optimal demand throttling,’’ Eur. J. Oper. Res., vol. 285, no. 1, pp. 324–342, Feb. 2020.

S. Salloum, J. Z. Huang, and Y. He, Random sample partition: A distributed data model

for big data analysis, IEEE Trans. Industr. Inform., vol. 15, no. 11, pp. 5846– 5854, 2019.

S. Salloum, J. Z. Huang, and Y. He, Random sample partition: A distributed data model

for big data analysis, IEEE Trans. Industr. Inform., vol. 15, no. 11, pp. 5846– 5854, 2019.

P. S. Janardhanan and P. Samuel, Launch overheads of spark applications on standalone

and hadoop YARN clusters, in Advances in Electrical and Computer Technologies, T.

Sengodan, M. Murugappan, and S. Misra, eds. Singapore: Springer, 2020, pp. 47–54.

E. Zeydan, O. Dedeoglu, and Y. Turk, ‘‘Experimental evaluations of TDD-based massive

MIMO deployment for mobile network operators,’’ IEEE Access, vol. 8, pp. 33202

, 2020.

B. Varghese and R. Buyya, ‘‘Next generation cloud computing: New trends and research

directions,’’ Future Gener. Comput. Syst., vol. 79, pp. 849–861, Feb. 2018.

A. Daghistani, W. G. Aref, A. Ghafoor, and A. R. Mahmood, ‘‘SWARM: Adaptive load

balancing in distributed streaming systems for big spatial data,’’ ACM Trans. Spatial

Algorithms Syst., vol. 7, no. 3, pp. 1–43, Sep. 2021.

T. Z. Emara and J. Z. Huang, Distributed data strategies to support large-scale data

analysis across geo-distributed data centers, IEEE Access, vol. 8, pp. 178526–178538,

L. Globa and N. Gvozdetska, Comprehensive energy efficient approach to workload

processing in distributed computing environment, in Proc. 2020 IEEE Int. Black Sea

Conf. Communications and Networking (BlackSeaCom), Odessa, Ukraine, 2020, pp. 1

M. Chen, Z. Yang, W. Saad, C. Yin, H. V. Poor, and S. Cui, ‘‘A joint learning and

communications framework for federated learning over wireless networks,’’ IEEE Trans.

Wireless Commun., vol. 20, no. 1, pp. 269–283, Jan. 2021.

Downloads.

Published

2021-02-10

How to Cite

OPTIMIZED DISTRIBUTED CLOUD ARCHITECTURES FOR ENTERPRISE SCALE DATA ENGINEERING APPLICATIONS . (2021). Phoenix: International Multidisciplinary Research Journal ( Peer Reviewed High Impact Journal ), 1(1), 7-20. https://doi.org/10.65009/re52cm82