OPTIMIZED DISTRIBUTED CLOUD ARCHITECTURES FOR ENTERPRISE SCALE DATA ENGINEERING APPLICATIONS

Hardik Patel

doi:10.65009/re52cm82

Authors

Hardik Patel Author

DOI:

Keywords:

Optimized Distributed Cloud Architecture, Enterprise-Scale Data Engineering, Latency-Aware Resource Allocation, Predictive Autoscaling, Hybrid Edge–Core–Cloud Integration, Intelligent Orchestration, and Fault-Tolerant Computing.,,

Abstract

Distributed cloud computing is now an integral part of enterprise-scale data engineering with
the large-scale heterogeneous workloads requiring low latency, high throughput and resilient
execution on geographically distributed resources. The paper introduces an efficient distributed
cloud architecture that incorporates adaptive workload profiling, latency conscious resource
mapping, data sensitive placement, predictive autoscaling with learning-based demand
prediction, and the intelligent fault-containment between edge and core and multi-cloud
systems. The framework dynamically optimizes resource usage, reduces data transfer cost and
anticipates failure by hostilely avoiding failures by anomaly-conscious migration and recovery
provisions. Significant performance improvements are measured with large synthetic and real
enterprise workload traces, proving to be much higher than the current hybrid and distributed
workload architectures. The suggested system is 27-35% faster in terms of execution time, 22%
faster in terms of throughput, and 30% better in terms of overall resource consumption, and
cuts Migration overhead by a wide margin, energy consumption, and cost of operation. The
proactive resilience measures also significantly cut fault recovery time and probability of
failure. The findings suggest that the architecture presents a scalable, efficient, and enterprise
scale base of next-generation data engineering applications running in distributed cloud
environments.

,

References

B. Cheng, G. Solmaz, F. Cirillo, E. Kovacs, K. Terasawa, and A. Kitazawa, ‘‘FogFlow:

Easy programming of IoT services over cloud and edges for smart cities,’’ IEEE Internet

Things J., vol. 5, no. 2, pp. 696–707, Apr. 2018.

P. S. Janardhanan and P. Samuel, Launch overheads of spark applications on standalone

and hadoop YARN clusters, in Advances in Electrical and Computer Technologies, T.

Sengodan, M. Murugappan, and S. Misra, eds. Singapore: Springer, 2020, pp. 47–54.

S. Salloum, J. Z. Huang, and Y. He, Exploring and cleaning big data with random sample

data blocks, J. Big Data, vol. 6, no. 1, p. 45, 2019.

T. Z. Emara and J. Z. Huang, A distributed data management system to support large

scale data analysis, J. Syst. Softw., vol. 148, pp. 105–115, 2019.

X. Li, A. Garcia-Saavedra, X. Costa-Perez, C. J. Bernardos, C. Guimaraes, K. Antevski,

J. Mangues-Bafalluy, J. Baranda, E. Zeydan, D. Corujo, P. Iovanna, G. Landi, J. Alonso,

P. Paixao, H. Martins, M. Lorenzo, J. Ordonez-Lucena, and D. R. Lopez, ‘‘5Growth: An

end-to-end service platform for automated deployment and management of vertical

services over 5G networks,’’ IEEE Commun. Mag., vol. 59, no. 3, pp. 84–90, Mar. 2021.

Z. Ahmad, S. Duppala, R. Chowdhury, and S. Skiena, ‘‘Improved MapReduce load

balancing through distribution-dependent hash function optimization,’’ in Proc. IEEE

th Int. Conf. Parallel Distrib. Syst. (ICPADS), Hong Kong, Dec. 2020, pp. 9–18.

R. Anil, G. Capan, I. Drost-Fromm, T. Dunning, E. Friedman, T. Grant, S. Quinn, P.

Ranjan, S. Schelter, and O. ¨ Yılmazeł, Apache mahout: Machine learning on distributed

dataflow systems, J. Mach. Learn. Res., vol. 21, no. 127, pp. 1–6, 2020.

A. Banerjee, ‘‘Blockchain with IoT: Applications and use cases for a new paradigm of

supply chain driving efficiency and cost,’’ in Advances in Computers, vol. 115.

Amsterdam, The Netherlands: Elsevier, 2019, pp. 259–292.

S. Perera, V. Gupta, and W. Buckley, ‘‘Management of online server congestion using

optimal demand throttling,’’ Eur. J. Oper. Res., vol. 285, no. 1, pp. 324–342, Feb. 2020.

S. Salloum, J. Z. Huang, and Y. He, Random sample partition: A distributed data model

for big data analysis, IEEE Trans. Industr. Inform., vol. 15, no. 11, pp. 5846– 5854, 2019.

S. Salloum, J. Z. Huang, and Y. He, Random sample partition: A distributed data model

for big data analysis, IEEE Trans. Industr. Inform., vol. 15, no. 11, pp. 5846– 5854, 2019.

P. S. Janardhanan and P. Samuel, Launch overheads of spark applications on standalone

and hadoop YARN clusters, in Advances in Electrical and Computer Technologies, T.

Sengodan, M. Murugappan, and S. Misra, eds. Singapore: Springer, 2020, pp. 47–54.

E. Zeydan, O. Dedeoglu, and Y. Turk, ‘‘Experimental evaluations of TDD-based massive

MIMO deployment for mobile network operators,’’ IEEE Access, vol. 8, pp. 33202

, 2020.

B. Varghese and R. Buyya, ‘‘Next generation cloud computing: New trends and research

directions,’’ Future Gener. Comput. Syst., vol. 79, pp. 849–861, Feb. 2018.

A. Daghistani, W. G. Aref, A. Ghafoor, and A. R. Mahmood, ‘‘SWARM: Adaptive load

balancing in distributed streaming systems for big spatial data,’’ ACM Trans. Spatial

Algorithms Syst., vol. 7, no. 3, pp. 1–43, Sep. 2021.

T. Z. Emara and J. Z. Huang, Distributed data strategies to support large-scale data

analysis across geo-distributed data centers, IEEE Access, vol. 8, pp. 178526–178538,

L. Globa and N. Gvozdetska, Comprehensive energy efficient approach to workload

processing in distributed computing environment, in Proc. 2020 IEEE Int. Black Sea

Conf. Communications and Networking (BlackSeaCom), Odessa, Ukraine, 2020, pp. 1

M. Chen, Z. Yang, W. Saad, C. Yin, H. V. Poor, and S. Cui, ‘‘A joint learning and

communications framework for federated learning over wireless networks,’’ IEEE Trans.

Wireless Commun., vol. 20, no. 1, pp. 269–283, Jan. 2021.

OPTIMIZED DISTRIBUTED CLOUD ARCHITECTURES FOR ENTERPRISE SCALE DATA ENGINEERING APPLICATIONS

Authors

DOI:

Keywords:

Abstract

References

Downloads.

Published

Issue

Section

How to Cite

Make a Submission

Google Scholar

Cross Ref

Information

Language

© 2025 Phoenix: International Multidisciplinary Research Journal. All rights reserved.