BUILDING RESILIENT DATA ENGINEERING PIPELINES USING ENTERPRISE CLOUD DISTRIBUTED SYSTEMS

Authors

  • Hardik Patel Author

DOI:

https://doi.org/10.65009/kr96t581

Keywords:

Resilient Data Pipelines, Distributed Cloud Systems, Fault-Tolerant Data  Engineering, Adaptive Orchestration, Self-Healing Architecture, Enterprise Cloud Computing,  Reliability Management.,,

Abstract

The modern enterprises depend on the data engineering pipelines, which should stay reliable 
regardless of dynamic workloads, the compliance with heterogeneous data sources as well as 
failures of the distributed cloud environments. This paper introduces a robust enterprise cloud
based pipeline architecture that combines, adaptive orchestration, fault tolerant-dataflow 
scheduling, self-healing microservices, and policy-driven scaling of resources through geo 
distributed resources. The layered architecture offers the convergence of streaming and batch 
workloads by isolating ingestion, smart routing, resiliency analytics, and a verify-compliant 
storage management. A reliability manager is a role that is learning enabled to monitor 
workload performance, forecasting possible performance-bottlenecks and anticipating 
performance drops and proactive scale down and scale up of the compute and storage resources. 
Checkpoint-aware processing, autonomic recovery of failed components, multi-region 
replication strategy and latency-sensitive placement strategy are all reinforcers to aid in 
resilience. The structure is aimed at providing enterprise-level governance, observability, and 
security without causing operational shocks in case of failures, or spikes in demand. In general, 
the suggested architecture will support powerful, scalable, and reliable data application 
pipelines that can be adapted to mission-critical cloud-based business operation environments 
in variety of regulatory, workload, and infrastructure specifications on the global scale. 

,

References

Y. R. Avuthu, ‘‘Change management and rollback strategies using IaC in CI/CD

pipelines,’’ Int. J. Sci. Res. Arch., vol. 2, no. 1, pp. 160–168, Apr. 2021.

J. Navarro-Ortiz, P. Romero-Diaz, S. Sendra, P. Ameigeiras, J. J. Ramos-Munoz, and J.

M. Lopez-Soler, “A survey on 5G usage scenarios and traffic models,” IEEE Commun.

Surveys Tuts., vol. 22, no. 2, pp. 905–929, 2nd Quart., 2020.

G. Ramirez-Gargallo, M. Garcia-Gasulla, and F. Mantovani, Tensor flow on state-of-the

art HPC clusters: A machine learning use case, in Proc. 2019 19 th IEEE/ACM Int. Symp.

Cluster, Cloud and Grid Computing (CCGRID) , Larnaca, Cyprus, 2019, pp. 526–533.

X. Li, A. Garcia-Saavedra, X. Costa-Perez, C. J. Bernardos, C. Guimaraes, K. Antevski,

J. Mangues-Bafalluy, J. Baranda, E. Zeydan, D. Corujo, P. Iovanna, G. Landi, J. Alonso,

P. Paixao, H. Martins, M. Lorenzo, J. Ordonez-Lucena, and D. R. Lopez, “5Growth: An

end-to-end service platform for automated deployment and management of vertical

services over 5G networks,” IEEE Commun. Mag., vol. 59, no. 3, pp. 84–90, Mar. 2021.

P. Reddy, ‘‘The role of AI in continuous integration and continuous deployment (CI/CD)

pipelines: Enhancing performance and reliability,’’ Int. Res. J. Eng. Technol., vol. 8, no.

, 2021.

M. Usama, J. Qadir, A. Raza, H. Arif, K.-L.-A. Yau, Y. Elkhatib, A. Hussain, and A. Al

Fuqaha, “Unsupervised machine learning for networking: Techniques, applications and

research challenges,” IEEE Access, vol. 7, pp. 65579–65615, 2019.

D. A. Tamburri, M. Miglierina, and E. D. Nitto, ‘‘Cloud applications monitoring: An

industrial study,’’ Inf. Softw. Technol., vol. 127, Nov. 2020, Art. no. 106376.

M. A. S. Netto, R. N. Calheiros, E. R. Rodrigues, R. L. F. Cunha, and R. Buyya, HPC

cloud for scientific and business applications: taxonomy, vision, and research

challenges, ACM Comput. Surv., vol. 51, no. 1, pp. 1–29, 2019.

A. Lavin, C. M. Gilligan-Lee, A. Visnjic, S. Ganju, D. Newman, A. G. Baydin, S.

Ganguly, D. Lange, A. Sharma, S. Zheng, E. P. Xing, A. Gibson, J. Parr, C. Mattmann,

and Y. Gal, “Technology readiness levels for machine learning systems,”

, arXiv:2101.03989.

S. Perera, V. Gupta, and W. Buckley, ‘‘Management of online server congestion using

optimal demand throttling,’’ Eur. J. Oper. Res., vol. 285, no. 1, pp. 324–342, Feb. 2020.

J. Xie, F. R. Yu, T. Huang, R. Xie, J. Liu, and Y. Liu, “A survey of machine learning

techniques applied to software defined networking (SDN): Research issues and

challenges,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, pp. 393–430, 1st Quart., 2019.

H. Cao, M. Wachowicz, C. Renso, and E. Carlini, ‘‘Analytics everywhere: Generating

insights from the Internet of Things,’’ IEEE Access, vol. 7, pp. 71749–71769, 2019.

P. S. Janardhanan and P. Samuel, “Launch overheads of spark applications on standalone

and hadoop YARN clusters”, in Advances in Electrical and Computer Technologies, T.

Sengodan, M. Murugappan, and S. Misra, eds. Singapore: Springer, 2020, pp. 47–54.

H. Gacanin and M. Wagner, “Artificial intelligence paradigm for customer experience

management in next-generation networks: Challenges and perspectives,” IEEE Netw.,

vol. 33, no. 2, pp. 188–194, Mar. 2019.

M. E. Morocho-Cayamcela, H. Lee, and W. Lim, ‘‘Machine learning for 5G/B5G mobile

and wireless communications: Potential, limitations, and future directions,’’ IEEE

Access, vol. 7, pp. 137184–137206, 2019.

H. Zahid, T. Mahmood, A. Morshed, and T. Sellis, “Big data analytics in

telecommunications: Literature review and architecture recommendations,” IEEE/CAA

J. Autom. Sinica, vol. 7, no. 1, pp. 18–38, Jan. 2020.

A. D’Alconzo, I. Drago, A. Morichetaa, M. Mellia, and P. Casas, ‘‘A survey on big data

for network traffic monitoring and analysis,’’ IEEE Trans. Netw. Service Manag., vol.

, no. 3, pp. 800–813, Sep. 2019.

Y. Benlachmi and M. L. Hasnaoui, Big data and spark: Comparison with hadoop, in Proc.

Fourth World Conf. Smart Trends in Systems, Security and Sustainability

(WorldS4), London, UK, 2020, pp. 811–817.

L. Frost, T. B. Meriem, J. M. Bonifacio, S. Cadzow, F. da Silva, M. Essa, R. Forbes, P.

Marchese, M. Odini, N. Sprecher, C. Toche, and S. Wood, “Artificial intelligence and

future directions for ETSI,” ETSI, Sophia Antipolis, France, ETSI White Paper #34

(2020-06), 2020.

Downloads.

Published

2022-04-14

Issue

Section

Articles

How to Cite

BUILDING RESILIENT DATA ENGINEERING PIPELINES USING ENTERPRISE CLOUD DISTRIBUTED SYSTEMS . (2022). Phoenix: International Multidisciplinary Research Journal ( Peer Reviewed High Impact Journal ), 1(2), 10-24. https://doi.org/10.65009/kr96t581