BUILDING RESILIENT DATA ENGINEERING PIPELINES USING ENTERPRISE CLOUD DISTRIBUTED SYSTEMS
DOI:
https://doi.org/10.65009/kr96t581Keywords:
Resilient Data Pipelines, Distributed Cloud Systems, Fault-Tolerant Data Engineering, Adaptive Orchestration, Self-Healing Architecture, Enterprise Cloud Computing, Reliability Management.,,Abstract
The modern enterprises depend on the data engineering pipelines, which should stay reliable
regardless of dynamic workloads, the compliance with heterogeneous data sources as well as
failures of the distributed cloud environments. This paper introduces a robust enterprise cloud
based pipeline architecture that combines, adaptive orchestration, fault tolerant-dataflow
scheduling, self-healing microservices, and policy-driven scaling of resources through geo
distributed resources. The layered architecture offers the convergence of streaming and batch
workloads by isolating ingestion, smart routing, resiliency analytics, and a verify-compliant
storage management. A reliability manager is a role that is learning enabled to monitor
workload performance, forecasting possible performance-bottlenecks and anticipating
performance drops and proactive scale down and scale up of the compute and storage resources.
Checkpoint-aware processing, autonomic recovery of failed components, multi-region
replication strategy and latency-sensitive placement strategy are all reinforcers to aid in
resilience. The structure is aimed at providing enterprise-level governance, observability, and
security without causing operational shocks in case of failures, or spikes in demand. In general,
the suggested architecture will support powerful, scalable, and reliable data application
pipelines that can be adapted to mission-critical cloud-based business operation environments
in variety of regulatory, workload, and infrastructure specifications on the global scale.
References
Y. R. Avuthu, ‘‘Change management and rollback strategies using IaC in CI/CD
pipelines,’’ Int. J. Sci. Res. Arch., vol. 2, no. 1, pp. 160–168, Apr. 2021.
J. Navarro-Ortiz, P. Romero-Diaz, S. Sendra, P. Ameigeiras, J. J. Ramos-Munoz, and J.
M. Lopez-Soler, “A survey on 5G usage scenarios and traffic models,” IEEE Commun.
Surveys Tuts., vol. 22, no. 2, pp. 905–929, 2nd Quart., 2020.
G. Ramirez-Gargallo, M. Garcia-Gasulla, and F. Mantovani, Tensor flow on state-of-the
art HPC clusters: A machine learning use case, in Proc. 2019 19 th IEEE/ACM Int. Symp.
Cluster, Cloud and Grid Computing (CCGRID) , Larnaca, Cyprus, 2019, pp. 526–533.
X. Li, A. Garcia-Saavedra, X. Costa-Perez, C. J. Bernardos, C. Guimaraes, K. Antevski,
J. Mangues-Bafalluy, J. Baranda, E. Zeydan, D. Corujo, P. Iovanna, G. Landi, J. Alonso,
P. Paixao, H. Martins, M. Lorenzo, J. Ordonez-Lucena, and D. R. Lopez, “5Growth: An
end-to-end service platform for automated deployment and management of vertical
services over 5G networks,” IEEE Commun. Mag., vol. 59, no. 3, pp. 84–90, Mar. 2021.
P. Reddy, ‘‘The role of AI in continuous integration and continuous deployment (CI/CD)
pipelines: Enhancing performance and reliability,’’ Int. Res. J. Eng. Technol., vol. 8, no.
, 2021.
M. Usama, J. Qadir, A. Raza, H. Arif, K.-L.-A. Yau, Y. Elkhatib, A. Hussain, and A. Al
Fuqaha, “Unsupervised machine learning for networking: Techniques, applications and
research challenges,” IEEE Access, vol. 7, pp. 65579–65615, 2019.
D. A. Tamburri, M. Miglierina, and E. D. Nitto, ‘‘Cloud applications monitoring: An
industrial study,’’ Inf. Softw. Technol., vol. 127, Nov. 2020, Art. no. 106376.
M. A. S. Netto, R. N. Calheiros, E. R. Rodrigues, R. L. F. Cunha, and R. Buyya, HPC
cloud for scientific and business applications: taxonomy, vision, and research
challenges, ACM Comput. Surv., vol. 51, no. 1, pp. 1–29, 2019.
A. Lavin, C. M. Gilligan-Lee, A. Visnjic, S. Ganju, D. Newman, A. G. Baydin, S.
Ganguly, D. Lange, A. Sharma, S. Zheng, E. P. Xing, A. Gibson, J. Parr, C. Mattmann,
and Y. Gal, “Technology readiness levels for machine learning systems,”
, arXiv:2101.03989.
S. Perera, V. Gupta, and W. Buckley, ‘‘Management of online server congestion using
optimal demand throttling,’’ Eur. J. Oper. Res., vol. 285, no. 1, pp. 324–342, Feb. 2020.
J. Xie, F. R. Yu, T. Huang, R. Xie, J. Liu, and Y. Liu, “A survey of machine learning
techniques applied to software defined networking (SDN): Research issues and
challenges,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, pp. 393–430, 1st Quart., 2019.
H. Cao, M. Wachowicz, C. Renso, and E. Carlini, ‘‘Analytics everywhere: Generating
insights from the Internet of Things,’’ IEEE Access, vol. 7, pp. 71749–71769, 2019.
P. S. Janardhanan and P. Samuel, “Launch overheads of spark applications on standalone
and hadoop YARN clusters”, in Advances in Electrical and Computer Technologies, T.
Sengodan, M. Murugappan, and S. Misra, eds. Singapore: Springer, 2020, pp. 47–54.
H. Gacanin and M. Wagner, “Artificial intelligence paradigm for customer experience
management in next-generation networks: Challenges and perspectives,” IEEE Netw.,
vol. 33, no. 2, pp. 188–194, Mar. 2019.
M. E. Morocho-Cayamcela, H. Lee, and W. Lim, ‘‘Machine learning for 5G/B5G mobile
and wireless communications: Potential, limitations, and future directions,’’ IEEE
Access, vol. 7, pp. 137184–137206, 2019.
H. Zahid, T. Mahmood, A. Morshed, and T. Sellis, “Big data analytics in
telecommunications: Literature review and architecture recommendations,” IEEE/CAA
J. Autom. Sinica, vol. 7, no. 1, pp. 18–38, Jan. 2020.
A. D’Alconzo, I. Drago, A. Morichetaa, M. Mellia, and P. Casas, ‘‘A survey on big data
for network traffic monitoring and analysis,’’ IEEE Trans. Netw. Service Manag., vol.
, no. 3, pp. 800–813, Sep. 2019.
Y. Benlachmi and M. L. Hasnaoui, Big data and spark: Comparison with hadoop, in Proc.
Fourth World Conf. Smart Trends in Systems, Security and Sustainability
(WorldS4), London, UK, 2020, pp. 811–817.
L. Frost, T. B. Meriem, J. M. Bonifacio, S. Cadzow, F. da Silva, M. Essa, R. Forbes, P.
Marchese, M. Odini, N. Sprecher, C. Toche, and S. Wood, “Artificial intelligence and
future directions for ETSI,” ETSI, Sophia Antipolis, France, ETSI White Paper #34
(2020-06), 2020.

