Development of Context-Aware Model Adaptation Strategies for Real-Time AI Services in Serverless Cloud Architectures

Authors

  • Jane Austen POE Researcher Author

Keywords:

Model adaptation, serverless computing, context-awareness, real-time AI, edge-cloud systems, AI deployment, cloud-native intelligence

Abstract

Real-time AI services deployed over serverless cloud infrastructures face unique challenges due to latency constraints, resource heterogeneity, and ephemeral function execution. Context-aware model adaptation is a promising strategy that allows AI models to adjust behavior based on environmental cues such as input complexity, network load, or user profile. This paper explores adaptive strategies that optimize model performance under dynamic serverless conditions, proposing an architectural framework combining lightweight monitoring, dynamic model selection, and on-demand adaptation. Experimental results show significant gains in latency reduction and accuracy retention in varying contexts, underscoring the viability of context-aware techniques for future serverless AI.

 

 

References

Zhao, T., et al. (2023). Dynamic Pruning for Edge AI. IEEE Transactions on Neural Networks.

Adapa, C.S.R. (2025). Building a standout portfolio in master data management (MDM) and data engineering. International Research Journal of Modernization in Engineering Technology and Science, 7(3), 8082–8099. https://doi.org/10.56726/IRJMETS70424

Sankaranarayanan, S. (2025). The Role of Data Engineering in Enabling Real-Time Analytics and Decision-Making Across Heterogeneous Data Sources in Cloud-Native Environments. International Journal of Advanced Research in Cyber Security (IJARC), 6(1), January-June 2025.

Chen, Y., et al. (2022). Input-aware Switchable CNNs for Real-Time Inference. AAAI.

Han, S., et al. (2021). Latency-Aware Adaptive Inference in Serverless Edge. ACM SIGCOMM.

S.Sankara Narayanan and M.Ramakrishnan, Software As A Service: MRI Cloud Automated Brain MRI Segmentation And Quantification Web Services, International Journal of Computer Engineering & Technology, 8(2), 2017, pp. 38–48.

Adapa, C.S.R. (2025). Transforming quality management with AI/ML and MDM integration: A LabCorp case study. International Journal on Science and Technology (IJSAT), 16(1), 1–12.

Sankar Narayanan .S, System Analyst, Anna University Coimbatore , 2010. INTELLECTUAL PROPERY RIGHTS: ECONOMY Vs SCIENCE &TECHNOLOGY. International Journal of Intellectual Property Rights (IJIPR) .Volume:1,Issue:1,Pages:6-10.

Harlap, A., et al. (2018). Pipedream: Model Serving with Pipeline Adaptation. USENIX.

Wang, R., et al. (2020). Comparing Serverless and Containerized AI Services. arXiv preprint arXiv:2006.13423.

Luo, J., et al. (2022). Meta-Learning for Serverless AI. NeurIPS.

Laskaridis, S., et al. (2020). Holistic Inference Optimization with Early Exits. ACM SenSys.

Mukesh, V., Joel, D., Balaji, V. M., Tamilpriyan, R., & Yogesh Pandian, S. (2024). Data management and creation of routes for automated vehicles in smart city. International Journal of Computer Engineering and Technology (IJCET), 15(36), 2119–2150. doi: https://doi.org/10.5281/zenodo.14993009

Sankar Narayanan .S System Analyst, Anna University Coimbatore , 2010. PATTERN BASED SOFTWARE PATENT.International Journal of Computer Engineering and Technology (IJCET) -Volume:1,Issue:1,Pages:8-17.

Chandra Sekhara Reddy Adapa. (2025). Blockchain-Based Master Data Management: A Revolutionary Approach to Data Security and Integrity. International Journal of Information Technology and Management Information Systems (IJITMIS), 16(2), 1061-1076.

Mukesh, V. (2022). Evaluating Blockchain Based Identity Management Systems for Secure Digital Transformation. International Journal of Computer Science and Engineering (ISCSITR-IJCSE), 3(1), 1–5.

Lin, J., et al. (2020). On-Demand DNN Scaling for Edge Applications. ACM Transactions on Embedded Systems.

Wang, K., et al. (2019). GLAD: Global-Latency-Aware Deployment of AI Models. INFOCOM.

Yao, Y., et al. (2021). FastAdapt: Dynamic Runtime Optimization for AI on FaaS. IEEE CLOUD.

McGrath, G., & Brenner, P. (2019). Serverless computing: Design, implementation, and performance. In Proceedings of the 10th ACM Symposium on Cloud Computing (pp. 181–193).

Mukesh, V. (2025). Architecting intelligent systems with integration technologies to enable seamless automation in distributed cloud environments. International Journal of Advanced Research in Cloud Computing (IJARCC), 6(1),5-10.

Adapa, C.S.R. (2025). Cloud-based master data management: Transforming enterprise data strategy. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 11(2), 1057–1065. https://doi.org/10.32628/CSEIT25112436

McGrath, G., & Brenner, P. (2019). Serverless computing: Design, implementation, and performance. Proceedings of the 10th ACM Symposium on Cloud Computing, 181–193.

Liu, Z., Yu, L., Wang, C., Liu, Q., & Zhang, T. (2020). AutoDeep: Automated DNN compression and acceleration with reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 32(5), 2105–2115.

Wang, L., Li, M., Zhang, Y., Ristenpart, T., & Swift, M. (2018). Peeking behind the curtains of serverless platforms. Proceedings of the USENIX Annual Technical Conference (ATC), 133–146.

Mukesh, V. (2024). A Comprehensive Review of Advanced Machine Learning Techniques for Enhancing Cybersecurity in Blockchain Networks. ISCSITR-International Journal of Artificial Intelligence, 5(1), 1–6.

Laskaridis, S., et al. (2020). SPINN: Synergistic progressive inference of neural networks over device-edge-cloud continuum. Proceedings of the 26th ACM Annual International Conference on Mobile Computing and Networking (MobiCom).

Xu, M., Ren, Y., Lin, X., & Yu, J. (2021). Dynamic neural network inference acceleration via early-exit architecture. Neurocomputing, 452, 729–740.

Downloads

Published

2025-04-17

How to Cite

Jane Austen POE. (2025). Development of Context-Aware Model Adaptation Strategies for Real-Time AI Services in Serverless Cloud Architectures. ISCSITR - INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING (ISCSITR-IJSRAIML) ISSN (Online): 3067-753X, 6(2), 39-46. https://iscsitr.in/index.php/ISCSITR-IJSRAIML/article/view/ISCSITR-IJSRAIML_06_02_005