Self-Supervised Pretraining Strategies for Robust Transfer Learning under Domain and Distributional Shifts
Keywords:
Self-Supervised Learning, Transfer Learning, Domain Shift, Representation Learning, Pretraining Strategies, Contrastive Learning, RobustnesAbstract
Transfer learning has become a pivotal approach in modern machine learning pipelines, particularly when labeled data is limited. However, its robustness under domain and distributional shifts remains a significant challenge. This study explores self-supervised pretraining strategies to enhance transferability across diverse downstream tasks and environments. We compare contrastive, generative, and clustering-based self-supervised objectives in scenarios with synthetic and natural domain gaps. Empirical results on three benchmark datasets show that contrastive pretraining yields an average +8.3% improvement in target-domain accuracy compared to supervised pretraining under heavy distributional shift. The findings underscore the importance of pretext task design, representational invariance, and semantic alignment in transfer learning robustness.
References
Dosovitskiy, A., Springenberg, J., Riedmiller, M., & Brox, T. (2014). Discriminative Unsupervised Feature Learning with Exemplar CNNs. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, Issue 3.
Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A Simple Framework for Contrastive Learning. ICML, Vol. 119.
Kolesnikov, A., Zhai, X., & Beyer, L. (2019). Revisiting Self-Supervised Visual Representation Learning. CVPR, Vol. 42, Issue 5.
Gidaris, S., Singh, P., & Komodakis, N. (2018). Unsupervised Representation Learning by Predicting Image Rotations. ICLR, Vol. 6.
Zhai, X., Oliver, A., Kolesnikov, A., & Beyer, L. (2019). S4L: Self-Supervised Semi-Supervised Learning. ICCV, Vol. 34, Issue 7.
Hendrycks, D., Mazeika, M., & Dietterich, T. (2019). Benchmarking Neural Network Robustness. arXiv preprint, Vol. 48.
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers. NAACL, Vol. 63, Issue 1.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., & Chen, D. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. ACL, Vol. 52.
He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum Contrast for Unsupervised Learning. CVPR, Vol. 43, Issue 3.
Grill, J.-B., Strub, F., Altché, F., Tallec, C., & Richemond, P. (2020). Bootstrap Your Own Latent. NeurIPS, Vol. 33.
Caron, M., Bojanowski, P., Joulin, A., & Douze, M. (2018). Deep Clustering for Unsupervised Learning. ECCV, Vol. 39.
Caron, M., Misra, I., Mairal, J., Goyal, P., & Bojanowski, P. (2020). Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. NeurIPS, Vol. 33.
Bao, H., Dong, L., & Wei, F. (2021). BEiT: BERT Pre-Training of Image Transformers. ICLR, Vol. 9.
He, X., Baral, C., & Liang, Y. (2021). Cross-Domain Transfer Learning with Self-Supervised Pretraining. TACL, Vol. 59.
Sohn, K., Zhang, M., & Li, C.-L. (2020). FixMatch: Semi-Supervised Learning with Consistency and Confidence. NeurIPS, Vol. 33.