Multifidelity methods leverage low-cost surrogate models to speed up computations and make occasional recourse to expensive high-fidelity models to establish accuracy guarantees. Because surrogate and high-fidelity models are used together, poor approximations by the surrogate models can be compensated with frequent recourse to high-fidelity models. Thus, there is a trade-off between investing computational resources in training and improving surrogate models and the frequency of making recourse to high-fidelity models; however, this trade-off is ignored by traditional model reduction and data-driven modeling methods that learn surrogate models that are meant to replace high-fidelity models rather than being used together with high-fidelity models. This presentation introduces the concept of context-aware learning that aims to derive models that are explicitly trained to be used together with high-fidelity models in multi-fidelity settings. Our analysis shows that in certain situations this trade-off can be exploited explicitly, which leads to an optimal training of surrogate models. The result is that fewer data points are provably sufficient when learning models for multi-fidelity settings than when learning models to be used in single-fidelity settings.