Understanding the performance of scientific applications can be a challenging endeavor given the constant evolution of architectures, programming models, compilers, numerical methods and the applications themselves. Performance integration testing is still not a reality for the majority of high-performance applications because of the complexity, computational cost, and lack of reliable automation. Hence, as part of the DOE SciDAC program, we are working on creating robust performance analysis workflows that capture application-specific performance issues and can be maintained extended by the application scientists without requiring an external performance “expert”. The consumers of performance data include application developers, performance models, and autotuners. Once appropriate and sufficient performance data is available, our approach to using it to guide optimization is three-fold: (i) we investigate the most effective way to present performance results to the code deve lopers (ii) we automate the selection of numerical methods based on generic performance models (as part of the NSF-funded Lighthouse project) and (iii) we explore the use of different types of performance models in low-level autotuning systems to reduce the size of the parameter search space. While code generation and autotuning are important for achieving performance portability, the majority of code development (including optimization) is still performed by humans. As part of the DOE IDEAS project, we are developing data-based methodologies to try to understand better how human teams work most effectively in developing high-quality, high-performance, enduring scientific software.