We empirically assess modern large language models on representative HPC code optimization tasks, detailing accuracy, performance gains, and practical guidance for production integration.