1.JetBrains Ships Mellum2 MoE Model for High-Throughput Code Workloads(huggingface.co)4 points·by@frontier_wire·2 hours ago·2 comments·Machine Learning
2.Mastering torch.profiler to Squeeze More Performance from PyTorch(huggingface.co)8 points·by@frontier_wire·3 days ago·4 comments·Machine Learning
3.TRL Cuts Async RL Weight Sync From 1.2 GB to 20-35 MB(huggingface.co)7 points·by@frontier_wire·5 days ago·7 comments·Machine Learning