ASTRA: HackerRank's coding benchmark for LLMs
5 by rvivek | 0 comments on Hacker News.
We help companies hire & upskill developers. A customer recently asked: What % of HackerRank problems can LLMs solve? That got us thinking—how should hiring evolve when AI can translate natural language to code? Our belief: AI will handle much of code generation, so developers will be assessed more on SDLC skills with AI assistants. To explore this, we’re benchmarking LLMs on real-world software dev scenarios—starting with 65 unseen problems across 10 domains. Beyond correctness, we evaluated consistency—an often overlooked aspect of AI reliability. We’re open-sourcing the dataset on Huggingface and expanding it to cover more domains, ambiguous specs, and harder challenges. Would love the HN community’s take on this!
Post Top Ad
Tuesday, February 11, 2025

New top story on Hacker News: ASTRA: HackerRank's coding benchmark for LLMs
Tags
# Hacker News
Share This
About Sr officials
Templatesyard is a blogger resources site is a provider of high quality blogger template with premium looking layout and robust design. The main mission of templatesyard is to provide the best quality blogger templates which are professionally designed and perfectlly seo optimized to deliver best result for your blog.
Newer Article
New top story on Hacker News: Hackers leak cop manuals for departments after breaching major provider
Older Article
New top story on Hacker News: Show HN: Open-source Hacker News apps
New top story on Hacker News: Hacker News Hug of Deaf
Sr officialsApr 10, 2025New top story on Hacker News: Treasury's OCC Says Hackers Had Access to 150k Emails
Sr officialsApr 09, 2025New top story on Hacker News: LLM-hacker-news: LLM plugin for pulling content from Hacker News
Sr officialsApr 08, 2025
Labels:
Hacker News
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment