Tuesday, February 11, 2025

Home Hacker News New top story on Hacker News: ASTRA: HackerRank's coding benchmark for LLMs

New top story on Hacker News: ASTRA: HackerRank's coding benchmark for LLMs

Responsive Ads Here

ASTRA: HackerRank's coding benchmark for LLMs
5 by rvivek | 0 comments on Hacker News.
We help companies hire & upskill developers. A customer recently asked: What % of HackerRank problems can LLMs solve? That got us thinking—how should hiring evolve when AI can translate natural language to code? Our belief: AI will handle much of code generation, so developers will be assessed more on SDLC skills with AI assistants. To explore this, we’re benchmarking LLMs on real-world software dev scenarios—starting with 65 unseen problems across 10 domains. Beyond correctness, we evaluated consistency—an often overlooked aspect of AI reliability. We’re open-sourcing the dataset on Huggingface and expanding it to cover more domains, ambiguous specs, and harder challenges. Would love the HN community’s take on this!

Home Top Ad

Post Top Ad

Tuesday, February 11, 2025

New top story on Hacker News: ASTRA: HackerRank's coding benchmark for LLMs

No comments:

Post a Comment

Post Bottom Ad

Author Details

Social Counter

Socialize

Facebook

Ethereum

Ripple

Technology

Facebook

Featured

Bitcoin

Featured Post

Search This Blog

Recent Posts

Recent in Sports

Css Options

Default Variables

Link List

BUSINESS$type=complex$count=4

Contributors

Categories

Label

Mobile Logo Settings

MAIN QUOTE$quote=Steve Jobs

Litecoin

Footer Logo

Fanspage

Most Recent Post

Sports

Menu Footer Widget

Social Media Icons

Social Plugin

Menu

Social Media Icons

Main Menu

Footer Pages

Main Menu

Facebook

Popular Posts

Main Menu

Archive

Sponsor

Technology

Tags

Pages

Tags

Connect With us

Recent News

About Me

Contact Form

Categories