Gemini 3.1 Pro is now at the top of the APEX-Agents leaderboard. Gemini jumped from 18.4% to 33.5% on Pass@1 in just 90 days. It also completes 5 tasks that no model has ever been able to do before.…


LinkedIn Content Strategy & Writing Style
CEO @ Mercor | Thiel Fellow
1 person tracking this creator on Viral Brain
54.7K
3.7K
Gemini 3.1 Pro is now at the top of the APEX-Agents leaderboard. Gemini jumped from 18.4% to 33.5% on Pass@1 in just 90 days. It also completes 5 tasks that no model has ever been able to do before.…

1.0 posts/week
Posts / Week
8.1 days
Days Between Posts
1
Total Posts Analyzed
MEDIUM
Posting Frequency
417.5714285714286%
Avg Engagement Rate
STABLE
Performance Trend
95
Avg Length (Words)
HIGH
Depth Level
ADVANCED
Expertise Level
0.78/10
Uniqueness Score
NO
Question Usage
0.12%
Response Rate
Writing style breakdown
<start of post>
The speed at which LLMs are mastering specialized labor is breathtaking. GPT 5.4 just hit a 62% success rate on the APEX-Agents legal reasoning module.
Six months ago, no model could maintain consistency across a 50-page contract. Now, GPT 5.4 identifies conflicting clauses and suggests redlines in seconds.
This is the end of rote document review. The bottleneck is no longer human hours, but the quality of the underlying data.
The economic impact on professional services will be enormous.
great work Sam Altman and the OpenAI team
<end of post>
Sign in to unlock the full writing analysis
Nail your LinkedIn strategy with ViralBrain.
Analyze and write in Brendan Foody's style. Grow your LinkedIn to the next level.
418
—
1.0
79
1
GPT 5.4 is the best model we’ve ever tested on APEX-Agents. It’s also the first model to pass 50% mean score. A year ago, frontier models couldn’t even edit an Excel sheet and scored less than 5%. No…

Mercor's unique ability to answer this question is framed by the fact that we know who's an expert in what, all around the world. Uber built a driver network. Airbnb built a host network. Mercor is b…
Anthropic jumping from 18.4% to 29.8% on APEX Agents in a few months is insane. This benchmark demonstrates it's ability to create real deliverables such as slide decks, documents, financial models i…
Applied Compute improved 19% on Corporate Law tasks in APEX Agents. Their model traverses data rooms with hundreds of files to prepare complex legal deliverables. This level of model improvement wit…
Applied Compute achieving frontier capabilities on APEX Agents with just 2,000 tasks is incredible. Their model can produce complex legal deliverables, redlines, and slide decks. It feels like RL is…