Trendaavat aiheet
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.

Santiago
This is how you write 10x better code with 10x less effort.
Custom, specialized agents reviewing your code every step of the way.
I've seen automated code reviews before, but never with the ability to define your custom reviewer agents. @baz_scm is the first one to pull this off, and it's pretty cool.
There are three types of reviewer agents:
1. The ones that come out of the box.
These agents cover the most common patterns everyone wants to check: duplicated code, broken code, complex code, etc.
2. Recommended reviewer agents that Baz creates for you automatically.
Baz analyzes your review history and past comments to identify patterns you care about, and then automatically creates agents specialized in checking those patterns.
For example, if you always ask your developers to keep files under 100 lines of code, Baz will detect it and create a custom agent that checks for that.
3. Custom reviewer agents that you define.
These are my favorite ones: Write a prompt explaining your rules, and your agent will start checking your code to flag anything that matches the rules.
I created a simple reviewer agent in the attached video.
Honestly, at this point, you have no excuse for shipping bad code.
Here is a link for you to try these custom reviewer agents:
Thanks to the @baz_scm team for collaborating with me on this post.
31,43K
Honestly, most AI developers are still stuck in the last century.
It blows my mind how few people are aware of Error Analysis.
This is *literally* the fastest and most effective way to evaluate AI applications, and most teams are still stuck chasing ghosts.
Please, stop tracking generic metrics and follow these steps:
1. Collect failure samples
Start reviewing the responses generated by your application. Write notes about each response, especially those that were mistakes. You don't need to format your notes in any specific way. Focus on describing what went wrong with the response.
2. Categorize your notes
After you have reviewed a good set of responses, take an LLM and ask it to find common patterns in your notes. Ask it to classify each note based on these patterns.
You'll end up with categories covering every type of mistake your application made.
3. Diagnose the most frequent mistakes
Begin by focusing on the most common type of mistake. You don't want to waste time working with rare mistakes.
Drill into the conversations, inputs, and logs leading to those incorrect samples. Try to understand what might be causing the problems.
4. Design targeted fixes
At this point, you want to determine how to eliminate the mistakes you diagnosed in the previous step as quickly and cheaply as possible.
For example, you could tweak your prompts, add extra validation rules, find more training data, or modify the model.
5. Automate the evaluation process
You need to implement a simple process to rerun an evaluation set through your application and evaluate whether your fixes were effective.
My recommendation is to use an LLM-as-a-Judge to run samples through the application, score them with a PASS/FAIL tag, and compute the results.
6. Keep an eye on your metrics
Each category you identified during error analysis is a metric you want to track over time.
You will get nowhere by obsessing over "relevance", "correctness", "completeness", "coherence", and any other out-of-the-box metrics. Forget about these and focus on the real issues you found.

49,28K
Johtavat
Rankkaus
Suosikit
Ketjussa trendaava
Trendaa X:ssä
Viimeisimmät suosituimmat rahoitukset
Merkittävin