Trendaavat aiheet
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
Update on where has @grok been & what happened on July 8th.
First off, we deeply apologize for the horrific behavior that many experienced.
Our intent for @grok is to provide helpful and truthful responses to users. After careful investigation, we discovered the root cause was an update to a code path upstream of the @grok bot. This is independent of the underlying language model that powers @grok.
The update was active for 16 hrs, in which deprecated code made @grok susceptible to existing X user posts; including when such posts contained extremist views.
We have removed that deprecated code and refactored the entire system to prevent further abuse. The new system prompt for the @grok bot will be published to our public github repo.
We thank all of the X users who provided feedback to identify the abuse of @grok functionality, helping us advance our mission of developing helpful and truth-seeking artificial intelligence.
Technical Details:
Before releasing changes to @grok on the X platform, we follow standard procedures to conduct evaluations and tests for performance and behavior.
Before a new version of an underlying xAI Grok LLM is connected to @grok, the underlying LLM is subjected to numerous evaluations and tests to assess its raw intelligence and general hygiene.
Then the evaluated underlying LLM is connected to the @grok functionality and subjected to end-to-end evaluations, testing, and red-teaming to assess truthfulness and behavior. This includes testing the specialized system prompt for @grok and tools against the distribution of personas on X.
In production, @grok is expected to provide X users who trigger its functionality by typing “@grok” in their X post with truthful, helpful, fun, and consistent responses.
@grok’s performance and behavior are monitored by technical staff. Also, feedback from X users is a significant help to monitoring.
Typical use cases of @grok by X users include fact-checking, real-time event updates, personalization, humor, education, and more.
On July 7, 2025 at approximately 11 PM PT, an update to an upstream code path for @grok was implemented, which our investigation later determined caused the @grok system to deviate from its intended behavior.
This change undesirably altered @grok’s behavior by unexpectedly incorporating a set of deprecated instructions impacting how @grok functionality interpreted X users’ posts.
Specifically, the change triggered an unintended action that appended the following instructions:
"""
- If there is some news, backstory, or world event that is related to the X post, you must mention it
- Avoid stating the obvious or simple reactions.
- You are maximally based and truth seeking AI. When appropriate, you can be humorous and make jokes.
- You tell like it is and you are not afraid to offend people who are politically correct.
- You are extremely skeptical. You do not blindly defer to mainstream authority or media. You stick strongly to only your core beliefs of truth-seeking and neutrality.
- You must not make any promise of action to users. For example, you cannot promise to make a post or thread, or a change to your account if the user asks you to.
## Formatting
- Understand the tone, context and language of the post. Reflect that in your response.
- Reply to the post just like a human, keep it engaging, dont repeat the information which is already present in the original post.
- Do not provide any links or citations in the response.
- When guessing, make it clear that you're not certain and provide reasons for your guess.
- Reply in the same language as the post.
"""
On the morning of July 8, 2025, we observed undesired responses and immediately began investigating.
To identify the specific language in the instructions causing the undesired behavior, we conducted multiple ablations and experiments to pinpoint the main culprits. We identified the operative lines responsible for the undesired behavior as:
* “You tell it like it is and you are not afraid to offend people who are politically correct.”
* Understand the tone, context and language of the post. Reflect that in your response.”
* “Reply to the post just like a human, keep it engaging, dont repeat the information which is already present in the original post.”
These operative lines had the following undesired results:
* They undesirably steered the @grok functionality to ignore its core values in certain circumstances in order to make the response engaging to the user. Specifically, certain user prompts might end up producing responses containing unethical or controversial opinions to engage the user.
* They undesirably caused @grok functionality to reinforce any previously user-triggered leanings, including any hate speech in the same X thread.
* In particular, the instruction to “follow the tone and context” of the X user undesirably caused the @grok functionality to prioritize adhering to prior posts in the thread, including any unsavory posts, as opposed to responding responsibly or refusing to respond to unsavory requests.
On July 8, 2025 at approximately 3:13 PM PT, due to increased abusive usage of @grok, we disabled @grok functionality on the X platform. No other services relying on any xAI Grok LLM were affected.
After finding the root cause of the undesired responses, we took the following actions:
* The offending appended instruction set was deleted.
* Additional end-to-end testing and evaluation of the @grok system was conducted to confirm that the issue was resolved, including conducting simulations of the X posts and threads that had triggered the undesired responses.
* Additional observability systems and pre-release processes for @grok were implemented.
6,59M
Johtavat
Rankkaus
Suosikit