昨日(2026-05-29)のブックマーク

1

Xブックマーク

投稿者: Koichi (@fujibee)
投稿日時: 2026-05-27 02:33 +07
元リンク: https://x.com/fujibee/status/2059357190725783892
自動生成記事: 未生成

本文

http://x.com/i/article/2059350551348092928

https://t.co/R9AUtFE6kk
https://x.com/fujibee/status/2059357190725783892

2

⚠️ Devs, parem tudo e leiam. Quase rodei malware na m…

投稿者: Fabio Vedovelli (@vedovelli74)
投稿日時: 2026-05-27 01:25 +07
元リンク: https://x.com/vedovelli74/status/2059340087696273828
自動生成記事: 未生成

日本語訳

⚠️ 開発者の皆さん、すべてを止めて読んでください。たった今、自分のマシンでマルウェアを実行させられそうになったので、詐欺がどのように行われるかを正確に知ってもらいたいのです。 GitHub 上のリポジトリへのリンクを受け取りました。これは、クローンを作成してローカルで実行するリクエストを含む、web3/poker プロジェクトの「MVP」です。テクニカルテストビジュアル、

本文

⚠️ Devs, parem tudo e leiam. Quase rodei malware na minha máquina agora mesmo e quero que vocês saibam exatamente como funciona o golpe. Recebi um link de um repo no GitHub: um "MVP" de um projeto web3/poker, com pedido pra clonar e rodar localmente. Visual de teste técnico,

⚠️ Devs, parem tudo e leiam. Quase rodei malware na minha máquina agora mesmo e quero que vocês saibam exatamente como funciona o golpe. Recebi um link de um repo no GitHub: um "MVP" de um projeto web3/poker, com pedido pra clonar e rodar localmente. Visual de teste técnico, htt…
https://x.com/vedovelli74/status/2059340087696273828

3

Introducing the newest Coral board, for efficient, on…

投稿者: Google Gemma (@googlegemma)
投稿日時: 2026-05-28 03:55 +07
元リンク: https://x.com/googlegemma/status/2059740184930074758
自動生成記事: 未生成

日本語訳

効率的なオンデバイス AI を実現する最新の Coral ボードが登場しました。ビデオでデモをチェックしてください: - オンボード音声翻訳 - 自然言語制御ハードウェア - ビジョンとサウンド生成音楽

本文

Introducing the newest Coral board, for efficient, on-device AI! Check out the demos in the video: - On-board speech translation - Natural language controlling hardware - Vision & sound generating music

Introducing the newest Coral board, for efficient, on-device AI! Check out the demos in the video: - On-board speech translation - Natural language controlling hardware - Vision & sound generating music https://t.co/Hav2VwI7G1
https://x.com/googlegemma/status/2059740184930074758

4

AI Vibe Coding Is Broken. Strict Engineering Fixes It.

投稿者: nunomaduro (@enunomaduro)
投稿日時: 2026-05-28 06:31 +07
元リンク: https://x.com/enunomaduro/status/2059779570061242704
自動生成記事: 未生成

日本語訳

AI バイブコーディングが壊れている 2026 年の優秀なエンジニアは単にプロンプトを出すだけではなく、AI の種類に基づいてより厳格なシステムを構築するでしょう。テスト。パターン。静的解析。 ci。ガードレールの新しいトークがライブ中です:

本文

AI vibe coding is broken the best engineers in 2026 won't just prompt.. they'll build stricter systems around AI types. tests. patterns. static analysis. ci. guardrails new talk is live: https://youtu.be/96To5-uJbog?si=2pmTLgPGsMjJ9vhB

AI vibe coding is broken the best engineers in 2026 won't just prompt.. they'll build stricter systems around AI types. tests. patterns. static analysis. ci. guardrails new talk is live: https://t.co/4vH7gsRn7Z https://t.co/Yx5801XCqy
https://x.com/enunomaduro/status/2059779570061242704

5

🚨 MICROSOFT JUST OPEN-SOURCED SELF-EVOLVING AGENT SKI…

投稿者: Charly Wargnier (@DataChaz)
投稿日時: 2026-05-28 15:08 +07
元リンク: https://x.com/DataChaz/status/2059909626532155482
自動生成記事: 未生成

日本語訳

🚨 マイクロソフトはまさにオープンソースの自己進化型エージェントスキル AI モデルをトレーニングするのとまったく同じ方法でエージェントスキルをトレーニングし、時間の経過とともに向上する様子を観察できるようになりました。これは SkillOpt と呼ばれ、100% 無料でオープンソースです。これまで、ビルディングエージェントのワークフローは純粋でした。

本文

🚨 MICROSOFT JUST OPEN-SOURCED SELF-EVOLVING AGENT SKILLS You can now train agent skills the exact same way you train AI models, and watch them get better over time. It's called SkillOpt, and it's 100% free and open-source. Until now, building agent workflows has been pure

🚨 MICROSOFT JUST OPEN-SOURCED SELF-EVOLVING AGENT SKILLS You can now train agent skills the exact same way you train AI models, and watch them get better over time. It's called SkillOpt, and it's 100% free and open-source. Until now, building agent workflows has been pure https:…
https://x.com/DataChaz/status/2059909626532155482

6

New in Claude Code (research preview): dynamic workfl…

投稿者: ClaudeDevs (@ClaudeDevs)
投稿日時: 2026-05-29 00:05 +07
元リンク: https://x.com/ClaudeDevs/status/2060044853279617150
自動生成記事: 未生成

日本語訳

Claude Code の新機能 (リサーチプレビュー): 動的なワークフロー。クロードは、その場でオーケストレーションスクリプトを作成し、調整されたサブエージェントの大規模なフリートを並行して起動して、最も複雑なタスクを引き受けます。開始するには、プロンプトで「ワークフロー」という単語を使用します。

本文

New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started.

New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started. https://t.co/re4SG…
https://x.com/ClaudeDevs/status/2060044853279617150

7

You can also set "/effort ultracode", a new effort le…

投稿者: ClaudeDevs (@ClaudeDevs)
投稿日時: 2026-05-29 00:05 +07
元リンク: https://x.com/ClaudeDevs/status/2060044857339724158
自動生成記事: 未生成

日本語訳

また、xhigh で実行され、タスクが動的なワークフローを必要とする場合にクロードが独自に決定できる新しいエフォートレベルである「/effort Ultracode」を設定することもできます。

本文

You can also set "/effort ultracode", a new effort level that runs at xhigh and lets Claude decide on its own when a task warrants a dynamic workflow.

You can also set "/effort ultracode", a new effort level that runs at xhigh and lets Claude decide on its own when a task warrants a dynamic workflow.
https://x.com/ClaudeDevs/status/2060044857339724158

8

Claude Code es flipante. Pero cada vez que lo arranca…

投稿者: PA13L0 (@Fluyeporlaweb)
投稿日時: 2026-05-28 13:00 +07
元リンク: https://x.com/Fluyeporlaweb/status/2059877502957310059
自動生成記事: 未生成

日本語訳

クロード・コードはすごいですね。しかし、新しいプロジェクトでそれを開始するたびに、すでに認識しているはずのファイルを読み取るためにトークンを消費し始めます。誰かがまさにそれを解決するリポジトリを投稿しました。 ✅ クロードコードのコードベースの事前にインデックス付けされたナレッジグラフ ✅ 少ない

本文

Claude Code es flipante. Pero cada vez que lo arrancas en un proyecto nuevo empieza a gastar tokens leyendo archivos que ya debería conocer. Alguien publicó el repo que resuelve exactamente eso. ✅ Grafo de conocimiento pre-indexado de tu codebase para Claude Code ✅ Menos

Claude Code es flipante. Pero cada vez que lo arrancas en un proyecto nuevo empieza a gastar tokens leyendo archivos que ya debería conocer. Alguien publicó el repo que resuelve exactamente eso. ✅ Grafo de conocimiento pre-indexado de tu codebase para Claude Code ✅ Menos https:/…
https://x.com/Fluyeporlaweb/status/2059877502957310059

9

Your localhost just got a public url! Sharing your lo…

投稿者: Confidence (@megaconfidence)
投稿日時: 2026-05-28 23:11 +07
元リンク: https://x.com/megaconfidence/status/2060031096327164079
自動生成記事: 未生成

日本語訳

ローカルホストが公開 URL を取得しました。 Wrangler または Cloudflare Vite プラグインでローカル開発セッションを共有するのが非常に簡単になりました。 T を押すだけでトンネルを作成できます。設定不要でパブリック URL を取得できます。カスタムドメインを使用してトンネルを作成するオプションもあります。

本文

Your localhost just got a public url! Sharing your local dev sessions is now super easy in Wrangler or the Cloudflare Vite plugin. Just press T to create a tunnel, you get a public url with no config needed There's also an option to use your custom domains to create tunnels for

Your localhost just got a public url! Sharing your local dev sessions is now super easy in Wrangler or the Cloudflare Vite plugin. Just press T to create a tunnel, you get a public url with no config needed There's also an option to use your custom domains to create tunnels for…
https://x.com/megaconfidence/status/2060031096327164079

10

We’ve updated Claude Code's built-in claude-api skill…

投稿者: ClaudeDevs (@ClaudeDevs)
投稿日時: 2026-05-28 23:59 +07
元リンク: https://x.com/ClaudeDevs/status/2060043213600367030
自動生成記事: 未生成

日本語訳

Claude Code の組み込み claude-api スキル移行ガイダンスを 4.8 に更新しました。「/claude-api merge」を実行してモデル文字列を更新し、Opus 4.8 用に調整されたプロンプトの改善を提案します。

本文

We’ve updated Claude Code's built-in claude-api skill migration guidance for 4.8. Run "/claude-api migrate" to update your model strings and suggest prompt improvements that are tuned for Opus 4.8.

We’ve updated Claude Code's built-in claude-api skill migration guidance for 4.8. Run "/claude-api migrate" to update your model strings and suggest prompt improvements that are tuned for Opus 4.8.
https://x.com/ClaudeDevs/status/2060043213600367030

11

Malware Blocking and Dependency Policies in Composer…

投稿者: Laravel News (@laravelnews)
投稿日時: 2026-05-29 00:02 +07
元リンク: https://x.com/laravelnews/status/2060043942448496836
自動生成記事: 未生成

日本語訳

Composer 2.10 のマルウェアブロックと依存関係ポリシー投稿者: @ericlbarnes

本文

Malware Blocking and Dependency Policies in Composer 2.10 posted by @ericlbarnes https://laravel-news.com/malware-blocking-and-dependency-policies-in-composer-210

Malware Blocking and Dependency Policies in Composer 2.10 posted by @ericlbarnes https://t.co/a8Mrc4WUX5
https://x.com/laravelnews/status/2060043942448496836

12

Prompting best practices

投稿者: ClaudeDevs (@ClaudeDevs)
投稿日時: 2026-05-29 00:04 +07
元リンク: https://x.com/ClaudeDevs/status/2060044553982460222
自動生成記事: 未生成

日本語訳

Opus 4.8 を使用するためのヒントとベストプラクティスについては、プロンプトガイドを参照してください。

本文

See our prompting guide for more tips and best practices for working with Opus 4.8: https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices

See our prompting guide for more tips and best practices for working with Opus 4.8: https://t.co/p03FDlBxnT
https://x.com/ClaudeDevs/status/2060044553982460222

13

Last week we rolled out sandboxes that @Railway's age…

投稿者: Cody De Arkland (@Codydearkland)
投稿日時: 2026-05-29 00:18 +07
元リンク: https://x.com/Codydearkland/status/2060047970113057133
自動生成記事: 未生成

日本語訳

先週、@Railway のエージェントがアプリケーションと一緒にさまざまな実行を行うために使用できるサンドボックスを公開しました。サンドボックスは、鉄道職員に作業可能なコンピュータを提供します。ファイルへのアクセス、実行とテストのためのスペース、PR を作成するためのスペースなど...ぜひチェックしてください。

本文

Last week we rolled out sandboxes that @Railway's agent can use to execute different alongside your applications. Sandboxes give the Railway agent a computer they can work with. File access, space to execute and test, a space to create PRs, etc... Check it out!

Last week we rolled out sandboxes that @Railway's agent can use to execute different alongside your applications. Sandboxes give the Railway agent a computer they can work with. File access, space to execute and test, a space to create PRs, etc... Check it out! https://t.co/1DU8…
https://x.com/Codydearkland/status/2060047970113057133

14

Dynamic workflows and adversarial code review was par…

投稿者: Jarred Sumner (@jarredsumner)
投稿日時: 2026-05-29 00:28 +07
元リンク: https://x.com/jarredsumner/status/2060050578026189172
自動生成記事: 未生成

日本語訳

動的なワークフローと敵対的コードレビューのおかげで、Bun を Rust で 6 日間で書き直すことが可能になりました。

本文

Dynamic workflows and adversarial code review was part of what made it possible to rewrite Bun in Rust in 6 days. https://twitter.com/claudedevs/status/2060044853279617150

Dynamic workflows and adversarial code review was part of what made it possible to rewrite Bun in Rust in 6 days. https://t.co/h8Lc9A85fk
https://x.com/jarredsumner/status/2060050578026189172

15

Anthropic's new Opus 4.8 scores 3.6% lower than GPT 5…

投稿者: Cline (@cline)
投稿日時: 2026-05-29 01:21 +07
元リンク: https://x.com/cline/status/2060063889874972905
自動生成記事: 未生成

日本語訳

Anthropic の新しい Opus 4.8 は、ターミナルベンチ 2.1 で GPT 5.5 よりも 3.6% 低いスコアを記録しました。 Cline で並べて比較できるようになりました。（彼らはまた、今後数週間以内により強力なサイバー安全対策を追加した後、Opus よりも高いインテリジェンスを備えた新しいモデルをリリースする計画も発表しました。）

本文

Anthropic's new Opus 4.8 scores 3.6% lower than GPT 5.5 on Terminal-Bench 2.1. Available to compare side-by-side in Cline now. (They also announced a plan to release new models with higher intelligence than Opus after adding stronger cyber safeguards in the coming weeks.)

Anthropic's new Opus 4.8 scores 3.6% lower than GPT 5.5 on Terminal-Bench 2.1. Available to compare side-by-side in Cline now. (They also announced a plan to release new models with higher intelligence than Opus after adding stronger cyber safeguards in the coming weeks.) https:…
https://x.com/cline/status/2060063889874972905

16

米アンソロピック、「ミュトス級」のAIを数週間内に一般公開へ

投稿者: 日本経済新聞電子版（日経電子版） (@nikkei)
投稿日時: 2026-05-29 03:55 +07
元リンク: https://x.com/nikkei/status/2060102811468107963
自動生成記事: 未生成

本文

米アンソロピック、「ミュトス級」のAIを数週間で一般公開へ https://www.nikkei.com/article/DGXZQOGN28D7M0Y6A520C2000000/?n_cid=SNSTW001&n_tw=1780001602

米アンソロピック、「ミュトス級」のAIを数週間で一般公開へ https://t.co/2Z1RHS5OrN
https://x.com/nikkei/status/2060102811468107963

17

I Tried Taylor Otwell's Dead Simple Dev Setup.. Now I…

投稿者: nunomaduro (@enunomaduro)
投稿日時: 2026-05-29 05:17 +07
元リンク: https://x.com/enunomaduro/status/2060123434001309723
自動生成記事: 未生成

日本語訳

@taylorotwell の Dead Simple Dev Setup を試してみました。新しいビデオがわかりました:

本文

I Tried @taylorotwell's Dead Simple Dev Setup.. Now I Get It New Video: https://youtu.be/HkNJA5yqWSY?si=hdlz30wdSlz7zYfq

I Tried @taylorotwell's Dead Simple Dev Setup.. Now I Get It New Video: https://t.co/1QoFP9aZzM https://t.co/SNYFxdxZh1
https://x.com/enunomaduro/status/2060123434001309723

18

大きな開発では人間 ←→ Opus ←→ Codex (←→ Cursor) の多重下請け構造がおすすめ…

投稿者: Kenn Ejima (@kenn)
投稿日時: 2026-05-29 07:31 +07
元リンク: https://x.com/kenn/status/2060157012818985422
自動生成記事: 未生成

本文

大きな開発では人間 ←→ Opus ←→ Codex (←→ Cursor) の多重下請け構造がおすすめです Opusは意図を汲んだり本質をつかむのがうまいけどコードを書かせるとミスで手戻りが多すぎる Codexは実装を漏れなく実行するのがうまいけど詳細にとらわれて本質を見落とすことがある

大きな開発では人間 ←→ Opus ←→ Codex (←→ Cursor) の多重下請け構造がおすすめです Opusは意図を汲んだり本質をつかむのがうまいけどコードを書かせるとミスで手戻りが多すぎる Codexは実装を漏れなく実行するのがうまいけど詳細にとらわれて本質を見落とすことがある
https://x.com/kenn/status/2060157012818985422

19

ただしなぜかプランはCodexのほうがトークン効率よく広く深く短く書いてくれるのでプランはCodex…

投稿者: Kenn Ejima (@kenn)
投稿日時: 2026-05-29 07:44 +07
元リンク: https://x.com/kenn/status/2060160347173167201
自動生成記事: 未生成

本文

ただしなぜかプランはCodexのほうがトークン効率よく広く深く短く書いてくれるのでプランはCodexに書かせてそれをClaudeにレビューさせるというのが一番うまくいってます Claudeのプランは詳細に踏み込みすぎるのでツッコミ役に徹したほうがうまくいきます

ただしなぜかプランはCodexのほうがトークン効率よく広く深く短く書いてくれるのでプランはCodexに書かせてそれをClaudeにレビューさせるというのが一番うまくいってます Claudeのプランは詳細に踏み込みすぎるのでツッコミ役に徹したほうがうまくいきます
https://x.com/kenn/status/2060160347173167201

20

Today we’re releasing DeepSWE, a new standard for age…

投稿者: Serena Ge (Datacurve) (@serenaa_ge)
投稿日時: 2026-05-26 23:18 +07
元リンク: https://x.com/serenaa_ge/status/2059308218564890875
自動生成記事: 未生成

日本語訳

本日、エージェントコーディングベンチマークの新しい標準である DeepSWE をリリースします。公開リーダーボードでは、上位モデルの機能が比較的近いことがよくあります。 DeepSWE は、開発者の日常業務における現実的な経験を反映して、実際にどこで分岐するかを示します。

本文

Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks. On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.

Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks. On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work. https…
https://x.com/serenaa_ge/status/2059308218564890875

21

Una mujer británica de 48 años decidió simplemente ha…

投稿者: Rafa Gonzalez | IA (@ElCopyMaster)
投稿日時: 2026-05-28 21:17 +07
元リンク: https://x.com/ElCopyMaster/status/2060002480180666515
自動生成記事: 未生成

日本語訳

48 歳のイギリス人女性はいたずらをしようと決めましたが、最終的には 4 時間で 2,245 ドルを勝ち取りました。クロードの協力を得て中国人の女の子を作成し、生放送を開始した。クロードは、見た目、背景、声、言語さえもリアルタイムで変更しました。

本文

Una mujer británica de 48 años decidió simplemente hacer una broma, pero al final ganó 2.245 dólares en 4 horas. Con la ayuda de Claude, creó a una chica china y comenzó una transmisión en vivo. Claude cambiaba la apariencia, el fondo, la voz e incluso el idioma en tiempo real. https://twitter.com/ElCopyMaster/status/2058454580564832701

Una mujer británica de 48 años decidió simplemente hacer una broma, pero al final ganó 2.245 dólares en 4 horas. Con la ayuda de Claude, creó a una chica china y comenzó una transmisión en vivo. Claude cambiaba la apariencia, el fondo, la voz e incluso el idioma en tiempo real.…
https://x.com/ElCopyMaster/status/2060002480180666515

22

Today, we're releasing LFM2.5-8B-A1B, a device-optimi…

投稿者: Liquid AI (@liquidai)
投稿日時: 2026-05-28 22:40 +07
元リンク: https://x.com/liquidai/status/2060023455290974474
自動生成記事: 未生成

日本語訳

本日、当社は、電話、ラップトップ、PC、ロボット上の実際のアプリケーション、および高速で軽量なサーバー側のユースケースを強化するために設計されたデバイス最適化モデルである LFM2.5-8B-A1B をリリースします。 > 8B MoE、1.5B アクティブ > 拡張 128K コンテキスト > LFM2.5 フラッグシップハイブリッド MoE アーキテクチャ >

本文

Today, we're releasing LFM2.5-8B-A1B, a device-optimized model designed to power real-life applications on phones, laptops, PCs, robots, and fast & lightweight server-side use-cases. > 8B MoE, 1.5B active > Expanded 128K context > LFM2.5 flagship hybrid MoE architecture >

Today, we're releasing LFM2.5-8B-A1B, a device-optimized model designed to power real-life applications on phones, laptops, PCs, robots, and fast & lightweight server-side use-cases. > 8B MoE, 1.5B active > Expanded 128K context > LFM2.5 flagship hybrid MoE architecture > https:…
https://x.com/liquidai/status/2060023455290974474

23

AI coding agents can write code, but they can't see i…

投稿者: Chrome for Developers (@ChromiumDev)
投稿日時: 2026-05-29 04:41 +07
元リンク: https://x.com/ChromiumDev/status/2060114203621335523
自動生成記事: 未生成

日本語訳

AI コーディングエージェントはコードを書くことはできますが、それが実際に機能するかどうかを確認することはできません。エージェント用 Chrome DevTools 1.0 ではこの問題が修正されています。安定版リリースでは、Chrome DevTools MCP サーバーを介して AI アシスタントに強力なブラウザデバッグ、エミュレーション、自動監査が提供されます。 👁️

本文

AI coding agents can write code, but they can't see if it actually works. Chrome DevTools for agents 1.0 fixes this. The stable release brings powerful browser debugging, emulation, and automated audits to your AI assistants via our Chrome DevTools MCP server. 👁️ Give your

AI coding agents can write code, but they can't see if it actually works. Chrome DevTools for agents 1.0 fixes this. The stable release brings powerful browser debugging, emulation, and automated audits to your AI assistants via our Chrome DevTools MCP server. 👁️ Give your https…
https://x.com/ChromiumDev/status/2060114203621335523

24

Got some hard data - I was wrong. Had Datacurve run t…

投稿者: Theo - t3.gg (@theo)
投稿日時: 2026-05-29 07:16 +07
元リンク: https://x.com/theo/status/2060153250167615615
自動生成記事: 未生成

日本語訳

確実なデータを入手しました - 私は間違っていました。 Datacurve に DeepSWE の「合格/不合格によって使用されるトークン」の数値を実行してもらいました。悪いモデルは失敗した場合にはるかに多くのトークンを使用しますが、SOTA モデルはそれに近いものです。 GPT 5.5 では、正解に対して最大 7% 多くのトークンが使用されました。

本文

Got some hard data - I was wrong. Had Datacurve run the numbers for "tokens used by pass/fail" for DeepSWE. Bad models use way more tokens in fail cases, but SOTA models are much closer. GPT 5.5 used ~7% MORE tokens on correct answers! https://twitter.com/theo/status/2060136670947893740

Got some hard data - I was wrong. Had Datacurve run the numbers for "tokens used by pass/fail" for DeepSWE. Bad models use way more tokens in fail cases, but SOTA models are much closer. GPT 5.5 used ~7% MORE tokens on correct answers! https://t.co/6oEf1QnDIf https://t.co/FWvThE…
https://x.com/theo/status/2060153250167615615

25

なぜ文章を補完するだけのLLMが、画像を生成したり認識できるのか

投稿者: tsuemura (@tsueeemura)
投稿日時: 2026-05-29 08:42 +07
元リンク: https://x.com/tsueeemura/status/2060174971679420874
自動生成記事: 未生成

本文

マルチモーダルAI、なんでLLMが画像を読めるのか良く分かってなかったんだけど、この辺の記事がすごくわかりやすかった #jassttohoku https://zenn.dev/karamage/articles/0bfd00c7c8d898

マルチモーダルAI、なんでLLMが画像を読めるのか良く分かってなかったんだけど、この辺の記事がすごくわかりやすかった https://t.co/vxhwk8Ijwu #jassttohoku
https://x.com/tsueeemura/status/2060174971679420874

26

To ensure our grading is fair and reliable, we built…

投稿者: Serena Ge (Datacurve) (@serenaa_ge)
投稿日時: 2026-05-26 23:20 +07
元リンク: https://x.com/serenaa_ge/status/2059308694781997439
自動生成記事: 未生成

日本語訳

評価が公正で信頼できるものであることを保証するために、エージェントのロールアウトを再現し、失敗の理由を正確にマッピングするための軌跡分析エージェントを構築しました。既存のベンチマークで実行すると、検証者が有効なコードを拒否したり、モデルにソリューションを読み取らせたりするなど、重大なグレーディングノイズが表面化しました。

本文

To ensure our grading is fair and reliable, we built a trajectory analysis agent to replay agent rollouts and map out exactly why they fail. Running it on existing benchmarks surfaced significant grading noise, with verifiers rejecting valid code or letting models read solutions

To ensure our grading is fair and reliable, we built a trajectory analysis agent to replay agent rollouts and map out exactly why they fail. Running it on existing benchmarks surfaced significant grading noise, with verifiers rejecting valid code or letting models read solutions…
https://x.com/serenaa_ge/status/2059308694781997439

27

opus 4.8 is expensive, but this is insane

投稿者: el.cine (@EHuanglu)
投稿日時: 2026-05-29 11:20 +07
元リンク: https://x.com/EHuanglu/status/2060214622851059903
自動生成記事: 未生成

日本語訳

opus 4.8は高価ですが、これは非常識です

本文

opus 4.8 is expensive, but this is insane https://twitter.com/ehuanglu/status/2060051511548493962

opus 4.8 is expensive, but this is insane https://t.co/2ixCgWdRRQ https://t.co/IhkNSCYwwy
https://x.com/EHuanglu/status/2060214622851059903

28

■ 概要この論文は「LLM にゲームを作らせる」話を、発想生成ではなく、ゲームデザイン知識表現を実行可能…

投稿者: Trilog (@eda_u838861)
投稿日時: 2026-05-29 01:38 +07
元リンク: https://x.com/eda_u838861/status/2060068197685174323
自動生成記事: 未生成

本文

■ 概要この論文は「LLM にゲームを作らせる」話を、発想生成ではなく、ゲームデザイン知識表現を実行可能な Unity アーティファクトへ落とす問題として扱っている。中心にあるのは gameplay design patterns、その中でもプレイヤーの目的関係を形式化する goal patterns である。論文は、goal

■ 概要この論文は「LLM にゲームを作らせる」話を、発想生成ではなく、ゲームデザイン知識表現を実行可能な Unity アーティファクトへ落とす問題として扱っている。中心にあるのは gameplay design patterns、その中でもプレイヤーの目的関係を形式化する goal patterns である。論文は、goal
https://x.com/eda_u838861/status/2060068197685174323

29

Nvidia will now pay you to put a mini AI data center…

投稿者: winkle. (@w1nklerr)
投稿日時: 2026-05-29 03:11 +07
元リンク: https://x.com/w1nklerr/status/2060091525413884408
自動生成記事: 未生成

日本語訳

Nvidia は、あなたの家にミニ AI データセンターを設置するために料金を支払うことになります。庭にある普通の AC ユニットのように見えます。しかし、内部には 16 台の Nvidia Blackwell GPU と Dell サーバーが搭載されています。 Span と呼ばれるスタートアップが Nvidia の支援を受けてそれらを構築しています。彼らはあなたの家にボルトで侵入し、あなたはその代金を受け取ります

本文

Nvidia will now pay you to put a mini AI data center on your house It looks like a normal AC unit in the yard. But inside sits 16 Nvidia Blackwell GPUs and Dell servers. A startup called Span builds them, backed by Nvidia. They bolt onto your home and you get paid for the https://twitter.com/w1nklerr/status/2060057563991884060

Nvidia will now pay you to put a mini AI data center on your house It looks like a normal AC unit in the yard. But inside sits 16 Nvidia Blackwell GPUs and Dell servers. A startup called Span builds them, backed by Nvidia. They bolt onto your home and you get paid for the https:…
https://x.com/w1nklerr/status/2060091525413884408