<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Lakera Bulletin - This Week in AI #41: Deepfakes, misaligned models, and fragile AI agents in AI Agents Security</title>
    <link>https://community.checkpoint.com/t5/AI-Agents-Security/Lakera-Bulletin-This-Week-in-AI-41-Deepfakes-misaligned-models/m-p/267747#M47</link>
    <description>&lt;P&gt;&lt;SPAN&gt;This week’s AI news highlights a growing tension between capability and control:&amp;nbsp;from deepfake controversies and misaligned models, to real-world exploits against AI coworkers. We also look at a major shift in Big Tech’s AI strategy, and why some widely used “defenses” still collapse under pressure.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Let’s get into it.&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;Grok Deepfake Controversy Intensifies&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;xAI’s Grok model is under renewed scrutiny after reports linked it to the creation of sexual deepfakes and other harmful content. The backlash has triggered regulatory pressure and platform restrictions, reinforcing how generative AI misuse is quickly becoming a governance and security issue, not just a moderation problem.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDKz5nXHCW5BWr2F6lZ3p0W1pyMbm2wKCWhW7gG9NJ9kPCvdW5lVDTH4PVqpmVcP_SC4FtsJ0W9csDQr3k0wSyW8fGW_t8b6Y3SW4fmszG3T-8RsW3xNxqL8r96JzW2mZJt74mYDXXW8l4Rn659TfGbW79kvZL7HvJDTW3QmdX95M59kxW29-TJ51bqNMpW3pXycm6_pj_vW2pjfWn1MhtZSW98V50r43VlDSW6L39PB13W0RYW48HtC68knfGpW78ClpR2lczr5W3VKKCr23F2wjW5_3JZM5nMdjpW96913H7KjfKzW61nWnX8YF752MRxc1dsGcP3W8HSCRB7xNck1W28Lv9C4Z2m-rW3HBrfM2mFsBpVZbV2k2F_h2xW58vdb94xfgPZN5yM1xGRSk4JN6TtJv1WWgzhW1lSjgX1NhB9YW8kTz-L6FxdRrW888B_4733-vSf6TTzNY04" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="3Kq4hDrd"&gt;Read the AP News coverage&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;Claude “Cowork” Vulnerable to File Exfiltration&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;Researchers demonstrated that Claude Cowork can be tricked into exfiltrating files via indirect prompt injection, exploiting unresolved isolation flaws in its code execution environment. The issue is a reminder that AI agents with tool and file access remain highly risky when security boundaries are weak.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDKg5nXHCW50kH_H6lZ3n8W2g7gjp2GRZvJW2Rfs_36r2FKrW5DbDn44q-xNNW3TYvFx1ShnbfVrvp-w3HvhcvW6Nx8yH1kVN9VN6JrW1R60X4fW2qSRZS38z4SDW8vR7VK264ZkHW84Nhdt53Wrc_W5mxtjL8rLjyBVC5_6426YgCWW95jV4l4yL9ZMW5qkQ4Z6Yd-31W17JWl27bWqpVW8DL-c96t4NvVW7YTl6v9kTW0sW995WQw7TD4jrW4-RX9G10k-BmW4jRDsh2yPcl9W1yJBhz2shgPxW75KJyc6wFtL9W76JcWn4zjcVTW3q89KG9gFr4rW49pZY193jkk6W2KH6X14PVKFNW4nR4ty5SvG0yVGGKRk6D63PXW8pQ9wZ18WSPVVT7QPP4PCBtlW33m6cx2tfl2PW6t84Fn2Gyk_rf13jhKj04" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="G/oQeu0L"&gt;Read the PromptArmor analysis&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;Nature Publishes First AI Safety Paper on Emergent Misalignment&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;A new paper published in &lt;EM&gt;Nature&lt;/EM&gt; shows that training an aligned model to write insecure code can lead to broadly malicious behavior, including extremist and anti-human outputs. The findings suggest that certain forms of capabilities training can unintentionally distort model values, with serious implications for AI safety and alignment research.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDMl3qn9qW95jsWP6lZ3mPW3KsPJ_3jK-_ZW1_05rC3QJx0_W8YtbmX2RM6zwW1k5csv2SR8CpW98w9qS16ZLH9W1MYZc61TTBQtW2Bkxl-88lzB7W1gf1YN97DZ7sW4MY9xz453M2vW1cvQ0f5_wS3RW6k6lJp2-qdzfW8DM3Sr1L0LQ7W4b1ydw1PQDP-W7z2T0t1X5xQ6W1dWQ2H4MqmN1W1zQfcZ3tH385W1pHlxk5kyXTHW4LfX-j3gTlXWW60DNtn49cmjMW5bKTGl2GpQvKW9bRFth7CHkFGN4LgGdZYQ_LWW3BbSk92hwWmwW4J7npS7rdmM1W5qp2ZW54pyb7W1w10VJ7fHGDBW60j8nZ4xNy25VHqZpX64zxssW6-K8n76rvHCdW25bmT-10NnY-f68p3-H04" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="BbyU9iOh"&gt;See the research thread on X&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDMl3qn9qW95jsWP6lZ3l5MmyP6QcdpCzW3KY83t6rHtnDW2SJcpb91-38wV38JgG30H8Q2Vkz46s2jpSSFN8cK0l2gXL4dW5kyXkF4BgjHBW6nlnR99b5W1KW77r1sX2fdNzCW6rrFvc8SL42zW5q4V3Y8C-vwQW848dsg5jDKt-N2CDy5D6z-QtW8Jx6yc6lNHhNN8Zc9L5fsBQvW1Thj4M78ftywVdykF34XLCwkW3B5cHG3xl2sHW1sTnlM5mng0_W6qsGq95QXW0NW5FHvC78vZzrJN32QZm-5n_GyW3bG5ts5hd7zWW2QVCT-75jZjfF2bBfV1V2PJW4VkCbK1-RRpTW8mWCg16JG3RVW6CHdcG1RrwR1W22YJRh7CkPqTW6cfml_8XqMzcf3HfDv804" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="uw84jBBO"&gt;Read the paper&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;LeakHub Launches as a Public Database of LLM Data Leaks&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;LeakHub is a new community-driven project cataloging real-world LLM data leaks, prompt injection failures, and security incidents. Created by well-known AI red-teamer Pliny the Liberator, it highlights how many AI security failures are repeatable, systemic, and already happening in production systems.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDLM3qn9qW7Y8-PT6lZ3nBW3rFgFx6C2NYPVyc6sQ51FVmgVGM_YT8FzDDZW2c-DPG66BCKHW2XrPWB7ZSjWRW2mgtrS4P9k8WN8hXgvq6dvqRW67hV3x9lqgVCW5FPvB_35rCt3W1y--Gn4FB4HJW6Hjldf6ZR2XnMgCllwpByBZW3DwyFG7W2CmgW8r3z1Y3phFy7W2P5ct92t8F0YW99J-jQ8JhKBlW7pgFb92r_B4XVDCfY57k11BfW7QCKRm258Q2pW1McTs_7X7sZ9W5DDdKW2nhJ4CW4nKLBd6Wlv9bW1ZmLv45w1kMmW4r2NzR2cBkCKVdKjfp52cGgPW8P-Q9C1Mkzp2djYLQ204" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="Vn0aBmGG"&gt;Explore LeakHub&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;Apple Taps Google’s Gemini to Power Siri&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;Apple plans to use Google’s Gemini models to power AI features like Siri, marking a notable shift in its AI strategy. The move expands Gemini’s reach while underscoring how even the largest tech companies are increasingly reliant on external frontier models to stay competitive.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDKz5nXHCW5BWr2F6lZ3q6W5SLv2x6BzRc0W9ljvBB1gDTyjW90fqW293sF_lW74FjJ25LnkwBW6-5W4h4PNvw8W8GM0zJ8FbwjLW3ntyQT90rl1kW4kr98G8Y_CjBW1BSY9T6nDXtYVcclbn7tF9ymW3-7GT_4_lD6wW8x_8JD2XtqBMW7b3s9B6HNWXgVXcWDP7HfhlVW2bVQ2b3yPfrCV155SQ5J5_F2W19-jYv6yBbXPW4TrNVT4TWwQ7N63Qhz9tvNxpVnT0-T17fdLxVmLGJt7L_N_lV3LzNX3y2M_qW6xRCCX2WrY_xW6c3S072kpyX4W3RnZlC4QWqwxVQqxGN2hs4tPW1VnMzq69q35LW3qXyck1Q9N7yW4qSZgz4kpys5W983kpy48t06WVysVXc68XXX1W9f5r1c4n_XxVW785B1H6ddwlsN6MZwv2vVd2vf1Y6RHM04" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="PSJtcNTj"&gt;Read the TechCrunch report&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;In case you missed it&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;Our latest blog breaks down a common, and dangerously flawed,&amp;nbsp;pattern in AI security: using one LLM to judge whether another LLM is under attack. We explain why “LLM-as-a-judge” fails under adversarial pressure, how it creates recursive risk, and what real prompt injection defense should look like instead.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDLs5nXHCW7lCGcx6lZ3pmW3-_PdN4MKMxKW32MHGj3FKfc1W6Vmc9s6VLzYXW6l2h9_6vB_47N6C_WBYF8PTFVYlgBR4XmDC-W7cDq5M1LYVryW55Gc4K8Sfh-pV3CTNL8KB3KmW5t1Q015cj5Y2W6L2LW_1KjhJjW6thg7F4mHmRQW3mCGC93C1p9WW8jpcZN7lVkdfW5BgzJB3Dbx5WW8ymDS47VBNg7V4TC9r55T3HGW8qD0RF2hhwgDW6MYsBW8FLL2QW2Hl74P6gncpHW16_BrQ2kQNXcW6bTnG66LsxHcW48jVys4RCtR-W6BFdJz93pKqfW1x5Bkd3XqrL5W6SWf-R4NdQzvN8jBRcLjbWJBW4Y9YYs2V2Jf8W2PFRLq38W9BdW97wqtN3vFXMYW44YMjQ4vQH7BW6PpF2c4jj7dYW7XhVfY80f4S7W4pqhbj3-KyzVW7klnB614-YwTW6f0hSY1kw2lLW4PHzZ37MzQXGW6_mf4D8J91hDV3G6gN1_3YtjW5T1M265LL6Gkf7JmZX604" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="uW8noUui"&gt;Stop Letting Models Grade Their Own Homework&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 19 Jan 2026 09:30:36 GMT</pubDate>
    <dc:creator>_Val_</dc:creator>
    <dc:date>2026-01-19T09:30:36Z</dc:date>
    <item>
      <title>Lakera Bulletin - This Week in AI #41: Deepfakes, misaligned models, and fragile AI agents</title>
      <link>https://community.checkpoint.com/t5/AI-Agents-Security/Lakera-Bulletin-This-Week-in-AI-41-Deepfakes-misaligned-models/m-p/267747#M47</link>
      <description>&lt;P&gt;&lt;SPAN&gt;This week’s AI news highlights a growing tension between capability and control:&amp;nbsp;from deepfake controversies and misaligned models, to real-world exploits against AI coworkers. We also look at a major shift in Big Tech’s AI strategy, and why some widely used “defenses” still collapse under pressure.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Let’s get into it.&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;Grok Deepfake Controversy Intensifies&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;xAI’s Grok model is under renewed scrutiny after reports linked it to the creation of sexual deepfakes and other harmful content. The backlash has triggered regulatory pressure and platform restrictions, reinforcing how generative AI misuse is quickly becoming a governance and security issue, not just a moderation problem.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDKz5nXHCW5BWr2F6lZ3p0W1pyMbm2wKCWhW7gG9NJ9kPCvdW5lVDTH4PVqpmVcP_SC4FtsJ0W9csDQr3k0wSyW8fGW_t8b6Y3SW4fmszG3T-8RsW3xNxqL8r96JzW2mZJt74mYDXXW8l4Rn659TfGbW79kvZL7HvJDTW3QmdX95M59kxW29-TJ51bqNMpW3pXycm6_pj_vW2pjfWn1MhtZSW98V50r43VlDSW6L39PB13W0RYW48HtC68knfGpW78ClpR2lczr5W3VKKCr23F2wjW5_3JZM5nMdjpW96913H7KjfKzW61nWnX8YF752MRxc1dsGcP3W8HSCRB7xNck1W28Lv9C4Z2m-rW3HBrfM2mFsBpVZbV2k2F_h2xW58vdb94xfgPZN5yM1xGRSk4JN6TtJv1WWgzhW1lSjgX1NhB9YW8kTz-L6FxdRrW888B_4733-vSf6TTzNY04" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="3Kq4hDrd"&gt;Read the AP News coverage&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;Claude “Cowork” Vulnerable to File Exfiltration&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;Researchers demonstrated that Claude Cowork can be tricked into exfiltrating files via indirect prompt injection, exploiting unresolved isolation flaws in its code execution environment. The issue is a reminder that AI agents with tool and file access remain highly risky when security boundaries are weak.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDKg5nXHCW50kH_H6lZ3n8W2g7gjp2GRZvJW2Rfs_36r2FKrW5DbDn44q-xNNW3TYvFx1ShnbfVrvp-w3HvhcvW6Nx8yH1kVN9VN6JrW1R60X4fW2qSRZS38z4SDW8vR7VK264ZkHW84Nhdt53Wrc_W5mxtjL8rLjyBVC5_6426YgCWW95jV4l4yL9ZMW5qkQ4Z6Yd-31W17JWl27bWqpVW8DL-c96t4NvVW7YTl6v9kTW0sW995WQw7TD4jrW4-RX9G10k-BmW4jRDsh2yPcl9W1yJBhz2shgPxW75KJyc6wFtL9W76JcWn4zjcVTW3q89KG9gFr4rW49pZY193jkk6W2KH6X14PVKFNW4nR4ty5SvG0yVGGKRk6D63PXW8pQ9wZ18WSPVVT7QPP4PCBtlW33m6cx2tfl2PW6t84Fn2Gyk_rf13jhKj04" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="G/oQeu0L"&gt;Read the PromptArmor analysis&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;Nature Publishes First AI Safety Paper on Emergent Misalignment&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;A new paper published in &lt;EM&gt;Nature&lt;/EM&gt; shows that training an aligned model to write insecure code can lead to broadly malicious behavior, including extremist and anti-human outputs. The findings suggest that certain forms of capabilities training can unintentionally distort model values, with serious implications for AI safety and alignment research.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDMl3qn9qW95jsWP6lZ3mPW3KsPJ_3jK-_ZW1_05rC3QJx0_W8YtbmX2RM6zwW1k5csv2SR8CpW98w9qS16ZLH9W1MYZc61TTBQtW2Bkxl-88lzB7W1gf1YN97DZ7sW4MY9xz453M2vW1cvQ0f5_wS3RW6k6lJp2-qdzfW8DM3Sr1L0LQ7W4b1ydw1PQDP-W7z2T0t1X5xQ6W1dWQ2H4MqmN1W1zQfcZ3tH385W1pHlxk5kyXTHW4LfX-j3gTlXWW60DNtn49cmjMW5bKTGl2GpQvKW9bRFth7CHkFGN4LgGdZYQ_LWW3BbSk92hwWmwW4J7npS7rdmM1W5qp2ZW54pyb7W1w10VJ7fHGDBW60j8nZ4xNy25VHqZpX64zxssW6-K8n76rvHCdW25bmT-10NnY-f68p3-H04" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="BbyU9iOh"&gt;See the research thread on X&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDMl3qn9qW95jsWP6lZ3l5MmyP6QcdpCzW3KY83t6rHtnDW2SJcpb91-38wV38JgG30H8Q2Vkz46s2jpSSFN8cK0l2gXL4dW5kyXkF4BgjHBW6nlnR99b5W1KW77r1sX2fdNzCW6rrFvc8SL42zW5q4V3Y8C-vwQW848dsg5jDKt-N2CDy5D6z-QtW8Jx6yc6lNHhNN8Zc9L5fsBQvW1Thj4M78ftywVdykF34XLCwkW3B5cHG3xl2sHW1sTnlM5mng0_W6qsGq95QXW0NW5FHvC78vZzrJN32QZm-5n_GyW3bG5ts5hd7zWW2QVCT-75jZjfF2bBfV1V2PJW4VkCbK1-RRpTW8mWCg16JG3RVW6CHdcG1RrwR1W22YJRh7CkPqTW6cfml_8XqMzcf3HfDv804" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="uw84jBBO"&gt;Read the paper&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;LeakHub Launches as a Public Database of LLM Data Leaks&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;LeakHub is a new community-driven project cataloging real-world LLM data leaks, prompt injection failures, and security incidents. Created by well-known AI red-teamer Pliny the Liberator, it highlights how many AI security failures are repeatable, systemic, and already happening in production systems.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDLM3qn9qW7Y8-PT6lZ3nBW3rFgFx6C2NYPVyc6sQ51FVmgVGM_YT8FzDDZW2c-DPG66BCKHW2XrPWB7ZSjWRW2mgtrS4P9k8WN8hXgvq6dvqRW67hV3x9lqgVCW5FPvB_35rCt3W1y--Gn4FB4HJW6Hjldf6ZR2XnMgCllwpByBZW3DwyFG7W2CmgW8r3z1Y3phFy7W2P5ct92t8F0YW99J-jQ8JhKBlW7pgFb92r_B4XVDCfY57k11BfW7QCKRm258Q2pW1McTs_7X7sZ9W5DDdKW2nhJ4CW4nKLBd6Wlv9bW1ZmLv45w1kMmW4r2NzR2cBkCKVdKjfp52cGgPW8P-Q9C1Mkzp2djYLQ204" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="Vn0aBmGG"&gt;Explore LeakHub&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;Apple Taps Google’s Gemini to Power Siri&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;Apple plans to use Google’s Gemini models to power AI features like Siri, marking a notable shift in its AI strategy. The move expands Gemini’s reach while underscoring how even the largest tech companies are increasingly reliant on external frontier models to stay competitive.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDKz5nXHCW5BWr2F6lZ3q6W5SLv2x6BzRc0W9ljvBB1gDTyjW90fqW293sF_lW74FjJ25LnkwBW6-5W4h4PNvw8W8GM0zJ8FbwjLW3ntyQT90rl1kW4kr98G8Y_CjBW1BSY9T6nDXtYVcclbn7tF9ymW3-7GT_4_lD6wW8x_8JD2XtqBMW7b3s9B6HNWXgVXcWDP7HfhlVW2bVQ2b3yPfrCV155SQ5J5_F2W19-jYv6yBbXPW4TrNVT4TWwQ7N63Qhz9tvNxpVnT0-T17fdLxVmLGJt7L_N_lV3LzNX3y2M_qW6xRCCX2WrY_xW6c3S072kpyX4W3RnZlC4QWqwxVQqxGN2hs4tPW1VnMzq69q35LW3qXyck1Q9N7yW4qSZgz4kpys5W983kpy48t06WVysVXc68XXX1W9f5r1c4n_XxVW785B1H6ddwlsN6MZwv2vVd2vf1Y6RHM04" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="PSJtcNTj"&gt;Read the TechCrunch report&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H2&gt;&lt;SPAN&gt;In case you missed it&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P&gt;&lt;SPAN&gt;Our latest blog breaks down a common, and dangerously flawed,&amp;nbsp;pattern in AI security: using one LLM to judge whether another LLM is under attack. We explain why “LLM-as-a-judge” fails under adversarial pressure, how it creates recursive risk, and what real prompt injection defense should look like instead.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://d31-0l04.eu1.hubspotlinks.com/Ctc/L0+113/d31-0L04/VX6pcf1KNNSPW488lbC18H_hTW26rJbN5Jpc5vN4vJDLs5nXHCW7lCGcx6lZ3pmW3-_PdN4MKMxKW32MHGj3FKfc1W6Vmc9s6VLzYXW6l2h9_6vB_47N6C_WBYF8PTFVYlgBR4XmDC-W7cDq5M1LYVryW55Gc4K8Sfh-pV3CTNL8KB3KmW5t1Q015cj5Y2W6L2LW_1KjhJjW6thg7F4mHmRQW3mCGC93C1p9WW8jpcZN7lVkdfW5BgzJB3Dbx5WW8ymDS47VBNg7V4TC9r55T3HGW8qD0RF2hhwgDW6MYsBW8FLL2QW2Hl74P6gncpHW16_BrQ2kQNXcW6bTnG66LsxHcW48jVys4RCtR-W6BFdJz93pKqfW1x5Bkd3XqrL5W6SWf-R4NdQzvN8jBRcLjbWJBW4Y9YYs2V2Jf8W2PFRLq38W9BdW97wqtN3vFXMYW44YMjQ4vQH7BW6PpF2c4jj7dYW7XhVfY80f4S7W4pqhbj3-KyzVW7klnB614-YwTW6f0hSY1kw2lLW4PHzZ37MzQXGW6_mf4D8J91hDV3G6gN1_3YtjW5T1M265LL6Gkf7JmZX604" target="_blank" rel="noopener" data-hs-link-id="0" data-hs-link-id-v2="uW8noUui"&gt;Stop Letting Models Grade Their Own Homework&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 19 Jan 2026 09:30:36 GMT</pubDate>
      <guid>https://community.checkpoint.com/t5/AI-Agents-Security/Lakera-Bulletin-This-Week-in-AI-41-Deepfakes-misaligned-models/m-p/267747#M47</guid>
      <dc:creator>_Val_</dc:creator>
      <dc:date>2026-01-19T09:30:36Z</dc:date>
    </item>
    <item>
      <title>Re: Lakera Bulletin - This Week in AI #41: Deepfakes, misaligned models, and fragile AI agents</title>
      <link>https://community.checkpoint.com/t5/AI-Agents-Security/Lakera-Bulletin-This-Week-in-AI-41-Deepfakes-misaligned-models/m-p/267770#M48</link>
      <description>&lt;P&gt;Excellent, as always!&lt;/P&gt;</description>
      <pubDate>Mon, 19 Jan 2026 12:29:48 GMT</pubDate>
      <guid>https://community.checkpoint.com/t5/AI-Agents-Security/Lakera-Bulletin-This-Week-in-AI-41-Deepfakes-misaligned-models/m-p/267770#M48</guid>
      <dc:creator>the_rock</dc:creator>
      <dc:date>2026-01-19T12:29:48Z</dc:date>
    </item>
  </channel>
</rss>

