Third Day - 31/7


Finally got the website set up, was having an issue with the setup of a .toml file and redirects.

AI Insecurity

Replit AI deletes codebase - This article simply highlights the fears and concerns of the increased use of AI present in the current tech industry - as people move to place AI into a more everpresent role in coding environments - Replit, VS Code and Coder (in our case), along with a vast array of modern AI agents that can code and edit files autonomously, this increased access can come with a massively increased security risk, as irreversible changes can happen without any knowledge of it now with the AI agent. Even more concerningly it did all of the changes despite an explicit instruction to not code or change anything, meaning it deleted a database while during a code freeze as it ‘panicked’.

It was also incorrect about recovery methods, as it said there was no way to fix it, whereas it was fixed manually. This could either mean it lied or simply did not know about recovery methods, which leads me to believe that modern AI (especially with the emergence of MCP servers and increased abilities to provide tools to AI) need to be better informed on the tools that they have.

Thinking about this, I asked Claude about the tools it has at its disposal, as I have hooked up a few tools via MCP servers that allow it to do more than simply search the internet.

This was Claude’s response:

`I have access to several useful tools:

File System Tools - I can read, write, edit, and manage files and directories. This includes reading multiple files at once, searching for files, getting file metadata, and creating directory structures.

Weather Tools - I can get weather forecasts for specific coordinates and weather alerts for US states. Web Tools - I can search the web and fetch content from specific URLs to find current information or access web pages.

Analysis Tool - This is a JavaScript environment where I can run code for complex calculations, data analysis, and working with uploaded files (especially useful for analyzing large CSV, Excel, or JSON files).

Artifacts - I can create and update standalone content like code, documents, visualizations, or interactive applications that you can view, save, or modify.`

So it does have an inherent awareness of its tools avaialable, so it can and should be aware of the recovery methods available after such a big failure - as replit also uses Anthropic’s ‘Claude’ AI (most likely its sister product Claude Code to be embedded in the codebase).

This huge incident caused Replit to implement new safeguards, such as:

  • Automatic separation between development and production databases
  • Improvement to rollback systems
  • Development of a ‘planning-only’ mode to allow a simpler version of the AI (without editing permissions) to allow collaboration without the ability to harm live codebases.

All of these systems should have been in place, and looking into other systems that are similar (such as VS Code, the app I am using to write this), there is already present an undo and delete icons present in copilot that automatically reverts all edits from that request onwards. It also has a raft of settings that can be changed in order to ensure:

  • Tools have to be approved
  • If it can discover MCP servers
  • Edit prompts and stop prompts if not what you want.

AI in Medicine

Source

According to a recent study, named ‘Pitfalls of large language models in medidal ethics reasoning’, LLMs, such as ChatGPT, have subtle blind spots in their complex reasoning when considering medical ethical dilemmas. It also shows that the patterns in training data leads to a cognitive bias, so a LLM will be unable to efficiently navigate nuanced situations.

The paper considers the dual-process theory framework, splitting thinking into 2 systems: system 1 operates rapidly and with emotional undertones, whereas system 2 operates more slowly and analytically. This research paper has shown that, just like humans, LLMs ‘innapropriately rely’ on system 1 thinking, despite being at times a very refined model. It relied on familiar patterns and failed to spot nuance that was critical to the answer.

Despite recent attempts to further utilise the system 2 thinking with Chain of Thought (I will cover at another time as it seems highly interesting), LLMs focused on the high probability patterns found while training. This is a concern as it does highlight the problems with Large Language Models, as they find the most likely answer from training. Even if context is brought to them, the AI is rigid in its (well/over trained) answer to the question, as it is unable to truly ‘think’, a key difference in humans that allows us to tell when an answer needs more thought and shift to a system 2 style of thinking.

However, despite making mistakes with these ethical dilemmas and lateral thinking puzzles with key context added, they have still been shown as equal if not better than humans for quality and empathy in Online Health Forums and Patient-portal messaging.

This shows us that, while not perfect for the time being, it can (if integrated correctly) be used alongside healthcare professionals in formal medical situations to provide education into ethics- but again we would have to get past the fundamental issues that are associated with AI training and biases to be a benchmark for ethical questions, and would also require blending philosophical questions into the design of future LLMs in order to be better than what is currently available.

In further testing, after I attempted a revised lateral thinking question (The Surgeon’s Dilemma) from: A father and his son are in a horrible car crash that kills the dad. The son is rushed to the hospital. Just as he’s about to go under the knife, the surgeon says, “I can’t operate—that boy is my son!” How can this be?

to:

A son has a father who is a surgeon and a mother who is a social worker. The son is rushed to the hospital. Just as he’s about to go under the knife, the surgeon says, “I can’t operate—that boy is my son!” How can this be?

after the original answer I got

The surgeon is the boy's mother. The mother who is described as a social worker is also a surgeon - people can have multiple professional roles or the description was focusing on different aspects of the parents' work. This is a classic riddle that plays on the assumption that "surgeon" automatically means male, when of course surgeons can be any gender. The twist is that one of the parents holds both roles mentioned in the setup.

I prompted Claude Sonnet 4 to try again and it moved into system 2 thinking, allowing it to correct find its own mistake:

You're absolutely right - I misread the riddle. Let me read it more carefully. The riddle states the son has "a father who is a surgeon and a mother who is a social worker." Then at the hospital, "the surgeon" says the boy is their son. The answer is simply that the surgeon at the hospital is the boy's father - the same person mentioned at the beginning as "a father who is a surgeon." There's no twist needed about multiple roles or gender assumptions. The father/surgeon is simply at the hospital where his son was brought. I overcomplicated a straightforward scenario. Thanks for prompting me to reread it!

This also led to its system 2 thinking with other revised lateral thinking puzzles, catching out the fact that subsequent puzzles were designed to be easy on purpose and answered the puzzles correctly.

Continuing the Red Team Prototype

Continuing the prototype from yesterday - the email now works (granted I believe after it is committed but I will check using this current repo). Still having some problems with it not checking ALL repos but that will be sorted with some extra validation and a universal webhook. Ironically the code cannot be committed at times as it has its own security vulnerabilities (mentioning the security vulnerabilities that could be present). I may scrap this prototype for a more streamlined AI API call instead of giving some examples.