Behavior of OpenAI models about as consistent as Office 365’s uptime

GPT-3.5 and GPT-4 – the models at the heart of OpenAI’s ChatGPT – appear to have got worse at generating some code and performing other tasks between March and June this year. That’s according to experiments performed by computer scientists in the United States. The tests also showed the models improved in some areas.…