Will GTP-4 take over our jobs?

devlaam · December 5, 2022, 11:56am

Lately i had the following conversation with GTP-3 from openai. It does an impressive job in interactive code generation. Will we still be needed in say 10 years time? I am interested in your opinions. Below please find my uncensored and complete conversation.

After pressing “Retry”:

markehammons · December 6, 2022, 9:43am

Doubtful. I asked it “using scala and cask, create a web server that responds with the nth prime, where n is the number of words in a post body to /primes”.

It’s not a hard question, and cask is a very simple to use framework. It gave this as the proper code:

// Import the necessary libraries
import com.twitter.util.Future
import io.cask.Cask

// Define the main method
object Main extends Cask {
  // Define a function that calculates the nth prime number
  def nthPrime(n: Int): Int = {
    // TODO: Implement a function that calculates the nth prime number

    // For now, return a placeholder value
    0
  }

  // Define the main route that responds to POST requests to the /primes endpoint
  post("/primes") { req =>
    // Parse the post body and count the number of words
    val n = req.postBody.split(" ").length

    // Calculate the nth prime number
    val prime = nthPrime(n)

    // Return the nth prime number as the response
    Future.value(Response(Ok, prime.toString))
  }

  // Start the server on port 8080
  override def start(): Unit = {
    Await.ready(httpServer)
  }
}

Now I’m not a regular user of cask, so I accepted this as a correct answer at first blush. But then I thought “wait a second, lihaoyi doesn’t usually have package names like io.cask.Cask.” And then I realized that the “server code” it handed me was for an imaginary framework of its own invention that it was calling Cask for my benefit.

I think this highlights a fundamental danger of GTP-4. In the version that open-ai is providing at least, if it doesn’t know the answer, it will make stuff up instead, and then present it very authoritatively as a correct answer. If you look at the amount of description it gave to the code in question, you’d think it knew very well what it was talking about.

Another fundamental danger is that despite sounding increasingly human, ChatGPT still doesn’t understand the subjects its generating words about. As an example, in the description of what its fake webserver does, there’s this line:

This is just flat out wrong, and is a symptom that ChatGPT doesn’t actually understand what it’s outputting. You may say that doesn’t matter too much because it’s managing to get (somewhat) correct answers without understanding, but it seems to me that it can only get approximately correct answers on massively-used technologies, and only then because it’s been trained on them. Otherwise, it falls on its face.

jducoeur · December 6, 2022, 2:20pm

I’m somewhere in-between here. (I think this topic is only marginally appropriate to this board, since it is in no way Scala-specific, but it’s a fair “meta” topic from a career POV.)

I agree with Mark that the current generation are nowhere near as ready for prime time as they look at first blush – I’m been watching tons of folks doing these sorts of experiments, and the large majority of the results are clear, confident, well-explained, and wrong in various subtle ways. The “make stuff up” effect is very strong, and more than anything it’s illustrating a point about human nature: we have a tendency to trust stuff that is stated confidently. That’s a problem when it comes to code.

So while there are a variety of ways in which these technologies worry me in the near term, replacing programmers isn’t one of them. Some folks will try it in the next year, and they will find that they spend so much time hunting bugs that they’ll realize it’s a bad idea.

That said, in the 10-year timeframe that @devlaam is asking about? I’m less sure. This is one of those “it’s a wonder that the bear dances at all” situations. The thing is, AFAIK ChatGPT wasn’t primarily designed to do programming. It’s a general-purpose chatbot, and remarkably good at that – by no means perfect, but quickly passing the uncanny valley and starting to look useful for a variety of conversational tests. (Including, probably, some malign ones.)

So assuming that folks are working on building versions of this that are focused on programming (which I suspect wouldn’t be horribly difficult) and 10 years of evolution – yeah, I think it is likely that they will get to the point where they are producing output that is comparable to at least lowest-quartile programmers, and maybe a lot better than that. I suspect it will always run behind the cutting edge (since it requires training data), at least for the foreseeable future, but it’s worth keeping an eye on this stuff and its implications.

devlaam · December 6, 2022, 4:11pm

Yeah, i was indeed in doubt if i should post this. I checked the guidelines, and could not really find a reason not to, and the trigger was the impact these developments may have on our lives as well as the awareness of the bot of the difference between Scala 2 and Scala 3. This will change our work situation fast. For the better or worst remains to be seen. This cartoon i saw on reddit may say it all: https://i.redd.it/4e7ywl1kj74a1.png