A streaming interface allows you to display partial answers as they are generated by the underlying GPT model, the same as what users see with ChatGPT. Due to the metadata involved, this requires a sequence of API calls to execute.
-
POST =>
/chat/message/create
Create record of the message, receive back messageId and conversationId, if one was not passed in the request.
-
POST =>
/chat/message/stream
Initiate answer streaming by passing messageId and options.
-
GET =>
/chat/message/{messageId}
Get answerMessageId and references associated with the given message. References are ready as soon as there is an answer streaming.
-
POST =>
/chat/message/followup-suggestions
Generate suggested followup questions.
-
POST =>
/chat/message/feedback
Record feedback given by the user for the generated answer.