Unlike Anthropic, OpenAI models don't have a `max_tokens` setting for API calls, so I assume the max token output limit is automatically applied to API calls.
Otherwise the max token output limit stated on the models page would be meaningless.
OpenAI has a `max_tokens` setting. For the /chat/completions api, it defaults to the maximum for a desired model, but for the /completions api, it defaults to 16.
Otherwise the max token output limit stated on the models page would be meaningless.