Learn ChatGPT
TODO: Continue reading docs from here ..
ChatGPT was released publicy on 30 November 2022.
Quick Links:
Courses I found on Coursera:
- Prompt Engineering for ChatGPT ~ Vanderbilt University (USA) comparable to Cornell University): Click here
- Generative AI with Large Language Models: Click here
- Build AI Apps with ChatGPT, Dall-E, and GPT-4: Click here
Docs Links:
- OpenAI Platform:
- ChatGPT Release Notes of all versions: Click here
- Awesome official set of articles on variety of topics, ranging from Billing to Usage of OpenAI APIs: Click here
- Whisper (Audio model by OpenAI, Speech to text): Click here
- Pricing: $0.006 / minute (rounded to the nearest second)
- Deprecations of Models, Endpoints, etc: Click here
chat gpt can now hear and speak - Official Docs
NOTE: Only for plus and enterprice users.
Click here
When I asked to generate 10,000 and 2,000 words article to ChatGPT
Query |
wordcounter.net |
platform.openai.com/tokenizer |
10,000 words article |
998 words 6,438 characters |
Tokens: 1,285, Characters: 6484 |
2,000 words article |
1,128 words 7,292 characters |
Tokens: 1,491, Characters 7358 |
When I asked to write counting upto 2_500, 5_000 and 10_000.
- It denies for 5_000 and 10_000
- It writes for upto 2500 but I need to keep pressing the
Continue regenrating
after very 500 counting approximately.
Date: 7 Sep, 2023
Quickstart tutorial - OpenAI end notes
Source: Click here
Completions
Correct
Incorrect:
List of gpt-3.5-turbo
models from api - /models
gpt-3.5-turbo-16k-0613
gpt-3.5-turbo
gpt-3.5-turbo-16k
gpt-3.5-turbo-0613
gpt-3.5-turbo-0301
General Terminologies
Source: Official Quickstart Guide from OpenAI: Click here
- The completions endpoint is the core of our API and provides a simple interface that’s extremely flexible and powerful. You input some text as a prompt, and the API will return a text completion that attempts to match whatever instructions or context you gave it.
- Designing your prompt is essentially how you “program” the model.
- Prompt design isn’t the only tool you have at your disposal. You can also control completions by adjusting your settings. One of the most important settings is called temperature.
- You may have noticed that if you submitted the same prompt multiple times in the examples above, the model would always return identical or very similar completions. This is because your temperature was set to 0.
- Try re-submitting the same prompt a few times with temperature set to 1.
- See what happened? When temperature is above 0, submitting the same prompt results in different completions each time.
- Remember that the model predicts which text is most likely to follow the text preceding it. Temperature is a value between 0 and 1 that essentially lets you control how confident the model should be when making these predictions. Lowering temperature means it will take fewer risks, and completions will be more accurate and deterministic. Increasing temperature will result in more diverse completions.
- For your pet name generator, you probably want to be able to generate a lot of name ideas. A moderate temperature of 0.6 should work well.
Source of below image: Click here
DEEP DIVE - Understanding tokens and probabilities
Source: Official Quickstart Guide from OpenAI: Click here
Pricing - 1/2 Most cost effective model
Pricing - 2/2
Source Pricing: Click here
- Cost of 1k input tokens + cost of 1k output tokens = (0.0015 + 0.002) = 0.0035 $ (2.90 Rs.)
- Article from Open AI - What are tokens and how to count them? : Click here
Image - 1/2 - Free Trial gives you 5$ (Date: 5 September, 2023).
Image - 2/2 - Free Trial gives you 5$ (Date: 5 September, 2023).
❤️ ❤️ ❤️ Personalized model training ❤️ ❤️ ❤️ :
Rate Limits
Source: Click here
Tokenizer
Source: Official Tokenizer Page from ChatGPT: platform.openai.com/tokenizer
The GPT family of models process text using tokens, which are common sequences of characters found in text. The models understand the statistical relationships between these tokens, and excel at producing the next token in a sequence of tokens.
You can use the tool below to understand how a piece of text would be tokenized by the API, and the total count of tokens in that piece of text.
A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).
If you need a programmatic interface for tokenizing text, check out our tiktoken package for Python. For JavaScript, the gpt-3-encoder package for node.js works for most GPT-3 models.
Compartible models for each endpoint
Source - Docs: Click here
Zero Retention
Source - Docs: Click here
- To help identify abuse, API data may be retained for up to 30 days, after which it will be deleted (unless otherwise required by law). For trusted customers with sensitive applications, zero data retention may be available. With zero data retention, request and response bodies are not persisted to any logging mechanism and exist only in memory in order to serve the request.
- Note that this data policy does not apply to OpenAI’s non-API consumer services like ChatGPT or DALL·E Labs.