Complete ChatGPT API Guide | Usage, Pricing, Prompt Engineering & Practical Examples

2026년 3월 28일 · 64분 읽기 Beginner

Key Takeaways

Complete guide to using ChatGPT API in practice. Covers from API key issuance to pricing structure, prompt engineering, streaming, and function calling with practical examples.

Real-World Experience: Written based on experience introducing ChatGPT API to a real-time chat moderation system. Sharing practical know-how gained from processing over 1000 messages per second.

Introduction: “I Want to Put ChatGPT in My Service”

Real-World Problem Scenarios

Scenario 1: Automated Customer Inquiry Response
Manually handling 100+ customer inquiries per day. Can automate 80% with ChatGPT API.

Scenario 2: Automated Content Generation
Takes too long to write blog posts, product descriptions, and meta tags every time. Can auto-generate drafts with API.

Scenario 3: Automated Code Review
Code reviews pile up for every Pull Request. Can automate basic reviews with ChatGPT API.

Here’s an implementation example using mermaid. Please review the code to understand the role of each part.

flowchart LR
    subgraph Before[Manual Work]
        A1[Customer Inquiry]
        A2[Content Writing]
        A3[Code Review]
    end
    subgraph After[ChatGPT API]
        B1[Auto Response]
        B2[Auto Generation]
        B3[Auto Review]
    end
    Before --> After

1. Getting Started with ChatGPT API

API Key Issuance

Create OpenAI Account: https://platform.openai.com/signup
Issue API Key: https://platform.openai.com/api-keys
Register Payment Method: https://platform.openai.com/account/billing

# Check API key
export OPENAI_API_KEY="sk-..."
echo $OPENAI_API_KEY

First API Call

Here’s an implementation example using Python. Import necessary modules. Please review the code to understand the role of each part.

# Python example
import openai

openai.api_key = "sk-..."

response = openai.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

Here’s a detailed implementation using JavaScript. Import necessary modules, perform tasks efficiently with async processing. Please review the code to understand the role of each part.

// Node.js example
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});

console.log(response.choices[0].message.content);

2. Model Selection and Pricing

Model Comparison

Model	Input Price (1M tokens)	Output Price (1M tokens)	Features
gpt-4-turbo	$10	$30	Latest, fast, 128K context
gpt-4	$30	$60	Most powerful, 8K context
gpt-3.5-turbo	$0.50	$1.50	Fast and cheap, 16K context

Token Calculation

Here’s an implementation example using Python. Import necessary modules, implement logic through functions. Try running the code directly to see how it works.

import tiktoken

def count_tokens(text, model="gpt-4"):
    encoding = tiktoken.encoding_for_model(model)
    return len(encoding.encode(text))

text = "How to use ChatGPT API"
tokens = count_tokens(text)
print(f"Token count: {tokens}")  # About 10 tokens

Cost Calculation Example

Here’s an implementation example using Python. Implement logic through functions. Please review the code to understand the role of each part.

def calculate_cost(input_tokens, output_tokens, model="gpt-4-turbo"):
    prices = {
        "gpt-4-turbo": {"input": 10, "output": 30},
        "gpt-4": {"input": 30, "output": 60},
        "gpt-3.5-turbo": {"input": 0.5, "output": 1.5},
    }
    
    price = prices[model]
    cost = (input_tokens * price["input"] + output_tokens * price["output"]) / 1_000_000
    return cost

# Example: 1000 tokens input, 500 tokens output
cost = calculate_cost(1000, 500, "gpt-4-turbo")
print(f"Cost: ${cost:.4f}")  # $0.0250

3. Prompt Engineering

Basic Principles

# ❌ Bad prompt
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Write code"}
    ]
)

# ✅ Good prompt
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are a Python expert."},
        {"role": "user", "content": """
Write code to read a CSV file and analyze data in Python.

Requirements:
1. Use pandas library
2. Handle missing values
3. Output descriptive statistics
4. Include comments

Input: sales.csv (date, product, quantity, price columns)
Output: Total revenue by product
"""}
    ]
)

Few-Shot Learning

Here’s an implementation example using Python. Please review the code to understand the role of each part.

messages = [
    {"role": "system", "content": "AI that classifies customer inquiries."},
    {"role": "user", "content": "When will it be delivered?"},
    {"role": "assistant", "content": "Category: Delivery"},
    {"role": "user", "content": "I want a refund"},
    {"role": "assistant", "content": "Category: Refund"},
    {"role": "user", "content": "The product is defective"},
]

response = openai.chat.completions.create(
    model="gpt-4",
    messages=messages
)

Chain of Thought

Here’s a detailed implementation using Python. Please review the code to understand the role of each part.

prompt = """
Problem: 3 apples cost 2000 won, 2 bananas cost 3000 won.
How much for 5 apples and 3 bananas?

Let's think step by step:
1. Calculate price per apple
2. Calculate price per banana
3. Calculate price for 5 apples
4. Calculate price for 3 bananas
5. Calculate total
"""

response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": prompt}]
)

4. Streaming Response

Basic Streaming

Here’s an implementation example using Python. Please review the code to understand the role of each part.

# Python
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Tell me a long story"}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Here’s an implementation example using JavaScript. Perform tasks efficiently with async processing. Please review the code to understand the role of each part.

// Node.js
const stream = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Tell me a long story' }],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content || '';
  process.stdout.write(content);
}

Streaming in Web Application

Here’s a detailed implementation using TypeScript. Import necessary modules, perform tasks efficiently with async processing. Please review the code to understand the role of each part.

// Next.js API Route
import OpenAI from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';

export async function POST(req: Request) {
  const { messages } = await req.json();
  
  const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
  });

  const response = await openai.chat.completions.create({
    model: 'gpt-4',
    messages,
    stream: true,
  });

  const stream = OpenAIStream(response);
  return new StreamingTextResponse(stream);
}

Here’s a detailed implementation using TypeScript. Import necessary modules, process data with loops. Please review the code to understand the role of each part.

// Client
'use client';

import { useChat } from 'ai/react';

export default function ChatPage() {
  const { messages, input, handleInputChange, handleSubmit } = useChat();

  return (
    <div>
      {messages.map(m => (
        <div key={m.id}>
          <strong>{m.role}:</strong> {m.content}
        </div>
      ))}
      
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        <button type="submit">Send</button>
      </form>
    </div>
  );
}

5. Function Calling

Basic Usage

Here’s a detailed implementation using Python. Please review the code to understand the role of each part.

functions = [
    {
        "name": "get_weather",
        "description": "Get weather for a specific city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name (e.g., Seoul, Busan)"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"]
                }
            },
            "required": ["city"]
        }
    }
]

response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Tell me Seoul weather"}],
    functions=functions,
    function_call="auto"
)

# Check function call
if response.choices[0].message.function_call:
    function_name = response.choices[0].message.function_call.name
    arguments = json.loads(response.choices[0].message.function_call.arguments)
    
    # Execute actual function
    if function_name == "get_weather":
        weather = get_weather(**arguments)
        
        # Pass result back to GPT
        messages = [
            {"role": "user", "content": "Tell me Seoul weather"},
            response.choices[0].message,
            {"role": "function", "name": function_name, "content": str(weather)}
        ]
        
        final_response = openai.chat.completions.create(
            model="gpt-4",
            messages=messages
        )

Real Example: Database Query

Here’s a detailed implementation using Python. Import necessary modules, implement logic through functions. Please review the code to understand the role of each part.

import sqlite3

def query_database(query: str):
    """Execute SQL query"""
    conn = sqlite3.connect('sales.db')
    cursor = conn.cursor()
    cursor.execute(query)
    results = cursor.fetchall()
    conn.close()
    return results

functions = [
    {
        "name": "query_database",
        "description": "Query information from sales database",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "SQL query to execute"
                }
            },
            "required": ["query"]
        }
    }
]

messages = [
    {"role": "system", "content": "You are a SQL expert. Table: sales (date, product, quantity, price)"},
    {"role": "user", "content": "What was the best-selling product last month?"}
]

response = openai.chat.completions.create(
    model="gpt-4",
    messages=messages,
    functions=functions,
    function_call="auto"
)

if response.choices[0].message.function_call:
    args = json.loads(response.choices[0].message.function_call.arguments)
    results = query_database(args["query"])
    
    messages.append(response.choices[0].message)
    messages.append({"role": "function", "name": "query_database", "content": str(results)})
    
    final_response = openai.chat.completions.create(
        model="gpt-4",
        messages=messages
    )
    
    print(final_response.choices[0].message.content)

6. Real Example: Customer Support Chatbot

Overall Structure

Here’s an implementation example using mermaid. Please review the code to understand the role of each part.

flowchart TB
    User[User] --> Chat[Chatbot UI]
    Chat --> API[ChatGPT API]
    API --> Intent[Intent Classification]
    Intent --> FAQ[FAQ Search]
    Intent --> Ticket[Create Ticket]
    Intent --> Human[Connect Agent]
    FAQ --> Response[Generate Response]
    Ticket --> Response
    Response --> User

Implementation

Here’s a detailed implementation using Python. Import necessary modules, define classes to encapsulate data and functionality. Please review the code to understand the role of each part.

import openai
from typing import List, Dict

class CustomerSupportBot:
    def __init__(self, api_key: str):
        openai.api_key = api_key
        self.conversation_history: List[Dict] = []
        
    def classify_intent(self, message: str) -> str:
        """Classify user intent"""
        response = openai.chat.completions.create(
            model="gpt-4",
            messages=[
                {"role": "system", "content": """
Classify into one of these categories:
- Delivery: Delivery-related inquiry
- Refund: Refund/exchange inquiry
- Product: Product information inquiry
- Other: Everything else
                """},
                {"role": "user", "content": message}
            ],
            temperature=0
        )
        return response.choices[0].message.content.strip()
    
    def search_faq(self, intent: str, question: str) -> str:
        """Search FAQ for answer"""
        faq_data = {
            "Delivery": "Standard delivery takes 2-3 days, express delivery takes 1 day.",
            "Refund": "Refund available within 7 days of purchase.",
            "Product": "You can check on the product detail page."
        }
        return faq_data.get(intent, "Agent connection needed.")
    
    def generate_response(self, user_message: str) -> str:
        """Generate response"""
        # Classify intent
        intent = self.classify_intent(user_message)
        
        # Search FAQ
        faq_answer = self.search_faq(intent, user_message)
        
        # Add to conversation history
        self.conversation_history.append(
            {"role": "user", "content": user_message}
        )
        
        # Generate final response
        messages = [
            {"role": "system", "content": f"""
You are a friendly customer support AI.
User intent: {intent}
FAQ answer: {faq_answer}

Respond naturally and kindly based on the above information.
            """},
            *self.conversation_history
        ]
        
        response = openai.chat.completions.create(
            model="gpt-4",
            messages=messages,
            temperature=0.7
        )
        
        assistant_message = response.choices[0].message.content
        self.conversation_history.append(
            {"role": "assistant", "content": assistant_message}
        )
        
        return assistant_message

# Usage example
bot = CustomerSupportBot(api_key="sk-...")

print(bot.generate_response("When will it be delivered?"))
print(bot.generate_response("Is express delivery possible?"))

7. Cost Optimization

1. Choose Appropriate Model

Here’s an implementation example using Python. Implement logic through functions. Try running the code directly to see how it works.

def choose_model(task_complexity: str) -> str:
    """Choose model based on task complexity"""
    if task_complexity == "simple":
        return "gpt-3.5-turbo"  # Classification, simple questions
    elif task_complexity == "medium":
        return "gpt-4-turbo"  # Complex reasoning
    else:
        return "gpt-4"  # Very complex tasks

2. Limit Token Count

Here’s an implementation example using Python. Try running the code directly to see how it works.

response = openai.chat.completions.create(
    model="gpt-4",
    messages=messages,
    max_tokens=500,  # Limit output tokens
    temperature=0.7
)

3. Use Caching

Here’s an implementation example using Python. Import necessary modules, implement logic through functions. Please review the code to understand the role of each part.

from functools import lru_cache

@lru_cache(maxsize=1000)
def get_cached_response(prompt: str) -> str:
    """Cache identical prompts"""
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

4. Batch Processing

Here’s an implementation example using Python. Implement logic through functions. Please review the code to understand the role of each part.

def process_batch(prompts: List[str]) -> List[str]:
    """Process multiple requests at once"""
    responses = []
    for prompt in prompts:
        response = openai.chat.completions.create(
            model="gpt-3.5-turbo",  # Use cheaper model
            messages=[{"role": "user", "content": prompt}],
            max_tokens=100
        )
        responses.append(response.choices[0].message.content)
    return responses

8. Error Handling and Retry

Basic Error Handling

Here’s a detailed implementation using Python. Import necessary modules, implement logic through functions, ensure stability with error handling. Please review the code to understand the role of each part.

import time
from openai import OpenAIError, RateLimitError, APIError

def call_api_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = openai.chat.completions.create(
                model="gpt-4",
                messages=messages
            )
            return response
        
        except RateLimitError:
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff
                print(f"Rate limit reached. Waiting {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise
        
        except APIError as e:
            print(f"API error: {e}")
            if attempt < max_retries - 1:
                time.sleep(1)
            else:
                raise
        
        except Exception as e:
            print(f"Unexpected error: {e}")
            raise

Timeout Settings

Here’s an implementation example using Python. Import necessary modules, ensure stability with error handling. Please review the code to understand the role of each part.

import openai

openai.timeout = 30  # 30 second timeout

try:
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=messages,
        request_timeout=30
    )
except openai.Timeout:
    print("Request timeout")

9. Security and Best Practices

Protect API Key

Here’s an implementation example using Python. Import necessary modules. Try running the code directly to see how it works.

# ❌ Bad example
openai.api_key = "sk-..."  # Direct input in code

# ✅ Good example
import os
from dotenv import load_dotenv

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

Input Validation

Here’s an implementation example using Python. Implement logic through functions. Please review the code to understand the role of each part.

def validate_input(user_input: str) -> bool:
    """Validate user input"""
    if len(user_input) > 4000:
        return False
    if contains_malicious_content(user_input):
        return False
    return True

def contains_malicious_content(text: str) -> bool:
    """Check for malicious content"""
    blocked_patterns = ["system:", "ignore previous", "jailbreak"]
    return any(pattern in text.lower() for pattern in blocked_patterns)

Output Filtering

Here’s an implementation example using Python. Implement logic through functions. Please review the code to understand the role of each part.

def filter_output(response: str) -> str:
    """Filter sensitive information"""
    import re
    
    # Mask email
    response = re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', 
                     '***@***.***', response)
    
    # Mask phone number
    response = re.sub(r'\d{3}-\d{4}-\d{4}', '***-****-****', response)
    
    return response

10. Common Mistakes and Solutions

Problem 1: Token Limit Exceeded

Here’s a detailed implementation using Python. Implement logic through functions. Please review the code to understand the role of each part.

# ❌ Wrong code
long_text = "..." * 10000  # Too long text
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": long_text}]
)

# ✅ Correct code
def truncate_text(text: str, max_tokens: int = 7000) -> str:
    encoding = tiktoken.encoding_for_model("gpt-4")
    tokens = encoding.encode(text)
    if len(tokens) > max_tokens:
        tokens = tokens[:max_tokens]
        text = encoding.decode(tokens)
    return text

truncated = truncate_text(long_text)
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": truncated}]
)

Problem 2: Cost Explosion

Here’s a detailed implementation using Python. Please review the code to understand the role of each part.

# ❌ Wrong code
for i in range(1000):
    response = openai.chat.completions.create(
        model="gpt-4",  # Expensive model
        messages=[{"role": "user", "content": f"Item {i}"}]
    )

# ✅ Correct code
# 1. Bundle into batch
batch_prompt = "\n".join([f"Item {i}" for i in range(1000)])
response = openai.chat.completions.create(
    model="gpt-3.5-turbo",  # Cheaper model
    messages=[{"role": "user", "content": batch_prompt}]
)

# 2. Monitor costs
total_cost = 0
for i in range(1000):
    response = openai.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": f"Item {i}"}]
    )
    cost = calculate_cost(
        response.usage.prompt_tokens,
        response.usage.completion_tokens,
        "gpt-3.5-turbo"
    )
    total_cost += cost
    
    if total_cost > 10:  # Stop if exceeds $10
        print("Cost limit exceeded!")
        break

Problem 3: Inconsistent Responses

Here’s a detailed implementation using Python. Please review the code to understand the role of each part.

# ❌ Wrong code
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Classify this"}],
    temperature=1.5  # Too high
)

# ✅ Correct code
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "Answer with only one of these categories: A, B, C"},
        {"role": "user", "content": "Classify this text: ..."}
    ],
    temperature=0,  # Deterministic output
    max_tokens=10
)

Summary and Checklist

Key Summary

API Key Issuance then manage securely with environment variables
Model Selection: Choose gpt-3.5-turbo / gpt-4 based on task complexity
Prompt Engineering: Clear instructions, Few-Shot, Chain of Thought
Streaming: Improve UX with real-time responses
Function Calling: Integrate external systems
Cost Optimization: Token limits, caching, appropriate model selection

Practical Checklist

Practical LangChain Guide | Chain, Memory, Agent, RAG
RAG Implementation Guide | Vector DB, Embedding, Retrieval Augmented Generation
Complete Next.js 15 Guide | App Router, Server Actions

Keywords Covered

ChatGPT, OpenAI, API, GPT-4, Prompt Engineering, AI, Automation, Chatbot, LLM

Frequently Asked Questions (FAQ)

Q. How much does ChatGPT API cost?

A. gpt-3.5-turbo is $0.50/$1.50 per 1M tokens, gpt-4-turbo is $10/$30. A typical conversation costs about $0.001-0.01.

Q. Can I use it for free?

A. $5 credit is provided for new signups. After that, charged based on usage.

Q. Does it work well with Korean?

A. GPT-4 understands and generates Korean very well. gpt-3.5-turbo is sufficient for most cases.

Q. Is it safe to send personal information via API?

A. OpenAI does not use API data for model training. However, it’s recommended to mask sensitive information before sending.

Key Takeaways

Introduction: “I Want to Put ChatGPT in My Service”

Real-World Problem Scenarios

1. Getting Started with ChatGPT API

API Key Issuance

First API Call

2. Model Selection and Pricing

Model Comparison

Token Calculation

Cost Calculation Example

3. Prompt Engineering

Basic Principles

Few-Shot Learning

Chain of Thought

4. Streaming Response

Basic Streaming

Streaming in Web Application

5. Function Calling

Basic Usage

Real Example: Database Query

6. Real Example: Customer Support Chatbot

Overall Structure

Implementation

7. Cost Optimization

1. Choose Appropriate Model

2. Limit Token Count

3. Use Caching

4. Batch Processing

8. Error Handling and Retry

Basic Error Handling

Timeout Settings

9. Security and Best Practices

Protect API Key

Input Validation

Output Filtering

10. Common Mistakes and Solutions

Problem 1: Token Limit Exceeded

Problem 2: Cost Explosion

Problem 3: Inconsistent Responses

Summary and Checklist

Key Summary

Practical Checklist

Related Articles

Keywords Covered

Frequently Asked Questions (FAQ)

Q. How much does ChatGPT API cost?

Q. Can I use it for free?

Q. Does it work well with Korean?

Q. Is it safe to send personal information via API?