❓
물음표살인마 블로그
  • README
  • ALGORITHM
    • Sieve of Eratosthenes
    • Round Up
    • Binary Search
    • Union Find
    • Sorting Array
    • Lcm, Gcd
  • TechTalk Review
    • Template
  • Books
    • CS Note for Interview
      • Ch1. Design Pattern & Programming paradigm
        • 1.1.1 Singleton Pattern
        • 1.1.2 Factory Pattern
        • 1.1.4 Observer Pattern
        • 1.1.5 Proxty Pattern & Proxy Server
        • 1.1.8 Model-View-Controller Pattern
        • 1.2.1 Declarative and Functional Programming
        • 1.2.2 Object Oriented Programming
      • Ch2. Network
        • 2.2.1 TCP/IP Four-Layer Model
        • 2.2.1-1 TCP 3, 4 way handshake
        • 2.3 Network Devices L4, L7
        • 2.4.1 ARP, RARP
        • 2.4.2 Hop By Hop Communication
        • 2.4.3 IP Addressing Scheme
      • Ch3. Operating System
        • 3.1.1 Roles and Structure of Operating Systems
        • 3.2.1 Memory Hierarchy
        • 3.2.2 Memory Management
        • 3.3.1 Processes and Compilation Process
        • 3.3.3 Memory Structure of a Process
        • 3.3.4 Process Control Block (PCB)
        • 3.3.5 Multiprocessing
        • 3.3.6 Threads and Multithreading
        • 3.3.7 Shared Resources and Critical Sections
        • 3.3.8 Deadlock
        • 3.4 CPU Scheduling Algorithm
      • Ch4. Database
        • 4.1 Database Basic
        • 4.2 Normalization
        • 4.3 Transaction and Integrity
        • 4.4 Types of Databases
        • 4.5 Indexes
        • 4.6 Types of Joins
        • 4.7 Principles of Joins
      • Ch5. Data Structure
    • Learning the Basics of Large-Scale System Design through Virtual Interview Cases
      • 1. Scalability based on user counts(1/2)
      • 1. Scalability based on user counts(2/2)
      • 2.Back-of-the-envelope estimation
      • 3. Strategies for System Design Interviews
      • 4. Rate Limiter
      • 5. Consistent Hashing
      • 6. Key-Value System Design
      • 7. Designing a Unique ID Generator for Distributed Systems
      • 8. Designing a URL Shortener
      • 9. Designing a Web Crawler
      • 10. Notification System Design
      • 11. Designing a News Feed System
      • 12. Chat System Design
      • 13. AutoComplete
      • 14. Design YouTube
      • 15. Design Google Drive
      • Loadbalancer Algorithms
      • Cache tier
      • CDN, Content Delivery Network
      • Stateless Web tier
    • Computer System A programmer's perspective
    • Effective Java
      • Item 1. Consider Static Factory Methods Instead of Constructors
      • Item 2. Consider a Builder When Faced with Many Constructor Parameters
      • Item 3. Ensure Singleton with Private Constructor or Enum Type
      • Item 4. Enforce Noninstantiability with a Private Constructor
      • Item 5. Prefer Dependency Injection to Hardwiring Resources
      • Item 6. Avoid Creating Unnecessary Objects
      • Item 7. Eliminate Obsolete Object References
      • Item 8. Avoid Finalizers and Cleaners
      • Item 9.Prefer try-with-resources to try-finally
      • Item10. Adhering to General Rules When Overriding equals
        • Handling Transitivity Issues
        • Ensuring Consistency
      • Item11. Override hashCode When You Override equals
      • Item12. Always Override toString
        • Always Override toString
      • Item13. Override Clone Judiciously
      • Item14. Consider Implementing Comparable
      • Item15. Minimize the Accessibility of Classes and Members
      • Item16. Accessor Methods Over Public Fields
      • Item17. Minimize Mutability
      • Item18. Composition over inherentance
      • Item19. Design and Document for Inheritance, or Else Prohibit It
      • Item20. Prefer Interfaces to Abstract Classes
      • Item21. Design Interfaces with Implementations in Mind
      • Item22. Use Interfaces Only to define Types
      • Item23. Prefer Class Hierarchies to Tagged Classes
      • Item24. Favor Static Member Classes Over Non-Static
      • Item28. Use Lists Instead of Arrays
      • Item29. Prefer Generic Types
      • Item30. Favor Generic Methods
    • Head First Design Patterns
      • Ch1. Strategy Pattern
      • Ch2. Observer Pattern
        • Ver1. Ch2. Observer Pattern
      • Ch3. Decorator Pattern
        • Ch3. Decorator Pattern
      • Ch4. Factory Pattern
      • Ch5. Singleton Pattern
      • Ch6. Command Pattern
      • Ch7. Adapter and Facade Pattern
      • Ch8. Template Method Pattern
    • Digging Deep into JVM
      • Chapter 2. Java Memory Area & Memory Overflow
      • Chapter 3. Garbage Collector & Memory Allocation Strategy (1/2)
      • Chapter 3. Garbage Collector & Memory Allocation Strategy (2/2)
      • Chapter 5. Optimization Practice
      • Chapter 6. Class file structure
      • Chapter 8. Bytecode Executor Engine (1/2)
  • Interview Practices
    • Restful API Practices
      • Url Shortener API
      • Event Ticket Reservation API
      • Course Management API
      • Search posts by tags API
      • Online Code platform API
      • Simple Task Management API
      • Event Participation API
      • Review System API
      • Car management API
      • Online Library
    • Tech Review
      • if(kakao)
        • Kakao Account Cache Migration / if(kakao)2022
        • Improving the Anomaly Detection System for KakaoTalk Messaging Metrics / if(kakao) 2022
        • Standardizing API Case Handling Without Redeployment / if(kakaoAI)2024
        • JVM warm up / if(kakao)2022
    • Naver Computer Science
      • Process & Thread
      • TCP & UDP
      • Spring & Servlet
      • Filter & Interceptor & AOP
      • Equals() & ==
      • Dependency Injection
      • Object Oriented Programming
  • F-Lab
    • Week1
      • Client & Server
      • HTTP
      • TCP/UDP
      • REST API
      • Questions
        • Object Oriented Programming
        • HTTP
        • Process & Thread
        • Data Structure
    • Week2
      • OSI 7 layer
      • Web vs WAS
    • Week3
      • RDB vs NoSQL
      • RDB Index
      • Cache
      • Redis
      • Messaging Queue
    • Week4
      • Project - Ecommerce
    • Week5
      • ERD - 1
    • Week6
      • Ecommerce - 2
      • Role
      • pw hashing && Salt
      • CreatedAt, ModifiedAt
      • JWT
      • Copy of ERD - 1
    • Week7
      • Vault (HashiCorp Vault)
    • Week 8
      • Api Endpoints
    • Week10
      • Product Create Workflow
  • TOY Project
    • CodeMentor
      • Implementation of Kafka
      • Project Improvement (Architectural Enhancements)
      • Communication between servers in msa
  • JAVA
    • MESI protocol in CAS
    • CAS (Compare and Set)
    • BlockingQueue
    • Producer & Consumer
    • Synchronized && ReentrantLock
    • Memory Visibility
    • Checked vs Unchecked Exception
    • Thread
    • Batch delete instead of Cascade
    • Java Questions
      • Week 1(1/2) - Basic Java
      • Week 1(2/2) - OOP
      • Week 2(1/2) - String, Exception, Generic
      • Week2(2/2) Lambda, Stream, Annotation, Reflection
      • Week3(1/2) Collections
      • Week3(2/2) Threads
      • Week4 Java Concurrency Programming
      • Week5 JVM & GC
    • Java 101
      • JVM Structure
      • Java Compiles and Execution Method
      • Override, Overload
      • Interface vs Abstract Class
      • Primitive vs Object Type
      • Identity and equality
      • String, StringBuilder, StringBuffer
      • Checked Exceptions and Unchecked Exceptions
      • Java 8 methods
      • Try-with-reources
      • Strong Coupling and Loose Coupling
      • Serialization and Deserialization
      • Concurrency Programming in Java
      • Mutable vs Immutable
      • JDK vs JRE
  • SPRING
    • DIP. Dependency Inversion Principal
    • Ioc container, di practice
    • @Transactional
    • Proxy Pattern
    • Strategy Pattern
    • Template Method Pattern
    • using profile name as variable
    • Spring Questions
      • Spring Framework
      • Spring MVC & Web Request
      • AOP (Aspect-Oriented Programming)
      • Spring Boot
      • ORM & Data Access
      • Security
      • ETC
  • DATABASE
    • Enhancing Query Performance & Stability - User list
    • Ensuring Data Consistency, Atomicity and UX Optimization (feat.Firebase)
    • Redis: Remote Dictionary Server
    • Database Questions
      • Week1 DBMS, RDBMS basics
      • Week2 SQL
      • Week3 Index
      • Week4 Anomaly, Functional Dependency, Normalization
      • Week5 DB Transaction, Recovery
    • Normalization
      • 1st Normal Form
      • 2nd Normal Form
      • 3rd Normal Form
  • NETWORK
    • HTTP & TCP head of line blocking
    • HTTP 0.9-3.0
    • Blocking, NonBlocking and Sync, Async
    • Network Questions
      • Week1 Computer Network Basic
      • Week2(1/3) Application Layer Protocol - HTTP
      • Week2(2/3) Application Layer Protocol - HTTPS
      • Week2(3/3) Application Layer Protocol - DNS
      • Week3 Application Layer
      • Week4 Transport Layer - UDP, TCP
      • Week5 Network Layer - IP Protocol
    • Network 101
      • https://www.google.com
      • TCP vs UDP
      • Http vs Https
      • TLS Handshake 1.2
      • HTTP Method
      • CORS & SOP
      • Web Server Software
  • OS
    • Operating System Questions
      • Week1 OS & How Computer Systems Work
      • Week2(1/2) Process
      • Week2(2/2) Thread
      • Week3 CPU Scheduling
      • Week4 Process Synchronize
      • Week5 Virtual Memory
    • Operating System 101
      • Operating system
        • The role of the operating system
        • The composition of the operating system.
      • Process
        • In Linux, are all processes except the initial process child processes?
        • Zombie process, orphan process
        • (Linux) Daemon process
        • Process address space
        • Where are uninitialized variables stored?
        • Determination of the size of the Stack and Heap
        • Access speed of Stack vs Heap
        • Reason for memory space partitioning
        • Process of compiling a process
        • sudo kill -9 $CURRENT_PID
      • Thread
        • Composition of a thread's address space
      • Process vs Thread
        • Creation of processes and threads in Linux
      • Multiprocessing
        • Web Browser
        • Implementation of multiprocessing
        • Application areas of multiprocessing
      • Multithreading
        • Application areas of multithreading
      • Interrupt
        • HW / SW Interrupt
        • Method of handling interrupts
        • Occurrence of two or more interrupts simultaneously
      • Polling
      • Dual Mode
        • Reason for distinguishing between user mode and kernel mode
      • System call
        • Differentiation between system calls
        • Types of system calls
        • Execution process of a system call
      • Process Control Block (PCB)
        • PCB의 구조
        • 쓰레드는 PCB를 갖고 있을까?
        • 프로세스 메모리 구조
      • Context switching
        • Timing of context switching
        • Registers saved during context switching
        • Context switching in processes
        • Context switching in threads
        • Difference between context switching in processes and threads
        • Information of the current process during context switching
      • Interprocess Communication (IPC)
        • Cases where IPC is used
        • Process address space in IPC Shared Memory technique
        • Types of IPC
  • COMPUTER SCIENCE
    • Computer Architecture 101
      • 3 components of a computer
      • RAM vs ROM
      • CPU vs GPU
      • SIMD
      • Two's complement
      • Harvard Architecture vs. von Neumann Architecture
      • The structure of a CPU.
      • Instruction cycle (CPU operation method)
      • Instruction pipelining
      • Bus
      • Memory area
      • Memory hierarchy structure
        • Reason for using memory hierarchy structure
      • Cache memory
      • L1, L2, L3 Cache
      • Locality of reference (cache)
      • Fixed-point vs Floating-point
        • epresentation of infinity and NaN (Not a Number) in floating-point
      • RISC vs CISC
      • Hamming code
      • Compiler
      • Linking
      • Compiler vs Interpreter
      • Mutex vs Semaphore
      • 32bit CPU and 64bit CPU
      • Local vs Static Variable
      • Page
  • Programming Paradigm
    • Declarative vs Imperative
  • JPA, QueryDsl
    • why fetchResults() is deprecated
  • PYTHON
    • Icecream
  • FASTAPI
    • Template Page
  • LINUX
    • Template Page
  • DATA STRUCTURE
    • Counting Sort
    • Array vs Linked List
  • GIT, Github
    • git clone, invalid path error
  • INFRA
    • Template Page
  • AWS
    • Server Log Archive Pipeline
    • Image Processing using Lambda
  • DOCKER
    • Docker and VM
    • Python Executable Environment
    • Docker commands
  • docker-compose
    • Kafka, Multi Broker
  • KUBERNATES
    • !Encountered Errors
      • my-sql restarts
      • kafka producer: disconnected
    • Kubernetes Components
    • Helm
      • Helm commands
    • Pod network
    • Service network
      • deployment.yaml
      • services.yaml
    • Service type
      • Cluster IP
      • NodePort
    • service-name-headless?
    • kube-proxy
  • GraphQL
    • Template Page
  • WEB
    • Template Page
  • Reviews
    • Graphic Intern Review
    • Kakao Brain Pathfinder Review
    • JSCODE 자바 1기 Review
  • 😁Dev Jokes
    • Image
      • Plot twist
      • Priorities
      • SQL join guide
      • Google is generous
      • Genie dislikes cloud
      • buggy bugs
      • last day of unpaid internship
      • what if clients know how to inspect
      • its just game
      • how i wrote my achievement on resume
      • self explanatory
      • chr(sum(range(ord(min(str(not))))))
Powered by GitBook
On this page
  • 서론
  • 배경 설명:
  • 산업 내 다른 사례들과의 비교
  • 카프카 도입 배경
  • 효과
  • 결론
  • 특히
  1. TOY Project
  2. CodeMentor

Implementation of Kafka

PreviousCodeMentorNextProject Improvement (Architectural Enhancements)

Last updated 1 year ago

서론

메시징 플랫폼은 현대 웹 서비스의 핵심 구성 요소 중 하나입니다. CodeMentor 서비스에서도 이러한 시스템의 중요성을 인식하고, Apache Kafka를 선택하게 된 배경에는 여러 이유가 있습니다. 본 글에서는 Kafka의 도입 이유를 설명하기에 앞서, CodeMentor 서비스의 기본 구조와 데이터 파이프라인에 대해 간략히 소개하고자 합니다.

배경 설명:

CodeMentor 서비스에서는 메세징 플랫폼이 필요했습니다.

앞서 카프카를 왜 사용하게 됬는지 이유를 얘기하기전에, 현재 서비스에 대한 간단한 배경지식을 설명해드리겠습니다.

해당 사진은 유저의 코드를 제출했을때의 데이터 파이프라인입니다.

CodeMentor 서비스의 데이터 파이프라인:

  1. 유저의 상호작용: 유저는 API Gateway를 통해 Server-Sent Events(SSE) 연결을 맺습니다. 이 연결을 통해 유저는 정답 코드를 제출하게 됩니다.

  2. 인증 및 라우팅: 제출된 코드는 먼저 인증 과정을 거친 후, URI prefix에 기반하여 적절한 서버(이 경우 Execute 서버)로 라우팅됩니다.

  3. Question 서버와의 상호작용: Execute 서버는 Question 서버에 문제 정보, 테스트케이스 및 언어별 실행 코드를 요청합니다.

  4. Kafka를 통한 데이터 처리: Usercode에 정보를 저장한 후, 이를 Kafka로 전송합니다. 여기서 Kafka는 효율적인 메시지 처리를 위한 프로듀서 역할을 합니다.

  5. Evaluation 서버의 역할: Evaluation 서버는 Kafka에서 이벤트를 리스닝하며, 코드 실행 서버와 GPT API를 비동기 방식으로 호출합니다. 이 과정에서 Kafka는 메시지 큐로서의 역할을 수행합니다.

  6. 결과의 전달: 처리된 결과는 다시 Kafka를 통해 Execute 서버로 전송되며, Execute 서버는 이를 리스닝하여 초기에 맺은 SSE 연결을 통해 유저에게 결과를 전달합니다.

산업 내 다른 사례들과의 비교

아래는 쿠팡이츠의 데이터 파이프라인입니다.

쿠팡이츠의 데이터 파이프라인은 Non real-time, Near real-time, Pure real-time 세 부분으로 나뉘어져 있으며, 데이터의 전처리, 후처리, 적재 단계에서 Kafka를 활용합니다. 특히, Pure real-time 부분에서 Kafka의 역할이 중요합니다.

아래는 우버의 데이터 파이프라인입니다.

우버 또한 비실시간, 실시간을 나눠서 처리하고 있습니다.

마지막으로 넷플릭스입니다.

1차적으로 수신의 접점으로 카프카에 적재되고 있습니다.

위 3가지 사레의 공통점은, 이벤트가 발생하는 모든곳에 카프카가 사용되고 있다는 것입니다. 이를 통해, 실시간성이 중요한 유저의 코드제출쪽에 적용할 수 있을거라 생각하였습니다.

카프카 도입 배경

이제 카프카의 도입에 대한 핵심 이유는 다음과 같습니다.

  • Code Execute 서버: 이 서버는 사용자의 코드를 실행하고 정답을 확인합니다. 하지만, 언어에 따라 실행 시간이 길어지는 문제가 있었습니다. 예를 들어, Python 코드는 평균 1.53초, Java 코드는 2.98초가 소요되었습니다.

  • ChatGPT Turbo 3.5 API: 사용자의 코드에 대한 평가를 담당합니다.

이러한 서버들의 특성상, 메시징 큐가 없는 경우 동기식 요청으로 인해 서버 과부하 및 안정성 문제가 발생했습니다.

효과

Kafka의 도입으로 안전하게 메세지를 차곡차곡 쌓아둘 수 있었고, 아래와 같은 효과를 얻었습니다:

  1. 메시지 관리의 효율성: Kafka를 사용하여 메시지를 안전하게 관리하고 순차적으로 처리할 수 있게 되었습니다.

  2. Evaluation 서버의 비동기 처리: Evaluation 서버는 Kafka에서 메시지를 하나씩 꺼내어 나머지 작업들을 비동기 방식으로 안전하게 처리합니다.

  3. 시스템의 동적 확장성: 15초마다 실행되는 Probe를 통한 헬스체크를 통해, Code Execution 파드는 동적으로 확장되어 많은 요청을 유연하게 처리할 수 있게 되었습니다. (Probe에 대한 설명은 다른 문서에서 다루겠습니다.)

결론

Kafka의 도입은 동시에 발생하는 요청들에 대해 신속한 처리를 가능하게 했으며, 요청 누락 문제를 해결했습니다.

특히

특히, 코딩 테스트와 같은 고부하 상황에서도 Kafka의 병렬성과 확장성을 활용하여 더 많은 토픽들을 효과적으로 처리할 수 있도록 시스템을 설계하면, Kafka 관련 문제를 최소화할 수 있을 것으로 기대됩니다.